Patent classifications
G06F40/47
METHOD FOR TRAINING NON-AUTOREGRESSIVE TRANSLATION MODEL
A method for training a non-autoregressive translation (NAT) model includes: acquiring a source language text, a target language text corresponding to the source language text and a target length of the target language text; generating a target language prediction text and a prediction length by inputting the source language text into the NAT model, in which initialization parameters of the NAT model are determined based on parameters of a pre-trained translation model; and obtaining a target NAT model by training the NAT model based on the target language text, the target language prediction text, the target length and the prediction length.
METHOD FOR TRAINING NON-AUTOREGRESSIVE TRANSLATION MODEL
A method for training a non-autoregressive translation (NAT) model includes: acquiring a source language text, a target language text corresponding to the source language text and a target length of the target language text; generating a target language prediction text and a prediction length by inputting the source language text into the NAT model, in which initialization parameters of the NAT model are determined based on parameters of a pre-trained translation model; and obtaining a target NAT model by training the NAT model based on the target language text, the target language prediction text, the target length and the prediction length.
METHODS AND SYSTEMS FOR EXPANDING VOCABULARY
The present disclosure provides a method and a system for expanding vocabulary. The method includes: obtaining a target vocabulary, the target vocabulary including a single word or a phrase composed of two or more words; obtaining at least one candidate text associated with the target vocabulary; determining a plurality of candidate vocabularies from the at least one candidate text, the plurality of candidate vocabularies including words from the at least one candidate text and a phrase formed by at least two consecutive words in position; and determining at least one expansion vocabulary of the target vocabulary from the plurality of candidate vocabularies.
METHODS AND SYSTEMS FOR EXPANDING VOCABULARY
The present disclosure provides a method and a system for expanding vocabulary. The method includes: obtaining a target vocabulary, the target vocabulary including a single word or a phrase composed of two or more words; obtaining at least one candidate text associated with the target vocabulary; determining a plurality of candidate vocabularies from the at least one candidate text, the plurality of candidate vocabularies including words from the at least one candidate text and a phrase formed by at least two consecutive words in position; and determining at least one expansion vocabulary of the target vocabulary from the plurality of candidate vocabularies.
Document translation method and apparatus, storage medium, and electronic device
A document translation method includes: displaying a source text display region, a translated text region, and an editing region, wherein textual content in a document to be translated is displayed in the source text display region, and reference translated text for the textual content is displayed in the translated text region; and providing a translated text recommendation from the reference translated text according to input from a user within the editing region. The method further includes: displaying the translation recommendation in the editing area as a translation result, if a confirmation operation for the translation recommendation is detected; and receiving a translation inputted by the user that is different from the translation recommendation and displaying the translation inputted by the user in the editing area as the translation result, if a non-confirmation operation for the translation recommendation is detected.
Enhanced graphical user interface for voice communications
Enhanced graphical user interfaces for transcription of audio and video messages is disclosed. Audio data may be transcribed, and the transcription may include emphasized words and/or punctuation corresponding to emphasis of user speech. Additionally, the transcription may be translated into a second language. A message spoken by a user depicted in one or more images of video data may also be transcribed and provided to one or more devices.
Enhanced graphical user interface for voice communications
Enhanced graphical user interfaces for transcription of audio and video messages is disclosed. Audio data may be transcribed, and the transcription may include emphasized words and/or punctuation corresponding to emphasis of user speech. Additionally, the transcription may be translated into a second language. A message spoken by a user depicted in one or more images of video data may also be transcribed and provided to one or more devices.
Natural language processing engine for translating questions into executable database queries
A system and method for translating questions into database queries are provided. A text to database query system receives a natural language question and a structure in a database. Question tokens are generated from the question and query tokens are generated from the structure in the database. The question tokens and query tokens are concatenated into a sentence and a sentence token is added to the sentence. A BERT network generates question hidden states for the question tokens, query hidden states for the query tokens, and a classifier hidden state for the sentence token. A translatability predictor network determines if the question is translatable or untranslatable. A decoder converts a translatable question into an executable query. A confusion span predictor network identifies a confusion span in the untranslatable question that causes the question to be untranslatable. An auto-correction module to auto-correct the tokens in the confusion span.
Natural language processing engine for translating questions into executable database queries
A system and method for translating questions into database queries are provided. A text to database query system receives a natural language question and a structure in a database. Question tokens are generated from the question and query tokens are generated from the structure in the database. The question tokens and query tokens are concatenated into a sentence and a sentence token is added to the sentence. A BERT network generates question hidden states for the question tokens, query hidden states for the query tokens, and a classifier hidden state for the sentence token. A translatability predictor network determines if the question is translatable or untranslatable. A decoder converts a translatable question into an executable query. A confusion span predictor network identifies a confusion span in the untranslatable question that causes the question to be untranslatable. An auto-correction module to auto-correct the tokens in the confusion span.
PREDICTING FUTURE TRANSLATIONS
Technology is disclosed for snippet pre-translation and dynamic selection of translation systems. Pre-translation uses snippet attributes such as characteristics of a snippet author, snippet topics, snippet context, expected snippet viewers, etc., to predict how many translation requests for the snippet are likely to be received. An appropriate translator can be dynamically selected to produce a translation of a snippet either as a result of the snippet being selected for pre-translation or from another trigger, such as a user requesting a translation of the snippet. Different translators can generate high quality translations after a period of time or other translators can generate lower quality translations earlier. Dynamic selection of translators involves dynamically selecting machine or human translation, e.g., based on a quality of translation that is desired. Translations can be improved over time by employing better machine or human translators, such as when a snippet is identified as being more popular.