Patent classifications
G06F40/45
Automatic synthesis of translated speech using speaker-specific phonemes
An embodiment includes converting an original audio signal to an original text string, the original audio signal being from a recording of the original text string spoken by a specific person in a source language. The embodiment generates a translated text string by translating the original text string from the source language to a target language, including translation of a word from the source language to a target language. The embodiment assembles a standard phoneme sequence from a set of standard phonemes, where the standard phoneme sequence includes a standard pronunciation of the translated word. The embodiment also associates a custom phoneme with a standard phoneme of the standard phoneme sequence, where the custom phoneme includes the specific person's pronunciation of a sound in the translated word. The embodiment synthesizes the translated text string to a translated audio signal including the translated word pronounced using the custom phoneme.
TRANSLATION APPARATUS, TRANSLATION SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
A translation apparatus includes a translation unit which translates content of a document into a different language, a history creating unit which, in translation of the content from a first language into a second language, creates history information including a correspondence between original text in the first language and translated text in the second language, an extraction unit which, in translation of the content from the second language into another language, if content (present content) of the document in the second language is present in the history information, extracts content (absent content) that is not present in the history information, and a combining unit which combines a translation result obtained by translating the present content from the second language into the other language, with a replacement result obtained by replacing the absent content from the second language to the other language based on the history information.
Method and system for suggesting revisions to an electronic document
A method for suggesting revisions to a document-under-analysis from a seed database, the seed database including a plurality of original texts each respectively associated with one of a plurality of final texts, the method for suggesting revisions including selecting a statement-under-analysis (“SUA”), selecting a first original text of the plurality of original texts, determining a first edit-type classification of the first original text with respect to its associated final text, generating a first similarity score for the first original text based on the first edit-type classification, the first similarity score representing a degree of similarity between the SUA and the first original text, selecting a second original text of the plurality of original texts, determining a second edit-type classification of the second original text with respect to its associated final text, generating a second similarity score for the second original text based on the second edit-type classification, the second similarity score representing a degree of similarity between the SUA and the second original text, selecting a candidate original text from one of the first original text and the second original text, and creating an edited SUA (“ESUA”) by modifying a copy of the first SUA consistent with a first candidate final text associated with the first candidate original text.
Method and system for suggesting revisions to an electronic document
A method for suggesting revisions to a document-under-analysis from a seed database, the seed database including a plurality of original texts each respectively associated with one of a plurality of final texts, the method for suggesting revisions including selecting a statement-under-analysis (“SUA”), selecting a first original text of the plurality of original texts, determining a first edit-type classification of the first original text with respect to its associated final text, generating a first similarity score for the first original text based on the first edit-type classification, the first similarity score representing a degree of similarity between the SUA and the first original text, selecting a second original text of the plurality of original texts, determining a second edit-type classification of the second original text with respect to its associated final text, generating a second similarity score for the second original text based on the second edit-type classification, the second similarity score representing a degree of similarity between the SUA and the second original text, selecting a candidate original text from one of the first original text and the second original text, and creating an edited SUA (“ESUA”) by modifying a copy of the first SUA consistent with a first candidate final text associated with the first candidate original text.
Method and system of translating a source phrase in a first language into a target phrase in a second language
There is disclosed a method and system for translating a source phrase in a first language into a second language. The method being executable by a device configured to access an index comprising a set of source sentences in the first language, and a set of target sentences in the second language, each of the target sentence corresponding to a translation of a given source sentence. The method comprises: acquiring the source phrase; generating by a translation algorithm, one or more target phrases, each of the one or more target phrases having a different semantic meaning within the second language; retrieving, from the index, a respective target sentence for each of the one or more target phrases, the respective target sentence comprising one of the one or more target phrases; and selecting each of the one or more target phrase and the respective target sentences for display.
END-TO-END NEURAL WORD ALIGNMENT PROCESS OF SUGGESTING FORMATTING IN MACHINE TRANSLATIONS
In an embodiment, the disclosure provides a programmed computer system implemented via client-server Software as a Service (SaaS) techniques that allows for machine translation of digital content. When translating digital content, linguists must translate more than just the text on the page. Formatting, for example, is a commonly used and important aspect of online content that is typically managed with tags, such as <b> for bold and <i> for italics. When linguists work, they must ensure these tags are placed accurately as part of the translation. Projecting tags accurately depends on successfully accomplishing the challenging task of word alignment. Unfortunately, if word alignment is inaccurate, it makes placing formatting tags very difficult. In an embodiment, the present disclosure provides a method of not only translating text, but also efficiently and accurately projecting tags from input text in one language to output text in another language.
Word vector retrofitting method and apparatus
The present disclosure discloses a word vector retrofitting method. The method includes obtaining, by a computing device, a first model and a second model that are generated when original word vectors are trained, the first model being configured to predict a context according to an inputted word, and the second model being configured to predict a target word according to a context; inputting a corpus unit from a target corpus into the first model, inputting an output of the first model into the second model, and determining losses generated by the first model and the second model when the second model outputs the corpus unit; and retrofitting the first model and the second model according to the losses.
Word vector retrofitting method and apparatus
The present disclosure discloses a word vector retrofitting method. The method includes obtaining, by a computing device, a first model and a second model that are generated when original word vectors are trained, the first model being configured to predict a context according to an inputted word, and the second model being configured to predict a target word according to a context; inputting a corpus unit from a target corpus into the first model, inputting an output of the first model into the second model, and determining losses generated by the first model and the second model when the second model outputs the corpus unit; and retrofitting the first model and the second model according to the losses.
TRANSLATION APPARATUS, TRANSLATION METHOD AND PROGRAM
A translation apparatus includes: a preprocessing unit that takes an input sentence in a source language and outputs a token string in which the input sentence has been segmented in tokens, the tokens being a predetermined unit of processing; an output sequence prediction unit that inputs the token string output by the preprocessing unit to a trained translation model and predicts a word translation probability of a translation candidate for each token of the token string from the trained translation model; a word set prediction unit that checks each token of the token string output by the preprocessing unit against entry words of a bilingual dictionary, and upon detecting an entry word that agrees with the token in the bilingual dictionary, generates a target-language word set from a set of tokens constituting a translation phrase corresponding to the detected entry word; and an output sequence determination unit that computes a reward which is based on whether a translation candidate for each token of the input sentence is included in the target-language word set or not and determines a translated sentence of the input sentence based on a word translation score computed by adding the reward to the word translation probability of the translation candidate. Units of tokens constituting the translation phrase in the bilingual dictionary are subwords.
MACHINE TRANSLATION GUIDED BY REFERENCE DOCUMENTS
System and methods for a computerized machine translation of a document in a source language to a target language, where the translation is guided by additional inputs, which are one or more reference documents in the source language and their corresponding reference translation(s) in the target language, or are links thereto.