G06F40/45

AUTOMATIC SYNTHESIS OF TRANSLATED SPEECH USING SPEAKER-SPECIFIC PHONEMES

An embodiment includes converting an original audio signal to an original text string, the original audio signal being from a recording of the original text string spoken by a specific person in a source language. The embodiment generates a translated text string by translating the original text string from the source language to a target language, including translation of a word from the source language to a target language. The embodiment assembles a standard phoneme sequence from a set of standard phonemes, where the standard phoneme sequence includes a standard pronunciation of the translated word. The embodiment also associates a custom phoneme with a standard phoneme of the standard phoneme sequence, where the custom phoneme includes the specific person's pronunciation of a sound in the translated word. The embodiment synthesizes the translated text string to a translated audio signal including the translated word pronounced using the custom phoneme.

MULTILINGUAL SUPPORT FOR NATURAL LANGUAGE PROCESSING APPLICATIONS

A data processing system implements obtaining textual content in a first language from a first client device and segmenting the textual content into a plurality of first tokens. The system also implements translating the first tokens from the first language to a second language using a bilingual dictionary, extracting features information from the second tokens to create a features vector, providing the feature vector to a first natural language processing model trained to analyze textual input in the second language and to output contextual information indicating one or more topics or subject matter of the first textual content, and providing the contextual information to a first machine learning model configured to analyze the contextual information and to identify one or more content items predicted to be relevant to the contextual information. The system further implements providing the information identifying the one or more content items to the first client device.

SYSTEMS AND METHODS FOR CODE-MIXING ADVERSARIAL TRAINING
20220164547 · 2022-05-26 ·

Embodiments described herein provide adversarial attacks targeting the cross-lingual generalization ability of massive multilingual representations, demonstrating their effectiveness on multilingual models for natural language inference and question answering. An efficient adversarial training scheme can thus be implemented with the adversarial attacks, which takes the same number of steps as standard supervised training and show that it encourages language-invariance in representations, thereby improving both clean and robust accuracy.

SYSTEMS AND METHODS FOR CODE-MIXING ADVERSARIAL TRAINING
20220164547 · 2022-05-26 ·

Embodiments described herein provide adversarial attacks targeting the cross-lingual generalization ability of massive multilingual representations, demonstrating their effectiveness on multilingual models for natural language inference and question answering. An efficient adversarial training scheme can thus be implemented with the adversarial attacks, which takes the same number of steps as standard supervised training and show that it encourages language-invariance in representations, thereby improving both clean and robust accuracy.

METHOD AND SYSTEM FOR SUGGESTING REVISIONS TO AN ELECTRONIC DOCUMENT

A method for suggesting revisions to a document-under-analysis from a seed database, the seed database including a plurality of original texts each respectively associated with one of a plurality of final texts, the method for suggesting revisions including selecting a statement-under-analysis (“SUA”), selecting a first original text of the plurality of original texts, determining a first edit-type classification of the first original text with respect to its associated final text, generating a first similarity score for the first original text based on the first edit-type classification, the first similarity score representing a degree of similarity between the SUA and the first original text, selecting a second original text of the plurality of original texts, determining a second edit-type classification of the second original text with respect to its associated final text, generating a second similarity score for the second original text based on the second edit-type classification, the second similarity score representing a degree of similarity between the SUA and the second original text, selecting a candidate original text from one of the first original text and the second original text, and creating an edited SUA (“ESUA”) by modifying a copy of the first SUA consistent with a first candidate final text associated with the first candidate original text.

METHOD AND SYSTEM FOR SUGGESTING REVISIONS TO AN ELECTRONIC DOCUMENT

A method for suggesting revisions to a document-under-analysis from a seed database, the seed database including a plurality of original texts each respectively associated with one of a plurality of final texts, the method for suggesting revisions including selecting a statement-under-analysis (“SUA”), selecting a first original text of the plurality of original texts, determining a first edit-type classification of the first original text with respect to its associated final text, generating a first similarity score for the first original text based on the first edit-type classification, the first similarity score representing a degree of similarity between the SUA and the first original text, selecting a second original text of the plurality of original texts, determining a second edit-type classification of the second original text with respect to its associated final text, generating a second similarity score for the second original text based on the second edit-type classification, the second similarity score representing a degree of similarity between the SUA and the second original text, selecting a candidate original text from one of the first original text and the second original text, and creating an edited SUA (“ESUA”) by modifying a copy of the first SUA consistent with a first candidate final text associated with the first candidate original text.

System, method, and recording medium for corpus pattern paraphrasing

A corpus pattern paraphrasing method, system, and non-transitory computer readable medium, include aligning slots of patterns for verbal phrases based on syntactical and lexical features along with calculated synonyms to predict paraphrases that are not previously stored in a corpus of sentences in a database.

CONTEXT-AWARE MACHINE LANGUAGE IDENTIFICATION
20220147720 · 2022-05-12 ·

A machine translation system, a ChatOps system, a method for a context-aware language machine identification, and computer program product. One embodiment of the machine translation system may include a density calculator. The density calculator may be adapted to calculate a part of speech (POS) density for a plurality of word tokens in an input text, calculate a knowledge density for the plurality of word tokens, and calculate an information density for the plurality of word tokens using the POS density and the knowledge density. In some embodiments, the machine translation system may further comprise a sememe attacher and a context translator.

CONTEXT-AWARE MACHINE LANGUAGE IDENTIFICATION
20220147720 · 2022-05-12 ·

A machine translation system, a ChatOps system, a method for a context-aware language machine identification, and computer program product. One embodiment of the machine translation system may include a density calculator. The density calculator may be adapted to calculate a part of speech (POS) density for a plurality of word tokens in an input text, calculate a knowledge density for the plurality of word tokens, and calculate an information density for the plurality of word tokens using the POS density and the knowledge density. In some embodiments, the machine translation system may further comprise a sememe attacher and a context translator.

Translation engine suggestion via targeted probes

A translation-engine suggestion method, system, and computer program product include identifying probes for third-party translation-engines for an input text, segmenting sections of the input text into a plurality of segments according to the identified probes, fragmenting the input text into fragments according to the segments, applying each fragment to the identified probe using the corresponding third-party translation-engine, and outputting a translation by combining each fragment.