G06F40/47

UNIVERSAL DATA LANGUAGE TRANSLATOR
20230004729 · 2023-01-05 ·

The present disclosure is directed to a universal data language (UDL) translator. Specifically, the systems and methods disclosed enable input data from a variety of sources to be translated into a UDL that can be consistently analyzed and compared against other sources of data. For example, an entity may upload input data that has a plurality of data terms and definitions (e.g., header column in a spreadsheet). These terms may be duplicative and/or inaccurate with respect to the underlying data. If the entity wishes to compare and transact data within a data marketplace, the entity may not fully comprehend what data it is missing and/or what data another entity may have to offer for trade. To remedy this problem of business semantic management, the present invention discloses steps for creating a UDL and a UDL translator so that any input data can be translated to UDL.

UNIVERSAL DATA LANGUAGE TRANSLATOR
20230004729 · 2023-01-05 ·

The present disclosure is directed to a universal data language (UDL) translator. Specifically, the systems and methods disclosed enable input data from a variety of sources to be translated into a UDL that can be consistently analyzed and compared against other sources of data. For example, an entity may upload input data that has a plurality of data terms and definitions (e.g., header column in a spreadsheet). These terms may be duplicative and/or inaccurate with respect to the underlying data. If the entity wishes to compare and transact data within a data marketplace, the entity may not fully comprehend what data it is missing and/or what data another entity may have to offer for trade. To remedy this problem of business semantic management, the present invention discloses steps for creating a UDL and a UDL translator so that any input data can be translated to UDL.

MODEL MAPPING AND ENRICHMENT SYSTEM
20230004728 · 2023-01-05 ·

Disclosed herein are various embodiments for training and enriching a natural language processing system. An embodiment operates by determining that a first prediction from a first machine model has been generated based on a dataset comprising a plurality of attributes. A technical map identifying a first subset of attributes of the plurality of attributes used to generate the first prediction by the first machine model is generated. Natural language translations corresponding to at least a portion of the first subset of attributes used to generate the first prediction by the first machine model are identified. A natural language map of the first subset of attributes is generated based on the natural language translations. The natural language map is provided with the first prediction.

MODEL MAPPING AND ENRICHMENT SYSTEM
20230004728 · 2023-01-05 ·

Disclosed herein are various embodiments for training and enriching a natural language processing system. An embodiment operates by determining that a first prediction from a first machine model has been generated based on a dataset comprising a plurality of attributes. A technical map identifying a first subset of attributes of the plurality of attributes used to generate the first prediction by the first machine model is generated. Natural language translations corresponding to at least a portion of the first subset of attributes used to generate the first prediction by the first machine model are identified. A natural language map of the first subset of attributes is generated based on the natural language translations. The natural language map is provided with the first prediction.

Multilingual speech translation with adaptive speech synthesis and adaptive physiognomy

Techniques for the generation of dubbed audio for an audio/video are described. An exemplary approach is to receive a request to generate dubbed speech for an audio/visual file; and in response to the request to: extract speech segments from an audio track of the audio/visual file associated with identified speakers; translate the extracted speech segments into a target language; determine a machine learning model per identified speaker, the trained machine learning models to be used to generate a spoken version of the translated, extracted speech segments based on the identified speaker; generate, per translated, extracted speech segment, a spoken version of the translated, extracted speech segments using a trained machine learning model that corresponds to the identified speaker of the translated, extracted speech segment and prosody information for the extracted speech segments; and replace the extracted speech segments from the audio track of the audio/visual file with the spoken versions spoken version of the translated, extracted speech segments to generate a modified audio track.

Multilingual speech translation with adaptive speech synthesis and adaptive physiognomy

Techniques for the generation of dubbed audio for an audio/video are described. An exemplary approach is to receive a request to generate dubbed speech for an audio/visual file; and in response to the request to: extract speech segments from an audio track of the audio/visual file associated with identified speakers; translate the extracted speech segments into a target language; determine a machine learning model per identified speaker, the trained machine learning models to be used to generate a spoken version of the translated, extracted speech segments based on the identified speaker; generate, per translated, extracted speech segment, a spoken version of the translated, extracted speech segments using a trained machine learning model that corresponds to the identified speaker of the translated, extracted speech segment and prosody information for the extracted speech segments; and replace the extracted speech segments from the audio track of the audio/visual file with the spoken versions spoken version of the translated, extracted speech segments to generate a modified audio track.

SYSTEMS, METHODS, AND APPARATUS FOR DETERMINING AN OFFICIAL TRANSCRIPTION AND SPEAKER LANGUAGE FROM A PLURALITY OF TRANSCRIPTS OF TEXT IN DIFFERENT LANGUAGES

A method for determining an official transcription and speaker language from a plurality of transcripts of text in different languages. The method includes receiving a preselection of a plurality of different languages in which a first speaker can speak during a session of a cloud-based meeting; receiving, from a microphone, first audio content which originated from the first speaker; transcribing the first audio content of the first speaker into text of all the plurality of languages in which the first speaker can speak to generate a first plurality of text transcripts; identifying a first untranslated speech bubble as making the most sense among the first plurality of transcripts for the first speaker; and adding the first untranslated speech bubble to a master transcript for the first speaker.

SYSTEMS, METHODS, AND APPARATUS FOR DETERMINING AN OFFICIAL TRANSCRIPTION AND SPEAKER LANGUAGE FROM A PLURALITY OF TRANSCRIPTS OF TEXT IN DIFFERENT LANGUAGES

A method for determining an official transcription and speaker language from a plurality of transcripts of text in different languages. The method includes receiving a preselection of a plurality of different languages in which a first speaker can speak during a session of a cloud-based meeting; receiving, from a microphone, first audio content which originated from the first speaker; transcribing the first audio content of the first speaker into text of all the plurality of languages in which the first speaker can speak to generate a first plurality of text transcripts; identifying a first untranslated speech bubble as making the most sense among the first plurality of transcripts for the first speaker; and adding the first untranslated speech bubble to a master transcript for the first speaker.

Structured text translation

Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.

Generating and customizing summarized notes

Provided are techniques for generating and customizing summarized notes. A template is selected from a plurality of templates based on a context using a machine learning model. The template includes one or more translatable string resources with variables to represent key attributes extracted from historical notes. A summarized note is generated using values of the key attributes for the variables in the translatable string resources of the template.