G06F40/58

Automated voice translation dubbing for prerecorded video

A method for aligning a translation of original caption data with an audio portion of a video is provided. The method includes identifying, by a processing device, original caption data for a video that includes a plurality of caption character strings. The processing device identifies speech recognition data that includes a plurality of generated character strings and associated timing information for each generated character string. The processing device maps the plurality of caption character strings to the plurality of generated character strings using assigned values indicative of semantic similarities between character strings. The processing device assigns timing information to the individual caption character strings based on timing information of mapped individual generated character strings. The processing device aligns a translation of the original caption data with the audio portion of the video using assigned timing information of the individual caption character strings.

Automated voice translation dubbing for prerecorded video

A method for aligning a translation of original caption data with an audio portion of a video is provided. The method includes identifying, by a processing device, original caption data for a video that includes a plurality of caption character strings. The processing device identifies speech recognition data that includes a plurality of generated character strings and associated timing information for each generated character string. The processing device maps the plurality of caption character strings to the plurality of generated character strings using assigned values indicative of semantic similarities between character strings. The processing device assigns timing information to the individual caption character strings based on timing information of mapped individual generated character strings. The processing device aligns a translation of the original caption data with the audio portion of the video using assigned timing information of the individual caption character strings.

Machine translation of chat sessions

An embodiment may involve a database containing a first user profile that specifies a first preferred language of a first user and a second user profile that specifies a second preferred language of a second user. The embodiment may also involve one or more processors configured to: receive, from the first user and within a chat session, a first set of messages in the first preferred language; cause the first set of messages to be translated into the second preferred language; provide, to the second user and within the chat session, the first set of messages as translated; receive, from the second user and within the chat session, a second set of messages in the second preferred language; cause the second set of messages to be translated into the first preferred language; and provide, to the first user and within the chat session, the second set of messages as translated.

Machine translation of chat sessions

An embodiment may involve a database containing a first user profile that specifies a first preferred language of a first user and a second user profile that specifies a second preferred language of a second user. The embodiment may also involve one or more processors configured to: receive, from the first user and within a chat session, a first set of messages in the first preferred language; cause the first set of messages to be translated into the second preferred language; provide, to the second user and within the chat session, the first set of messages as translated; receive, from the second user and within the chat session, a second set of messages in the second preferred language; cause the second set of messages to be translated into the first preferred language; and provide, to the first user and within the chat session, the second set of messages as translated.

Input method language determination

Techniques are disclosed for determining a target language for a communication session and configuring a language mode of an input method editor (IME) to the target language. An example methodology implementing the techniques includes, by a computing device, detecting a communication to a recipient via a software application running on the computing device, determining a target language for the communication, and configuring a language mode of an input method editor to the target language. The target language may be determined based on an attribute or attributes of the recipient of the communication. In some cases, the target language may be determined based on an attribute or attributes of a contents of a prior communication.

Input method language determination

Techniques are disclosed for determining a target language for a communication session and configuring a language mode of an input method editor (IME) to the target language. An example methodology implementing the techniques includes, by a computing device, detecting a communication to a recipient via a software application running on the computing device, determining a target language for the communication, and configuring a language mode of an input method editor to the target language. The target language may be determined based on an attribute or attributes of the recipient of the communication. In some cases, the target language may be determined based on an attribute or attributes of a contents of a prior communication.

Preventing audio delay-induced miscommunication in audio/video conferences

Embodiments for delay-induced miscommunication reduction are provided. The embodiment may include capturing data streams transmitted between participants in an A/V exchange; translating, on a sender device prior to transmission to a recipient device, an audio stream within the data streams to text; timestamping, on a sender device prior to transmission to the recipient device, each word in the translated audio stream; transmitting the audio stream and the sender-side translated and timestamped audio stream to the recipient device; translating, on the recipient device, the transmitted audio stream to text; timestamping, on the recipient device, each word in the translated audio stream; determining a lag exists in the A/V exchange based on a comparison of each timestamp for corresponding words on the sender-side translated and timestamped audio stream and the recipient-side translated and timestamped audio stream; and generating a true transcript of an intended exchange between the participants based on the comparison.

INFORMATION COMMUNICATION SYSTEM AND METHOD FOR CONTROLLING TERMINAL
20230043401 · 2023-02-09 ·

An in-flight announcement system (500) includes a voice input device (300) that inputs voice, a central control device (100) that translates an announcement content based on the input voice and outputs a modulation signal, a lighting device (400) that emits light based on the input modulation signal, and a receiving terminal (200) that specifies the announcement content based on an input light signal and outputs a translation result. The central control device (100) includes a recognition unit (101) that recognizes the voice information as utterance information, a determination unit (102) that determines whether the utterance information is a fixed sentence or not and outputs identification information of the fixed sentence, a text information group (103) including text information necessary for the determination unit (102) to determine and obtain the identification information, a translation unit (104) that translates the utterance information that is not determined as the fixed sentence and outputs reference information for the translation result, a storage (105) for storing the translation result, a generation unit (106) that generates a data set, a conversion unit (107) that generates the modulation signal based on the data set, and a moving body information management unit (108) that stores various information of an aircraft.

INFORMATION COMMUNICATION SYSTEM AND METHOD FOR CONTROLLING TERMINAL
20230043401 · 2023-02-09 ·

An in-flight announcement system (500) includes a voice input device (300) that inputs voice, a central control device (100) that translates an announcement content based on the input voice and outputs a modulation signal, a lighting device (400) that emits light based on the input modulation signal, and a receiving terminal (200) that specifies the announcement content based on an input light signal and outputs a translation result. The central control device (100) includes a recognition unit (101) that recognizes the voice information as utterance information, a determination unit (102) that determines whether the utterance information is a fixed sentence or not and outputs identification information of the fixed sentence, a text information group (103) including text information necessary for the determination unit (102) to determine and obtain the identification information, a translation unit (104) that translates the utterance information that is not determined as the fixed sentence and outputs reference information for the translation result, a storage (105) for storing the translation result, a generation unit (106) that generates a data set, a conversion unit (107) that generates the modulation signal based on the data set, and a moving body information management unit (108) that stores various information of an aircraft.

Neural network model compression method, corpus translation method and device

A method for compressing a neural network model, includes: obtaining a set of training samples including a plurality of pairs of training samples, each pair of the training samples including source data and target data corresponding to the source data; training an original teacher model by using the source data as an input and using the target data as verification data; training intermediate teacher models based on the set of training samples and the original teacher model, one or more intermediate teacher models forming a set of teacher models; training multiple candidate student models based on the set of training samples, the original teacher model, and the set of teacher models, the multiple candidate student models forming a set of student models; and selecting a candidate student model of the multiple candidate student models as a target student model according to training results of the multiple candidate student models.