G10L13/08

SYSTEMS AND METHODS FOR PROVIDING AUDIBLE FLIGHT INFORMATION
20230046264 · 2023-02-16 ·

Disclosed are methods and systems for providing audible flight information to an operator of an aircraft. A method, for example, may include receiving flight information detected by one or more sensors positioned on the aircraft, causing an image to be displayed on a display device, the image including a plurality of text items corresponding to the flight information, receiving a first operator selection indicative of one or more of the text items, parsing the one or more text items to generate a set of intermediate data, synthesizing audio data based on the intermediate data, and causing audible content corresponding to the audio data to be emitted by one or more audio emitting devices, wherein the audible content includes speech corresponding to the flight information.

Synthetic speech processing

A speech-processing system receives input data representing text. A first encoder processes segments of the text to determine embedding data representing the text, and a second encoder processes corresponding audio data to determine prosodic data corresponding to the text. The embedding and prosodic data is processed to create output data including a representation of speech corresponding to the text and prosody.

Synthetic speech processing

A speech-processing system receives input data representing text. A first encoder processes segments of the text to determine embedding data representing the text, and a second encoder processes corresponding audio data to determine prosodic data corresponding to the text. The embedding and prosodic data is processed to create output data including a representation of speech corresponding to the text and prosody.

Modification of audio-based computer program output
11582169 · 2023-02-14 · ·

Modifying computer program output in a voice or non-text input activated environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify a computer program to invoke. The computer program can identify a dialog data structure. The system can modify the identified dialog data structure to include a content item. The system can provide the modified dialog data structure to a computing device for presentation.

Modification of audio-based computer program output
11582169 · 2023-02-14 · ·

Modifying computer program output in a voice or non-text input activated environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify a computer program to invoke. The computer program can identify a dialog data structure. The system can modify the identified dialog data structure to include a content item. The system can provide the modified dialog data structure to a computing device for presentation.

Systems and methods of handling speech audio stream interruptions

A device for communication includes one or more processors configured to receive, during an online meeting, a speech audio stream representing speech of a first user. The one or more processors are also configured to receive a text stream representing the speech of the first user. The one or more processors are further configured to selectively generate an output based on the text stream in response to an interruption in the speech audio stream.

Systems and methods of handling speech audio stream interruptions

A device for communication includes one or more processors configured to receive, during an online meeting, a speech audio stream representing speech of a first user. The one or more processors are also configured to receive a text stream representing the speech of the first user. The one or more processors are further configured to selectively generate an output based on the text stream in response to an interruption in the speech audio stream.

Method for providing speech and intelligent computing device controlling speech providing apparatus
11580953 · 2023-02-14 · ·

A method for providing a speech and an intelligent computing device controlling a speech providing apparatus are disclosed. A method for providing a speech according to an embodiment of the present invention includes obtaining a message, converting the message into a speech, and determining output pattern based on a generation situation of the message, so that it is possible to more realistically convey a situation at a time of message generation to a receiver of TTS. One or more of the voice providing method, devices, intelligent computing devices controlling the voice providing device, and servers of the present invention may include artificial intelligence modules, drones (Unmanned Aerial Vehicles, UAVs), robots, Augmented Reality (AR) devices, and virtual reality (VR) devices, devices related to 5G services, and the like.

Method for providing speech and intelligent computing device controlling speech providing apparatus
11580953 · 2023-02-14 · ·

A method for providing a speech and an intelligent computing device controlling a speech providing apparatus are disclosed. A method for providing a speech according to an embodiment of the present invention includes obtaining a message, converting the message into a speech, and determining output pattern based on a generation situation of the message, so that it is possible to more realistically convey a situation at a time of message generation to a receiver of TTS. One or more of the voice providing method, devices, intelligent computing devices controlling the voice providing device, and servers of the present invention may include artificial intelligence modules, drones (Unmanned Aerial Vehicles, UAVs), robots, Augmented Reality (AR) devices, and virtual reality (VR) devices, devices related to 5G services, and the like.

Multilingual speech synthesis and cross-language voice cloning

A method includes receiving an input text sequence to be synthesized into speech in a first language and obtaining a speaker embedding, the speaker embedding specifying specific voice characteristics of a target speaker for synthesizing the input text sequence into speech that clones a voice of the target speaker. The target speaker includes a native speaker of a second language different than the first language. The method also includes generating, using a text-to-speech (TTS) model, an output audio feature representation of the input text by processing the input text sequence and the speaker embedding. The output audio feature representation includes the voice characteristics of the target speaker specified by the speaker embedding.