IPIQ

G10L13/033

VOICE COMMUNICATION BETWEEN A SPEAKER AND A RECIPIENT OVER A COMMUNICATION NETWORK

20230005465 · 2023-01-05 ·

Voice communication, between a speaker and a recipient, either or both of which may be in a motor vehicle, is provided via a communication network. In a first step, an input speech utterance is received from the speaker. Optionally, a bandwidth of a connection to the communication network is evaluated at the side of the speaker. The input speech utterance is then converted to text. At least the text is transmitted over the communication network. In case of a sufficiently large bandwidth, the input speech utterance may be transmitted as voice and as text. The transmitted text is converted into an output speech utterance that simulates a voice of the speaker. Finally, the output speech utterance is provided to the recipient.

VOICE COMMUNICATION BETWEEN A SPEAKER AND A RECIPIENT OVER A COMMUNICATION NETWORK

20230005465 · 2023-01-05 ·

Voice recognition method using artificial intelligence and apparatus thereof

11568853 · 2023-01-31 ·

Lg Electronics Inc.

Jonghoon Chae

Disclosed is a voice recognition method and apparatus using artificial intelligence. A voice recognition method using artificial intelligence may include: generating a utterance by receiving a voice command of a user; obtaining a user's intention by analyzing the generated utterance; deriving an urgency level of the user on the basis of the generated utterance and prestored user information; generating a first response in association with the user's intention; obtaining main vocabularies included in the first response; generating a second response by using the main vocabularies and the urgency level of the user; determining a speech rate of the second response on the basis of the urgency level of the user; and outputting the second response according to the speech rate by synthesizing the second response to a voice signal.

Voice recognition method using artificial intelligence and apparatus thereof

11568853 · 2023-01-31 ·

Lg Electronics Inc.

Jonghoon Chae

TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, AND A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM

20230230576 · 2023-07-20 ·

A text-to-speech synthesis method includes receiving text, inputting the received text in a synthesizer that includes a prediction network configured to convert the received text into speech data having a speech attribute that includes emotion, intention, projection, pace, and/or accent, and outputting said speech data. The prediction network is obtained by obtaining a first sub-dataset and a second sub-dataset, where the first sub-dataset and the second sub-dataset each include audio samples and corresponding text, and the speech attribute of the audio samples of the second sub-dataset is more pronounced than the speech attribute of the audio samples of the first sub-dataset, training a first model using the first sub-dataset until a performance metric reaches a first predetermined value, training a second model by further training the first model using the second sub-dataset until the performance metric reaches a second predetermined value, and selecting one trained model as the prediction network.

TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, AND A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM

20230230576 · 2023-07-20 ·

Electronic device and method of controlling thereof

11561760 · 2023-01-24 ·

Samsung Electronics Co., Ltd.

An electronic device for changing a voice of a personal assistant function, and a method therefor are provided. The electronic device includes a display, a transceiver, processor, and a memory for storing commands executable by the processor. The processor is configured to, based on a user command to request acquisition of voice data feature of a person included in a media content displayed on the display being received, control the display to display information of a person, based on a user input to select the one of the information of a person being received, acquire voice data corresponding to an utterance of a person related to the selected information of a person, and acquire voice data feature from the acquired voice data, control the transceiver to transmit the acquired voice data feature to a server.

Content output management based on speech quality

11562739 · 2023-01-24 ·

Amazon Technologies, Inc.

Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

Content output management based on speech quality

11562739 · 2023-01-24 ·

Amazon Technologies, Inc.

Stylizing text-to-speech (TTS) voice response for assistant systems

11562744 · 2023-01-24 ·

Meta Platforms Technologies, LLC

In one embodiment, a method includes receiving a voice input from a user and determining a first style of the voice input, based on first features extracted from the voice input. A second style for a voice response having second features may then be determined based on the first style. Finally, the voice response may be generated based on the second features of the second style, and this voice response may be provided in response to the voice input.

Patent classifications

G10L13/033