IPIQ

G10L13/033

METHOD AND SYSTEM FOR GENERATING AN INTELLIGENT VOICE ASSISTANT RESPONSE

20230223007 · 2023-07-13 ·

A method and a system for generating an intelligent voice assistant response are provided. The method includes receiving a preliminary voice assistant response to a user command and determining a subjective polarity score of the preliminary voice assistant response and a dynamic polarity score indicative of an instant user reaction to the preliminary voice assistant response, once the preliminary voice assistant response is delivered. The method thereafter determines a sentiment score of the preliminary voice assistant response based on the subjective polarity score and the dynamic polarity score. The method identifies an emotionally uplifting information for the user that is to be combined with the preliminary voice assistant response. The method further includes generating a personalized note to be combined with the preliminary voice assistant response and generating the intelligent voice assistant response by combining the preliminary voice assistant response with the emotionally uplifting information and the personalized note.

Using speech to text data in training text to speech models

11699430 · 2023-07-11 ·

International Business Machines Corporation

A system and method for providing a text to speech output by receiving user audio data, determining a user region-specific-pronunciation classification according to the audio data, determining text for a response to the user according to the audio data, identifying a portion from the text, where a region specific-pronunciation dictionary includes the portion, and using a phoneme string, from the dictionary selected according to the user region-specific pronunciation classification, for the word in a text to speech output to the user.

Audio Processing Apparatus

20230213349 · 2023-07-06 ·

Koray Ozcan

An apparatus configured to: determine, with a position sensor, position information; determine at least one keyword within at least one audio signal, wherein at least the at least one keyword is configured to be spatially processed; obtain at least one spatial processing parameter based at least partially, on the position information, wherein the at least one spatial processing parameter is configured to be used to spatially process at least the at least one keyword to be perceived from a direction during rendering, wherein the direction indicates a navigation direction; generate at least one processed audio signal, comprising processing at least the at least one keyword based on the at least one spatial processing parameter; and provide the at least one processed audio signal, comprising the at least one processed keyword, for generation of a virtual audio image.

Audio Processing Apparatus

20230213349 · 2023-07-06 ·

Koray Ozcan

Terminal and Operating Method Thereof

20230215418 · 2023-07-06 ·

Hyperconnect LLC

A terminal may include a display that is divided into at least two areas, when a real time broadcasting, where a user of the terminal is a host, starts through a broadcasting channel, and of which one area of the at least two areas is allocated to the host; an input/output interface that receives a voice of the host; a communication interface that receives one item selected of at least one or more items and a certain text from a terminal of a certain guest, of at least one or more guests who entered the broadcasting channel; and a processor that generates a voice message converted from the certain text into the voice of the host or a voice of the certain guest.

Terminal and Operating Method Thereof

20230215418 · 2023-07-06 ·

Hyperconnect LLC

Multi-Purpose Protective Face Mask

20230210201 · 2023-07-06 ·

Eric Lacy

A protective face mask implemented with a pocket located on a front surface of the mask A removable amplifier unit configured to be placed into the pocket, the removable amplifier unit comprising: a micro-processor configured to process voice data; a rechargeable battery coupled to the micro-processor; a Bluetooth device coupled to the micro-processor; a microphone coupled to the micro-processor and configured to provide the voice data to the micro-processor; and a speaker unit configured to output the voice data processed by the micro-processor.

Multi-Purpose Protective Face Mask

20230210201 · 2023-07-06 ·

Eric Lacy

PSYCHOLOGY COUNSELING DEVICE AND METHOD THEREOF

20230215542 · 2023-07-06 ·

A psychology counseling device is provided. The device includes a user interface configured to receive an input from a user and provide information; a microphone configured to collect a voice of the user; a speaker configured to convey auditory information to the user; a processor configured to control the user interface, the microphone, and the speaker; and a memory accessible by the processor and configured to store executable instructions. The memory is configured to further store texts to be provided to the user and voice data received from the user. The executable instructions, when executed by the processor, causes the processor to perform: recognizing an emotional state of the user based on the user's input; providing texts including different contents to the user according to the emotional state of the user; receiving a voice that the user articulates the texts and storing the voice in the memory as the voice data; obtaining a plurality of modulated voices by converting the voice data; and providing at least two among the plurality of modulated voices to the user.

SYSTEMS AND METHODS FOR AUTOMATED REAL-TIME GENERATION OF AN INTERACTIVE AVATAR UTILIZING SHORT-TERM AND LONG-TERM COMPUTER MEMORY STRUCTURES

20230215071 · 2023-07-06 ·

Attune Media Labs, PBC

Systems and methods enabling rendering an avatar attuned to a user. The systems and methods include receiving audio-visual data of user communications of a user. Using the audio-visual data, the systems and methods may determine vocal characteristics of the user, facial action units representative of facial features of the user, and speech of the user based on a speech recognition model and/or natural language understanding model. Based on the vocal characteristics, an acoustic emotion metric can be determined. Based on the speech recognition data, a speech emotion metric may be determined. Based on the facial action units, a facial emotion metric may be determined. An emotional complex signature may be determined to represent an emotional state of the user for rendering the avatar attuned to the emotional state based on a combination of the acoustic emotion metric, the speech emotion metric and the facial emotion metric.

Patent classifications

G10L13/033