Patent classifications
G10L15/24
Dynamic Language and Command Recognition
Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user’s spoken command.
Dynamic Language and Command Recognition
Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user’s spoken command.
COMMUNICATION DEVICE, COMMUNICATION METHOD, AND NON-TRANSITORY STORAGE MEDIUM
A communication device includes: a vibration transmitting unit configured to transmit an input vibration wave to a first portion of a subject; a vibration receiving unit configured to receive, at a second portion of the subject, an output vibration wave generated based on the input vibration wave propagated through at least a part of the subject; and a speech recognition device configured to recognize a phoneme which is uttered by the subject based on a difference wave between the input vibration wave and the output vibration wave, wherein the first portion and the second portion are arranged on right and left of the subject.
COMMUNICATION DEVICE, COMMUNICATION METHOD, AND NON-TRANSITORY STORAGE MEDIUM
A communication device includes: a vibration transmitting unit configured to transmit an input vibration wave to a first portion of a subject; a vibration receiving unit configured to receive, at a second portion of the subject, an output vibration wave generated based on the input vibration wave propagated through at least a part of the subject; and a speech recognition device configured to recognize a phoneme which is uttered by the subject based on a difference wave between the input vibration wave and the output vibration wave, wherein the first portion and the second portion are arranged on right and left of the subject.
LINGUISTIC MODEL SELECTION FOR ADAPTIVE AUTOMATIC SPEECH RECOGNITION
The present disclosure describes dynamically adjusting linguistic models for automatic speech recognition based on biometric information to produce a more reliable speech recognition experience. Embodiments include receiving a speech signal, receiving a biometric signal from a biometric sensor implemented at least partially in hardware, determining a linguistic model based on the biometric signal, and processing the speech signal for speech recognition using the linguistic model based on the biometric signal.
CONVERSATIONAL ARTIFICIAL INTELLIGENCE DRIVEN METHODS AND SYSTEM FOR DELIVERING PERSONALIZED THERAPY AND TRAINING SESSIONS
A user directed verbal interactive method and system for requesting a evaluation and obtaining a customized verbal therapy routine based on the evaluation obtained. The method and system allow users to interact with an artificial intelligence agent by answering a series of system directed questions that guides the users through evaluation and treatment of physical pain using a customized verbal interaction and delivery regimen. Users verbally engage with the artificial intelligence agent to create respective profiles. The system develops therapies based on their current physiological state and profile. The users are then delivered verbal therapy prompts through the system to implement the developed therapy routines.
CONVERSATIONAL ARTIFICIAL INTELLIGENCE DRIVEN METHODS AND SYSTEM FOR DELIVERING PERSONALIZED THERAPY AND TRAINING SESSIONS
A user directed verbal interactive method and system for requesting a evaluation and obtaining a customized verbal therapy routine based on the evaluation obtained. The method and system allow users to interact with an artificial intelligence agent by answering a series of system directed questions that guides the users through evaluation and treatment of physical pain using a customized verbal interaction and delivery regimen. Users verbally engage with the artificial intelligence agent to create respective profiles. The system develops therapies based on their current physiological state and profile. The users are then delivered verbal therapy prompts through the system to implement the developed therapy routines.
ADDING BACKGROUND SOUND TO SPEECH-CONTAINING AUDIO DATA
An editing method facilitates the task of adding background sound to speech-containing audio data so as to augment the listening experience. The editing method is executed by a processor in a computing device and comprises obtaining characterization data that characterizes time segments in the audio data by at least one of topic and sentiment; deriving, for a respective time segment in the audio data and based on the characterization data, a desired property of a background sound to be added to the audio data in the respective time segment, and providing the desired property for the respective time segment so as to enable the audio data to be combined, within the respective time segment, with background sound having the desired property. The background sound may be selected and added automatically or by manual user intervention.
ADDING BACKGROUND SOUND TO SPEECH-CONTAINING AUDIO DATA
An editing method facilitates the task of adding background sound to speech-containing audio data so as to augment the listening experience. The editing method is executed by a processor in a computing device and comprises obtaining characterization data that characterizes time segments in the audio data by at least one of topic and sentiment; deriving, for a respective time segment in the audio data and based on the characterization data, a desired property of a background sound to be added to the audio data in the respective time segment, and providing the desired property for the respective time segment so as to enable the audio data to be combined, within the respective time segment, with background sound having the desired property. The background sound may be selected and added automatically or by manual user intervention.
Subvocalized Speech Recognition and Command Execution by Machine Learning
Provided is an in-ear device and associated computational support system that leverages machine learning to interpret sensor data descriptive of one or more in-ear phenomena during subvocalization by the user. An electronic device can receive sensor data generated by at least one sensor at least partially positioned within an ear of a user, wherein the sensor data was generated by the at least one sensor concurrently with the user subvocalizing a subvocalized utterance. The electronic device can then process the sensor data with a machine-learned subvocalization interpretation model to generate an interpretation of the subvocalized utterance as an output of the machine-learned subvocalization interpretation model.