Patent classifications
G10L2015/027
Speech characteristic recognition and conversion
Systems, devices, media, and methods are presented for converting sounds in an audio stream. The systems and methods receive an audio conversion request initiating conversion of one or more sound characteristics of an audio stream from a first state to a second state. The systems and methods access an audio conversion model associated with an audio signature for the second state. The audio stream is converted based on the audio conversion model and an audio construct is compiled from the converted audio stream and a base audio segment. The compiled audio construct is presented at a client device.
SPEECH TO TEXT CONVERSION ENGINE FOR NON-STANDARD SPEECH
Using a computing device to convert verbal communications including non-standard speech to text. The computing device receives an audio recording of voice and generates a standard text log. A standard word dictionary is retrieved. Non-standard words not found in the word dictionary are determined. Portions of the audio recording corresponding to the non-standard words are retrieved. Portions of the audio recording corresponding to non-standard words into input into a natural language understanding model. The computing device utilizes the results of the natural language understanding model to determine a best-match non-standard dictionary. One or more portions of the audio recording are used to generate a non-standard text log. The standard text log and non-standard text log are merged.
Voice-controlled secure remote actuation system
A secure remote actuation system is described herein that operates based on voice commands provided by a user and/or owner of the system. The system may include: a remote input receptor having a user interface for receiving one or more user inputs from a user, the user interface having a voice input processor, and the user inputs including vocalization; and a cloud-based network storing one or acceptable inputs and including a network device for obtaining said one or more user inputs from the remote input receptor. The network device may obtain said one or more user inputs from the remote input receptor while the user is using the user interface. The cloud-based network may compare said one or more user inputs to said one or more acceptable inputs. The voice input processor may include a microphone, a speaker, or both, and may perform various types of voice recognition.
Systems and Methods for Providing Reading Assistance Using Speech Recognition and Error Tracking Mechanisms
Methods and systems for providing reading assistance to a user are provided. One or more written words are transmitted for display to a user's computing device, for the user to read aloud. An audio segment is received from the user's computing device. The audio segment comprises the user's spoken (audible) words as the user read aloud the one or more written words. The audio segment is processed by utilizing speech recognition to determine if the user's spoken word or words match with the one or more written words.
SYSTEM AND METHOD OF GENERATING EFFECTS DURING LIVE RECITATIONS OF STORIES
One aspect of this disclosure relates to presentation of a first effect on one or more presentation devices during an oral recitation of a first story. The first effect is associated with a first trigger point, first content, and/or first story. The first trigger point being one or more specific syllables from a word and/or phrase in the first story. A first transmission point associated with the first effect can be determined based on a latency of a presentation device and user speaking profile. The first transmission point being one or more specific syllables from a word and/or phrase before the first trigger point in the first story. Control signals for instructions to present the first content at the first trigger point are transmitted to the presentation device when a user recites the first transmission point such that first content is presented at the first trigger point.
Speech analysis algorithmic system and method for objective evaluation and/or disease detection
Systems and methods use patient speech samples as inputs, use subjective multi-point ratings by speech-language pathologists of multiple perceptual dimensions of patient speech samples as further inputs, and extract laboratory-implemented features from the patient speech samples. A predictive software model learns the relationship between speech acoustics and the subjective ratings of such speech obtained from speech-language pathologists, and is configured to apply this information to evaluate new speech samples. Outputs may include objective evaluation of the plurality of perceptual dimensions for new speech samples and/or evaluation of disease onset, disease progression, or disease treatment efficacy for a condition involving dysarthria as a symptom, utilizing the new speech samples.
METHOD AND DEVICE FOR EXTRACTING FACTOID ASSOCIATED WORDS FROM NATURAL LANGUAGE SENTENCES
A method an system for extracting factoid associated words from natural language sentences is disclosed. The method includes creating an input vector that includes a plurality of parameters for each target word in a sentence. For a target word, the plurality of parameters includes a Part of Speech (POS) vector, a word embedding, a word embedding for a head word of the target word, a dependency label, and a semantic role label. The method includes processing for each target word, the input vector through a trained neural network and assigning one or more factoid tags to each target word in the sentence. The method includes extracting text associated with factoids from the sentence based on the one or more factoid tags. The method further includes providing a response to the sentence inputted by the user based on the text associated with the factoids.
Wearable word counter
This disclosure generally relates to a word counting device. Specifically, this disclosure generally relates to a wearable word counter device. The word counter device includes a microphone to receive speech input. The word counter device further includes a light sensor to receive data representative of an amount of light in an environment of the word counter device. The word counter device also includes an accelerometer to receive data representative of an amount of movement of the word counter device or the wearer of the word counter device.
COMMUNICATION WITH IN-GAME CHARACTERS
A method for coordinating reactions of a virtual character with script spoken by a user involves specifying keywords in the script, making a first prediction of times for individual keywords and for responses by a virtual character, displaying the script, sensing the time that the user reaches a first keyword, recalculating the predictions based on the syllables in the script the time the user reaches the first keyword, sensing the time that the user reaches a second keyword, recalculating the predictions of times based on the syllables and time to second keyword, continuing to sense times and recalculating until a last keyword is reached, and causing specific actions and responses of the virtual character according to the recalculated predictions of times.
METHOD, DEVICE AND STORAGE MEDIUM FOR SPEECH RECOGNITION
Disclosed are a method, device and readable storage medium for speech recognition. The method includes: determining speech features of the speech data by feature extraction on the speech data; determining syllable data corresponding to each of the speech features based on a plurality of feature extraction layers and a softmax function layer included in an acoustic model, where the acoustic model is configured to convert the speech feature into the syllable data; determining text data corresponding to the speech data based on a language model, a pronouncing dictionary and the syllable data, where the pronouncing dictionary is configured to convert the syllable data into the text data, and the language model is configured to evaluate the text data; and outputting the text data.