G10L2015/027

Filtering Model Training Method and Speech Recognition Method
20200258499 · 2020-08-13 ·

A filtering model training method includes obtaining N original syllables, obtaining N recognized syllables, and obtaining N syllable distances based on the N original syllables and the N recognized syllables, where the N syllable distances are in a one-to-one correspondence with N syllable pairs, the N original syllables and the N recognized syllables form the N syllable pairs, each syllable pair includes an original syllable and a recognized syllable that correspond to each other, and each syllable distance is used to indicate a similarity between an original syllable and a recognized syllable that are included in a corresponding syllable pair.

System and method of generating effects during live recitations of stories

One aspect of this disclosure relates to presentation of a first effect on one or more presentation devices during an oral recitation of a first story. The first effect is associated with a first trigger point, first content, and/or first story. The first trigger point being one or more specific syllables from a word and/or phrase in the first story. A first transmission point associated with the first effect can be determined based on a latency of a presentation device and user speaking profile. The first transmission point being one or more specific syllables from a word and/or phrase before the first trigger point in the first story. Control signals for instructions to present the first content at the first trigger point are transmitted to the presentation device when a user recites the first transmission point such that first content is presented at the first trigger point.

METHODS, DEVICES AND COMPUTER-READABLE STORAGE MEDIA FOR REAL-TIME SPEECH RECOGNITION
20200219486 · 2020-07-09 ·

Methods, apparatuses, devices and computer-readable storage media for real-time speech recognition are provided. The method includes: based on an input speech signal, obtaining truncating information for truncating a sequence of features of the speech signal; based on the truncating information, truncating the sequence of features into a plurality of subsequences; and for each subsequence in the plurality of subsequences, obtaining a real-time recognition result through attention mechanism.

VOICE WAKE-UP APPARATUS AND METHOD THEREOF
20200219502 · 2020-07-09 ·

A voice wake-up apparatus used in an electronic device that includes a voice activity detection circuit, a storage circuit and a smart detection circuit is provided. The voice activity detection circuit receives an input sound signal and detects a voice activity section of the input sound signal. The storage circuit stores a predetermined voice sample. The smart detection circuit receives the input sound signal to perform a time domain and a frequency domain detection on the voice activity section to generate a syllable and frequency characteristic detection result, compare the syllable and frequency characteristic detection result with the predetermined voice sample and generate a wake-up signal to a processing circuit of the electronic device when the syllable and frequency characteristic detection result matches the predetermined voice sample to wake up the processing circuit.

Methods for explainability of deep-learning models

Embodiments are disclosed for health assessment and diagnosis implemented in an artificial intelligence (AI) system. In an embodiment, a method comprises: feeding a first set of input features to the AI model; obtaining a first set of raw output predictions from the model; determining a first set of impact scores for the input features fed into the model; training a neural network with the first set of impact scores as input to the network and pre-determined sentences describing the model's behavior as output; feeding a second set of input features to the AI model; obtaining a second set of raw output predictions from the model; determining a second set of impact scores based on the second set of output predictions; feeding the second set of impact scores to the neural network; and generating a sentence describing the AI model's behavior on the second set of input features.

TRANSPORTATION VEHICLE
20200209008 · 2020-07-02 · ·

A transportation vehicle having a navigation system and an operating system connected to the navigation system for data transmission via a bus system. The transportation vehicle has a microphone and includes a phoneme generation module for generating phonemes from an acoustic voice signal or the output signal of the microphone; the phonemes are part of a predefined selection of exclusively monosyllabic phonemes; and a phoneme-to-grapheme module for generating inputs to operate the transportation vehicle based on monosyllabic phonemes generated by the phoneme generation module.

Production of speech based on whispered speech and silent speech

A method, a system, and a computer program product are provided for interpreting low amplitude speech and transmitting amplified speech to a remote communication device. At least one computing device receives sensor data from multiple sensors. The sensor data is associated with the low amplitude speech. At least one of the at least one computing device analyzes the sensor data to map the sensor data to at least one syllable resulting in a string of one or more words. An electronic representation of the string of the one or more words may be generated and transmitted to a remote communication device for producing the amplified speech from the electronic representation.

ASSESSMENT OF SPEECH CONSUMABILITY BY TEXT ANALYSIS

Methods, computer program products, and systems are presented. The methods include, for instance: obtaining an input text for an output speech. The number of words and syllables are counted in each sentence, and a mean sentence length of the input text is calculated. Each sentence length is checked against the mean sentence length and a variation for each sentence is calculated. For the input text, the consumability-readability score is produced as an average of variations for all sentences in the input text. The consumability-readability score indicates the level of satisfaction for the listener of the output speech based on the input text.

Intelligent Health Monitoring

Embodiments are disclosed for health assessment and diagnosis implemented in an artificial intelligence (AI) system. In an embodiment, a method comprises: capturing, using one or more sensors of a device, signals including information about a user's symptoms; using one or more processors of the device to: collect other data correlative of symptoms experienced by the user; and implement pre-trained data driven methods to: determine one or more symptoms of the user; determine a disease or disease state of the user based on the determined one or more symptoms; determine a medication effectiveness in suppressing at least one determined symptom or improving the determined disease state of the user; and present, using an output device, one or more evidence for at least one of the determined symptoms, the disease, disease state, or an indication of the medication effectiveness for the user.

Intelligent Health Monitoring
20200151519 · 2020-05-14 ·

Embodiments are disclosed for health assessment and diagnosis implemented in an artificial intelligence (AI) system. The AI system takes as input information from a multitude of sensors measuring different biomarkers in a continuous or intermittent fashion. The proposed techniques disclosed herein address the unique challenges encountered in implementing such an AI system.