Patent classifications
G10L25/27
ARTIFICIAL INTELLIGENCE (AI) BASED GARBLED SPEECH ELIMINATION
An AI-based approach to Garbled speech (GS) detection. Machine learning (ML) models are created that can distinguish between GS speech and non-GS speech with high accuracy. The machine learning models take as input an encoded speech frame that has passed a CRC check. The input data/predictors to the models are a selected set of information elements (IEs) (i.e., a set of one or more bits) of the encoded speech frame. The selected IEs are a part of the input parameters to the speech decoder. It is possible to operate on single encoded speech frames, in contrast to using decoded frames, which requires taking a previous encoded frame into account for being able to perform the decoding.
ARTIFICIAL INTELLIGENCE (AI) BASED GARBLED SPEECH ELIMINATION
An AI-based approach to Garbled speech (GS) detection. Machine learning (ML) models are created that can distinguish between GS speech and non-GS speech with high accuracy. The machine learning models take as input an encoded speech frame that has passed a CRC check. The input data/predictors to the models are a selected set of information elements (IEs) (i.e., a set of one or more bits) of the encoded speech frame. The selected IEs are a part of the input parameters to the speech decoder. It is possible to operate on single encoded speech frames, in contrast to using decoded frames, which requires taking a previous encoded frame into account for being able to perform the decoding.
Real-time verbal harassment detection system
In some cases, a verbal harassment detection system may use machine learning models to detect verbal harassment in real-time or near real-time. The system may receive an audio segment comprising a portion of audio captured by a microphone located within a vehicle. Further, the system may convert the audio segment to a text segment. The system may provide at least the text segment to a prediction model associated with verbal harassment detection to obtain a harassment prediction. Further, the system may provide the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment. Based at least in part on the harassment prediction and the detected emotion, the system may automatically, and without user intervention, determine whether a user is being harassed.
Real-time verbal harassment detection system
In some cases, a verbal harassment detection system may use machine learning models to detect verbal harassment in real-time or near real-time. The system may receive an audio segment comprising a portion of audio captured by a microphone located within a vehicle. Further, the system may convert the audio segment to a text segment. The system may provide at least the text segment to a prediction model associated with verbal harassment detection to obtain a harassment prediction. Further, the system may provide the audio segment to an emotion detector to obtain a detected emotion of a speaking user that made an utterance included in the audio segment. Based at least in part on the harassment prediction and the detected emotion, the system may automatically, and without user intervention, determine whether a user is being harassed.
On-device self training in a two-stage wakeup system comprising a system on chip which operates in a reduced-activity mode
In one embodiment, an electronic device includes an input device configured to provide an input stream, a first processing device, and a second processing device. The first processing device is configured to use a keyword-detection model to determine if the input stream comprises a keyword, wake up the second processing device in response to determining that a segment of the input stream comprises the keyword, and modify the keyword-detection model in response to a training input received from the second processing device. The second processing device is configured to use a first neural network to determine whether the segment of the input stream comprises the keyword and provide the training input to the first processing device in response to determining that the segment of the input stream does not comprise the keyword.
METHOD FOR ESTABLISHING DEFECT DETECTION MODEL AND ELECTRONIC APPARATUS
A method for establishing a defect detection model and an electronic apparatus are provided. A first classification model is established based on a training sample set including a plurality of training samples. The training samples are respectively input to the first classification model to obtain a classification result of each training sample. A plurality of outlier samples that are classified incorrectly are obtained from the training samples based on the classification result. A part of outlier samples that are classified incorrectly is deleted from the training samples, and the remaining training samples are used as an optimal sample set. A second classification model is established based on the optimal sample set so as to perform a defect detection through the second classification model.
METHOD FOR ESTABLISHING DEFECT DETECTION MODEL AND ELECTRONIC APPARATUS
A method for establishing a defect detection model and an electronic apparatus are provided. A first classification model is established based on a training sample set including a plurality of training samples. The training samples are respectively input to the first classification model to obtain a classification result of each training sample. A plurality of outlier samples that are classified incorrectly are obtained from the training samples based on the classification result. A part of outlier samples that are classified incorrectly is deleted from the training samples, and the remaining training samples are used as an optimal sample set. A second classification model is established based on the optimal sample set so as to perform a defect detection through the second classification model.
Automatic gain control based on machine learning level estimation of the desired signal
Method includes receiving, through a plurality of channels, audio data corresponding to a plurality of frequency ranges; determining, for each channel's frequency ranges, speech audio and/or noise energy level using a model trained by machine learning; determining a speech signal with removed noise for each channel; determining one or more statistical values associated with an energy level of a channel's speech signal with the removed noise; determining a strongest channel that has highest statistical values associated with an energy level of a speech signal; determining that the one or more statistical values associated with the energy level of the strongest channel's speech signal satisfy a threshold condition; comparing statistical values associated with an energy level of a speech signal of each channel with those of the strongest channel; and determining whether to update a gain value for a channel based on the channel's statistical values associated with the energy level.
Automatic gain control based on machine learning level estimation of the desired signal
Method includes receiving, through a plurality of channels, audio data corresponding to a plurality of frequency ranges; determining, for each channel's frequency ranges, speech audio and/or noise energy level using a model trained by machine learning; determining a speech signal with removed noise for each channel; determining one or more statistical values associated with an energy level of a channel's speech signal with the removed noise; determining a strongest channel that has highest statistical values associated with an energy level of a speech signal; determining that the one or more statistical values associated with the energy level of the strongest channel's speech signal satisfy a threshold condition; comparing statistical values associated with an energy level of a speech signal of each channel with those of the strongest channel; and determining whether to update a gain value for a channel based on the channel's statistical values associated with the energy level.
ELECTRONIC DEVICE AND METHOD FOR ANALYZING SPEECH RECOGNITION RESULTS
An electronic device and a method for analyzing a speech recognition result is provided. The electronic device includes a display module configured to provide information to an outside of the electronic device, a processor electrically connected to the display module, and a memory electrically connected to the processor. The processor is configured to generate feature information of a text corresponding to a user utterance based on the text, determine an output domain for processing the user utterance based on the feature information of the text, identify an expected domain predetermined by a user, extract, from the memory, feature information associated with the output domain and feature information associated with the expected domain, and display the feature information associated with the output domain and the feature information associated with the expected domain using the display module.