G10L25/09

TIME-BASED FREQUENCY TUNING OF ANALOG-TO-INFORMATION FEATURE EXTRACTION
20220215829 · 2022-07-07 ·

A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

TIME-BASED FREQUENCY TUNING OF ANALOG-TO-INFORMATION FEATURE EXTRACTION
20220215829 · 2022-07-07 ·

A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

Time-based frequency tuning of analog-to-information feature extraction
11302306 · 2022-04-12 · ·

A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

Time-based frequency tuning of analog-to-information feature extraction
11302306 · 2022-04-12 · ·

A sound recognition system including time-dependent analog filtered feature extraction and sequencing. An analog front end (AFE) in the system receives input analog signals, such as signals representing an audio input to a microphone. Features in the input signal are extracted, by measuring such attributes as zero crossing events and total energy in filtered versions of the signal with different frequency characteristics at different times during the audio event. In one embodiment, a tunable analog filter is controlled to change its frequency characteristics at different times during the event. In another embodiment, multiple analog filters with different filter characteristics filter the input signal in parallel, and signal features are extracted from each filtered signal; a multiplexer selects the desired features at different times during the event.

MODEL CONSTRUCTING METHOD FOR AUDIO RECOGNITION

A model constructing method for audio recognition is provided. In the method, audio data is obtained. A predicted result of the audio data is determined by using the classification model which is trained by machine learning algorithm. The predicted result includes a label defined by the classification model. A prompt message is provided according to a loss level of the predicted result. The loss level is related to a difference between the predicted result and a corresponding actual result. The prompt message is used to query a correlation between the audio data and the label. The classification model is modified according to a confirmation response of the prompt message, and the confirmation response is related to a confirmation of the correlation between the audio data and the label. Accordingly, the labeling efficiency and predicting correctness can be improved.

Sound recognition apparatus
11120817 · 2021-09-14 ·

A sound recognition apparatus (100) comprises a microphone (110) for capturing a posterior sound signal; and a processing circuit comprising a processor (180). The processing circuit is configured to process the posterior sound signal to derive posterior data, generate, using the processor (180), amalgamated data from the posterior data and anterior data derived from a previously captured anterior signal, determine, by the processor (180), whether there are correlations between the amalgamated data, the posterior data, and the anterior data that indicate that the posterior data matches the anterior data by comparing the posterior data and the amalgamated data, and the anterior data and the amalgamated data, and upon the posterior data matching the anterior data, output, by the processor (180), an indication that the posterior data matches the anterior data.

Methods and system for cue detection from audio input, low-power data processing and related arrangements

Methods and arrangements involving electronic devices, such as smartphones, tablet computers, wearable devices, etc., are disclosed. One arrangement involves a low-power processing technique for discerning cues from audio input. Another involves a technique for detecting audio activity based on the Kullback-Liebler divergence (KLD) (or a modified version thereof) of the audio input. Still other arrangements concern techniques for managing the manner in which policies are embodied on an electronic device. Others relate to distributed computing techniques. A great variety of other features are also detailed.

ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THE SAME, AND STORAGE MEDIUM
20210158824 · 2021-05-27 · ·

Disclosed is an electronic device capable of improving voice recognition. The electronic device includes a sound receiver, and a processor configured to: acquire a sound signal received by the sound receiver, separate the acquired sound signal into a plurality of sound source signals, detect signal characteristics of each of the plurality of separated sound source signals, and identify a sound source signal corresponding to a user utterance voice among the plurality of sound source signals based on predefined information on a correlation between the detected signal characteristics and the user utterance voice.

SPEECH SIGNAL CASCADE PROCESSING METHOD, TERMINAL, AND COMPUTER-READABLE STORAGE MEDIUM
20210035596 · 2021-02-04 ·

A method for improving speech signal intelligibility is performed at a device. A speech signal is obtained. A correspondence between the speech signal and a respective user group among different user groups having distinct voice characteristics is identified. Pre-encoding signal augmentation is performed on the speech signal with a respective pre-augmentation filtering coefficient that corresponds to the respective user group to obtain a group-specific pre-augmented speech signal. The device encodes the pre-augmented speech signal for subsequent transmission through the voice communication channel. An encoded version of the pre-augmented speech signal has reduced loss of signal quality as compared to an encoded version of the speech signal that is obtained without the pre-encoding signal augmentation.

Voice detection method and apparatus, and storage medium

Embodiments of the present disclosure provide a voice detection method. An audio signal can be divided into a plurality of audio segments. Audio characteristics can be extracted from each of the plurality of audio segments. The audio characteristics of the respective audio segment include a time domain characteristic and a frequency domain characteristic of the respective audio segment. At least one target voice segment can be detected from the plurality of audio segments according to the audio characteristics of the plurality of audio segments.