G10L21/0308

EFFICIENT BLIND SOURCE SEPARATION USING TOPOLOGICAL APPROACH

Aspects disclosed herein generally related to a method and system for efficient blind source separation using a topological approach. The method and system comprise locating and separating the audio streams by constructing and simplifying contour tree in a built time-frequency smooth weighted histogram in the subsystems included. Thus, in one example, the audio streams can be separated and reproduced in a faster, more reliability, higher quality and more robust way.

EFFICIENT BLIND SOURCE SEPARATION USING TOPOLOGICAL APPROACH

Aspects disclosed herein generally related to a method and system for efficient blind source separation using a topological approach. The method and system comprise locating and separating the audio streams by constructing and simplifying contour tree in a built time-frequency smooth weighted histogram in the subsystems included. Thus, in one example, the audio streams can be separated and reproduced in a faster, more reliability, higher quality and more robust way.

Audio device and method of audio processing with improved talker discrimination

An audio device for improved talker discrimination is provided. To improve suppression of close talker interference, the audio device comprises at least a first and a second audio input to receive a first and second voice input signal; a first filter bank, configured to provide a plurality of first sub-band signals; a second filter bank, configured to provide a plurality of second sub-band signals; a correlator, configured to determine at least one signal correlation between at least a group of the first sub-band signals and at least a group of the second sub-band signals; and an attenuator, arranged to receive at least the group of the first sub-band signals and configured to conduct signal attenuation on the group of the first sub-band signals to provide gain-controlled sub-band signals, wherein the signal attenuation is based on the determined at least one signal correlation.

Audio device and method of audio processing with improved talker discrimination

An audio device for improved talker discrimination is provided. To improve suppression of close talker interference, the audio device comprises at least a first and a second audio input to receive a first and second voice input signal; a first filter bank, configured to provide a plurality of first sub-band signals; a second filter bank, configured to provide a plurality of second sub-band signals; a correlator, configured to determine at least one signal correlation between at least a group of the first sub-band signals and at least a group of the second sub-band signals; and an attenuator, arranged to receive at least the group of the first sub-band signals and configured to conduct signal attenuation on the group of the first sub-band signals to provide gain-controlled sub-band signals, wherein the signal attenuation is based on the determined at least one signal correlation.

Online target-speech extraction method based on auxiliary function for robust automatic speech recognition

A target speech signal extraction method for robust speech recognition includes: initializing a steering vector for a target speech source and an adaptive vector, setting a real output channel of the target speech source as an output by the adaptive vector, initializing adaptive vectors for a noise and setting a dummy channel as an output by the adaptive vectors for the noise; setting a cost function for minimizing dependency between a real output for the target speech source and a dummy output for the noise; setting an auxiliary function to the cost function, and updating the adaptive vector for the target speech source and the adaptive vectors for the noise by using the auxiliary function and the steering vector; estimating the target speech signal by using the adaptive vector thereby extracting the target speech signal from the input signals; and updating the steering vector for the target speech source.

Online target-speech extraction method based on auxiliary function for robust automatic speech recognition

A target speech signal extraction method for robust speech recognition includes: initializing a steering vector for a target speech source and an adaptive vector, setting a real output channel of the target speech source as an output by the adaptive vector, initializing adaptive vectors for a noise and setting a dummy channel as an output by the adaptive vectors for the noise; setting a cost function for minimizing dependency between a real output for the target speech source and a dummy output for the noise; setting an auxiliary function to the cost function, and updating the adaptive vector for the target speech source and the adaptive vectors for the noise by using the auxiliary function and the steering vector; estimating the target speech signal by using the adaptive vector thereby extracting the target speech signal from the input signals; and updating the steering vector for the target speech source.

Audio Processing Method, Method for Training Estimation Model, and Audio Processing System
20220406325 · 2022-12-22 ·

An audio processing method by which input data are obtained that includes first sound data representing first components of a first frequency band, included in a first sound corresponding to a first sound source, second sound data representing second components of the first frequency band, included in a second sound corresponding to a second sound source, and mix sound data representing mix components of an input frequency band including a second frequency band, the mix components being included in a mix sound of the first sound and the second sound. The input data are then input to a trained estimation model, to generate at least one of first output data representing first estimated components within an output frequency band including the second frequency band, included in the first sound, or second output data representing second estimated components within the output frequency band, included in the second sound.

ELECTRONIC DEVICE AND PERSONALIZED AUDIO PROCESSING METHOD OF THE ELECTRONIC DEVICE
20220406324 · 2022-12-22 ·

According to an embodiment, an electronic device, comprises: a microphone configured to receive an audio signal comprising a speech of a user; a memory storing instructions therein; and a processor electrically connected to the memory and configured to execute the instructions, wherein execution of the instructions by the processor, causes the processor to perform a plurality of operations, the plurality of operations comprising: removing noise from the audio signal, thereby generating a first output result; performing speaker separation on the audio signal on the audio signal or the first output result, thereby generating a second output result; and processing a command corresponding to the audio signal based on the first output result and the second output result.

EXTRACTION OF AN AUDIO OBJECT
20220383894 · 2022-12-01 · ·

A method for extracting at least one audio object from at least two audio input signals, each of which contains the audio object. The second audio input signal is syncronized with the first audio input signal while obtaining a synchronized second audio input signal. The audio object is extracted by applying at least one trained model to the first audio signal and to the synchronized second audio input signal. The audio object is outputted. Further, the step of synchronizing the second audio input signal with the first audio input signal includes the steps of: generating audio signals; analytically calculating a correlation between the audio signals; optimizing the correlation vector; and determining the synchronized second audio input signal using the optimized correlation vector.

Vowel sensing voice activity detector
11587579 · 2023-02-21 · ·

Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound.