G10L2021/02165

METHOD AND DEVICE FOR PROCESSING AUDIO SIGNAL, AND STORAGE MEDIUM

An original noisy signal of each of at least two microphones is acquired by acquiring, using the at least two microphones, an audio signal emitted by each sound source. For each frame in time domain, an estimated frequency-domain signal of each sound source is acquired according to the original noisy signal of each of the at least two microphones. A frequency collection containing a plurality of predetermined static frequencies and dynamic frequencies is determined in a predetermined frequency band range. A weighting coefficient of each frequency contained in the frequency collection is determined according to the estimated frequency-domain signal of the each frequency in the frequency collection. A separation matrix of the each frequency is determined according to the weighting coefficient. The audio signal emitted by each of the at least two sound sources is acquired based on the separation matrix and the original noisy signal.

VOICE INPUT/OUTPUT APPARATUS, HEARING AID, VOICE INPUT/OUTPUT METHOD, AND VOICE INPUT/OUTPUT PROGRAM

By performing both noise cancellation and echo cancellation, a high-quality main voice signal is generated. A voice input/output apparatus includes a noise acquirer that is arranged toward an outside of a body of a user and acquires external noise arriving from the outside of the user, a voice output unit that accepts an input of a voice signal and outputs a voice to an ear canal of the user, a main voice acquirer that acquires a mixed voice, in which the external noise, the output voice, and a main voice of the user transmitted from a vocal cord of the user through the ear canal are mixed, and outputs a mixed voice signal, a noise canceler that processes the mixed voice signal using a noise signal based on the external noise, and an echo canceler that processes the mixed voice signal using the voice signal.

Method and device for improving voice quality
11200908 · 2021-12-14 · ·

A method for improving voice quality is provided herein. The method includes receiving acoustic signals from a microphone array; receiving sensor signals from an accelerometer sensor of the headset; generating, by a beamformer, a speech output signal and a noise output signal according to the acoustic signals; best-estimating the speech output signal according to the sensor signals to generate a best-estimated signal; and generating a mixed signal according to the speech output signal and the best-estimated signal.

VOICE ACTIVITY DETECTION
20210383825 · 2021-12-09 · ·

A headset that can detect the voice activity of a user includes an inner microphone generating an inner microphone signal; an outer microphone generating an outer microphone signal, wherein the inner microphone and outer microphone are positioned such that, when the headset is worn by a user, the inner microphone is disposed nearer to the user's head; and a voice-activity detector determining a sign of a phase difference between the inner microphone signal and the outer microphone signal and generating a voice activity detection signal representing a user's voice activity when the sign of the phase difference indicates that the outer microphone received an audio signal after the inner microphone received the audio signal.

AUDIO SIGNAL GENERATION METHOD AND SYSTEM

An audio generation method and system provided in this disclosure can dynamically select a frequency splicing point of an audio signal based on voice quality of a first audio signal and a second audio signal corresponding to each frequency in a frequency domain, divide the frequency domain into a first frequency interval and a second frequency interval, select audio signals of higher voice quality that correspond to each frequency interval for splicing, and obtain a target audio signal after fusion of the first audio signal and the second audio signal, so that voice quality of the target audio signal in each frequency interval in the frequency domain is the best, thereby improving voice quality of the target audio signal after fusion.

AUDIO SIGNAL PROCESSING METHOD AND SYSTEM FOR ECHO SUPPRESSION

In an audio signal processing method and system for echo suppression, selection of a target audio processing mode is controlled based on strength of a speaker signal. In the method and system, a control signal is generated based on the strength of the speaker signal, and the target audio processing mode is controlled based on the control signal to perform signal processing on a microphone signal so as to obtain better voice quality. When the speaker signal does not exceed a threshold, the system selects a first mode, and performs signal processing on a first audio signal and a second audio signal to obtain a first target audio; or when the speaker signal exceeds a threshold, the system selects a second mode, and performs signal processing on a second audio signal to obtain a second target audio, and the mode can be switched based on the speaker signal.

Voice isolation system

The disclosure includes a voice isolation system comprising an acoustic echo-cancelation subsystem configured to receive a plurality of input signals, subtract an interference component from the input signals, and provide a plurality of output signals. The system also includes an adaptive beamformer subsystem configured to receive the plurality of output signals from the acoustic echo-cancelation subsystem and compute a signal-to-noise ratio enhanced signal based on the received output signals. The system also includes a residual noise suppressor subsystem configured to attenuate at least one portion of the SNR enhanced signal received from the adaptive beamformer subsystem based on the at least one portion having an SNR below a predetermined SNR threshold. The system also includes an automatic gain control subsystem configured to process a signal outputted from the residual noise suppressor subsystem and transmit a resulting signal as an output signal.

SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND PROGRAM
20220189498 · 2022-06-16 ·

A signal processing device includes: an input unit to which a microphone signal including a mixed sound in which a target sound and a sound other than the target sound are mixed and a one-dimensional time-series signal acquired by an auxiliary sensor and synchronized with the target sound are input; and a sound source extraction unit that extracts a target sound signal corresponding to the target sound from the microphone signal on the basis of the one-dimensional time-series signal.

BONE CONDUCTION HEADPHONE SPEECH ENHANCEMENT SYSTEMS AND METHODS
20220189497 · 2022-06-16 ·

Systems and methods for enhancing a headset user's own voice include at least two outside microphones, an inside microphone, audio input components operable to receive and process the microphone signals, a voice activity detector operable to detect speech presence and absence in the received and/or processed signals, and a cross-over module configured to generate an enhanced voice signal. The audio processing components includes a low frequency branch comprising low pass filter banks, a low frequency spatial filter, a low frequency spectral filter and an equalizer, and a high frequency branch comprising highpass filter banks, a high frequency spatial filter, and a high frequency spectral filter.

Sound outputting device including plurality of microphones and method for processing sound signal using plurality of microphones

An electronic device and method are disclosed. The electronic device includes a first microphone, a second microphone, a memory; and a processor. The processor implements the method, including: determining whether a voice is detected in a first sound signal detected by the first microphone; determine whether a present recording period is a voice period or a silent period based on the determination, when the present period is the silent period, receive a second sound signal via the second microphone and analyze a noise signal included therein, remove noise signals from one of the first and second sound signals, based on characteristics of the voice period or the analyzed noise signal, and combine the first and second sound signal into an output signal and transmit the output signal to an external device.