G10L21/0232

MICROPHONE ARRAY SPEECH ENHANCEMENT
20180012616 · 2018-01-11 ·

Speech received from a microphone array is enhanced. In one example, a noise filtering system receives audio from the plurality of microphones, determines a beamformer output from the received audio, applies a first auto-regressive moving average smoothing filter to the beamformer output, determines noise estimates from the received audio, applies a second auto-regressive moving average smoothing filter to the noise estimates, and combines the first and second smoothing filter outputs to produce a power spectral density output of the received audio with reduced noise.

MICROPHONE ARRAY SPEECH ENHANCEMENT
20180012616 · 2018-01-11 ·

Speech received from a microphone array is enhanced. In one example, a noise filtering system receives audio from the plurality of microphones, determines a beamformer output from the received audio, applies a first auto-regressive moving average smoothing filter to the beamformer output, determines noise estimates from the received audio, applies a second auto-regressive moving average smoothing filter to the noise estimates, and combines the first and second smoothing filter outputs to produce a power spectral density output of the received audio with reduced noise.

MICROPHONE ARRAY NOISE SUPPRESSION USING NOISE FIELD ISOTROPY ESTIMATION
20180012617 · 2018-01-11 ·

Noise is suppressed from a microphone array by estimating a noise field isotropy. In some examples audio is received from a plurality of microphones. A power spectral density of a beamformer output is determined and a power spectral density of microphone noise differences is determined. A noise power spectral density is determined using a transfer function and the noise power spectral density is applied to the beamformer output power spectral density to produce a power spectral density output of the received audio with reduced noise.

MICROPHONE ARRAY NOISE SUPPRESSION USING NOISE FIELD ISOTROPY ESTIMATION
20180012617 · 2018-01-11 ·

Noise is suppressed from a microphone array by estimating a noise field isotropy. In some examples audio is received from a plurality of microphones. A power spectral density of a beamformer output is determined and a power spectral density of microphone noise differences is determined. A noise power spectral density is determined using a transfer function and the noise power spectral density is applied to the beamformer output power spectral density to produce a power spectral density output of the received audio with reduced noise.

Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array

An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.

Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array

An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.

APPROACH FOR DETECTING ALERT SIGNALS IN CHANGING ENVIRONMENTS
20180014112 · 2018-01-11 ·

In an audio system, an audio signal is preprocessed to provide an input signal to a fast detector and a slow detector, the input signal comprising alert signals and ambient sounds. The slow detector determines the ambient sound level of the input signal which is output to an alert signal detector. The alert signal detector uses the ambient sound level to compute an adaptive threshold level using an adaptive threshold function. The fast detector determines the envelope level of the input signal which is output to the alert signal detector. The alert signal detector compares the envelope level to the adaptive threshold level to determine if an alert signal is present in the input signal. The adaptive threshold level varies depending on the ambient sound level of the input signal and the alert signal detection of the audio system automatically adapts to changing acoustic environments having different ambient sound levels.

APPROACH FOR DETECTING ALERT SIGNALS IN CHANGING ENVIRONMENTS
20180014112 · 2018-01-11 ·

In an audio system, an audio signal is preprocessed to provide an input signal to a fast detector and a slow detector, the input signal comprising alert signals and ambient sounds. The slow detector determines the ambient sound level of the input signal which is output to an alert signal detector. The alert signal detector uses the ambient sound level to compute an adaptive threshold level using an adaptive threshold function. The fast detector determines the envelope level of the input signal which is output to the alert signal detector. The alert signal detector compares the envelope level to the adaptive threshold level to determine if an alert signal is present in the input signal. The adaptive threshold level varies depending on the ambient sound level of the input signal and the alert signal detection of the audio system automatically adapts to changing acoustic environments having different ambient sound levels.

Customized automated audio tuning

An example method of operation may include identifying, in a particular room environment, a number of speakers and one or more microphones on a network controlled by a controller and amplifier, providing test signals to play sequentially from each amplifier channel of the amplifier and the speakers, monitoring the test signals from the one or more microphones simultaneously to detect operational speakers and amplifier channels, providing additional test signals to the speakers to determine tuning parameters, detecting the additional test signals at the one or more microphones controlled by the controller, and automatically establishing a background noise level and noise spectrum of the room environment based on the detected additional test signals.

Customized automated audio tuning

An example method of operation may include identifying, in a particular room environment, a number of speakers and one or more microphones on a network controlled by a controller and amplifier, providing test signals to play sequentially from each amplifier channel of the amplifier and the speakers, monitoring the test signals from the one or more microphones simultaneously to detect operational speakers and amplifier channels, providing additional test signals to the speakers to determine tuning parameters, detecting the additional test signals at the one or more microphones controlled by the controller, and automatically establishing a background noise level and noise spectrum of the room environment based on the detected additional test signals.