G10L2021/02166

Audio signal processing for noise reduction

A headphone, headphone system, and speech enhancing method is provided to enhance speech pick-up from the user of a headphone and includes receiving a plurality of signals from a set of microphones and generating a primary signal by array processing the microphone signals to steer a beam toward the user's mouth. A noise reference signal is also derived from one or more microphones via a delay-and-sum technique, and a voice estimate signal is generated by filtering the primary signal to remove components that are correlated to the noise reference signal.

Modeling and Reduction of Drone Propulsion System Noise

In some embodiments, a method, apparatus and computer program for reducing noise from an audio signal captured by a drone (e.g., canceling the noise signature of a drone from the audio signal) using a model of noise emitted by the drone's propulsion system set, where the propulsion system set includes one or more propulsion systems, each of the propulsion systems including an electric motor, and wherein the noise reduction is performed in response to voltage data indicative of instantaneous voltage supplied to each electric motor of the propulsion system set. In some other embodiments, a method, apparatus and computer program for generating a noise model by determining the noise signature of at least one drone based upon a database of noise signals corresponding to at least one propulsion system and canceling the noise signature of the drone in an audio signal based upon the noise model.

MICROPHONE ARRAY SPEECH ENHANCEMENT
20180012616 · 2018-01-11 ·

Speech received from a microphone array is enhanced. In one example, a noise filtering system receives audio from the plurality of microphones, determines a beamformer output from the received audio, applies a first auto-regressive moving average smoothing filter to the beamformer output, determines noise estimates from the received audio, applies a second auto-regressive moving average smoothing filter to the noise estimates, and combines the first and second smoothing filter outputs to produce a power spectral density output of the received audio with reduced noise.

MICROPHONE ARRAY NOISE SUPPRESSION USING NOISE FIELD ISOTROPY ESTIMATION
20180012617 · 2018-01-11 ·

Noise is suppressed from a microphone array by estimating a noise field isotropy. In some examples audio is received from a plurality of microphones. A power spectral density of a beamformer output is determined and a power spectral density of microphone noise differences is determined. A noise power spectral density is determined using a transfer function and the noise power spectral density is applied to the beamformer output power spectral density to produce a power spectral density output of the received audio with reduced noise.

Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array

An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.

Customized automated audio tuning

An example method of operation may include identifying, in a particular room environment, a number of speakers and one or more microphones on a network controlled by a controller and amplifier, providing test signals to play sequentially from each amplifier channel of the amplifier and the speakers, monitoring the test signals from the one or more microphones simultaneously to detect operational speakers and amplifier channels, providing additional test signals to the speakers to determine tuning parameters, detecting the additional test signals at the one or more microphones controlled by the controller, and automatically establishing a background noise level and noise spectrum of the room environment based on the detected additional test signals.

Audio-based detection and tracking of emergency vehicles

Techniques are provided for audio-based detection and tracking of an acoustic source. A methodology implementing the techniques according to an embodiment includes generating acoustic signal spectra from signals provided by a microphone array, and performing beamforming on the acoustic signal spectra to generate beam signal spectra, using time-frequency masks to reduce noise. The method also includes detecting, by a deep neural network (DNN) classifier, an acoustic event, associated with the acoustic source, in the beam signal spectra. The DNN is trained on acoustic features associated with the acoustic event. The method further includes performing pattern extraction, in response to the detection, to identify time-frequency bins of the acoustic signal spectra that are associated with the acoustic event, and estimating a motion direction of the source relative to the array of microphones based on Doppler frequency shift of the acoustic event calculated from the time-frequency bins of the extracted pattern.

ACOUSTIC OUTPUT APPARATUS

The present disclosure discloses an acoustic output apparatus including at least one acoustic driver, a controller, and a supporting structure. The at least one acoustic driver may be configured to output sounds through at least two sound guiding holes. The at least two sound guiding holes may include a first sound guiding hole and a second sound guiding hole. The controller may be configured to control a phase and an amplitude of the sounds generated by the at least one acoustic driver using a control signal such that the sounds output by the at least one acoustic driver through the first and second sound guiding holes have opposite phases. The supporting structure may be provided with a baffle and configured to support the at least one acoustic driver such that the first and second sound guiding holes are located on both sides of the baffle.

Voice controlled assistant with coaxial speaker and microphone arrangement
11521624 · 2022-12-06 · ·

A voice controlled assistant has a housing to hold one or more microphones, one or more speakers, and various computing components. The housing has an elongated cylindrical body extending along a center axis between a base end and a top end. The microphone(s) are mounted in the top end and the speaker(s) are mounted proximal to the base end. The microphone(s) and speaker(s) are coaxially aligned along the center axis. The speaker(s) are oriented to output sound directionally toward the base end and opposite to the microphone(s) in the top end. The sound may then be redirected in a radial outward direction from the center axis at the base end so that the sound is output symmetric to, and equidistance from, the microphone(s).

Detecting self-generated wake expressions

A speech-based audio device may be configured to detect a user-uttered wake expression. For example, the audio device may generate a parameter indicating whether output audio is currently being produced by an audio speaker, whether the output audio contains speech, whether the output audio contains a predefined expression, loudness of the output audio, loudness of input audio, and/or an echo characteristic. Based on the parameter, the audio device may determine whether an occurrence of the predefined expression in the input audio is a result of an utterance of the predefined expression by a user.