Patent classifications
G10L2021/02166
Linear filtering for noise-suppressed speech detection via multiple network microphone devices
Systems and methods for suppressing noise and detecting voice input in a multi-channel audio signal captured by two or more network microphone devices include receiving an instruction to process one or more audio signals captured by a first network microphone device and after receiving the instruction (i) disabling at least a first microphone of a plurality of microphones of a second network microphone device, (ii) capturing a first audio signal via a second microphone of the plurality of microphones, (iii) receiving over a network interface of the second network microphone device a second audio signal captured via at least a third microphone of the first network microphone device, (iv) using estimated noise content to suppress first and second noise content in the first and second audio signals, (v) combining the suppressed first and second audio signals into a third audio signal, and (vi) determining that the third audio signal includes a voice input comprising a wake word.
AUDIO SYSTEMS AND METHODS FOR VOICE ACTIVITY DETECTION
Audio systems, methods, and processor instructions are provided that detect voice activity of a user and provide an output voice signal. The systems, methods, and instructions receive a plurality of microphone signals and combine the plurality of microphone signals according to a first combination and a second combination. The first combination produces a primary signal having enhanced response in the direction of the user's mouth, and the second combination produces a reference signal having reduced response in the direction of the user's mouth. The primary signal and the reference signal are added and subtracted to produce a voice-enhanced signal and a voice-reduced signal, respectively. The voice-enhanced signal and the voice-reduced signal are compares and an output voice signal is provided based upon the comparison.
Dynamic control of multiple feedforward microphones in active noise reduction devices
Technology described in this document can be embodied in an earpiece of an active noise reduction (ANR) device. The earpiece includes a plurality of microphones, wherein each of the plurality of microphones is usable for capturing ambient audio to generate input signals for both an ANR mode of operation and a hear-through mode of operation of the ANR device. The earpiece further includes a controller configured to: process a first subset of microphones from the plurality of microphones to generate input signals for the ANR mode of operation, process a second subset of microphones from the plurality of microphones to generate input signals for the hear-through mode of operation, detect that a particular microphone of the second subset is acoustically coupled to an acoustic transducer of the ANR device in the hear-through mode of operation, and in response to the detection, process the input signals from the second subset of microphones without using input signals from the particular microphone.
HEARING DEVICE COMPRISING A TRANSMITTER
A hearing device, e.g. a hearing aid, is configured to be arranged at least partly on a user’s head or at least partly implanted in a user’s head. The hearing device comprises a) at least one input transducer for picking up an input sound signal from the environment and providing at least one electric input signal representing said input sound signal; b) a signal processor connected to the at least one input transducer, the signal processor being configured to analyze the electric input signal and to provide a transmit control signal in dependence thereof; c) a memory buffer, e.g. a cyclic buffer, for storing a current time segment of a certain duration of said at least one electric input signal, or a processed version thereof; and a transmitter for transmitting at least a part of said time segment, or a processed version thereof, to an external device in dependence of said transmit control signal.
Cascade Architecture for Noise-Robust Keyword Spotting
A method (400) includes receiving, at a first processor (110) of a user device (102), streaming multi-channel audio (118) captured by an array of microphones (107), each channel (119) including respective audio features. For each channel, the method also includes processing, by the first processor, using a first stage hotword detector (210), the respective audio features to determine whether a hotword is detected. When the first stage hotword detector detects the hotword, the method also includes the first processor providing chomped raw audio data (212) to a second processor that processes, using a first noise cleaning algorithm (250), the chomped raw audio data to generate a clean monophonic audio chomp (260). The method also includes processing, by the second processor using a second stage hotword detector (220), the clean monophonic audio chomp to detect the hotword.
AUDIO DEVICE WITH DISTRACTOR ATTENUATOR
An audio device comprising an interface, memory, and a processor is disclosed. A first microphone input signal and a second microphone input signal is processed for provision of an output audio signal; and output the output audio signal, wherein to process the microphone signals determine a first distractor indicator based on features associated with the input signals; determine a first distractor attenuation parameter based on the first distractor indicator; determine a second distractor indicator based on one or more features associated with the first microphone input signal and the second microphone input signal; determine a second distractor attenuation parameter based on the second distractor indicator; determine an attenuator gain based on the first distractor attenuation parameter and the second gain compensation parameter; and apply a noise suppression scheme to a first beamforming output signal according to the attenuator gain for provision of the output audio signal.
Bone conduction headphone speech enhancement systems and methods
Systems and methods for enhancing a headset user's own voice include at least two outside microphones, an inside microphone, audio input components operable to receive and process the microphone signals, a voice activity detector operable to detect speech presence and absence in the received and/or processed signals, and a cross-over module configured to generate an enhanced voice signal. The audio processing components includes a low frequency branch comprising low pass filter banks, a low frequency spatial filter, a low frequency spectral filter and an equalizer, and a high frequency branch comprising highpass filter banks, a high frequency spatial filter, and a high frequency spectral filter.
METHOD FOR REDUCING RESIDUAL ECHO AND ELECTRONIC DEVICE USING THE SAME
Disclosed is a method for reducing residual echo including: performing an echo cancellation process on a voice input signal according to an echo reference signal to obtain an echo cancellation signal; performing a FFT on the echo reference signal to obtain a reference spectrum signal for each frame; performing the FFT on the echo cancellation signal to obtain a speech spectrum signal for each frame; using the reference spectrum signal and the speech spectrum signal of a current frame to obtain a priori signal-to-noise ratio of the current frame according to a principle of additive noise; filtering the speech spectrum signal of the current frame by a Wiener filter coefficient of the current frame determined by the prior signal-to-noise ratio of the current frame to obtain a target spectrum signal of each frame; performing an IFFT on the target spectrum signal of each frame to obtain a target voice signal.
Voice Awakening Method and Apparatus, Device, and Medium
A voice awakening method and apparatus, a device, and a medium are provided. The voice awakening method includes: collecting a voice signal by using a microphone array (S201); separately performing denoising processing on the voice signal by using N beamforming denoising algorithms to obtain N denoising signals, where each beamforming denoising algorithm corresponds to one of N areas, different beamforming denoising algorithms correspond to different areas, a union set of the N areas covers a signal collection area of the microphone array, and N is a positive integer greater than 1 (S202); and performing voice awakening by using an awakening engine based on at least one of the N denoising signals (S203). The method can maximize denoising performance of the denoising algorithms and can cancel an echo outside a beam to enhance echo cancellation performance, featuring a relatively high awakening rate and recognition rate. In addition, the method is easy to implement and popularize.
APPARATUS AND METHODS FOR CANCELLING THE NOISE OF A SPEAKER FOR SPEECH RECOGNITION
The present disclosure relates to an apparatus for cancelling a noise signal for speech recognition, the apparatus includes one or more microphones configured on a mesh enclosure to receive a first set of signals pertaining to a user command. A speaker located in the mesh enclosure configured to generate a second set of signals pertaining to noise signal, wherein each of the one or more microphones are arranged perpendicular above the speaker at predefined degrees to cancel the generated second set of signals reaching the one or more microphones. A processor configured to process the received first set of signals by cancellation of the second set of signals; and enable, on receipt of the first set of signals, an operational mode of the apparatus to execute corresponding action.