Patent classifications
G10L2021/02161
APPARATUS AND METHOD FOR PROCESSING AN AUDIO SIGNAL
An apparatus for processing an audio signal includes an audio signal analyzer and a filter. The audio signal analyzer is configured to analyze an audio signal to determine a plurality of noise suppression filter values for a plurality of bands of the audio signal, wherein the analyzer is configured to determine a noise suppression filter value so that a noise suppression filter value is greater than or equal to a minimum noise suppression filter value and so that the minimum noise suppression value depends on a characteristic of the audio signal. The filter is configured for filtering the audio signal, wherein the filter is adjusted based on the noise suppression filter values.
Earbud speech estimation
Embodiments of the invention determine a speech estimate using a bone conduction sensor or accelerometer, without employing voice activity detection gating of speech estimation. Speech estimation is based either exclusively on the bone conduction signal, or is performed in combination with a microphone signal. The speech estimate is then used to condition an output signal of the microphone. There are multiple use cases for speech processing in audio devices.
NOISE REDUCTION SYSTEM AND METHOD FOR AUDIO DEVICE WITH MULTIPLE MICROPHONES
An audio device has an array of microphones and a voice processing system that obtains a multi-dimensional spatial feature vector comprising at least a correlation of the microphones and a calculation of at least one ratio of energies of the microphones, uses the multi-dimensional feature vector to estimate an energy of near-field speech and background noise, uses a ratio of the near-field speech energy and background noise estimates to estimate a probability of a presence of the near-field speech, adaptively combines signals from the microphones based on the estimated near-field speech presence probability to provide a combined output signal comprising a near-field speech signal and a residual background noise signal, estimates a power spectral density of the residual background noise signal present at the combined output signal using the estimated near-field speech presence probability, and reduces the background noise by using the estimated power spectral density.
Distributed environmental microphones to minimize noise during speech recognition
A device, system, and method whereby a speech-driven system used in an industrial environment distinguishes speech obtained from users of the system from other background sounds. In one aspect, the present system and method provides for a first audio stream from a user microphone collocated with a source of human speech (that is, a user) and a second audio stream from a environmental microphone which is proximate to the source of human speech but more remote than the user microphone. The audio signals from the two microphones are asynchronous. A processor is configured to identify a common, distinctive sound event in the environment, such as an impulse sound or a periodic sound signal. Based on the common sound event, the processor provides for synchronization of the two audio signals. In another aspect, the present system and method provides for a determination of whether or not the sound received at the user microphone is suitable for identification of words in a human voice, based on a comparison of sound elements in the first audio stream and the second audio stream, for example based on a comparison of the sound intensities of the sound elements in the audio streams.
Adaptive multi-microphone beamforming
Provided is a method and computer program product for producing an enhanced audio signal for an output device from audio signals received by 2 or more microphones in close proximity to each other. For example, one embodiment of the present invention comprises the steps of receiving a first input audio signal from the first microphone, digitizing the first input audio signal to produce a first digitized audio input signal, receiving a second input audio input signal from the second microphone, digitizing the second input audio input signal to produce a second digitized audio input signal, using the first digitized audio input signal as a reference signal to an adaptive prediction filter, using the second digitized audio input signal as input to said adaptive prediction filter and finally adding a prediction result signal from the adaptive prediction filter to the first digitized audio input signal to produce the enhanced audio signal. In other embodiments, any number of microphones can be used, and in all embodiments there is no requirement to detect or locate the source or direction of arrival of the input audio signals.
Voice enhancement method for distributed system
A voice enhancement method for distributed system is disclosed. In the method of the present invention, a plurality of picking devices are disposed in a space for picking voice signal. The picking devices communicate with each other and have an enhancement operation on the voice information from each picking device to generate an enhanced voice signal.
Noise Mitigation for a Voice Interface Device
A method at an electronic device with one or more microphones and a speaker, the electronic device configured to be responsive to any of a plurality of affordances including a voice-based affordance, includes determining background noise of an environment associated with the electronic device, and before detecting the voice-based affordance: determining whether the background noise would interfere with recognition of the hotword in voice inputs detected by the electronic device, and if so, indicating to a user to use an affordance other than the voice-based affordance.
Adaptive nullforming for selective audio pick-up
Audio pickup systems and methods are provided to enhance an audio signal by removing noise components related to an acoustic environment. The systems and methods receive a primary signal and a reference signal. The reference signal is adaptively filtered and subtracted from the primary signal to minimize an energy content of a resulting output signal.
Audio signal processing in a vehicle
The present invention relates to a method for audio signal processing in a vehicle. In order to allow simple and reliable echo cancellation for voice recognition during simultaneous reproduction of a multichannel audio source signal in a vehicle, a mono audio signal is generated on the basis of a multichannel audio source signal. The mono audio signal is limited to a frequency range between a prescribed lower frequency and a prescribed upper frequency, for example to a range from 100 Hz to 8 kHz. The limited mono audio signal is output via multiple loudspeakers in the vehicle. An influence of the limited mono audio signal that is output via the multiple loudspeakers on a voice audio signal received in the vehicle via a microphone is compensated for by means of the limited mono audio signal in an echo canceller.
Noise mitigation for a voice interface device
A method at an electronic device with one or more microphones and a speaker, the electronic device configured to be awakened by any of a plurality of affordances including a voice-based affordance, includes determining a noise profile of an environment around the electronic device; determining whether the noise profile interferes with the voice-based affordance; and in accordance with a determination that the noise profile interferes with the voice-based affordance, presenting a hint to a user to use an affordance of the plurality of affordances other than the voice-based affordance to awaken the electronic device.