Patent classifications
G10L21/0264
APPROACH FOR DETECTING ALERT SIGNALS IN CHANGING ENVIRONMENTS
In an audio system, an audio signal is preprocessed to provide an input signal to a fast detector and a slow detector, the input signal comprising alert signals and ambient sounds. The slow detector determines the ambient sound level of the input signal which is output to an alert signal detector. The alert signal detector uses the ambient sound level to compute an adaptive threshold level using an adaptive threshold function. The fast detector determines the envelope level of the input signal which is output to the alert signal detector. The alert signal detector compares the envelope level to the adaptive threshold level to determine if an alert signal is present in the input signal. The adaptive threshold level varies depending on the ambient sound level of the input signal and the alert signal detection of the audio system automatically adapts to changing acoustic environments having different ambient sound levels.
Smart Noise Reduction Device and the Method Thereof
The present invention discloses a smart noise reduction device including a control device; an audio waveform pattern recognizer coupled to the control device for identifying an audio mixed signal including a regularity signal and a non-regularity signal; an audio waveform pattern database coupled to the control device, including at least one audio type, each having a plurality of preset second regularity signals; and an audio filter coupled to the control device to obtain the regularity signal.
SENSITIVITY MODE FOR AN AUDIO SPOTTING SYSTEM
An audio spotting system configured for various operating modes including a regular mode and sensitivity mode is described. An example cascade audio spotting system may include a high-power subsystem including a high-power trigger and a transfer module. This high-power trigger includes one or more detection models used to detect whether a target sound activity is included in the one or more audio streams. The one or more detection models are associated with a first set of hyperparameters when the cascade audio spotting system is in a regular mode, and the one or more detection models are associated with a second set of hyperparameters when the cascade audio spotting system is in a sensitivity mode. The transfer module provides at least one of one or more processed audio streams for further processing in response to the high-power trigger detecting the target sound activity in the one or more audio streams.
SENSITIVITY MODE FOR AN AUDIO SPOTTING SYSTEM
An audio spotting system configured for various operating modes including a regular mode and sensitivity mode is described. An example cascade audio spotting system may include a high-power subsystem including a high-power trigger and a transfer module. This high-power trigger includes one or more detection models used to detect whether a target sound activity is included in the one or more audio streams. The one or more detection models are associated with a first set of hyperparameters when the cascade audio spotting system is in a regular mode, and the one or more detection models are associated with a second set of hyperparameters when the cascade audio spotting system is in a sensitivity mode. The transfer module provides at least one of one or more processed audio streams for further processing in response to the high-power trigger detecting the target sound activity in the one or more audio streams.
Dynamic adjustment of audio detected by a microphone array
Techniques for dynamically adjusting received audio are described. In an example, a computer system receives audio data representing noise and utterance received by a device during a first time interval that has a start and an end. The start corresponds to a beginning of the utterance. The end corresponds to at a selection by the device of an audio beam associated with a direction towards an utterance source. The computer system determines a value associated with an audio adjustment factor. The audio adjustment factor is represented by values that vary during the time interval. The value is one of the values associated with a time point of the first time interval. The computer system generates, based at least in part on the audio data and the value, first data that indicates a measurement of at least one of the noise or the utterance.
Dynamic adjustment of audio detected by a microphone array
Techniques for dynamically adjusting received audio are described. In an example, a computer system receives audio data representing noise and utterance received by a device during a first time interval that has a start and an end. The start corresponds to a beginning of the utterance. The end corresponds to at a selection by the device of an audio beam associated with a direction towards an utterance source. The computer system determines a value associated with an audio adjustment factor. The audio adjustment factor is represented by values that vary during the time interval. The value is one of the values associated with a time point of the first time interval. The computer system generates, based at least in part on the audio data and the value, first data that indicates a measurement of at least one of the noise or the utterance.
Dynamic Player Selection for Audio Signal Processing
In one aspect, a first playback device is configured to (i) receive a set of voice signals, (ii) process the set of voice signals using a first set of audio processing algorithms, (iii) identify, from the set of voice signals, at least two voice signals that are to be further processed, (iv) determine that the first playback device does not have a threshold amount of computational power available, (v) receive an indication of an available amount of computational power of a second playback device, (vi) send the at least two voice signals to the second playback device, (vii) cause the second playback device to process the at least two voice signals using a second set of audio processing algorithms, (viii) receive, from the second playback device, the processed at least two voice signals, and (ix) combine the processed at least two voice signals into a combined voice signal.
Systems and methods for generating a cleaned version of ambient sound
A first electronic device is provided. While a media content item provided by a media-providing service is emitted by a second electronic device that is remote from the first electronic device, the first electronic device receives, from the media-providing service, data that includes an audio stream that corresponds to the media content item. The first electronic device detects ambient sound that includes sound corresponding to the media content item emitted by the second electronic device. The first electronic device generates a cleaned version of the ambient sound, which includes: using the data received from the media-providing service to align the audio stream with the ambient sound; and performing a subtraction operation to subtract the audio stream from the ambient sound. The first electronic device detects a voice command in the cleaned version of the ambient sound.
Microphone with adjustable signal processing
A microphone may comprise a microphone element for detecting sound, and a digital signal processor configured to process a first audio signal that is based on the sound in accordance with a selected one of a plurality of digital signal processing (DSP) modes. Each of the DSP modes may be for processing the first audio signal in a different way. For example, the DSP modes may account for distance of the person speaking (e.g., near versus far) and/or desired tone (e.g., darker, neutral, or bright tone). At least some of the modes may have, for example, an automatic level control setting to provide a more consistent volume as the user changes their distance from the microphone or changes their speaking level, and that may be associated with particular default (and/or adjustable) values of the parameters attack, hold, decay, maximum gain, and/or target gain, each depending upon which DSP is being applied.
Microphone with adjustable signal processing
A microphone may comprise a microphone element for detecting sound, and a digital signal processor configured to process a first audio signal that is based on the sound in accordance with a selected one of a plurality of digital signal processing (DSP) modes. Each of the DSP modes may be for processing the first audio signal in a different way. For example, the DSP modes may account for distance of the person speaking (e.g., near versus far) and/or desired tone (e.g., darker, neutral, or bright tone). At least some of the modes may have, for example, an automatic level control setting to provide a more consistent volume as the user changes their distance from the microphone or changes their speaking level, and that may be associated with particular default (and/or adjustable) values of the parameters attack, hold, decay, maximum gain, and/or target gain, each depending upon which DSP is being applied.