Patent classifications
G10L2025/786
Systems and methods for capturing noise for pattern recognition processing
Systems and methods provide a first sample of audio data and detect speech onset in the first sample of the audio data. Responsive to detecting the speech onset, systems and methods switch from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals. Systems and methods provide contiguous audio data using the second samples of the audio data captured at the first intervals and at least one captured portion of the second samples of the audio data captured at the second intervals.
Adaptive comfort noise parameter determination
A method for generating a comfort noise (CN) parameter is provided. The method includes receiving an audio input; detecting, with a Voice Activity Detector (VAD), a current inactive segment in the audio input; as a result of detecting, with the VAD, the current inactive segment in the audio input, calculating a CN parameter CN.sub.used; and providing the CN parameter CN.sub.used to a decoder. The CN parameter CN.sub.used is calculated based at least in part on the current inactive segment and a previous inactive segment.
SPEECH RECOGNITION APPARATUS AND SPEECH RECOGNITION METHOD
An apparatus includes a lip image recognition unit 103 to recognize a user state from image data which is information other than speech; a non-speech section deciding unit 104 to decide from the recognized user state whether the user is talking; a speech section detection threshold learning unit 106 to set a first speech section detection threshold (SSDT) from speech data when decided not talking, and a second SSDT from the speech data after conversion by a speech input unit when decided talking; a speech section detecting unit 107 to detect a speech section indicating talking from the speech data using the thresholds set, wherein if it cannot detect the speech section using the second SSDT, it detects the speech section using the first SSDT; and a speech recognition unit 108 to recognize speech data in the speech section detected, and to output a recognition result.
Object sound period detection apparatus, noise estimating apparatus and SNR estimation apparatus
An object sound period detection apparatus includes a first calculating unit, a second calculating unit, a first detecting unit, and a second detecting unit. The first calculating unit calculates a first threshold every unit time. The second calculating unit calculates a second threshold every unit time. The first detecting unit compares first feature amount based on the input signal with the first threshold and detects the object sound period in the input signal. The second detecting unit compares second feature amount based on the input signal with the second threshold, detects the object sound period in the input signal, and outputs a detecting result. The first calculating unit calculates the first threshold based on a detecting result before unit time by the second detecting unit. The second calculating unit calculates the second threshold based on a detecting result in same unit time by the first detecting unit.
AUDIO SYSTEMS AND METHODS FOR VOICE ACTIVITY DETECTION
Audio systems, methods, and processor instructions are provided that detect voice activity of a user and provide an output voice signal. The systems, methods, and instructions receive a plurality of microphone signals and combine the plurality of microphone signals according to a first combination and a second combination. The first combination produces a primary signal having enhanced response in the direction of the user's mouth, and the second combination produces a reference signal having reduced response in the direction of the user's mouth. The primary signal and the reference signal are added and subtracted to produce a summation signal and a difference signal, respectively. The summation signal and the difference signal are compares and an output voice signal is provided based upon the comparison.
Approach for detecting alert signals in changing environments
In an audio system, an audio signal is preprocessed to provide an input signal to a fast detector and a slow detector, the input signal comprising alert signals and ambient sounds. The slow detector determines the ambient sound level of the input signal which is output to an alert signal detector. The alert signal detector uses the ambient sound level to compute an adaptive threshold level using an adaptive threshold function. The fast detector determines the envelope level of the input signal which is output to the alert signal detector. The alert signal detector compares the envelope level to the adaptive threshold level to determine if an alert signal is present in the input signal. The adaptive threshold level varies depending on the ambient sound level of the input signal and the alert signal detection of the audio system automatically adapts to changing acoustic environments having different ambient sound levels.
Method and apparatus for detecting a voice activity in an input audio signal
A method for detecting a voice activity in an input audio signal composed of frames includes that a noise characteristic of the input signal is determined based on a received frame of the input audio signal. A voice activity detection (VAD) parameter is derived based on the noise characteristic of the input audio signal using an adaptive function. The derived VAD parameter is compared with a threshold value to provide a voice activity detection decision. The input audio signal is processed according to the voice activity detection decision.
ANALOG VOICE ACTIVITY DETECTION
According to some embodiments, an analog processing portion may receive an audio signal from a microphone. The analog processing portion may then convert the audio signal into sub-band signals and estimate an energy statistic value, such as a Signal-to-Noise Ratio (“SNR”) value, for each sub-band signal. A classification element may classify the estimated energy statistic values with analog processing such that a wakeup signal is generated when voice activity is detected. The wakeup signal may be associated with, for example, a battery-powered, always-listening audio application.
Method and apparatus for detecting a voice activity in an input audio signal
The disclosure provides a method and an apparatus for detecting a voice activity in an input audio signal composed of frames. A noise attribute of the input signal is determined based on a received frame of the input audio signal. A voice activity detection (VAD) parameter is derived based on the noise attribute of the input audio signal using an adaptive function. The derived VAD parameter is compared with a threshold value to provide a voice activity detection decision. The input audio signal is processed according to the voice activity detection decision.
User sensing system and method for low power voice command activation in wireless communication systems
A method of activating voice control on a wireless device includes sampling signals from a plurality of sensors on the device, determining if the device is in a hands-on state by a user on the basis of the signal sampling, and enabling a voice activated detection (VAD) application on the device on the basis of the determination. A voice controlled apparatus in a wireless device includes a plurality of sensors arranged on the device, a microphone, a controller to sample signals from one or more of the plurality of sensors, a processor coupled to the controller, and a voice activated detection (VAD) application running on the processor coupled to the controller and the microphone.