Patent classifications
G10L21/0224
SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM
A wideband signal is enhanced or suppressed to the same extent at each frequency without increasing the size of an overall sensor array. To achieve this, there is provided a signal processing apparatus including a direction estimator that obtains a direction of arrival of a signal for signals received from a plurality of sensors and each containing a target signal and noise, a first gain calculator that calculates a first gain using the direction of arrival of the signal, an integrator that obtains an integrated signal by integrating the signals received from the plurality of sensors, and a multiplier that multiplies the first gain by the integrated signal.
INTEGRATED SENSOR-ARRAY PROCESSOR
An integrated sensor-array processor and method includes sensor array time-domain input ports to receive sensor signals from time-domain sensors. A sensor transform engine (STE) creates sensor transform data from the sensor signals and applies sensor calibration adjustments. Transducer time-domain input ports receive time-domain transducer signals, and a transducer output transform engine (TTE) generates transducer output transform data from the transducer signals. A spatial filter engine (SFE) applies suppression coefficients to the sensor transform data, to suppress target signals received from noise locations and/or amplification locations. A blocking filter engine (BFE) applies subtraction coefficients to the sensor transform data, to subtract the target signals from the sensor transform data. A noise reduction filter engine (NRE) subtracts noise signals from the BFE output. An inverse transform engine (ITE) generates time-domain data from the NRE output.
INTEGRATED SENSOR-ARRAY PROCESSOR
An integrated sensor-array processor and method includes sensor array time-domain input ports to receive sensor signals from time-domain sensors. A sensor transform engine (STE) creates sensor transform data from the sensor signals and applies sensor calibration adjustments. Transducer time-domain input ports receive time-domain transducer signals, and a transducer output transform engine (TTE) generates transducer output transform data from the transducer signals. A spatial filter engine (SFE) applies suppression coefficients to the sensor transform data, to suppress target signals received from noise locations and/or amplification locations. A blocking filter engine (BFE) applies subtraction coefficients to the sensor transform data, to subtract the target signals from the sensor transform data. A noise reduction filter engine (NRE) subtracts noise signals from the BFE output. An inverse transform engine (ITE) generates time-domain data from the NRE output.
Adaptive audio enhancement for multichannel speech recognition
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
Adaptive audio enhancement for multichannel speech recognition
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
Ambient cooperative intelligence system and method
A method, computer program product, and computing system for obtaining calibration information for a three-dimensional space incorporating an ACI system; and processing the calibration information to calibrate the ACI system.
Ambient cooperative intelligence system and method
A method, computer program product, and computing system for obtaining calibration information for a three-dimensional space incorporating an ACI system; and processing the calibration information to calibrate the ACI system.
ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.
ACOUSTIC EVENT DETECTION SYSTEM AND METHOD
An acoustic event detection system and a method are provided. The system includes a voice activity detection subsystem, a database, and an acoustic event detection subsystem. The voice activity detection subsystem includes a voice receiving module, a feature extraction module, and a first determination module. The voice receiving module receives an original sound signal, the feature extraction module extracts a plurality of features from the original sound signal, and the first determination module executes a first classification process to determine whether or not the plurality of features match to a start-up voice. The acoustic event detection subsystem includes a second determination module and a function response module. The second determination module executes a second classification process to determine whether the features match to at least one of a plurality of predetermined voices. The function response module executes one of functions corresponding to the predetermined voices that is matched.