Patent classifications
G10L21/0208
Audio signal processing for noise reduction
A headphone, headphone system, and speech enhancing method is provided to enhance speech pick-up from the user of a headphone and includes receiving a plurality of signals from a set of microphones and generating a primary signal by array processing the microphone signals to steer a beam toward the user's mouth. A noise reference signal is also derived from one or more microphones via a delay-and-sum technique, and a voice estimate signal is generated by filtering the primary signal to remove components that are correlated to the noise reference signal.
DEVICE INCLUDING SPEECH RECOGNITION FUNCTION AND METHOD OF RECOGNIZING SPEECH
A device including a speech recognition function which recognizes speech from a user, includes: a loudspeaker which outputs speech to a space; a microphone which collects speech in the space; a first speech recognition unit which recognizes the speech collected by the microphone; a command control unit which issues a command for controlling the device, based on the speech recognized by the first speech recognition unit; and a control unit which prohibits the command issuance unit from issuing the command, based on the speech to be output from the loudspeaker.
DEVICE INCLUDING SPEECH RECOGNITION FUNCTION AND METHOD OF RECOGNIZING SPEECH
A device including a speech recognition function which recognizes speech from a user, includes: a loudspeaker which outputs speech to a space; a microphone which collects speech in the space; a first speech recognition unit which recognizes the speech collected by the microphone; a command control unit which issues a command for controlling the device, based on the speech recognized by the first speech recognition unit; and a control unit which prohibits the command issuance unit from issuing the command, based on the speech to be output from the loudspeaker.
MICROPHONE UNIT COMPRISING INTEGRATED SPEECH ANALYSIS
A microphone unit has a transducer, for generating an electrical audio signal from a received acoustic signal; a speech coder, for obtaining compressed speech data from the audio signal; and a digital output, for supplying digital signals representing said compressed speech data. The speech coder may be a lossy speech coder, and may contain a bank of filters with centre frequencies that are non-uniformly spaced, for example mel frequencies.
MICROPHONE UNIT COMPRISING INTEGRATED SPEECH ANALYSIS
A microphone unit has a transducer, for generating an electrical audio signal from a received acoustic signal; a speech coder, for obtaining compressed speech data from the audio signal; and a digital output, for supplying digital signals representing said compressed speech data. The speech coder may be a lossy speech coder, and may contain a bank of filters with centre frequencies that are non-uniformly spaced, for example mel frequencies.
Modeling and Reduction of Drone Propulsion System Noise
In some embodiments, a method, apparatus and computer program for reducing noise from an audio signal captured by a drone (e.g., canceling the noise signature of a drone from the audio signal) using a model of noise emitted by the drone's propulsion system set, where the propulsion system set includes one or more propulsion systems, each of the propulsion systems including an electric motor, and wherein the noise reduction is performed in response to voltage data indicative of instantaneous voltage supplied to each electric motor of the propulsion system set. In some other embodiments, a method, apparatus and computer program for generating a noise model by determining the noise signature of at least one drone based upon a database of noise signals corresponding to at least one propulsion system and canceling the noise signature of the drone in an audio signal based upon the noise model.
Spatial audio processing
An apparatus comprising at least one processor and at least one memory, the memory comprising machine-readable instructions, that when executed cause the apparatus to: store in a non-volatile memory multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; provide in a man machine interface an option for a user to select one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; and in response to the user selecting one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources, the apparatus is further caused to use the selected one of the stored multiple sets of predetermined spatial audio processing parameters to spatially process audio from one or more sound sources.
Spatial audio processing
An apparatus comprising at least one processor and at least one memory, the memory comprising machine-readable instructions, that when executed cause the apparatus to: store in a non-volatile memory multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; provide in a man machine interface an option for a user to select one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; and in response to the user selecting one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources, the apparatus is further caused to use the selected one of the stored multiple sets of predetermined spatial audio processing parameters to spatially process audio from one or more sound sources.
IMAGING DEVICE
An imaging device includes an optical system including a movable lens, an image capture unit configured to capture a subject image through the optical system, a sound-collecting microphone, and a control unit configured to control the optical system and the image capture unit and to receive an sound signal from the microphone. The imaging device has a first mode and a second mode as moving-image shooting modes. The control unit makes the movable lens move faster in the first mode than in the second mode when a moving image is captured. The control unit filters the sound signal with a narrower-band filter in the first mode than in the second mode.
Audio data processing method, apparatus and storage medium for detecting wake-up words based on multi-path audio from microphone array
An audio data processing method is provided. The method includes: obtaining multi-path audio data in an environmental space, obtaining a speech data set based on the multi-path audio data, and separately generating, in a plurality of enhancement directions, enhanced speech information corresponding to the speech data set; matching a speech hidden feature in the enhanced speech information with a target matching word, and determining an enhancement direction corresponding to the enhanced speech information having a highest degree of matching with the target matching word as a target audio direction; obtaining speech spectrum features in the enhanced speech information, and obtaining, from the speech spectrum features, a speech spectrum feature in the target audio direction; and performing speech authentication on the speech hidden feature and the speech spectrum feature that are in the target audio direction based on the target matching word, to obtain a target authentication result.