Patent classifications
G10L15/20
VOICE REINFORCEMENT IN MULTIPLE SOUND ZONE ENVIRONMENTS
Microphone signal is received from at least one microphone. AEC produces an echo cancelled microphone signal using first adaptive filters to estimate and cancel feedback that is a result of the environment. AFC produces a processed microphone signal using second adaptive filters to estimate and cancel feedback resulting from application of the reinforced voice signal within the environment. The uttered speech is reinforced in the processed microphone signal to produce the reinforced voice signal. The reinforced voice signal and the audio signal is applied to the loudspeakers. A step size of adjustment of the second adaptive filters may be increased responsive to detection of reverberation in the microphone signal. The reverberation that is used to control the step size of the second adaptive filters may be added artificially. This may provide multiple benefits including improving adjustment of the second adaptive filters and also improving the sound impression of the voice.
VOICE REINFORCEMENT IN MULTIPLE SOUND ZONE ENVIRONMENTS
Microphone signal is received from at least one microphone. AEC produces an echo cancelled microphone signal using first adaptive filters to estimate and cancel feedback that is a result of the environment. AFC produces a processed microphone signal using second adaptive filters to estimate and cancel feedback resulting from application of the reinforced voice signal within the environment. The uttered speech is reinforced in the processed microphone signal to produce the reinforced voice signal. The reinforced voice signal and the audio signal is applied to the loudspeakers. A step size of adjustment of the second adaptive filters may be increased responsive to detection of reverberation in the microphone signal. The reverberation that is used to control the step size of the second adaptive filters may be added artificially. This may provide multiple benefits including improving adjustment of the second adaptive filters and also improving the sound impression of the voice.
ELECTRONIC DEVICE FOR CONTROLLING BEAMFORMING AND OPERATING METHOD THEREOF
An electronic device is provided. The electronic device includes, for the purpose of determining a customized beamformer filter, an input module including a plurality of microphones configured to receive an external sound signal, a memory configured to store computer-executable instructions and an initial value of a voice parameter used to perform beamforming on the external sound signal, and a processor configured to execute the instructions by accessing the memory. The instructions may be configured to estimate a feature value of the external sound signal, calculate the initial value of the voice parameter used to perform beamforming based on the external sound signal received by the plurality of microphones, determine whether to store the calculated initial value according to the feature value, determine which one of the calculated initial value or an initial value stored in the memory used according to the feature value, and obtain a target voice parameter.
VOICE CONTROL SYSTEM AND VOICE CONTROL METHOD FOR AUTOMATIC DOOR
A voice control system and a voice control method for an automatic door are provided. The voice control system includes a sound detection device, a storage device, a first determination circuit, a second determination circuit and a control circuit. The sound detection device detects a sound signal of a sound source, the storage device includes a voiceprint database that includes reference voiceprint features. The first determination circuit analyzes a voiceprint feature of the sound signal and compares the voiceprint feature with the reference voiceprint features. The second determination circuit determines whether a velocity of the sound source falls within a reference speed range according to a frequency variation of the sound signal that matches one of the voiceprint features. In response to the velocity of the sound source within the reference speed range, the control circuit controls the automatic door to be in an open state.
Systems and methods for generating a cleaned version of ambient sound
A first electronic device is provided. While a media content item provided by a media-providing service is emitted by a second electronic device that is remote from the first electronic device, the first electronic device receives, from the media-providing service, data that includes an audio stream that corresponds to the media content item. The first electronic device detects ambient sound that includes sound corresponding to the media content item emitted by the second electronic device. The first electronic device generates a cleaned version of the ambient sound, which includes: using the data received from the media-providing service to align the audio stream with the ambient sound; and performing a subtraction operation to subtract the audio stream from the ambient sound. The first electronic device detects a voice command in the cleaned version of the ambient sound.
Systems and methods for generating a cleaned version of ambient sound
A first electronic device is provided. While a media content item provided by a media-providing service is emitted by a second electronic device that is remote from the first electronic device, the first electronic device receives, from the media-providing service, data that includes an audio stream that corresponds to the media content item. The first electronic device detects ambient sound that includes sound corresponding to the media content item emitted by the second electronic device. The first electronic device generates a cleaned version of the ambient sound, which includes: using the data received from the media-providing service to align the audio stream with the ambient sound; and performing a subtraction operation to subtract the audio stream from the ambient sound. The first electronic device detects a voice command in the cleaned version of the ambient sound.
Electronic device and method for speech recognition of the same
An electronic device for recognizing a user's speech and a speech recognition method therefor are provided. The electronic device includes a microphone configured to receive a user's speech, a memory for storing speech recognition models, and at least one processor configured to select a speech recognition model from among the speech recognition models stored in the memory based on an operation state of the electronic device, and recognize the user's speech received by the microphone based on the selected speech recognition model.
IDENTIFICATION AND CLASSIFICATION OF TALK-OVER SEGMENTS DURING VOICE COMMUNICATIONS USING MACHINE LEARNING MODELS
A system and methods are provided to analyze audio signals from an incoming voice call. The system includes a processor and a computer readable medium operably coupled thereto, to perform voice analysis operations which include receiving a first audio signal comprising a first audio waveform of a first speech between at least two users during the incoming voice call, accessing speech segment parameters for analyzing the audio signals, determining one or more talk-over segments in the first audio waveform using the speech segment parameters, extracting audio features from each of the one or more talk-over segments, determining, using a machine learning (ML) model trained for interruption analysis of the audio signals, whether each of the one or more talk-over segments are a negative interruption or a non-negative interruption based on the audio features, and determining whether to output a first notification for the negative interruption or the non-negative interruption.
IDENTIFICATION AND CLASSIFICATION OF TALK-OVER SEGMENTS DURING VOICE COMMUNICATIONS USING MACHINE LEARNING MODELS
A system and methods are provided to analyze audio signals from an incoming voice call. The system includes a processor and a computer readable medium operably coupled thereto, to perform voice analysis operations which include receiving a first audio signal comprising a first audio waveform of a first speech between at least two users during the incoming voice call, accessing speech segment parameters for analyzing the audio signals, determining one or more talk-over segments in the first audio waveform using the speech segment parameters, extracting audio features from each of the one or more talk-over segments, determining, using a machine learning (ML) model trained for interruption analysis of the audio signals, whether each of the one or more talk-over segments are a negative interruption or a non-negative interruption based on the audio features, and determining whether to output a first notification for the negative interruption or the non-negative interruption.
System and method to correct for packet loss in ASR systems
A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.