G10L21/0224

METHODS AND DEVICES FOR ENCODING AND/OR DECODING SPATIAL BACKGROUND NOISE WITHIN A MULTI-CHANNEL INPUT SIGNAL

The present document describes a method (600) for encoding a multi-channel input signal (101) which comprises N different channels. The method (600) comprises, for a current frame of a sequence of frames, determining (601) whether the current frame is an active frame or an inactive frame using a signal and/or a voice activity detector, and determining (602) a downmix signal (103) based on the multi-channel input signal (101), wherein the downmix signal (103) comprises N channels or less. In addition, the method (600) comprises determining (603) upmixing metadata (105) comprising a set of parameters for generating, based on the downmix signal (103), a reconstructed multi-channel signal (111) comprising N channels, wherein the upmixing metadata (105) is determined in dependence of whether the current frame is an active frame or an inactive frame. The method (600) further comprises encoding (604) the upmixing metadata (105) into a bitstream.

AUTOMATIC NOISE GATING
20230215450 · 2023-07-06 ·

An audio processing system for automatically noise gating an audio signal. The audio processing system comprises a voice activity detector configured to identify one or more segments of the audio signal not representative of speech; a level detector configured to determine at least one noise level associated with the one or more segments of the audio signal identified as not representative of speech; and a noise gate configured to noise gate the audio signal using a variable noise gate threshold that is automatically set based on the at least one determined noise level.

Post-processing gains for signal enhancement

A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.

Post-processing gains for signal enhancement

A method, an apparatus, and logic to post-process raw gains determined by input processing to generate post-processed gains, comprising using one or both of delta gain smoothing and decision-directed gain smoothing. The delta gain smoothing comprises applying a smoothing filter to the raw gain with a smoothing factor that depends on the gain delta: the absolute value of the difference between the raw gain for the current frame and the post-processed gain for a previous frame. The decision-directed gain smoothing comprises converting the raw gain to a signal-to-noise ratio, applying a smoothing filter with a smoothing factor to the signal-to-noise ratio to calculate a smoothed signal-to-noise ratio, and converting the smoothed signal-to-noise ratio to determine the second smoothed gain, with smoothing factor possibly dependent on the gain delta.

INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
20220406306 · 2022-12-22 ·

Provided is an information processing system including: an information processing device (20) and a playback device (10), the information processing device including: a first detection unit (204) that detects, from collected sound, audio processing superimposed on the sound by the playback device; a specifying unit (206) that specifies an utterance subject of the sound on the basis of the audio processing that has been detected; and a determination unit (208) that determines whether or not to execute a command included in the sound on the basis of a result of the specification.

INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
20220406306 · 2022-12-22 ·

Provided is an information processing system including: an information processing device (20) and a playback device (10), the information processing device including: a first detection unit (204) that detects, from collected sound, audio processing superimposed on the sound by the playback device; a specifying unit (206) that specifies an utterance subject of the sound on the basis of the audio processing that has been detected; and a determination unit (208) that determines whether or not to execute a command included in the sound on the basis of a result of the specification.

METHOD FOR SELECTING OUTPUT WAVE BEAM OF MICROPHONE ARRAY
20220399028 · 2022-12-15 ·

A method for selecting an output wave beam of a microphone array, comprising: (a) receiving a plurality of voice signals from the microphone array comprising a plurality of microphones, and performing beamforming on the voice signals to obtain a plurality of wave beams and corresponding wave beam output signals (102); (b) performing the following operation on each wave beam: converting the wave beam output signal of a current wave beam to frequency domain from time domain to obtain a frequency spectrum vector and a power spectrum vector of the current wave beam (104); on the basis of the frequency spectrum vector and the power spectrum vector of the current wave beam, calculating comprehensive voice signal energy of the current wave beam, wherein the comprehensive voice signal energy is the product of comprehensive energy of the current wave beam and a comprehensive voice existence probability, the comprehensive energy indicates the energy level of the wave beam output signal of the current wave beam, the comprehensive voice existence probability indicates an existence probability of voice in the wave beam output signal of the current wave beam, and the comprehensive voice existence probability and the comprehensive energy are scalar quantities (106); and (c) selecting the wave beam with a maximal comprehensive voice signal energy value as the output wave beam (110).

METHOD FOR SELECTING OUTPUT WAVE BEAM OF MICROPHONE ARRAY
20220399028 · 2022-12-15 ·

A method for selecting an output wave beam of a microphone array, comprising: (a) receiving a plurality of voice signals from the microphone array comprising a plurality of microphones, and performing beamforming on the voice signals to obtain a plurality of wave beams and corresponding wave beam output signals (102); (b) performing the following operation on each wave beam: converting the wave beam output signal of a current wave beam to frequency domain from time domain to obtain a frequency spectrum vector and a power spectrum vector of the current wave beam (104); on the basis of the frequency spectrum vector and the power spectrum vector of the current wave beam, calculating comprehensive voice signal energy of the current wave beam, wherein the comprehensive voice signal energy is the product of comprehensive energy of the current wave beam and a comprehensive voice existence probability, the comprehensive energy indicates the energy level of the wave beam output signal of the current wave beam, the comprehensive voice existence probability indicates an existence probability of voice in the wave beam output signal of the current wave beam, and the comprehensive voice existence probability and the comprehensive energy are scalar quantities (106); and (c) selecting the wave beam with a maximal comprehensive voice signal energy value as the output wave beam (110).

OPERATION DEVICE

Disclosed is an operation device including an interaction member, a microphone, a control circuit, and an audio signal processing circuit. The interaction member is used for interacting with a user. The control circuit periodically acquires scan data indicating the acting status of the interaction member. The audio signal processing circuit executes a noise removal process of removing noise from a collected audio signal collected by the microphone. The control circuit periodically transmits previously acquired scan data to the audio signal processing circuit. The audio signal processing circuit executes the noise removal process by using the scan data transmitted from the control circuit.

METHOD AND SYSTEM FOR PROTECTING USER PRIVACY DURING AUDIO CONTENT PROCESSING
20220375458 · 2022-11-24 ·

A method and system for protecting user privacy in audio content is disclosed. An audio content including private information related to at least one user is received. The audio content is segmented to generate a plurality of audio blocks. Each audio block is associated with a sequence number based on a respective chronological position in the audio content. A random key of predefined length is generated for each audio block. The plurality of audio blocks are randomly distributed to a plurality of agents for audio-to-text transcription. The random distribution is configured to scramble a data context for protecting the user privacy of the at least one user during the audio-to-text transcription. A textual transcript corresponding to the audio content is generated based on the audio-to-text transcription, the sequence number and the random key generated for each audio block.