G10L19/06

Signal decorrelation in an audio processing system

Audio processing methods may involve receiving audio data corresponding to a plurality of audio channels. The audio data may include a frequency domain representation corresponding to filterbank coefficients of an audio encoding or processing system. A decorrelation process may be performed with the same filterbank coefficients used by the audio encoding or processing system. The decorrelation process may be performed without converting coefficients of the frequency domain representation to another frequency domain or time domain representation. The decorrelation process may involve selective or signal-adaptive decorrelation of specific channels and/or specific frequency bands. The decorrelation process may involve applying a decorrelation filter to a portion of the received audio data to produce filtered audio data. The decorrelation process may involve using a non-hierarchal mixer to combine a direct portion of the received audio data with the filtered audio data according to spatial parameters.

Enhanced soundfield coding using parametric component generation

The present document relates to multichannel audio coding and more precisely to techniques for discrete multichannel audio encoding and decoding. In particular, the present document relates to systems and method for coding soundfields. An audio encoder (200) configured to encode a frame of a soundfield signal (110) comprising a plurality of audio signals is described. The audio encoder (200) comprises a transform determination unit (203, 204) configured to determine an energy-compacting orthogonal transform (V) based on the frame of the soundfield signal (110). Furthermore, the encoder (200) comprises a transform unit (202) configured to apply the energy-compacting orthogonal transform (V) to the frame of the soundfield signal (110), and configured to provide a frame of a rotated soundfield signal (112) comprising a plurality of rotated audio signals (E1, E2, E3). The audio encoder (200) comprises a waveform encoding unit (103) configured to encode a first rotated audio signal (E1) of the plurality of rotated audio signals (E1, E2, E3), and a parametric encoding unit (104) configured to determine a set of spatial parameters (ae2, be2) for determining a second rotated audio signal (E2) of the plurality of rotated audio signals (E1, E2, E3) based on the first rotated audio signal (E1).

Enhanced soundfield coding using parametric component generation

The present document relates to multichannel audio coding and more precisely to techniques for discrete multichannel audio encoding and decoding. In particular, the present document relates to systems and method for coding soundfields. An audio encoder (200) configured to encode a frame of a soundfield signal (110) comprising a plurality of audio signals is described. The audio encoder (200) comprises a transform determination unit (203, 204) configured to determine an energy-compacting orthogonal transform (V) based on the frame of the soundfield signal (110). Furthermore, the encoder (200) comprises a transform unit (202) configured to apply the energy-compacting orthogonal transform (V) to the frame of the soundfield signal (110), and configured to provide a frame of a rotated soundfield signal (112) comprising a plurality of rotated audio signals (E1, E2, E3). The audio encoder (200) comprises a waveform encoding unit (103) configured to encode a first rotated audio signal (E1) of the plurality of rotated audio signals (E1, E2, E3), and a parametric encoding unit (104) configured to determine a set of spatial parameters (ae2, be2) for determining a second rotated audio signal (E2) of the plurality of rotated audio signals (E1, E2, E3) based on the first rotated audio signal (E1).

Audio Encoding/Decoding based on an Efficient Representation of Auto-Regressive Coefficients
20230178087 · 2023-06-08 ·

An encoder for encoding a parametric spectral representation (f) of auto-regressive coefficients that partially represent an audio signal. The encoder includes a low-frequency encoder configured to quantize elements of a part of the parametric spectral representation that correspond to a low-frequency part of the audio signal. It also includes a high-frequency encoder configured to encode a high-frequency part (f.sup.H) of the parametric spectral representation (f) by weighted averaging based on the quantized elements ({circumflex over (f)}.sup.L) flipped around a quantized mirroring frequency ({circumflex over (f)}.sub.m), which separates the low-frequency part from the high-frequency part, and a frequency grid determined from a frequency grid codebook in a closed-loop search procedure. Described are also a corresponding decoder, corresponding encoding/decoding methods and UEs including such an encoder/decoder.

Audio Encoding/Decoding based on an Efficient Representation of Auto-Regressive Coefficients
20230178087 · 2023-06-08 ·

An encoder for encoding a parametric spectral representation (f) of auto-regressive coefficients that partially represent an audio signal. The encoder includes a low-frequency encoder configured to quantize elements of a part of the parametric spectral representation that correspond to a low-frequency part of the audio signal. It also includes a high-frequency encoder configured to encode a high-frequency part (f.sup.H) of the parametric spectral representation (f) by weighted averaging based on the quantized elements ({circumflex over (f)}.sup.L) flipped around a quantized mirroring frequency ({circumflex over (f)}.sub.m), which separates the low-frequency part from the high-frequency part, and a frequency grid determined from a frequency grid codebook in a closed-loop search procedure. Described are also a corresponding decoder, corresponding encoding/decoding methods and UEs including such an encoder/decoder.

Audio or video encoder, audio or video decoder and related methods for processing multi-channel audio or video signals using a variable prediction direction

An encoder/decoder is based on a combination of two audio or video channels to obtain a first combination signal as a mid-signal and a residual signal derivable using a predicted side signal derived from the mid-signal. A decoder uses the prediction residual signal, the first combination signal, a prediction direction indicator and prediction information to derive decoded first channel and second channel signals. A real-to-imaginary transform can be applied for estimating the imaginary part of the spectrum of the first combination signal. The prediction signal used in the derivation of the prediction residual signal, the real-valued first combination signal is multiplied by a real portion of the complex prediction information and the estimated imaginary part of the first combination signal is multiplied by an imaginary portion of the complex prediction information.

Audio or video encoder, audio or video decoder and related methods for processing multi-channel audio or video signals using a variable prediction direction

An encoder/decoder is based on a combination of two audio or video channels to obtain a first combination signal as a mid-signal and a residual signal derivable using a predicted side signal derived from the mid-signal. A decoder uses the prediction residual signal, the first combination signal, a prediction direction indicator and prediction information to derive decoded first channel and second channel signals. A real-to-imaginary transform can be applied for estimating the imaginary part of the spectrum of the first combination signal. The prediction signal used in the derivation of the prediction residual signal, the real-valued first combination signal is multiplied by a real portion of the complex prediction information and the estimated imaginary part of the first combination signal is multiplied by an imaginary portion of the complex prediction information.

Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.

Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.

MDCT-based complex prediction stereo coding

The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The upmixing can be suspended responsive to control data.