Patent classifications
G10L19/13
Audio coding based on audio pattern recognition
In general, techniques are described by which to perform audio coding based on audio pattern recognition. A source device comprising a memory and a processor may be configured to perform the techniques. The memory may store audio data. The processor may obtain, from a plurality of categories, a category to which the audio data corresponds, and obtain, based on the category, a set of pyramid vector quantization (PVQ) parameters from a plurality of sets of PVQ parameters. The processor may also perform, based on the set of PVQ parameters, PVQ with respect to the audio data to obtain a residual identifier representative of the audio data, and specify, in the bitstream, the residual identifier.
Method for adaptively encoding an audio signal in dependence on noise information for higher encoding accuracy
An audio encoder for providing an encoded representation on the basis of an audio signal, wherein the audio encoder is configured to obtain a noise information describing a noise included in the audio signal, and wherein the audio encoder is configured to adaptively encode the audio signal in dependence on the noise information, such that encoding accuracy is higher for parts of the audio signal that are less affected by the noise included in the audio signal than for parts of the audio signal that are more affected by the noise included in the audio signal.
Audio signal compression method and apparatus using deep neural network-based multilayer structure and training method thereof
A method, executed by a processor for compressing an audio signal in multiple layers, may comprise: (a) restoring, in a highest layer, an input audio signal as a first signal; (b) restoring, in at least one intermediate layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in the highest layer or an immediately previous intermediate layer, from the input audio signal as a second signal; and (c) restoring, in a lowest layer, a signal obtained by subtracting an upsampled signal, which is obtained by upsampling the audio signal restored in an intermediate layer immediately before the lowest layer, from the input audio signal as a third signal, wherein the first signal, the second signal, and the third signal are combined to output a final restoration audio signal.
Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal
Audio encoder for encoding a multichannel signal is shown. The audio encoder includes a downmixer for downmixing the multichannel signal to obtain a downmix signal, a linear prediction domain core encoder for encoding the downmix signal, wherein the downmix signal has a low band and a high band, wherein the linear prediction domain core encoder is configured to apply a bandwidth extension processing for parametrically encoding the high band, a filterbank for generating a spectral representation of the multichannel signal, and a joint multichannel encoder configured to process the spectral representation including the low band and the high band of the multichannel signal to generate multichannel information.
Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal
Audio encoder for encoding a multichannel signal is shown. The audio encoder includes a downmixer for downmixing the multichannel signal to obtain a downmix signal, a linear prediction domain core encoder for encoding the downmix signal, wherein the downmix signal has a low band and a high band, wherein the linear prediction domain core encoder is configured to apply a bandwidth extension processing for parametrically encoding the high band, a filterbank for generating a spectral representation of the multichannel signal, and a joint multichannel encoder configured to process the spectral representation including the low band and the high band of the multichannel signal to generate multichannel information.
AUDIO CODING BASED ON AUDIO PATTERN RECOGNITION
In general, techniques are described by which to perform audio coding based on audio pattern recognition. A source device comprising a memory and a processor may be configured to perform the techniques. The memory may store audio data. The processor may obtain, from a plurality of categories, a category to which the audio data corresponds, and obtain, based on the category, a set of pyramid vector quantization (PVQ) parameters from a plurality of sets of PVQ parameters. The processor may also perform, based on the set of PVQ parameters, PVQ with respect to the audio data to obtain a residual identifier representative of the audio data, and specify, in the bitstream, the residual identifier.
AUDIO CODING BASED ON AUDIO PATTERN RECOGNITION
In general, techniques are described by which to perform audio coding based on audio pattern recognition. A source device comprising a memory and a processor may be configured to perform the techniques. The memory may store audio data. The processor may obtain, from a plurality of categories, a category to which the audio data corresponds, and obtain, based on the category, a set of pyramid vector quantization (PVQ) parameters from a plurality of sets of PVQ parameters. The processor may also perform, based on the set of PVQ parameters, PVQ with respect to the audio data to obtain a residual identifier representative of the audio data, and specify, in the bitstream, the residual identifier.
AUDIO ENCODER FOR ENCODING A MULTICHANNEL SIGNAL AND AUDIO DECODER FOR DECODING AN ENCODED AUDIO SIGNAL
Audio encoder for encoding a multichannel signal is shown. The audio encoder includes a downmixer for downmixing the multichannel signal to obtain a downmix signal, a linear prediction domain core encoder for encoding the downmix signal, wherein the downmix signal has a low band and a high band, wherein the linear prediction domain core encoder is configured to apply a bandwidth extension processing for parametrically encoding the high band, a filterbank for generating a spectral representation of the multichannel signal, and a joint multichannel encoder configured to process the spectral representation including the low band and the high band of the multichannel signal to generate multichannel information.
AUDIO ENCODER FOR ENCODING A MULTICHANNEL SIGNAL AND AUDIO DECODER FOR DECODING AN ENCODED AUDIO SIGNAL
Audio encoder for encoding a multichannel signal is shown. The audio encoder includes a downmixer for downmixing the multichannel signal to obtain a downmix signal, a linear prediction domain core encoder for encoding the downmix signal, wherein the downmix signal has a low band and a high band, wherein the linear prediction domain core encoder is configured to apply a bandwidth extension processing for parametrically encoding the high band, a filterbank for generating a spectral representation of the multichannel signal, and a joint multichannel encoder configured to process the spectral representation including the low band and the high band of the multichannel signal to generate multichannel information.
Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal
A schematic block diagram of an audio encoder for encoding a multichannel audio signal is shown. The audio encoder includes a linear prediction domain encoder, a frequency domain encoder, and a controller for switching between the linear prediction domain encoder and the frequency domain encoder. The controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder. The linear prediction domain encoder includes a downmixer for downmixing the multichannel signal to obtain a downmixed signal. The linear prediction domain encoder further includes a linear prediction domain core encoder for encoding the downmix signal and furthermore, the linear prediction domain encoder includes a first joint multichannel encoder for generating first multichannel information from the multichannel signal.