Patent classifications
G10L19/032
AUDIO PROCESSING FOR VOICE ENCODING AND DECODING
The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a audio codec system which is particularly well suited for voice encoding/decoding. A transform-based speech encoder is configured to encode a speech signal into a bitstream is described. A speech decoder configured to decode audio signals from a bitstream is further described.
APPARATUS AND METHOD FOR ENCODING A PLURALITY OF AUDIO OBJECTS USING DIRECTION INFORMATION DURING A DOWNMIXING OR APPARATUS AND METHOD FOR DECODING USING AN OPTIMIZED COVARIANCE SYNTHESIS
An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects has: a downmixer for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.
APPARATUS AND METHOD FOR ENCODING A PLURALITY OF AUDIO OBJECTS USING DIRECTION INFORMATION DURING A DOWNMIXING OR APPARATUS AND METHOD FOR DECODING USING AN OPTIMIZED COVARIANCE SYNTHESIS
An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects has: a downmixer for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.
Audio signal coding apparatus, audio signal decoding apparatus, audio signal coding method, and audio signal decoding method
An audio signal coding apparatus includes a time-frequency transformer that outputs sub-band spectra from an input signal; a sub-band energy quantizer; a tonality calculator that analyzes tonality of the sub-band spectra; a bit allocator that selects a second sub-band on which quantization is performed by a second quantizer on the basis of the analysis result of the tonality and quantized sub-band energy, and determines a first number of bits to be allocated to a first sub-band on which quantization is performed by a first quantizer; the first quantizer that performs first coding using the first number of bits; the second quantizer that performs coding using a second coding method; and a multiplexer.
Audio signal coding apparatus, audio signal decoding apparatus, audio signal coding method, and audio signal decoding method
An audio signal coding apparatus includes a time-frequency transformer that outputs sub-band spectra from an input signal; a sub-band energy quantizer; a tonality calculator that analyzes tonality of the sub-band spectra; a bit allocator that selects a second sub-band on which quantization is performed by a second quantizer on the basis of the analysis result of the tonality and quantized sub-band energy, and determines a first number of bits to be allocated to a first sub-band on which quantization is performed by a first quantizer; the first quantizer that performs first coding using the first number of bits; the second quantizer that performs coding using a second coding method; and a multiplexer.
Inter-channel phase difference parameter encoding method and apparatus
This application discloses an IPD parameter encoding method, including: obtaining a reference parameter used to determine an IPD parameter encoding scheme of a current frame of a multi-channel signal; determining the IPD parameter encoding scheme of the current frame based on the reference parameter, where the determined IPD parameter encoding scheme of the current frame is one of at least two preset IPD parameter encoding schemes; and processing an IPD parameter of the current frame based on the determined IPD parameter encoding scheme of the current frame. The technical solutions provided in this application can improve encoding quality of the multi-channel signal.
Inter-channel phase difference parameter encoding method and apparatus
This application discloses an IPD parameter encoding method, including: obtaining a reference parameter used to determine an IPD parameter encoding scheme of a current frame of a multi-channel signal; determining the IPD parameter encoding scheme of the current frame based on the reference parameter, where the determined IPD parameter encoding scheme of the current frame is one of at least two preset IPD parameter encoding schemes; and processing an IPD parameter of the current frame based on the determined IPD parameter encoding scheme of the current frame. The technical solutions provided in this application can improve encoding quality of the multi-channel signal.
Decoding apparatus, encoding apparatus, and methods and programs therefor
A decoding apparatus includes: a bandwidth extending part 25 obtaining a decoded extended frequency spectrum sequence by arranging samples based on K samples included in a frequency-domain sample sequence obtained by decoding, on a higher side than the frequency-domain sample sequence; and a fricative sound adjustment releasing part 23 obtaining, if inputted information indicating whether a hissing sound or not indicates being a hissing sound, what is obtained by exchanging all or a part of a low-side frequency sample sequence existing on a lower side than a predetermined frequency in the decoded extended frequency spectrum sequence for all or a part of a high-side frequency sample sequence existing on a higher side than the predetermined frequency in the decoded extended frequency spectrum sequence as an adjusted frequency spectrum sequence, the number of all or the part of the high-side frequency spectrum sequence being the same as the number of all or the part of the low-side frequency spectrum sequence.
Decoding apparatus, encoding apparatus, and methods and programs therefor
A decoding apparatus includes: a bandwidth extending part 25 obtaining a decoded extended frequency spectrum sequence by arranging samples based on K samples included in a frequency-domain sample sequence obtained by decoding, on a higher side than the frequency-domain sample sequence; and a fricative sound adjustment releasing part 23 obtaining, if inputted information indicating whether a hissing sound or not indicates being a hissing sound, what is obtained by exchanging all or a part of a low-side frequency sample sequence existing on a lower side than a predetermined frequency in the decoded extended frequency spectrum sequence for all or a part of a high-side frequency sample sequence existing on a higher side than the predetermined frequency in the decoded extended frequency spectrum sequence as an adjusted frequency spectrum sequence, the number of all or the part of the high-side frequency spectrum sequence being the same as the number of all or the part of the low-side frequency spectrum sequence.
MAINTAINING INVARIANCE OF SENSORY DISSONANCE AND SOUND LOCALIZATION CUES IN AUDIO CODECS
A method including receiving a plurality of audio channels based on an audio stream, applying a model based on at least one acoustic perception algorithm to the plurality of audio channels to generate a first modelled audio stream, quantizing the plurality of audio channels using a first set of quantization parameters, dequantizing the quantized plurality of audio channels using the first set of quantization parameters, applying the model based on at least one acoustic perception algorithm to the dequantized plurality of audio channels to generate a second modelled audio stream, comparing the first modelled audio stream and the second modelled audio stream, in response to determining the comparison of the first modelled audio stream and the second modelled audio stream does not meet a criterion, generating a second set of quantization parameters, and quantizing the plurality of audio channels using the second set of quantization parameters.