Patent classifications
G10L19/032
SYSTEM AND METHOD FOR PROCESSING AUDIO DATA
An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events. The encoder operable to transform the error signal into high resolution frequency components using the MDCT block sizes, quantize the scale factors and frequency components, and encode the quantized lines, block sizes, and quantized scale factors for inclusion in the bitstream.
SYSTEM AND METHOD FOR PROCESSING AUDIO DATA
An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events. The encoder operable to transform the error signal into high resolution frequency components using the MDCT block sizes, quantize the scale factors and frequency components, and encode the quantized lines, block sizes, and quantized scale factors for inclusion in the bitstream.
AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus, are described. The encoding method includes obtaining a target frequency-domain coefficient of a current frame and a reference target frequency-domain coefficient of the current frame. The encoding method further includes calculating a cost function based on the target frequency-domain coefficient and the reference target frequency-domain coefficient of the current frame, where the cost function is for determining whether to perform long-term prediction (LTP) processing on the current frame during encoding of the target frequency-domain coefficient of the current frame. Additionally, the method includes encoding the target frequency-domain coefficient of the current frame based on the cost function.
AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus are provided. The audio signal encoding method includes: obtaining a frequency-domain coefficient of a current frame and a frequency-domain coefficient of a reference signal of the current frame; performing filtering processing on the frequency-domain coefficient of the current frame to obtain a filtering parameter; determining a target frequency-domain coefficient of the current frame based on the filtering parameter; performing filtering processing on the frequency-domain coefficient of the reference signal and a reference frequency-domain coefficient based on the filtering parameter to obtain a target frequency-domain coefficient of the reference signal; and encoding the target frequency-domain coefficient of the current frame based on the target frequency-domain coefficient of the current frame, the target frequency-domain coefficient of the reference signal, a reference target frequency-domain coefficient. The method can improve audio signal encoding/decoding efficiency.
Audio processing for voice encoding and decoding using spectral shaper model
The present disclosure relates to an audio encoding and decoding (codec) system for voice encoding/decoding using a spectral shaper model. In an embodiment, a method of audio signal decoding comprises: receiving a bit stream associated with an audio signal, the bit stream including encoded transform coefficients, spectral envelope data and one or more parameters of a spectral shaper model, the spectral shaper model indicative of a fundamental frequency of a multi-sinusoidal signal model, where the fundamental frequency corresponds to a time domain delay; decoding the encoded transform coefficients; adjusting the decoded transform coefficients using the spectral envelope data and the spectral shaper model; reconstructing transform coefficients of the audio signal using the adjusted, decoded transform coefficients; and transforming the reconstructed transform coefficients into a time domain audio signal.
Audio processing for voice encoding and decoding using spectral shaper model
The present disclosure relates to an audio encoding and decoding (codec) system for voice encoding/decoding using a spectral shaper model. In an embodiment, a method of audio signal decoding comprises: receiving a bit stream associated with an audio signal, the bit stream including encoded transform coefficients, spectral envelope data and one or more parameters of a spectral shaper model, the spectral shaper model indicative of a fundamental frequency of a multi-sinusoidal signal model, where the fundamental frequency corresponds to a time domain delay; decoding the encoded transform coefficients; adjusting the decoded transform coefficients using the spectral envelope data and the spectral shaper model; reconstructing transform coefficients of the audio signal using the adjusted, decoded transform coefficients; and transforming the reconstructed transform coefficients into a time domain audio signal.
Direct mapping
A single-bit audio stream can be converted to a modified single-bit audio stream with a constant edge rate while maintaining a modulation index of the original audio stream using direct mapping. With direct mapping, a pre-filter bank may be combined with a multi-bit symbol mapper to select symbols for the modified audio stream with a constant edge rate per symbol and the same modulation index as the original audio stream. The output of the pre-filter bank may be an audio stream with no consecutive full-scale symbols. Using the output of the pre-filter bank, a multi-bit symbol mapper may use the symbol selector to output a symbol with a constant edge rate per symbol and the same modulation index as the original signal. The symbols may be converted to an analog signal for reproduction of audio content using a transducer.
Direct mapping
A single-bit audio stream can be converted to a modified single-bit audio stream with a constant edge rate while maintaining a modulation index of the original audio stream using direct mapping. With direct mapping, a pre-filter bank may be combined with a multi-bit symbol mapper to select symbols for the modified audio stream with a constant edge rate per symbol and the same modulation index as the original audio stream. The output of the pre-filter bank may be an audio stream with no consecutive full-scale symbols. Using the output of the pre-filter bank, a multi-bit symbol mapper may use the symbol selector to output a symbol with a constant edge rate per symbol and the same modulation index as the original signal. The symbols may be converted to an analog signal for reproduction of audio content using a transducer.
Methods and apparatus for rate quality scalable coding with generative models
Described herein is a method of decoding an audio or speech signal, the method including the steps of: (a) receiving, by a decoder, a coded bitstream including the audio or speech signal and conditioning information; (b) providing, by a bitstream decoder, decoded conditioning information in a format associated with a first bitrate; (c) converting, by a converter, the decoded conditioning information from the format associated with the first bitrate to a format associated with a second bitrate; and (d) providing, by a generative neural network, a reconstruction of the audio or speech signal according to a probabilistic model conditioned by the conditioning information in the format associated with the second bitrate. Described are further an apparatus for decoding an audio or speech signal, a respective encoder, a system of the encoder and the apparatus for decoding an audio or speech signal as well as a respective computer program product.
Methods and apparatus for rate quality scalable coding with generative models
Described herein is a method of decoding an audio or speech signal, the method including the steps of: (a) receiving, by a decoder, a coded bitstream including the audio or speech signal and conditioning information; (b) providing, by a bitstream decoder, decoded conditioning information in a format associated with a first bitrate; (c) converting, by a converter, the decoded conditioning information from the format associated with the first bitrate to a format associated with a second bitrate; and (d) providing, by a generative neural network, a reconstruction of the audio or speech signal according to a probabilistic model conditioned by the conditioning information in the format associated with the second bitrate. Described are further an apparatus for decoding an audio or speech signal, a respective encoder, a system of the encoder and the apparatus for decoding an audio or speech signal as well as a respective computer program product.