Patent classifications
G10L19/032
INTER-CHANNEL PHASE DIFFERENCE PARAMETER ENCODING METHOD AND APPARATUS
The present disclosure discloses an inter-channel phase difference parameter encoding method, where a current frame is obtained; a signal type and a previous IPD parameter encoding scheme of a previous frame are obtained; a current IPD parameter encoding scheme is obtained at least based on the signal type of the previous frame and the previous IPD parameter encoding scheme; and an IPD parameter of the current frame is processed based on the current IPD parameter encoding scheme.
INTER-CHANNEL PHASE DIFFERENCE PARAMETER ENCODING METHOD AND APPARATUS
The present disclosure discloses an inter-channel phase difference parameter encoding method, where a current frame is obtained; a signal type and a previous IPD parameter encoding scheme of a previous frame are obtained; a current IPD parameter encoding scheme is obtained at least based on the signal type of the previous frame and the previous IPD parameter encoding scheme; and an IPD parameter of the current frame is processed based on the current IPD parameter encoding scheme.
SUPPORT FOR GENERATION OF COMFORT NOISE, AND GENERATION OF COMFORT NOISE
A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor α. The method comprises signaling information about the weight factor α to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.
SUPPORT FOR GENERATION OF COMFORT NOISE, AND GENERATION OF COMFORT NOISE
A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor α. The method comprises signaling information about the weight factor α to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.
Audio Transcoding Method and Apparatus, Audio Transcoder, Device, and Storage Medium
Provided is an audio transcoding method, including: (301) performing entropy decoding on a first audio stream with a first bitrate, to obtain an audio feature parameter and an excitation signal of the first audio stream, the excitation signal being a quantized audio signal; (302) obtaining a time-domain audio signal corresponding to the excitation signal based on the audio feature parameter and the excitation signal; (303) re-quantizing the excitation signal and the audio feature parameter based on the time-domain audio signal and a target transcoding bitrate, to obtain a target excitation signal and a target audio feature parameter; and (304) performing entropy coding on the target audio feature parameter and the target excitation signal, to obtain a second audio stream with a second bitrate, the second bitrate being lower than the first bitrate.
Audio Transcoding Method and Apparatus, Audio Transcoder, Device, and Storage Medium
Provided is an audio transcoding method, including: (301) performing entropy decoding on a first audio stream with a first bitrate, to obtain an audio feature parameter and an excitation signal of the first audio stream, the excitation signal being a quantized audio signal; (302) obtaining a time-domain audio signal corresponding to the excitation signal based on the audio feature parameter and the excitation signal; (303) re-quantizing the excitation signal and the audio feature parameter based on the time-domain audio signal and a target transcoding bitrate, to obtain a target excitation signal and a target audio feature parameter; and (304) performing entropy coding on the target audio feature parameter and the target excitation signal, to obtain a second audio stream with a second bitrate, the second bitrate being lower than the first bitrate.
METHOD AND APPARATUS FOR DETERMINING WEIGHTING FACTOR DURING STEREO SIGNAL ENCODING
Various embodiments provide a method and an apparatus for determining a weighting factor during stereo signal encoding. In those embodiments, a parameter value corresponding to the encoding mode of the to-be-encoded signal is determined based on an encoding mode of a to-be-encoded signal in a stereo signal and a correspondence between an encoding mode and a parameter value. Based on the determined parameter value and an energy spectrum of a linear prediction filter corresponding to an original line spectral frequency parameter of the to-be-encoded signal, a weighting factor for calculating a distance between the original line spectral frequency parameter and a target original line spectral frequency parameter is calculated.
METHOD AND APPARATUS FOR DETERMINING WEIGHTING FACTOR DURING STEREO SIGNAL ENCODING
Various embodiments provide a method and an apparatus for determining a weighting factor during stereo signal encoding. In those embodiments, a parameter value corresponding to the encoding mode of the to-be-encoded signal is determined based on an encoding mode of a to-be-encoded signal in a stereo signal and a correspondence between an encoding mode and a parameter value. Based on the determined parameter value and an energy spectrum of a linear prediction filter corresponding to an original line spectral frequency parameter of the to-be-encoded signal, a weighting factor for calculating a distance between the original line spectral frequency parameter and a target original line spectral frequency parameter is calculated.
Methods of encoding and decoding speech signal using neural network model recognizing sound sources, and encoding and decoding apparatuses for performing the same
Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.
Methods of encoding and decoding speech signal using neural network model recognizing sound sources, and encoding and decoding apparatuses for performing the same
Methods of encoding and decoding a speech signal using a neural network model that recognizes sound sources, and encoding and decoding apparatuses for performing the methods are provided. A method of encoding a speech signal includes identifying an input signal for a plurality of sound sources; generating a latent signal by encoding the input signal; obtaining a plurality of sound source signals by separating the latent signal for each of the plurality of sound sources; determining a number of bits used for quantization of each of the plurality of sound source signals according to a type of each of the plurality of sound sources; quantizing each of the plurality of sound source signals based on the determined number of bits; and generating a bitstream by combining the plurality of quantized sound source signals.