Patent classifications
G10L19/0208
Audio encoder, audio decoder and related methods using two-channel processing within an intelligent gap filling framework
An apparatus for generating a decoded two-channel signal, comprising: a parametric decoder for providing parametric data for a second set of second spectral portions and a two-channel identification identifying for a second spectral portion of the second set of second spectral portions either a first two-channel representation for the second spectral portion of the second set of second spectral portions or a second two-channel representation for the second spectral portion of the second set of second spectral portions, the second two-channel representation being different from the first two-channel representation; and a frequency regenerator for regenerating the second spectral portion of the second set of second spectral portions depending on a first spectral portion of a first set of first spectral portions, the parametric data for the second spectral portion of the second set of second spectral portions and the two-channel identification for the second spectral portion of the second set of second spectral portions to acquire a regenerated second spectral portion of the second set of second spectral portions.
AUDIO ENCODER, AUDIO DECODER AND RELATED METHODS USING TWO-CHANNEL PROCESSING WITHIN AN INTELLIGENT GAP FILLING FRAMEWORK
An apparatus for generating a decoded two-channel signal, comprising: a parametric decoder for providing parametric data for a second set of second spectral portions and a two-channel identification identifying for a second spectral portion of the second set of second spectral portions either a first two-channel representation for the second spectral portion of the second set of second spectral portions or a second two-channel representation for the second spectral portion of the second set of second spectral portions, the second two-channel representation being different from the first two-channel representation; and a frequency regenerator for regenerating the second spectral portion of the second set of second spectral portions depending on a first spectral portion of a first set of first spectral portions, the parametric data for the second spectral portion of the second set of second spectral portions and the two-channel identification for the second spectral portion of the second set of second spectral portions to acquire a regenerated second spectral portion of the second set of second spectral portions.
ENCODING DEVICE, DECODING DEVICE, ENCODING METHOD, DECODING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM
An encoding device according to the disclosure includes a first encoder, which in operation, encodes a low-band signal from a voice or audio input signal to generate a first encoded signal; a decoder, which in operation, decodes the first encoded signal to generate a low-band decoded signal; a second encoder, which in operation, encodes, on the basis of the low-band decoded signal, a high-band signal comprising a band from the voice or audio input signal, the band being higher than that of the low-band signal to generate a high-band encoded signal; an energy calculator, which in operation, calculates an energy of the voice or audio input signal for each subband of a plurality of subbands of the voice or audio input signal to acquire a calculated energy for each subband of the plurality of subbands of the voice or audio input signal, quantizes the calculated energy for each subband of the plurality of subbands of the voice or audio input signal to acquire a quantized band energy for each subband of the plurality of subbands of the voice or audio input signal and outputs the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal; and a multiplexer, which in operation, multiplexes the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal, the first encoded signal, and the high-band encoded signal to generate and output an encoded signal.
Model Based Prediction in a Critically Sampled Filterbank
The present document relates to audio source coding systems. In particular, the present document relates to audio source coding systems which make use of linear prediction in combination with a filterbank. A method for estimating a first sample (615) of a first subband signal in a first subband of an audio signal is described. The first subband signal of the audio signal is determined using an analysis filterbank (612) comprising a plurality of analysis filters which provide a plurality of subband signals in a plurality of subbands from the audio signal, respectively. The method comprises determining a model parameter (613) of a signal model; determining a prediction coefficient to be applied to a previous sample (614) of a first decoded subband signals derived from the first subband signal, based on the signal model, based on the model parameter (613) and based on the analysis filterbank (612); wherein a time slot of the previous sample (614) is prior to a time slot of the first sample (615); and determining an estimate of the first sample (615) by applying the prediction coefficient to the previous sample (614).
Apparatus and method for decoding or encoding an audio signal using energy information values for a reconstruction band
An apparatus for decoding an encoded audio signal having an encoded representation of a first set of first spectral portions and an encoded representation of parametric data indicating spectral energies for a second set of second spectral portions, has: an audio decoder for decoding the encoded representation of the first set of the first spectral portions to obtain a first set of first spectral portions and for decoding the encoded representation of the parametric data to obtain a decoded parametric data for the second set of second spectral portions indicating, for individual reconstruction bands, individual energies; a frequency regenerator for reconstructing spectral values in a reconstruction band having a second spectral portion using a first spectral portion of the first set of the first spectral portions and an individual energy for the reconstruction band, the reconstruction band having a first spectral portion and the second spectral portion.
Bandwidth extension method and apparatus, electronic device, and computer-readable storage medium
Embodiments of this application disclose a bandwidth extension (BWE) method and apparatus. The method is performed by an electronic device, and includes: performing a time-frequency transform on a to-be-processed narrowband signal to obtain a corresponding initial low-frequency spectrum; obtaining a correlation parameter of a high-frequency portion and a low-frequency portion of a target broadband spectrum based on the initial low-frequency spectrum by using a neural network model; obtaining an initial high-frequency spectrum based on the correlation parameter and the initial low-frequency spectrum; and obtaining a broadband signal according to a target low-frequency spectrum and a target high-frequency spectrum.
ESTIMATION OF BACKGROUND NOISE IN AUDIO SIGNALS
Background noise estimators and methods are disclosed for estimating background noise in an audio signal. Some methods include obtaining at least one parameter associated with an audio signal segment, such as a frame or part of a frame, based on a first linear prediction gain, calculated as a quotient between a residual signal from a 0th-order linear prediction and a residual signal from a 2nd-order linear prediction for the audio signal segment. A second linear prediction gain is calculated as a quotient between a residual signal from a 2nd-order linear prediction and a residual signal from a 16th-order linear prediction for the audio signal segment. Whether the audio signal segment comprises a pause is determined based at least on the obtained at least one parameter; and a background noise estimate is updated based on the audio signal segment when the audio signal segment comprises a pause.
Encoding device, decoding device, encoding method, decoding method, and non-transitory computer-readable recording medium
An encoding device according to the disclosure includes a first encoding unit that generates a first encoded signal in which a low-band signal having a frequency lower than or equal to a predetermined frequency from a voice or audio input signal is encoded, and a low-band decoded signal; a second encoding unit that encodes, on the basis of the low-band decoded signal, a high-band signal having a band higher than that of the low-band signal to generate a high-band encoded signal; and a first multiplexing unit that multiplexes the first encoded signal and the high-band encoded signal to generate and output an encoded signal. The second encoding unit calculates an energy ratio between a high-band noise component, which is a noise component of the high-band signal, and a high-band non-tonal component of a high-band decoded signal generated from the low-band decoded signal and outputs the ratio as the high-band encoded signal.
MACHINE LEARNING-BASED AUDIO CODEC SWITCHING
Described herein are techniques, devices, and systems for selectively using a music-capable audio codec on-demand during a communication session. A user equipment (UE) may adaptively transition between using a first audio codec that provides a first audio bandwidth and a second audio codec (e.g., the EVS-FB codec) that provides a second audio bandwidth that is greater than the first audio bandwidth. The transition to the second audio codec may occur in response to determining that sound in the environment of the UE includes frequencies outside of a range of frequencies associated with a human voice, such as by determining that music is being played in the environment of the UE, which allows for selectively using a music-capable audio codec when it would be beneficial to do so.
Apparatus for decoding an encoded audio signal with frequency tile adaption
Apparatus for decoding an encoded audio signal including an encoded core signal and parametric data, including: a core decoder for decoding the encoded core signal to obtain a decoded core signal; an analyzer for analyzing the decoded core signal before or after performing a frequency regeneration operation to provide an analysis result; and a frequency regenerator for regenerating spectral portions not included in the decoded core signal using a spectral portion of the decoded core signal, the parametric data, and the analysis result.