Patent classifications
G10L19/028
Transform Encoding/Decoding of Harmonic Audio Signals
An encoder for encoding frequency transform coefficients of a harmonic audio signal include the following elements: A peak locator configured to locate spectral peaks having magnitudes exceeding a predetermined frequency dependent threshold. A peak region encoder configured to encode peak regions including and surrounding the located peaks. A low-frequency set encoder configured to encode at least one low-frequency set of coefficients outside the peak regions and below a crossover frequency that depends on the number of bits used to encode the peak regions. A noise-floor gain encoder configured to encode a noise-floor gain of at least one high-frequency set of not yet encoded coefficients outside the peak regions.
Apparatus and method for stereo filling in multichannel coding
An apparatus for decoding an encoded multichannel signal of a current frame to obtain three or more current audio output channels is provided. A multichannel processor is adapted to select two decoded channels from three or more decoded channels depending on first multichannel parameters. Moreover, the multichannel processor is adapted to generate a first group of two or more processed channels based on the selected channels. A noise filling module is adapted to identify for at least one of the selected channels, one or more frequency bands, within which all spectral lines are quantized to zero, and to generate a mixing channel using, depending on side information, a proper subset of three or more previous audio output channels that have been decoded, and to fill the spectral lines of frequency bands, within which all spectral lines are quantized to zero, with noise generated using spectral lines of the mixing channel.
Apparatus and method for stereo filling in multichannel coding
An apparatus for decoding an encoded multichannel signal of a current frame to obtain three or more current audio output channels is provided. A multichannel processor is adapted to select two decoded channels from three or more decoded channels depending on first multichannel parameters. Moreover, the multichannel processor is adapted to generate a first group of two or more processed channels based on the selected channels. A noise filling module is adapted to identify for at least one of the selected channels, one or more frequency bands, within which all spectral lines are quantized to zero, and to generate a mixing channel using, depending on side information, a proper subset of three or more previous audio output channels that have been decoded, and to fill the spectral lines of frequency bands, within which all spectral lines are quantized to zero, with noise generated using spectral lines of the mixing channel.
Audio content recognition method and system
A method implemented by a computing system comprises generating, by the computing system, a fingerprint comprising a plurality of bin samples associated with audio content. Each bin sample is specified within a frame of the fingerprint and is associated with one of a plurality of non-overlapping frequency ranges and a value indicative of a magnitude of energy associated with a corresponding frequency range. The computing system removes, from the fingerprint, a plurality of bin samples associated with a frequency sweep in the audio content.
ENCODING DEVICE, DECODING DEVICE, ENCODING METHOD, DECODING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM
An encoding device according to the disclosure includes a first encoder, which in operation, encodes a low-band signal from a voice or audio input signal to generate a first encoded signal; a decoder, which in operation, decodes the first encoded signal to generate a low-band decoded signal; a second encoder, which in operation, encodes, on the basis of the low-band decoded signal, a high-band signal comprising a band from the voice or audio input signal, the band being higher than that of the low-band signal to generate a high-band encoded signal; an energy calculator, which in operation, calculates an energy of the voice or audio input signal for each subband of a plurality of subbands of the voice or audio input signal to acquire a calculated energy for each subband of the plurality of subbands of the voice or audio input signal, quantizes the calculated energy for each subband of the plurality of subbands of the voice or audio input signal to acquire a quantized band energy for each subband of the plurality of subbands of the voice or audio input signal and outputs the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal; and a multiplexer, which in operation, multiplexes the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal, the first encoded signal, and the high-band encoded signal to generate and output an encoded signal.
ENCODING DEVICE, DECODING DEVICE, ENCODING METHOD, DECODING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM
An encoding device according to the disclosure includes a first encoder, which in operation, encodes a low-band signal from a voice or audio input signal to generate a first encoded signal; a decoder, which in operation, decodes the first encoded signal to generate a low-band decoded signal; a second encoder, which in operation, encodes, on the basis of the low-band decoded signal, a high-band signal comprising a band from the voice or audio input signal, the band being higher than that of the low-band signal to generate a high-band encoded signal; an energy calculator, which in operation, calculates an energy of the voice or audio input signal for each subband of a plurality of subbands of the voice or audio input signal to acquire a calculated energy for each subband of the plurality of subbands of the voice or audio input signal, quantizes the calculated energy for each subband of the plurality of subbands of the voice or audio input signal to acquire a quantized band energy for each subband of the plurality of subbands of the voice or audio input signal and outputs the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal; and a multiplexer, which in operation, multiplexes the quantized band energy for each subband of the plurality of subbands of the voice or audio input signal, the first encoded signal, and the high-band encoded signal to generate and output an encoded signal.
Audio decoder supporting a set of different loss concealment tools
An assignment of one of phase set of different loss concealment tools of an audio decoder to a portion of the audio signal to be decoded from a data stream, which portion is affected by loss, that is the selection out of the set of different loss concealment tools, may be made in a manner leading to a more pleasant loss concealment if the assignment/selection is done based on two measures: A first measure which is determined measures a spectral position of a spectral centroid of a spectrum of the audio signal and a second measure which is determined measures a temporal predictability of the audio signal. The assigned or selected loss concealment tool may then be used to recover the portion of the audio signal.
Frequency band expansion device, frequency band expansion method, and storage medium storing frequency band expansion program
A frequency band expansion device includes processing circuitry to calculate a weighting coefficient based on a frequency gradient of the input signal; to generate a white noise signal; to generate a first white noise signal by performing filtering on the white noise signal; to generate a second white noise signal by regulating a phase characteristic of the white noise signal; to generate a third white noise signal by performing weighted addition on the first white noise signal and the second white noise signal by using the weighting coefficient; and to generate the output signal by adding together the input signal and a signal corresponding to the third white noise signal, wherein the processing circuitry is configured so that the phase characteristic of the second white noise signal becomes the same as the phase characteristic of the first white noise signal.
Frequency band expansion device, frequency band expansion method, and storage medium storing frequency band expansion program
A frequency band expansion device includes processing circuitry to calculate a weighting coefficient based on a frequency gradient of the input signal; to generate a white noise signal; to generate a first white noise signal by performing filtering on the white noise signal; to generate a second white noise signal by regulating a phase characteristic of the white noise signal; to generate a third white noise signal by performing weighted addition on the first white noise signal and the second white noise signal by using the weighting coefficient; and to generate the output signal by adding together the input signal and a signal corresponding to the third white noise signal, wherein the processing circuitry is configured so that the phase characteristic of the second white noise signal becomes the same as the phase characteristic of the first white noise signal.
Encoding device, decoding device, encoding method, decoding method, and non-transitory computer-readable recording medium
An encoding device according to the disclosure includes a first encoding unit that generates a first encoded signal in which a low-band signal having a frequency lower than or equal to a predetermined frequency from a voice or audio input signal is encoded, and a low-band decoded signal; a second encoding unit that encodes, on the basis of the low-band decoded signal, a high-band signal having a band higher than that of the low-band signal to generate a high-band encoded signal; and a first multiplexing unit that multiplexes the first encoded signal and the high-band encoded signal to generate and output an encoded signal. The second encoding unit calculates an energy ratio between a high-band noise component, which is a noise component of the high-band signal, and a high-band non-tonal component of a high-band decoded signal generated from the low-band decoded signal and outputs the ratio as the high-band encoded signal.