Patent classifications
G10L19/03
ADAPTIVE AUDIO CODEC SYSTEM, METHOD AND ARTICLE
An adaptive noise shaping filter flattens signal components below a threshold frequency range in a filtered signal to be encoded. An encoder generates quantized signals based on a difference signal and includes an adaptive quantizer and a decoder. The decoder generates feedback signals and has an inverse quantizer and a predictor. The predictor has determined control parameters based on the threshold frequency range.
Support for generation of comfort noise, and generation of comfort noise
A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor a. The method comprises signaling information about the weight factor a to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.
COMFORT NOISE GENERATION
Apparatuses, arrangements and methods therein for generation of comfort noise are disclosed. In short, the solution relates to exploiting the spatial coherence of multiple input audio channels in order to generate high quality multi channel comfort noise.
COMFORT NOISE GENERATION
Apparatuses, arrangements and methods therein for generation of comfort noise are disclosed. In short, the solution relates to exploiting the spatial coherence of multiple input audio channels in order to generate high quality multi channel comfort noise.
Audio bandwidth extension by insertion of temporal pre-shaped noise in frequency domain
An audio decoder device for decoding a bitstream includes a bitstream receiver configured to receive the bitstream and to derive an encoded audio signal from the bitstream; a core decoder module configured for deriving a decoded audio signal in a time domain from the encoded audio signal; a temporal envelope generator configured to determine a temporal envelope of the decoded audio signal; a bandwidth extension module configured to produce a frequency domain bandwidth extension signal; a time-to-frequency converter configured to transform the decoded audio signal into a frequency domain decoded audio signal; a combiner configured to combine the frequency domain decoded audio signal and the frequency domain bandwidth extension signal in order to produce a bandwidth extended frequency domain audio signal; and a frequency-to-time converter configured to transform the bandwidth extended frequency domain audio signal into a bandwidth-extended time domain audio signal.
CODING DEVICE AND METHOD, DECODING DEVICE AND METHOD, AND PROGRAM
The present technology relates to a coding device and method, and a decoding device and method, and a program capable of reducing the amount of calculations for decoding.
A separating unit separates a supplied bit stream into coded data of channel sources including a dialog source, coded data of additional data sources, and coded data of dialog information. A dialog information decoding unit decodes the coded data of the dialog information. When the dialog information acquired by the decoding is presented to a viewer, the viewer selects one source from the dialog source and some additional dialog sources. An additional dialog source decoding unit decodes only the coded data of an additional dialog source selected by the viewer. An additional dialog selection unit outputs a viewer-selected audio signal from among the audio signals of the dialog source and additional dialog sources in response to the selection instruction of the viewer. The present technology is applicable to coding devices and decoding devices.
CODING DEVICE AND METHOD, DECODING DEVICE AND METHOD, AND PROGRAM
The present technology relates to a coding device and method, and a decoding device and method, and a program capable of reducing the amount of calculations for decoding.
A separating unit separates a supplied bit stream into coded data of channel sources including a dialog source, coded data of additional data sources, and coded data of dialog information. A dialog information decoding unit decodes the coded data of the dialog information. When the dialog information acquired by the decoding is presented to a viewer, the viewer selects one source from the dialog source and some additional dialog sources. An additional dialog source decoding unit decodes only the coded data of an additional dialog source selected by the viewer. An additional dialog selection unit outputs a viewer-selected audio signal from among the audio signals of the dialog source and additional dialog sources in response to the selection instruction of the viewer. The present technology is applicable to coding devices and decoding devices.
Duration informed attention network (DURIAN) for audio-visual synthesis
A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A spectrogram frame is generated based on the duration model. An audio waveform is generated based on the spectrogram frame. Video information is generated based on the audio waveform. The audio waveform is provided as an output along with a corresponding video.
Duration informed attention network (DURIAN) for audio-visual synthesis
A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A spectrogram frame is generated based on the duration model. An audio waveform is generated based on the spectrogram frame. Video information is generated based on the audio waveform. The audio waveform is provided as an output along with a corresponding video.
Speech decoder with high-band generation and temporal envelope shaping
A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is shaped. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a bandwidth extension technique in the frequency domain represented by SBR.