Patent classifications
G10L19/0208
Digital Filterbank for Spectral Envelope Adjustment
An apparatus and method are disclosed for processing an audio signal. The apparatus includes an input interface, a digital filterbank having an analysis part and a synthesis part, a first phase shifter, a spectral envelope adjuster, a second phase shifter, and an output interface. The first phase shifter and the second phase shifter reduce a complexity of the digital filterbank, which includes both analysis and synthesis filters that are complex-exponential modulated versions of a prototype filter.
System and method for non-destructively normalizing loudness of audio signals within portable devices
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
AUDIO ENCODER AND DECODER USING A FREQUENCY DOMAIN PROCESSOR , A TIME DOMAIN PROCESSOR, AND A CROSS PROCESSING FOR CONTINUOUS INITIALIZATION
An audio encoder for encoding an audio signal includes: a first encoding processor for encoding a first audio signal portion in a frequency domain, wherein the first encoding processor includes: a time frequency converter for converting the first audio signal portion into a frequency domain representation having spectral lines up to a maximum frequency of the first audio signal portion; a spectral encoder for encoding the frequency domain representation; a second encoding processor for encoding a second different audio signal portion in the time domain; a cross-processor for calculating, from the encoded spectral representation of the first audio signal portion, initialization data of the second encoding processor, so that the second encoding processing is initialized to encode the second audio signal portion immediately following the first audio signal portion in time in the audio signal; a controller configured for analyzing the audio signal and for determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion of the audio signal is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal including a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second audio signal portion.
Frame Loss Compensation Processing Method and Apparatus
A frame loss compensation processing method and apparatus is presented, where the method includes, when a i.sup.th frame is a lost frame, estimating a spectrum frequency parameter, a pitch period, and a gain of the i.sup.th frame according to at least one of an inter-frame relationship between first N frames of the i.sup.th frame or an intra-frame relationship between first N frames of the i.sup.th frame. A parameter of the i.sup.th frame is determined using the signal correlation between the first N frames, the signal energy stability between the first N frames, intra-frame signal correlation of each frame, and intra-frame signal energy stability of each frame.
Encoding device and decoding device
An encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum including a lower frequency spectrum, a BWE encoding unit (204) that generates extension data which specifies a higher frequency spectrum at a higher frequency than the lower frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum obtained by the MDCT unit (202) and the extension data obtained by the BWE encoding unit (204). The BWE encoding unit (204) generates as the extension data (i) a first parameter which specifies a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and (ii) a second parameter which specifies a gain of the lower subband after being copied.
Speech decoder with high-band generation and temporal envelope shaping
A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is shaped. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a bandwidth extension technique in the frequency domain represented by SBR.
Method and apparatus for detecting a voice activity in an input audio signal
A method for detecting a voice activity in an input audio signal composed of frames includes that a noise characteristic of the input signal is determined based on a received frame of the input audio signal. A voice activity detection (VAD) parameter is derived based on the noise characteristic of the input audio signal using an adaptive function. The derived VAD parameter is compared with a threshold value to provide a voice activity detection decision. The input audio signal is processed according to the voice activity detection decision.
ENCODING OF MULTIPLE AUDIO SIGNALS
A device includes an encoder and a transmitter. The encoder is configured to determine a mismatch value indicative of an amount of temporal mismatch between a reference channel and a target channel. The encoder is also configured to determine whether to perform a first temporal-shift operation on the target channel at least based on the mismatch value and a coding mode to generate an adjusted target channel. The encoder is further configured to perform a first transform operation on the reference channel to generate a frequency-domain reference channel and perform a second transform operation on the adjusted target channel to generate a frequency-domain adjusted target channel. The encoder is also configured to estimate one or more stereo cues based on the frequency-domain reference channel and the frequency-domain adjusted target channel. The transmitter is configured to transmit the one or more stereo cues to a receiver.
APPARATUS AND METHOD FOR ENCODING OR DECODING AN AUDIO SIGNAL WITH INTELLIGENT GAP FILLING IN THE SPECTRAL DOMAIN
An apparatus for decoding an encoded audio signal, includes a spectral domain audio decoder for generating a first decoded representation of a first set of first spectral portions, the decoded representation having a first spectral resolution; a parametric decoder for generating a second decoded representation of a second set of second spectral portions having a second spectral resolution being lower than the first spectral resolution; a frequency regenerator for regenerating every constructed second spectral portion having the first spectral resolution using a first spectral portion and spectral envelope information for the second spectral portion; and a spectrum time converter for converting the first decoded representation and the reconstructed second spectral portion into a time representation.
CODING DENSE TRANSIENT EVENTS WITH COMPANDING
Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A method of processing an audio signal includes the following operations. A system receives an audio signal. The system determines that a first frame of the audio signal includes a sparse transient signal. The system determines that a second frame of the audio signal includes a dense transient signal. The system compresses/expands (compands) the audio signal using a companding mle that applies a first companding exponent to the first frame of the audio signal and applies a second companding exponent to the second frame of the audio signal, each companding exponent being used to derive a respective degree of dynamic range compression and expansion for a corresponding frame. The system then provides the companded audio signal to a downstream device.