Patent classifications
G10L19/0204
OVERSAMPLING IN A COMBINED TRANSPOSER FILTERBANK
The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank (501) comprising an analysis transformation unit (601) having a frequency resolution of Δf; and an analysis window (611) having a duration of D.sub.A; the analysis filter bank (501) being configured to provide a set of analysis subband signals from the low frequency component of the signal; a nonlinear processing unit (502, 650) configured to determine a set of synthesis subband signals based on a portion of the set of analysis subband signals, wherein the portion of the set of analysis subband signals is phase shifted by a transposition order T; and a synthesis filter bank (504) comprising a synthesis transformation unit (602) having a frequency resolution of QΔf; and a synthesis window (612) having a duration of D.sub.s; the synthesis filter bank (504) being configured to generate the high frequency component of the signal from the set of synthesis subband signals; wherein Q is a frequency resolution factor with Q≥1 and smaller than the transposition order T; and wherein the value of the product of the frequency resolution Δf and the duration D.sub.A of the analysis filter bank is selected based on the frequency resolution factor Q.
SYSTEM AND METHOD FOR PROCESSING AUDIO DATA
An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events. The encoder operable to transform the error signal into high resolution frequency components using the MDCT block sizes, quantize the scale factors and frequency components, and encode the quantized lines, block sizes, and quantized scale factors for inclusion in the bitstream.
METHODS AND APPARATUS FOR SUPPLEMENTING PARTIALLY READABLE AND/OR INACCURATE CODES IN MEDIA
Methods and apparatus are disclosed for supplementing partially readable and/or inaccurate codes. An example apparatus includes a watermark analyzer to select a first watermark and a second watermark decoded from media; a comparator to compare a first decoded timestamp of the first watermark to a second decoded timestamp of the second watermark; and a timestamp adjuster to adjust the second decoded timestamp based on the first decoded timestamp of the second watermark when at least a threshold number of symbols of the second decoded timestamp match corresponding symbols of the first decoded timestamp.
AUDIO ENCODING METHOD AND DEVICE AND AUDIO DECODING METHOD AND DEVICE
An audio encoding method and device and an audio decoding method and device are provided. The audio encoding method includes: obtaining a current frame of an audio signal, where the current frame includes a high frequency band signal and a low frequency band signal; obtaining a compatible layer encoding parameter of the current frame based on the high frequency band signal and the low frequency band signal; obtaining an enhancement layer encoding parameter of the current frame based on the high frequency band signal; and performing bitstream multiplexing on the compatible layer encoding parameter and the enhancement layer encoding parameter to obtain an encoded bitstream.
APPARATUS AND METHOD FOR PROCESSING AN INPUT AUDIO SIGNAL USING CASCADED FILTERBANKS
An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
Audio Coding Method and Apparatus
An audio coding method includes obtaining a current frame of an audio signal, where the current frame includes a high frequency band signal; coding the high frequency band signal to obtain a coding parameter of the current frame, where coding includes tonal component screening, the coding parameter indicates information about a target tonal component of the high frequency band signal, the target tonal component is obtained after tonal component screening, and information about a tonal component includes location information, quantity information, and amplitude information or energy information of the tonal component; and performing bitstream multiplexing on the coding parameter to obtain a coded bitstream.
SOUND SIGNAL DATABASE GENERATION APPARATUS, SOUND SIGNAL SEARCH APPARATUS, SOUND SIGNAL DATABASE GENERATION METHOD, SOUND SIGNAL SEARCH METHOD, DATABASE GENERATION APPARATUS, DATA SEARCH APPARATUS, DATABASE GENERATION METHOD, DATA SEARCH METHOD, AND PROGRAM
To provide database generation techniques that can accurately and efficiently generate a database useable in text-based sound signal search. A sound signal database generation apparatus includes: a latent variable generation unit that generates, from a sound signal, a latent variable corresponding to the sound signal using a sound signal encoder; a data generation unit that generates a natural language representation corresponding to the sound signal from the latent variable and a condition concerning an index for a natural language representation using a natural language representation decoder; and a sound signal database generation unit that generates a record including the natural language representation corresponding to the sound signal and the sound signal from the natural language representation corresponding to the sound signal and the sound signal, and generates a sound signal database made up of the record.
Parametric joint-coding of audio sources
The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals, which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.
Method and apparatus for processing multimedia signals
The present invention relates to a method and an apparatus for processing a signal, which are used for effectively reproducing a multimedia signal, and more particularly, to a method and an apparatus for processing a signal, which are used for implementing filtering for multimedia signal having a plurality of subbands with a low calculation amount. To this end, provided are a method for processing a multimedia signal including: receiving a multimedia signal having a plurality of subbands; receiving at least one proto-type filter coefficients for filtering each subband signal of the multimedia signal; converting the proto-type filter coefficients into a plurality of subband filter coefficients; truncating each subband filter coefficients based on filter order information obtained by at least partially using characteristic information extracted from the corresponding subband filter coefficients, the length of at least one truncated subband filter coefficients being different from the length of truncated subband filter coefficients of another subband; and filtering the multimedia signal by using the truncated subband filter coefficients corresponding to each subband signal and an apparatus for processing a multimedia signal using the same.
AUDIO CODING METHOD AND RELATED APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM
An audio decoding method includes: obtaining an encoded bitstream; performing bitstream demultiplexing on the encoded bitstream to obtain a first coding parameter of a current frame; performing bitstream demultiplexing on the encoded bitstream based on a configuration parameter for tonal component coding to obtain a second coding parameter of the current frame, where the second coding parameter of the current frame includes a tonal component parameter; obtaining a first high frequency band signal and a first low frequency band signal of the current frame based on the first coding parameter; obtaining a second high frequency band signal of the current frame based on the second coding parameter and the configuration parameter for tonal component coding; and obtaining a decoded signal of the current frame based on the first high frequency band signal, the second high frequency band signal, and the first low frequency band signal.