Patent classifications
G10L19/18
FRAME ELEMENT POSITIONING IN FRAMES OF A BITSTREAM REPRESENTING AUDIO CONTENT
A better compromise between a too high bitstream and decoding overhead on the one hand and flexibility of frame element positioning on the other hand is achieved by arranging that each of the sequence of frames of the bitstream has a sequence of N frame elements and, on the other hand, the bitstream has a configuration block having a field indicating the number of elements N and a type indication syntax portion indicating, for each element position of the sequence of N element positions, an element type out of a plurality of element types with, in the sequences of N frame elements of the frames, each frame element being of the element type indicated, by the type indication portion, for the respective element position at which the respective frame element is positioned within the sequence of N frame elements of the respective frame in the bitstream.
FRAME ELEMENT POSITIONING IN FRAMES OF A BITSTREAM REPRESENTING AUDIO CONTENT
A better compromise between a too high bitstream and decoding overhead on the one hand and flexibility of frame element positioning on the other hand is achieved by arranging that each of the sequence of frames of the bitstream has a sequence of N frame elements and, on the other hand, the bitstream has a configuration block having a field indicating the number of elements N and a type indication syntax portion indicating, for each element position of the sequence of N element positions, an element type out of a plurality of element types with, in the sequences of N frame elements of the frames, each frame element being of the element type indicated, by the type indication portion, for the respective element position at which the respective frame element is positioned within the sequence of N frame elements of the respective frame in the bitstream.
Method, Apparatus, and System for Processing Audio Data
A method for processing an audio signal includes: receiving a bitstream corresponding to the audio signal; obtaining a silence insertion descriptor (SID) type of a current frame of the audio signal by decoding the bitstream; obtaining a low-band parameter of the current frame by decoding the bitstream; obtaining a low-band signal of the current frame based on the low-band parameter; obtaining, based on the SID type of the current frame, a high-band parameter of the current frame; obtaining a high-band signal of the current frame based on the high-band parameter; and obtaining a synthesis signal of the current frame based on the low-band signal and the high-band signal.
MDCT-based complex prediction stereo coding
The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The method comprises applying independent band-width limits for the input channels.
Method and apparatus for encoding and decoding high frequency for bandwidth extension
Disclosed are a method and apparatus for encoding and decoding a high frequency for bandwidth extension. The method includes: estimating a weight; and generating a high frequency excitation signal by applying the weight between random noise and a decoded low frequency spectrum.
Method and apparatus for encoding and decoding high frequency for bandwidth extension
Disclosed are a method and apparatus for encoding and decoding a high frequency for bandwidth extension. The method includes: estimating a weight; and generating a high frequency excitation signal by applying the weight between random noise and a decoded low frequency spectrum.
AUDIO ENCODER AND DECODER USING A FREQUENCY DOMAIN PROCESSOR WITH FULL-BAND GAP FILLING AND A TIME DOMAIN PROCESSOR
An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal having a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second portion.
AUDIO ENCODER AND DECODER USING A FREQUENCY DOMAIN PROCESSOR WITH FULL-BAND GAP FILLING AND A TIME DOMAIN PROCESSOR
An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal having a first encoded signal portion for the first audio signal portion and a second encoded signal portion for the second portion.
SOUND QUALITY DETECTION METHOD AND DEVICE FOR HOMOLOGOUS AUDIO AND STORAGE MEDIUM
Provided are a method for detecting tone quality of homologous audio, a device and storage medium, which belong to the technical field of audio. The method comprises: acquiring a plurality of audio files to be detected belonging to homologous audio files (101); extracting the features of each audio file of the plurality of audio files, to obtain at least one audio feature of each audio file, and to generate the corresponding relationship list between the at least one audio feature of each audio file and the audio file identifier (102); on the basis of the corresponding relationship list between the at least one audio feature of the plurality of audio file and the audio file identifier, determining the tone quality score of each audio file of the plurality of audio files through a tone quality detecting model (103). The tone quality detection of the homologous audio files is achieved, which is convenient to store, acquire and manage the homologous audio files according to the tone quality, and the storing, obtaining and managing costs of the homologous audio files can be saved.
METHODS AND DEVICES FOR GENERATING OR DECODING A BITSTREAM COMPRISING IMMERSIVE AUDIO SIGNALS
The present document describes a method (500) for generating a bitstream (101), wherein the bitstream (101) comprises a sequence of superframes (400) for a sequence of frames of an immersive audio signal (111). The method (500) comprises, repeatedly for the sequence of superframes (400), inserting (501) coded audio data (206) for one or more frames of one or more downmix channel signals (203) derived from the immersive audio signal (111), into data fields (411, 421, 412, 422) of a superframe (400); and inserting (502) metadata (202, 205) for reconstructing one or more frames of the immersive audio signal (111) from the coded audio data (206), into a metadata field (403) of the superframe (400).