Patent classifications
G10L19/0017
Audio Transcoding Method and Apparatus, Audio Transcoder, Device, and Storage Medium
Provided is an audio transcoding method, including: (301) performing entropy decoding on a first audio stream with a first bitrate, to obtain an audio feature parameter and an excitation signal of the first audio stream, the excitation signal being a quantized audio signal; (302) obtaining a time-domain audio signal corresponding to the excitation signal based on the audio feature parameter and the excitation signal; (303) re-quantizing the excitation signal and the audio feature parameter based on the time-domain audio signal and a target transcoding bitrate, to obtain a target excitation signal and a target audio feature parameter; and (304) performing entropy coding on the target audio feature parameter and the target excitation signal, to obtain a second audio stream with a second bitrate, the second bitrate being lower than the first bitrate.
Low bitrate audio encoding/decoding scheme having cascaded switches
An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first and second encoding branches, the second encoding branch having a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and the second encoding branch having a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder, a second domain decoder, and a third domain decoder as well as two cascaded switches for switching between the decoders.
PROCESSING OF AUDIO SIGNALS DURING HIGH FREQUENCY RECONSTRUCTION
The application relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals. In particular, the application relates to a method and system for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals is described. The system comprises means for receiving the plurality of low frequency subband signals; means for receiving a set of target energies, each target energy covering a different target interval within the high frequency interval and being indicative of the desired energy of one or more high frequency subband signals lying within the target interval; means for generating the plurality of high frequency subband signals from the plurality of low frequency subband signals and from a plurality of spectral gain coefficients associated with the plurality of low frequency subband signals, respectively; and means for adjusting the energy of the plurality of high frequency subband signals using the set of target energies.
Compressing audio waveforms using neural networks and vector quantizers
Methods, systems and apparatus, including computer programs encoded on computer storage media. One of the methods includes receiving an audio waveform that includes a respective audio sample for each of a plurality of time steps, processing the audio waveform using an encoder neural network to generate a plurality of feature vectors representing the audio waveform, generating a respective coded representation of each of the plurality of feature vectors using a plurality of vector quantizers that are each associated with a respective codebook of code vectors, wherein the respective coded representation of each feature vector identifies a plurality of code vectors, including a respective code vector from the codebook of each vector quantizer, that define a quantized representation of the feature vector, and generating a compressed representation of the audio waveform by compressing the respective coded representation of each of the plurality of feature vectors.
AUDIO ENCODER, AUDIO DECODER, METHOD FOR ENCODING AN AUDIO INFORMATION, METHOD FOR DECODING AN AUDIO INFORMATION AND COMPUTER PROGRAM USING A DETECTION OF A GROUP OF PREVIOUSLY-DECODED SPECTRAL VALUES
An audio decoder for providing a decoded audio information includes a arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically-encoded representation of the spectral values and a frequency-domain-to-time-domain converter for providing a time-domain audio representation using the decoded spectral values. The arithmetic decoder is configured to select a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state. The arithmetic decoder is configured to determine or modify the current context state in dependence on a plurality of previously-decoded spectral values. The arithmetic decoder is configured to detect a group of a plurality of previously-decoded spectral values, which fulfill, individually or taken together, a predetermined condition regarding their magnitudes, and to determine the current context state in dependence on a result of the detection.
An audio encoder uses similar principles.
SYSTEM AND METHOD FOR PROCESSING AUDIO DATA
An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events. The encoder operable to transform the error signal into high resolution frequency components using the MDCT block sizes, quantize the scale factors and frequency components, and encode the quantized lines, block sizes, and quantized scale factors for inclusion in the bitstream.
Frame error concealment method and apparatus, and audio decoding method and apparatus
A frame error concealment method is provided that includes predicting a parameter by performing a regression analysis on a group basis for a plurality of groups formed from a first plurality of bands forming an error frame and concealing an error in the error frame by using the parameter predicted on a group basis.
Audio encoder, audio decoder, methods and computer program using jointly encoded residual signals
An audio decoder for providing at least four audio channel signals on the basis of an encoded representation is configured to provide a first residual signal and a second residual signal on the basis of a jointly encoded representation of the first residual signal and of the second residual signal using a multi-channel decoding. The audio decoder is configured to provide a first audio channel signal and a second audio channel signal on the basis of a first downmix signal and the first residual signal using a residual-signal-assisted multi-channel decoding. The audio decoder is configured to provide a third audio channel signal and a fourth audio channel signal on the basis of a second downmix signal and the second residual signal using a residual-signal-assisted multi-channel decoding. An audio encoder is based on corresponding considerations.
Sample sequence converter, signal encoding apparatus, signal decoding apparatus, sample sequence converting method, signal encoding method, signal decoding method and program
Performance of an encoding process and a decoding process for a sound signal is enhanced. A representative value calculating part 110 calculates, for each frequency section by a plurality of samples fewer than the number of frequency samples of a sample sequence of a frequency domain signal corresponding to an input acoustic signal, from the sample sequence of the frequency domain signal, a representative value of the frequency section from sample values of samples included in the frequency section, for each of predetermined time sections. A signal companding part 120 obtains, for each of the predetermined time sections, a frequency domain sample sequence obtained by multiplying a weight according to a function value of the representative value by a companding function for which an inverse function can be defined and each of the samples corresponding to the representative value in the sample sequence of the frequency domain signal, as a sample sequence of a weighted frequency domain signal.
ELECTRONIC DEVICE FOR PERFORMING AUDIO STREAMING AND OPERATING METHOD THEREOF
An electronic device includes a memory configured to store computer-executable instructions; and a processor configured to execute the computer-executable instructions to: based on a result of analyzing a transmission environment of a wireless communication channel through which an audio signal is transmitted, determine a bitrate of the audio signal, encode the audio signal into packets according to the bitrate, the packets including a main packet for audio streaming and a plurality of extension packets for sound quality improvement, based on at least one of a type of the packets and the result of analyzing the transmission environment, determine a packet type indicating a modulation scheme and number of time slots used for transmitting each packet of the packets, and configure and transmit audio packets reflecting the packet type for each packet of the packets.