G10L19/18

MDCT-BASED COMPLEX PREDICTION STEREO CODING

The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The upmixing can be suspended responsive to control data.

MDCT-BASED COMPLEX PREDICTION STEREO CODING

The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The upmixing can be suspended responsive to control data.

Audio coding and decoding methods and devices, and audio coding and decoding system

An audio coding method, comprising: obtaining an i.sup.th audio frame in n consecutive audio frames and obtaining an i.sup.th piece of coded data and an i.sup.th piece of redundant data based on the i.sup.th audio frame, wherein the i.sup.th piece of coded data is obtained by coding the i.sup.th audio frame, and the i.sup.th piece of redundant data is obtained by coding and buffering the i.sup.th audio frame, wherein n is a positive integer, and 1≤i≤n; and packing the i.sup.th piece of coded data and at most m pieces of redundant data before the i.sup.th piece of redundant data into an i.sup.th audio data packet, wherein m is a preset positive integer. An audio decoding method and a computer device are further provided.

Systems and methods of audio decoder determination and selection
11348592 · 2022-05-31 · ·

Playback devices can support audio encoded using various encoding schemes. Playing back such content includes receiving, at a playback device, audio data from an audio source; and receiving an indication from the audio source that the audio data is encoded in the compressed audio format. The device determines, independently of receiving the indication from the audio source that the audio data is encoded in the compressed audio format, whether the audio data is encoded in a compressed audio format. If the audio data is determined to be encoded in the compressed audio format: the device selects a decoder from among a plurality of decoders; decodes the audio data using the selected decoder; and plays back the decoded audio data via the playback device. If the audio data is determined not to be encoded in the compressed audio format, the device inhibits playback of the audio data.

System and method for intelligent contextual session management for videoconferencing applications

An information handling system executing an intelligent collaboration contextual session management system may comprise a display screen, a speaker, a video camera, a microphone, a sensor hub detecting participation of a user in a videoconference session, and a processor to execute a multimedia multi-user collaboration application to join the videoconference session with a remotely located computing device. The processor may also input a detected user participation level into a trained neural network and output an optimized media capture instruction to the video camera predicted to adjust performance of the multimedia multi-user collaboration application to meet a preset performance benchmark value, during the videoconference session. The video camera may be configured to capture a video sample based on the optimized media capture instructions, in response to the detected user participation level, and a network interface device may be configured to transmit the captured video sample to the remotely located computing device.

Vector quantizer

Vector Quantizer and method therein for vector quantization, e.g. in a transform audio codec. The method comprises comparing an input target vector with four centroids C.sub.0, C.sub.1, C.sub.0,flip and C.sub.1,flip, wherein centroid C.sub.0,flip is a flipped version of centroid C.sub.0 and centroid C.sub.1,flip is a flipped version of centroid C.sub.1, each centroid representing a respective class of codevectors. A starting point for a search related to the input target vector in the codebook is determined, based on the comparison. A search is performed in the codebook, starting at the determined starting point, and a codevector is identified to represent the input target vector. A number of input target vectors per block or time segment is variable. A search space is dynamically adjusted to the number of input target vectors. The codevectors are sorted according to a distortion measure reflecting the distance between each codevector and the centroids C.sub.0 and C.sub.1.

Vector quantizer

Vector Quantizer and method therein for vector quantization, e.g. in a transform audio codec. The method comprises comparing an input target vector with four centroids C.sub.0, C.sub.1, C.sub.0,flip and C.sub.1,flip, wherein centroid C.sub.0,flip is a flipped version of centroid C.sub.0 and centroid C.sub.1,flip is a flipped version of centroid C.sub.1, each centroid representing a respective class of codevectors. A starting point for a search related to the input target vector in the codebook is determined, based on the comparison. A search is performed in the codebook, starting at the determined starting point, and a codevector is identified to represent the input target vector. A number of input target vectors per block or time segment is variable. A search space is dynamically adjusted to the number of input target vectors. The codevectors are sorted according to a distortion measure reflecting the distance between each codevector and the centroids C.sub.0 and C.sub.1.

Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

A schematic block diagram of an audio encoder for encoding a multichannel audio signal is shown. The audio encoder includes a linear prediction domain encoder, a frequency domain encoder, and a controller for switching between the linear prediction domain encoder and the frequency domain encoder. The controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder. The linear prediction domain encoder includes a downmixer for downmixing the multichannel signal to obtain a downmixed signal. The linear prediction domain encoder further includes a linear prediction domain core encoder for encoding the downmix signal and furthermore, the linear prediction domain encoder includes a first joint multichannel encoder for generating first multichannel information from the multichannel signal.

Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

A schematic block diagram of an audio encoder for encoding a multichannel audio signal is shown. The audio encoder includes a linear prediction domain encoder, a frequency domain encoder, and a controller for switching between the linear prediction domain encoder and the frequency domain encoder. The controller is configured such that a portion of the multichannel signal is represented either by an encoded frame of the linear prediction domain encoder or by an encoded frame of the frequency domain encoder. The linear prediction domain encoder includes a downmixer for downmixing the multichannel signal to obtain a downmixed signal. The linear prediction domain encoder further includes a linear prediction domain core encoder for encoding the downmix signal and furthermore, the linear prediction domain encoder includes a first joint multichannel encoder for generating first multichannel information from the multichannel signal.

METHOD AND APPARATUS FOR UPDATING A NEURAL NETWORK
20220156584 · 2022-05-19 · ·

Described herein is a method of generating a media bitstream to transmit parameters for updating a neural network implemented in a decoder, wherein the method includes the steps of: (a) determining at least one set of parameters for updating the neural network; (b) encoding the at least one set of parameters and media data to generate the media bitstream; and (c) transmitting the media bitstream to the decoder for updating the neural network with the at least one set of parameters. Described herein are further a method for updating a neural network implemented in a decoder, an apparatus for generating a media bitstream to transmit parameters for updating a neural network implemented in a decoder, an apparatus for updating a neural network implemented in a decoder and computer program products comprising a computer-readable storage medium with instructions adapted to cause the device to carry out said methods when executed by a device having processing capability.