G10L19/173

Optimized partial mixing of audio streams encoded by sub-band encoding
09984698 · 2018-05-29 · ·

The invention relates to a method for combining a plurality of audio streams encoded by frequency sub-band encoding, comprising the following steps: decoding (E301) a portion of the encoded streams over at least one frequency sub-band; combining (E302) the streams thus encoded to form a mixed stream; selecting (E303), from among the plurality of encoded audio streams, at least one encoded replication stream, over at least one frequency sub-band that is different from that of the decoding step. The method is such that the selection of the at least one encoded replication stream is carried out according to a criterion which takes into consideration the presence of a predetermined frequency band in the encoded stream (E304). The invention also relates to a device which implements the described method and can be integrated into a conference bridge, a communication terminal or a communication gateway.

Transition from a transform coding/decoding to a predictive coding/decoding
09984696 · 2018-05-29 · ·

Methods and apparatus are provided for coding and decoding a digital audio signal. Decoding includes: decoding according to an inverse transform decoding of a previous frame of samples of the digital signal, which is received and coded according to a transform coding; and decoding according to a predictive decoding of a current frame of samples of the digital signal, which is received and coded according to a predictive coding. The predictive decoding of the current frame is a transition predictive decoding which does not use any adaptive dictionary arising from the previous frame. At least one state of the predictive decoding is reinitialized to a predetermined default value, and an add-overlap step combines a signal segment synthesized by predictive decoding of the current frame and a signal segment synthesized by inverse transform decoding, corresponding to a stored segment of the decoding of the previous frame.

FRAME CODING FOR SPATIAL AUDIO DATA

The techniques disclosed herein provide apparatuses and related methods for the communication of spatial audio and related metadata. In some implementations, a source provides prerecorded spatial audio that has embedded metadata. A computing device processes the prerecorded spatial audio to generate an audio codec that is segmented to include a first section of audio data and a second section that includes metadata extracted from the prerecorded spatial audio. The generated audio codec may be received by a device that includes an encoder. The encoder may process the generated audio codec to generate audio data that includes the metadata.

Methods, Encoder And Decoder For Linear Predictive Encoding And Decoding Of Sound Signals Upon Transition Between Frames Having Different Sampling Rates
20180137871 · 2018-05-17 ·

Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S1 to a sampling rate S2. A power spectrum of a LP synthesis filter is computed, at the sampling rate S1, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S1 to the sampling rate S2. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S2. The autocorrelations are used to compute the LP filter parameters at the sampling rate S2.

Conversion from channel-based audio to HOA
09961467 · 2018-05-01 · ·

In one example, a method includes obtaining a representation of a multi-channel audio signal for a source loudspeaker configuration; obtaining a representation of a plurality of spatial positioning vectors (SPVs), in a Higher-Order Ambisonics (HOA) domain, that are based on a source rendering matrix, which is based on the loudspeaker configuration; and generating a HOA soundfield based on the multi-channel audio signal and the plurality of spatial positioning vectors.

Conversion from object-based audio to HOA
09961475 · 2018-05-01 · ·

A device obtains an object-based representation of an audio signal of an audio object. The audio signal corresponds to a time interval. Additionally, the device obtains a representation of a spatial vector for the audio object, wherein the spatial vector is defined in a Higher-Order Ambisonics (HOA) domain and is based on a first plurality of loudspeaker locations. The device generates, based on the audio signal of the audio object and the spatial vector, a plurality of audio signals. Each respective audio signal of the plurality of audio signals corresponds to a respective loudspeaker in a plurality of local loudspeakers at the second plurality of loudspeaker locations different from the first plurality of loudspeaker locations.

System and method for reducing tandeming effects in a communication system

The present disclosure is directed towards a system and method for reducing tandeming effects in a communications system. The method may include receiving, at a speech decoder, an input bitstream associated with an incoming initial speech signal from a speech encoder. The method may further include determining whether or not coding is required and if coding is required, modifying an excitation signal associated with the bitstream. The method may also include providing the modified excitation signal to an adaptive encoder.

Methods, apparatus and systems for determining reconstructed audio signal

According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.

Audio decoder, audio encoder, method for providing a decoded audio signal, method for providing an encoded audio signal, audio stream, audio stream provider and computer program using a stream identifier

An audio decoder for providing a decoded audio signal representation on the basis of an encoded audio signal representation is configured to adjust decoding parameters in dependence on a configuration information, to decode one or more audio frames using a current configuration information, to compare a configuration information in a configuration structure associated with one or more frames to be decoded by the current configuration information, and to make a transition to perform decoding using the configuration information in the configuration structure associated with the one or more frames to be decoded as a new configuration information if the configuration information in the configuration structure associated with the one or more frames to be decoded, or a relevant portion thereof, is different from the current configuration information, and to consider a stream identifier information included in the configuration structure when comparing the configuration information.

AUDIO AND VIDEO TRANSCODING APPARATUS AND METHOD, DEVICE, MEDIUM, AND PRODUCT
20240373047 · 2024-11-07 ·

Provided are an audio and video transcoding method performed by a computer device. The method includes: obtaining first multimedia data in a first format; processing the first multimedia data in the first format into intermediate data through a first transcoding operation; processing the intermediate data into at least two pieces of second multimedia data in second formats through at least two second transcoding operations, the first transcoding operation and the at least two second transcoding operations being operations performed based on that a media bus provides a data communication channel; and outputting the at least two pieces of second multimedia data in the second formats.