H04S2420/03

SPATIAL AUDIO PARAMETER ENCODING AND ASSOCIATED DECODING
20230047237 · 2023-02-16 ·

An apparatus comprising means configured to obtain direction parameter values (108) associated with at least two time-frequency parts (202) of at least one audio signal (102); and encode the obtained direction parameter values based on a codebook (206), wherein the codebook comprises two or more quantization levels arranged such that a first quantization level comprises a first set of quantization values, and a second or succeeding quantization level comprises a second or further set of quantization values and preceding quantization level quantization values.

Reconstruction of audio scenes from a downmix

Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators. In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators. In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel.

Audio encoder and decoder

The present disclosure provides methods, devices and computer program products for encoding and decoding of a vector of parameters in an audio coding system. The disclosure further relates to a method and apparatus for reconstructing an audio object in an audio decoding system. According to the disclosure, a modulo differential approach for coding and encoding a vector of a non-periodic quantity may improve the coding efficiency and provide encoders and decoders with less memory requirements. Moreover, an efficient method for encoding and decoding a sparse matrix is provided.

Noise filling in multichannel audio coding

In multichannel audio coding, an improved coding efficiency is achieved by the following measure: the noise filling of zero-quantized scale factor bands is performed using noise filling sources other than artificially generated noise or spectral replica. In particular, the coding efficiency in multichannel audio coding may be rendered more efficient by performing the noise filling based on noise generated using spectral lines from a previous frame of, or a different channel of the current frame of, the multichannel audio signal.

Spatial audio processing

An apparatus comprising at least one processor and at least one memory, the memory comprising machine-readable instructions, that when executed cause the apparatus to: store in a non-volatile memory multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; provide in a man machine interface an option for a user to select one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources; and in response to the user selecting one of the stored multiple sets of predetermined spatial audio processing parameters for differently moving sound sources, the apparatus is further caused to use the selected one of the stored multiple sets of predetermined spatial audio processing parameters to spatially process audio from one or more sound sources.

METHODS AND SYSTEMS FOR GENERATING AND RENDERING OBJECT BASED AUDIO WITH CONDITIONAL RENDERING METADATA

Methods and audio processing units for generating an object based audio program including conditional rendering metadata corresponding to at least one object channel of the program, where the conditional rendering metadata is indicative of at least one rendering constraint, based on playback speaker array configuration, which applies to each corresponding object channel, and methods for rendering audio content determined by such a program, including by rendering content of at least one audio channel of the program in a manner compliant with each applicable rendering constraint in response to at least some of the conditional rendering metadata. Rendering of a selected mix of content of the program may provide an immersive experience.

Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Parameter Conversion

An apparatus for processing an encoded audio scene representing a sound field related to a virtual listener position, the encoded audio scene including information on a transport signal and a first set of parameters related to the virtual listener position includes a parameter converter for converting the first set of parameters into a second set of parameters related to a channel representation including two or more channels for a reproduction at predefined spatial positions for the two or more channels, and an output interface for generating a processed audio scene using the second set of parameters and the information on the transport signal.

Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Parameter Smoothing

Apparatus for processing an audio scene representing a sound field, the audio scene having information on a transport signal and a first set of parameters. The apparatus has a parameter processor for processing the first set of parameters to obtain a second set of parameters, wherein the parameter processor is configured to calculate at least one raw parameter for each output time frame using at least one parameter of the first set of parameters for the input time frame, to calculate a smoothing information such as a factor for each raw parameter in accordance with a smoothing rule, and to apply a corresponding smoothing information to the corresponding raw parameter to derive the parameter of the second set of parameters for the output time frame. The apparatus further has an output interface for generating a processed audio scene using the second set of parameters and the information on the transport signal.

Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Bandwidth Extension

Apparatus for processing an audio scene representing a sound field, the audio scene comprising information on a transport signal and a set of parameters. The apparatus comprising an output interface for generating a processed audio scene using the set of parameters and the information on the transport signal, wherein the output interface is configured to generate a raw representation of two or more channels using the set of parameters and the transport signal and a multichannel enhancer for generating an enhancement representation of the two or more channels using the transport signal, and a signal combiner for combining the raw representation of the two or more channels and the enhancement representation of the two or more channels to obtain the processed audio scene.

APPARATUS AND METHOD FOR ENCODING A PLURALITY OF AUDIO OBJECTS USING DIRECTION INFORMATION DURING A DOWNMIXING OR APPARATUS AND METHOD FOR DECODING USING AN OPTIMIZED COVARIANCE SYNTHESIS

An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects has: a downmixer for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.