Patent classifications
H04S2420/03
Binaural Dialogue Enhancement
Methods for dialogue enhancing audio content, comprising providing a first audio signal presentation of the audio components, providing a second audio signal presentation, receiving a set of dialogue estimation parameters configured to enable estimation of dialogue components from the first audio signal presentation, applying said set of dialogue estimation parameters to said first audio signal presentation, to form a dialogue presentation of the dialogue components; and combining the dialogue presentation with said second audio signal presentation to form a dialogue enhanced audio signal presentation for reproduction on the second audio reproduction system, wherein at least one of said first and second audio signal presentation is a binaural audio signal presentation.
Apparatus and method for downmixing or upmixing a multichannel signal using phase compensation
An apparatus for downmixing a multi-channel signal having at least two channels, has: a downmixer for calculating a downmix signal from the multi-channel signal, wherein the downmixer is configured to calculate the downmix using an absolute phase compensation, so that a channel having a lower energy among the at least two channels is only rotated or is rotated stronger than a channel having a greater energy in calculating the downmix signal; and an output interface for generating an output signal, the output signal having information on the downmix signal.
Audio signal decoder, audio signal encoder, method for providing an upmix signal representation, method for providing a downmix signal representation, computer program and bitstream using a common inter-object-correlation parameter value
An audio signal decoder for providing an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information and in dependence on a rendering information has an object parameter determinator. The object parameter determinator is configured to obtain inter-object-correlation values for a plurality of pairs of audio objects. The object parameter determinator is configured to evaluate a bitstream signaling parameter in order to decide whether to evaluate individual inter-object-correlation bitstream parameter values to obtain inter-object-correlation values for a plurality of pairs of related audio objects, or to obtain inter-object-correlation values for a plurality of pairs of related audio objects using a common inter-object-correlation bitstream parameter value. The audio signal decoder also has a signal processor configured to obtain the upmix signal representation on the basis of the downmix signal representation and using the inter-object-correlation values for a plurality of pairs of related objects and the rendering information.
Object clustering for rendering object-based audio content based on perceptual criteria
Embodiments are directed a method of rendering object-based audio comprising determining an initial spatial position of objects having object audio data and associated metadata, determining a perceptual importance of the objects, and grouping the audio objects into a number of clusters based on the determined perceptual importance of the objects, such that a spatial error caused by moving an object from an initial spatial position to a second spatial position in a cluster is minimized for objects with a relatively high perceptual importance. The perceptual importance is based at least in part by a partial loudness of an object and content semantics of the object.
Methods and systems for generating and interactively rendering object based audio
Methods for generating an object based audio program, renderable in a personalizable manner, and including a bed of speaker channels renderable in the absence of selection of other program content (e.g., to provide a default full range audio experience). Other embodiments include steps of delivering, decoding, and/or rendering such a program. Rendering of content of the bed, or of a selected mix of other content of the program, may provide an immersive experience. The program may include multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects), the bed of speaker channels, and other speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method.
Encoding device and method, decoding device and method, and program
The present technique relates to an encoding device and a method, a decoding device and a method, and a program capable of obtaining higher quality audio. An encoding unit encodes position information and a gain of an object in a current frame in multiple encoding modes. A compressing unit generates, for each combination of encoding modes of each pieces of position information and gains, encoded meta data including encoding mode information indicating the encoding modes and encoded data which are the encoded position information and gains, and compresses the encoding mode information included in the encoding meta data. A determining unit selects encoded meta data of which amount of data is the least from among the encoded meta data generated for each combination, thus determining the encoding mode of each pieces of position information and gains. The present technique can be applied to an encoder and a decoder.
DECODING METHOD AND DECODER FOR DIALOG ENHANCEMENT
There is provided a method for enhancing dialog in a decoder of an audio system. The method comprises receiving a plurality of downmix signals being a downmix of a larger plurality of channels; receiving parameters for dialog enhancement being defined with respect to a subset of the plurality of channels that is downmixed into a subset of the plurality of downmix signals; upmixing the subset of downmix signals parametrically in order to reconstruct the subset of the plurality of channels with respect to which the parameters for dialog enhancement are defined; applying dialog enhancement to the subset of the plurality of channels with respect to which the parameters for dialog enhancement are defined using the parameters for dialog enhancement to provide at least one dialog enhanced signal; and subjecting the at least one dialog enhanced signal to mixing to provide dialog enhanced versions of the subset of downmix signals.
Binaural decoder to output spatial stereo sound and a decoding method thereof
A binaural decoder for an MPEG surround stream, which decodes an MPEG surround stream into a stereo 3D signal, and a decoding method thereof. The method includes dividing a compressed audio stream and head related transfer function (HRTF) data into subbands, selecting predetermined subbands of the HRTF data divided into subbands and filtering the HRTF data to obtain the selected subbands, decoding the audio stream divided into subbands into a stream of multi-channel audio data with respect to subbands according to spatial additional information, and binaural-synthesizing the HRTF data of the selected subbands with the multi-channel audio data of corresponding subbands.
Stereo audio signal encoder
An apparatus comprising a mapper configured to map an instance of a parameter according to a first mapping to generate a first mapped instance; a remapper configured to remap the first mapped instance dependent on the frequency distribution of mapped instances to generate a remapped instance with an associated order position; and an encoder configured to encode the remapped instance dependent on an order position of the remapped instance.
Reflected sound rendering for object-based audio
Embodiments are described for rendering spatial audio content through a system that is configured to reflect audio off of one or more surfaces of a listening environment. The system includes an array of audio drivers distributed around a room, wherein at least one driver of the array of drivers is configured to project sound waves toward one or more surfaces of the listening environment for reflection to a listening area within the listening environment and a renderer configured to receive and process audio streams and one or more metadata sets that are associated with each of the audio streams and that specify a playback location in the listening environment.