H04S5/00

Reconstruction of audio scenes from a downmix

Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators. In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators. In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel.

Reconstruction of audio scenes from a downmix

Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators. In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators. In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel.

AUDIO PROVIDING APPARATUS AND AUDIO PROVIDING METHOD

An audio providing apparatus and method are provided. The audio providing apparatus includes: an object renderer configured to render an object audio signal based on geometric information regarding the object audio signal; a channel renderer configured to render an audio signal having a first channel number into an audio signal having a second channel number; and a mixer configured to mix the rendered object audio signal with the audio signal having the second channel number.

ACOUSTIC SIGNAL PROCESSING APPARATUS, ACOUSTIC SIGNAL PROCESSING METHOD, AND PROGRAM
20180007485 · 2018-01-04 ·

The present technology relates to an acoustic signal processing apparatus, an acoustic signal processing method, and a program for expanding a range of listening positions in which an effect of a transaural reproduction system can be obtained. First and second output signals for localizing a sound image in front of or behind and on the left of a first position located on the left of a listening position are output from first and second speakers, respectively. Third and fourth output signals for localizing a sound image in front of or behind and on the right of a second position located on the right of the listening position are output from third and fourth speakers, respectively. The first speaker is disposed in a first direction in front of or behind the listening position and on the left of the listening position. The second speaker is disposed in the first direction and on the right of the listening position. The third speaker is disposed in the first direction and on the left of the listening position and on the right of the first speaker. The fourth speaker is disposed in the first direction of the listening position and on the right of the second speaker. The present technology can be applied, for example, to an acoustic processing system.

Manipulation of Playback Device Response Using Signal Processing
20180014137 · 2018-01-11 ·

An example playback device receives left and right channels of audio content and generates a center channel of the audio content by combining at least a portion of the left right channels. The playback device generates first and second side channels of the audio content by combining the center channel and a difference of the left channel and the right channel and combining the center channel and an inverse of the difference of the left channel and the right channel, respectively. The playback device plays back the center channel of the audio content according to a first radiation pattern having a maximum aligned with a first direction, the first side channel according to a second radiation pattern having a maximum aligned with a second direction, and the second side channel according to a third radiation pattern having a maximum aligned with a fourth direction.

Manipulation of Playback Device Response Using Signal Processing
20180014137 · 2018-01-11 ·

An example playback device receives left and right channels of audio content and generates a center channel of the audio content by combining at least a portion of the left right channels. The playback device generates first and second side channels of the audio content by combining the center channel and a difference of the left channel and the right channel and combining the center channel and an inverse of the difference of the left channel and the right channel, respectively. The playback device plays back the center channel of the audio content according to a first radiation pattern having a maximum aligned with a first direction, the first side channel according to a second radiation pattern having a maximum aligned with a second direction, and the second side channel according to a third radiation pattern having a maximum aligned with a fourth direction.

AUDIO CHANNEL SPATIAL TRANSLATION

The present invention is directed to methods and apparatus for translating a first plurality of audio input channels to a second plurality of audio output channels. This includes determining that there is pair-wise coding among any of the first plurality of audio input channels, determining an input/output-mapping matrix for mapping at least a first set of the first plurality of audio input channels to at least a second set of the second plurality of audio output channels; and deriving the second plurality of audio output channels based on first plurality of audio input channels, the input/output-mapping matrix and the determined pair-wise coding. The first plurality of audio input channels represent the same soundfield represented by the second plurality of audio output channels.

Apparatus, Method, or Computer Program for Processing an Encoded Audio Scene using a Parameter Conversion

An apparatus for processing an encoded audio scene representing a sound field related to a virtual listener position, the encoded audio scene including information on a transport signal and a first set of parameters related to the virtual listener position includes a parameter converter for converting the first set of parameters into a second set of parameters related to a channel representation including two or more channels for a reproduction at predefined spatial positions for the two or more channels, and an output interface for generating a processed audio scene using the second set of parameters and the information on the transport signal.

COLORLESS GENERATION OF ELEVATION PERCEPTUAL CUES USING ALL-PASS FILTER NETWORKS
20230025801 · 2023-01-26 ·

A system includes one or more computing devices that encode spatial perceptual cues into a monaural channel to generate a plurality of output channels. A computing device determines a target amplitude response for the mid and side channels of the plurality of output channels, defining a spatial perceptual associated with one or more frequency-dependent phase shifts. The computing device determines a transfer function of a single-input, multi-output allpass filter based on the target amplitude response and determines coefficients of the allpass filter based on the transfer function, and processes the monaural channel with the coefficients of the allpass filter to generate the plurality of channels having the encoded spatial perceptual cues. The allpass filter is configured to be colorless with respect to the individual output channels, allowing for the placement of spatial cues into the audio stream to be decoupled from the overall coloration of the audio.

Methods and apparatus for rendering audio objects

Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment.