Patent classifications
H04S5/00
Renderer controlled spatial upmix
An audio decoder device for decoding a compressed input audio signal having at least one core decoder having one or more processors for generating a processor output signal based on a processor input signal, wherein a number of output channels of the processor output signal is higher than a number of input channels of the processor input signal, wherein each of the one or more processors has a decorrelator and a mixer, wherein a core decoder output signal having a plurality of channels has the processor output signal, and wherein the core decoder output signal is suitable for a reference loudspeaker setup; at least one format converter device configured to convert the core decoder output signal into an output audio signal, which is suitable for a target loudspeaker setup; and a control device configured to control at least one or more processors in such way that the decorrelator of the processor may be controlled independently from the mixer of the processor, wherein the control device is configured to control at least one of the decorrelators of the one or more processors depending on the target loudspeaker setup.
AUDIO DECODER FOR AUDIO CHANNEL RECONSTRUCTION
A method and apparatus for reconstructing N audio channels from M audio channels is disclosed. The method includes receiving a bitstream containing an encoded audio signal representing the M audio channels and decoding the encoded audio signal to obtain a frequency domain representation of the M audio channels. The method further includes extracting a parameter from the bitstream and reconstructing at least one of the N audio channels using the parameter. The parameter represents an angle between two signals, at least one of which is included in the M audio channels.
AUDIO DECODER FOR AUDIO CHANNEL RECONSTRUCTION
A method and apparatus for reconstructing N audio channels from M audio channels is disclosed. The method includes receiving a bitstream containing an encoded audio signal representing the M audio channels and decoding the encoded audio signal to obtain a frequency domain representation of the M audio channels. The method further includes extracting a parameter from the bitstream and reconstructing at least one of the N audio channels using the parameter. The parameter represents an angle between two signals, at least one of which is included in the M audio channels.
Systems and Methods for Spatial Audio Rendering
- Christopher John Stringer ,
- Afrooz Family ,
- Fabian Renn-Giles ,
- David Narajowski ,
- Joshua Phillip Song ,
- John Moreland ,
- Pooja Patel ,
- Pere Aizcorbe Arrocha ,
- Nicholas Knudson ,
- Nathan Hoyt ,
- Marc Carino ,
- Mark Rakes ,
- Ryan Mihelich ,
- Matthew Brown ,
- Bas Ording ,
- Robert Tilton ,
- Jay Sterling Coggin ,
- Lasse Vetter ,
- Christos Kyriakakis ,
- Matthew Robbetts ,
- Matthias Kronlachner
Systems and methods for rendering spatial audio in accordance with embodiments of the invention are illustrated. One embodiment includes a spatial audio system, including a primary network connected speaker, including a plurality of sets of drivers, where each set of drivers is oriented in a different direction, a processor system, memory containing an audio player application, wherein the audio player application configures the processor system to obtain an audio source stream from an audio source via the network interface, spatially encode the audio source, decode the spatially encoded audio source to obtain driver inputs for the individual drivers in the plurality of sets of drivers, where the driver inputs cause the drivers to generate directional audio.
RECONSTRUCTION OF AUDIO SCENES FROM A DOWNMIX
Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators.
In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators.
In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel.
RECONSTRUCTION OF AUDIO SCENES FROM A DOWNMIX
Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators.
In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators.
In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel.
Methods and Apparatus for Rendering Audio Objects
Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment.
Method to expedite playing of binaural sound to a listener
A method expedites processing and playing of binaural sound during an electronic communication between a first user and a second user. An electronic device of the first user convolves sound into binaural sound for the second user before the binaural sound transmits to the electronic device of the second user. In this way, the binaural sound is already convolved and ready to play upon receipt at the electronic device of the second user.
Method to expedite playing of binaural sound to a listener
A method expedites processing and playing of binaural sound during an electronic communication between a first user and a second user. An electronic device of the first user convolves sound into binaural sound for the second user before the binaural sound transmits to the electronic device of the second user. In this way, the binaural sound is already convolved and ready to play upon receipt at the electronic device of the second user.
Acoustic simulation apparatus
A virtual reproduction signal generation unit generates a virtual reproduction signal based on a sound pickup signal of a stereophonic sound at a listening position in a compartment, assuming that virtual speakers are respectively located at portions of Np positions in a vehicle, the virtual reproduction signal causing the virtual speakers of the Np positions to reproduce the stereophonic sound. A virtual prediction signal generation unit generates a virtual prediction signal based on the virtual reproduction signal and an information representing a change of acoustic characteristics when at least part of the portions of the Np positions is changed, the virtual prediction signal causing the virtual speakers of the Np positions to output a predicted sound at the listening position. An output signal generation unit generates an output signal based on the virtual prediction signal, the output signal causing speakers of a plurality of positions to output the predicted sound.