H04S5/00

Method and apparatus for reproducing stereophonic sound
09749767 · 2017-08-29 · ·

Method and apparatus reproduce a stereophonic sound. The method includes obtaining sound depth information which denotes a distance between at least one object within a sound signal and a reference position, and providing sound perspective to the sound object output from a speaker, based on the sound depth information.

Method and apparatus for reproducing stereophonic sound
09749767 · 2017-08-29 · ·

Method and apparatus reproduce a stereophonic sound. The method includes obtaining sound depth information which denotes a distance between at least one object within a sound signal and a reference position, and providing sound perspective to the sound object output from a speaker, based on the sound depth information.

Placement of talkers in 2D or 3D conference scene

The present document relates to setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller (111, 175) configured to place an upstream audio signal (123, 173) associated with a conference participant within a 2D or 3D conference scene to be rendered to a listener (211) is described. An X-point conference scene with X different spatial talker locations (212) is set up within the conference scene, wherein the X talker locations (212) are positioned within a cone around a midline (215) in front of a head of the listener (211). A generatrix (216) of the cone and the midline (215) form an angle which is smaller than or equal to a pre-determined maximum cone angle. The upstream audio signal (123, 173) is assigned to one of the talker locations (212) and metadata identifying the assigned talker location (212) are generated, thus enabling a spatialized audio signal.

Placement of talkers in 2D or 3D conference scene

The present document relates to setting up and managing two-dimensional or three-dimensional scenes for audio conferences. A conference controller (111, 175) configured to place an upstream audio signal (123, 173) associated with a conference participant within a 2D or 3D conference scene to be rendered to a listener (211) is described. An X-point conference scene with X different spatial talker locations (212) is set up within the conference scene, wherein the X talker locations (212) are positioned within a cone around a midline (215) in front of a head of the listener (211). A generatrix (216) of the cone and the midline (215) form an angle which is smaller than or equal to a pre-determined maximum cone angle. The upstream audio signal (123, 173) is assigned to one of the talker locations (212) and metadata identifying the assigned talker location (212) are generated, thus enabling a spatialized audio signal.

Audio Processing Systems and Methods

Embodiments are directed processing adaptive audio content by determining an audio type as one of channel-based audio and object-based audio for each audio segment of an adaptive audio bitstream, tagging the each audio segment with a metadata definition indicating the audio type of the corresponding audio segment, processing audio segments tagged as channel-based audio in a channel audio renderer component, and processing audio segments tagged as object-based audio in an object audio renderer component that is distinct from the channel audio renderer component. Object-based audio is rendered through an object audio renderer interface that dynamically adjusts processing block sizes of the object audio segments based on timing and alignment of metadata updates and maximum/minimum block size parameters.

Audio Processing Systems and Methods

Embodiments are directed processing adaptive audio content by determining an audio type as one of channel-based audio and object-based audio for each audio segment of an adaptive audio bitstream, tagging the each audio segment with a metadata definition indicating the audio type of the corresponding audio segment, processing audio segments tagged as channel-based audio in a channel audio renderer component, and processing audio segments tagged as object-based audio in an object audio renderer component that is distinct from the channel audio renderer component. Object-based audio is rendered through an object audio renderer interface that dynamically adjusts processing block sizes of the object audio segments based on timing and alignment of metadata updates and maximum/minimum block size parameters.

ORIENTATION-AWARE SURROUND SOUND PLAYBACK

Example embodiments disclosed herein relate to orientation-aware surround sound playback. A method for processing audio on an electronic device that includes a plurality of loudspeakers is disclosed, the loudspeakers arranged in more than one dimension of the electronic device. The method includes, responsive to receipt of a plurality of received audio streams, generating a rendering component associated with the plurality of received audio streams, determining an orientation dependent component of the rendering component, processing the rendering component by updating the orientation dependent component according to an orientation of the loudspeakers and dispatching the received audio streams to the plurality of loudspeakers for playback based on the processed rendering component. Corresponding system and computer program products are also disclosed.

Apparatus and method for efficient object metadata coding

An apparatus for generating one or more audio channels is provided. The apparatus includes a metadata decoder for receiving one or more compressed metadata signals. Each of the one or more compressed metadata signals includes a plurality of first metadata samples. The metadata decoder is configured to generate one or more reconstructed metadata signals and to generate each of the second metadata samples of each reconstructed metadata signal of the one or more reconstructed metadata signals depending on at least two of the first metadata samples of the reconstructed metadata signal. The apparatus includes an audio channel generator for generating the one or more audio channels depending on the one or more audio object signals and depending on the one or more reconstructed metadata signals. An apparatus for generating encoded audio information including one or more encoded audio signals and one or more compressed metadata signals is provided.

Audio calibration and adjustment

The subject disclosure is directed towards calibrating sound pressure levels of speakers to determine desired attenuation data for use in later playback. A user may be guided to a calibration location to place a microphone, and each speaker is calibrated to output a desired sound pressure level in its current acoustic environment based upon the attenuation data learned during calibration. During playback, the attenuation data is used. Also described is testing the setup of the speakers, and dynamically adjusting the attenuation data in real time based upon tracking the listener's current location.

DECODING OF AUDIO SCENES

Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which is represented by one or more audio signals. The encoder generates a bit stream which comprises downmix signals and side information which includes individual matrix elements of a reconstruction matrix which enables reconstruction of the one or more audio signals in the decoder.