H04S2420/03

System and method for capturing, encoding, distributing, and decoding immersive audio
09794721 · 2017-10-17 · ·

A sound field coding system and method that provides flexible capture, distribution, and reproduction of immersive audio recordings encoded in a generic digital audio format compatible with standard two-channel or multi-channel reproduction systems. This end-to-end system and method mitigates any impractical need for standard multi-channel microphone array configurations in consumer mobile devices such as smart phones or cameras. The system and method capture and spatially encode two-channel or multi-channel immersive audio signals that are compatible with legacy playback systems from flexible multi-channel microphone array configurations.

Method and an apparatus for processing an audio signal
09787266 · 2017-10-10 · ·

A method of processing an audio signal is provided. A decoding apparatus receives the audio signal, which includes a plurality of objects. The decoding apparatus further receives object information, object gain information, and object correlation information, preset presence information, preset number information, preset information based on the preset presence information and the preset number information, and preset rendering data. The preset information exists in a header region or in each frame. The decoding apparatus obtains a preset matrix from the user by an input unit of the audio decoding apparatus when the preset type information indicates the preset rendering data is defined by the user, and adjusts an output level of the plurality of objects for an output channel based on the object information and the preset matrix. The decoding apparatus outputs an adjusted audio signal including the plurality of objects with an adjusted output level.

Parametric Mixing of Audio Signals

In an encoding section (100), a downmix section (110) forms first and second channels (L.sub.1, L.sub.2) of a downmix signal as linear combinations of first and second groups (401, 402) of channels, respectively, of an M-channel audio signal; and an analysis section (120) determines upmix parameters (α.sub.LU) for parametric reconstruction of the audio signal, and mixing parameters (α.sub.LM). In a decoding section (1200), a decorrelating section (1210) outputs a decorrelated signal (D) based on the downmix signal; and a mixing section (1220) determines mixing coefficients based on the mixing parameters or the upmix parameters, and forms a K-channel output signal ({tilde over (L)}.sub.1, . . . , {tilde over (L)}.sub.K) as a linear combination of the downmix signal and the decorrelated signal in accordance with the mixing coefficients. The channels of the output signal approximate linear combinations of K groups (501-502, 1301-1303) of channels, respectively, of the audio signal. The K groups constitute a different partition of the audio signal than the first and second groups, and 2≦K<M.

TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEPTION DEVICE, AND RECEPTION METHOD
20170289720 · 2017-10-05 · ·

A new service can be provided as maintaining the compatibility with a related audio receiver, without deteriorating an efficient usage of a transmission band. A predetermined number of audio streams having first encoded data and second encoded data which is related to the first encoded data are generated, and a container in a predetermined format including these audio streams is transmitted. The predetermined number of audio streams are generated so that the second encoded data is discarded in a receiver which is not compatible with the second encoded data.

Stereo Signal Processing Method and Apparatus
20220051680 · 2022-02-17 ·

A stereo signal processing method includes performing delay estimation on a stereo signal of a current frame to determine an inter-channel time difference of the current frame, identifying a sign of the inter-channel time difference of the current frame is different from a sign of an inter-channel time difference of a previous frame of the current frame, performing delay alignment processing on the first-channel signal of the current frame based on the inter-channel time difference of the current frame, and performing delay alignment processing on the second-channel signal of the current frame based on the inter-channel time difference of the previous frame.

ADAPTIVE PANNER OF AUDIO OBJECTS
20170280264 · 2017-09-28 ·

An audio object including audio content and object metadata is received. The object metadata indicates an object spatial position of the audio object to be rendered by audio speakers in a playback environment. Based on the object spatial position and source spatial positions of the audio speakers, initial gain values for the audio speakers are determined. The initial gain values can be used to select a set of audio speakers from among the audio speakers. Based on the object spatial position and a set of source spatial positions at which the set of audio speakers are respectively located in the playback environment, a set of non-negative optimized gain values for the set of audio speakers is determined. The audio object at the object spatial position is rendered with the set of optimized gain values for the set of audio speakers.

Systems, Devices and Methods for Multi-Dimensional Audio Recording and Playback
20220046374 · 2022-02-10 ·

Systems and methods for recording and playback of multi-dimensional sound are described herein. The systems and methods may include positioning a plurality of multi-dimensional sound recording devices in a location and positioning a plurality of multi-dimensional sound recording sensors within the location. Then, acoustical footprint data can be generated. Next, recording positional data within the location utilizing the plurality of multi-dimensional sound recording devices may occur. The systems and methods may continue to generate spatial data utilizing the recorded positional data and store the generated acoustical footprint data and spatial data. An audio mix-down utilizing the stored acoustical footprint and spatial data is generated. Finally, a consumer-device audio track mix based on the audio mix-down can be generated. Further embodiments may also replace audio tracks to mimic the original recording conditions in other languages and environment. Playback may occur on a device that generates a profile of the playback area.

Insertion of Sound Objects Into a Downmixed Audio Signal

A method for inserting a first audio signal into a bitstream which comprises a downmix signal and associated bitstream metadata is described. The downmix signal and associated bitstream metadata are indicative of an audio program comprising a plurality of spatially diverse audio signals. The downmix signal comprises at least one audio channel and the bitstream metadata comprise upmix metadata for reproducing the plurality of spatially diverse audio signals from the at least one channel. The method comprises mixing the first audio signal with the at least one audio channel to generate a modified downmix signal. The method further comprises generating an output bitstream comprising the modified downmix signal and the associated modified bitstream metadata indicative of a modified audio program comprising a plurality of modified spatially diverse audio signals.

Method and apparatus for rendering sound signal, and computer-readable recording medium
11245998 · 2022-02-08 · ·

A method of reproducing a multi-channel audio signal including an elevation sound signal in a horizontal layout environment is provided, thereby obtaining a rendering parameter according to a rendering type and configuring a down-mix matrix, and thus effective rendering performance may be obtained with respect to an audio signal that is not suitable for applying virtual rendering. A method of rendering an audio signal includes receiving a multi-channel signal includes a plurality of input channels to be converted into a plurality of output channels; determining a rendering type for elevation rendering based on a parameter determined from a characteristic of the multi-channel signal; and rendering at least one height input channel according to the determined rendering type, wherein the parameter is included in a bitstream of the multi-channel signal.

APPARATUS AND METHOD FOR FRONTAL AUDIO RENDERING IN INTERACTION WITH SCREEN SIZE

Provided is an apparatus and method for frontal audio rendering in interaction with a screen size, the method including measuring playback environment information used to play back input content; and correcting an audio signal to be output based on the measured playback environment information and production environment information included in the input content.