H04S2420/11

SPATIAL AUDIO FOR INTERACTIVE AUDIO ENVIRONMENTS

Systems and methods of presenting an output audio signal to a listener located at a first location in a virtual environment are disclosed. According to embodiments of a method, an input audio signal is received. A first intermediate audio signal corresponding to the input audio signal is determined, based on a location of the sound source in the virtual environment, and the first intermediate audio signal is associated with a first bus. A second intermediate audio signal is determined. The second intermediate audio signal corresponds to a reverberation of the input audio signal in the virtual environment. The second intermediate audio signal is determined based on a location of the sound source, and further based on an acoustic property of the virtual environment. The second intermediate audio signal is associated with a second bus. The output audio signal is presented to the listener via the first and second buses.

Sound Field Related Rendering
20220328056 · 2022-10-13 ·

An apparatus including circuitry configured to obtain a defocus direction; process a spatial audio signal that represents an audio scene to generate a processed spatial audio signal that represents a modified audio scene based on the defocus direction, so as to control relative deemphasis in, at least in part, a portion of the spatial audio signal in the defocus direction relative to at least in part other portions of the spatial audio signal; and output the processed spatial audio signal, wherein the modified audio scene based on the defocus direction enables the deemphasis in, at least in part, the portion of the spatial audio signal in the defocus direction relative to at least in part other portions of the spatial audio signal.

Method and Apparatus for Low Complexity Low Bitrate 6DOF HOA Rendering

An apparatus for generating an immersive audio scene, the apparatus including circuitry configured to: obtain audio scene based sources, the audio scene based sources are associated with one or more positions in an audio scene, wherein each audio scene based source includes at least one spatial parameter and at least one audio signal; determine at least one position associated with at least one of the audio scene based sources; generate at least one audio source based on the determined at least one position, wherein the circuitry is configured to: generate at least one spatial audio parameter; and generate at least one audio source signal; and generate information about a relationship between the generated at least one spatial audio parameter and the at least one audio signals and the generated at least one audio source is selected based on a renderer preference.

Recording and rendering audio signals

A method, apparatus and computer program, the method comprising: receiving a plurality of input signals representing a sound space; using the received plurality of input signals to obtain spatial metadata corresponding to the sound space; using the received plurality of input signals to obtain a first spatial audio signal corresponding to the spatial metadata; and associating the first spatial audio signal with the spatial metadata to enable the spatial metadata to be used to process the first spatial audio signal to obtain a second spatial audio signal.

Audio encoding device and method

A method and a device encode N audio signals, from N microphones where N≥3. For each pair of the N audio signals an angle of incidence of direct sound is estimated. A-format direct sound signals are derived from the estimated angles of incidence by deriving from each estimated angle an A-format direct sound signal. Each A-format direct sound signal is a first-order virtual microphone signal, for example, a cardioids signal.

Immersive media with media device
11632642 · 2023-04-18 · ·

Aspects of the subject disclosure may include, for example, a method, comprising: receiving, by a media processor including a processor, spherical audiovisual media content from a content delivery network; rendering, by the media processor, video for a point of view in the spherical audiovisual media content at a display device coupled to the media processor; receiving, from a remote control device coupled to the media processor, a control signal panning the point of view, resulting in a new field of view; and generating, by the media processor, audio signals from the spherical audiovisual media content corresponding to the new field of view, wherein the audio signals are adapted to audio reproduction equipment coupled to the media processor. Other embodiments are disclosed.

Spatial transformation of ambisonic audio data

A device configured to decode a bitstream, where the device includes a memory configured to store a temporally encoded representation of spatial audio signals. The device is also configured to receive the bitstream that includes an indication of a spatial transformation, and includes a temporal decoding unit, coupled to the memory, configured to decode one or more spatial audio signals represented in a spatial domain, where the one or more spatial audio signals are associated with different angles in the spatial domain. In addition, the device includes an inverse spatial transformation unit, coupled to the temporal decoding unit, is configured to convert the one or more spatial audio signals represented in the spatial domain into at least three ambisonic coefficients that, in part, represent a soundfield in an ambisonics domain, and perform a spatial transformation of the soundfield based on the indication of the spatial transformation received in the bitstream.

6DOF Rendering of Microphone-Array Captured Audio For Locations Outside The Microphone-Arrays

An apparatus for generating a spatialized audio output based on a listener position, the apparatus including circuitry configured to: obtain two or more audio signal sets; obtain a listener position within an audio environment, wherein the audio environment includes one or more area having one or more inside and outside regions in relation to the respective audio signal set positions; obtain metadata based on a processing of the at least two audio signals; determine, for the listener position within an audio environment outside the inside region, a second listener position; determine modified metadata for the second listener position based on the metadata; determine at least two modified audio signals for the second listener position based on the at least two audio signals; determine spatial metadata for the listener position; and output the at least two modified audio signals and the spatial metadata.

Spatial audio parameters and associated spatial audio playback

An apparatus including at least one processor and at least one memory including a computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to: determine, for two or more microphone audio signals, at least one spatial audio parameter for providing spatial audio reproduction; determine at least one coherence parameter associated with a sound field based on the two or more microphone audio signals, such that another sound field is configured to be reproduced based on the at least one spatial audio parameter and the at least one coherence parameter.

Layered coding for compressed sound or sound field representations
11626119 · 2023-04-11 · ·

The present document relates to a method of layered encoding of a compressed sound representation of a sound or sound field. The compressed sound representation comprises a basic compressed sound representation comprising a plurality of components, basic side information for decoding the basic compressed sound representation to a basic reconstructed sound representation of the sound or sound field, and enhancement side information including parameters for improving the basic reconstructed sound representation. The method comprises sub-dividing the plurality of components into a plurality of groups of components and assigning each of the plurality of groups to a respective one of a plurality of hierarchical layers, the number of groups corresponding to the number of layers, and the plurality of layers including a base layer and one or more hierarchical enhancement layers, adding the basic side information to the base layer, and determining a plurality of portions of enhancement side information from the enhancement side information and assigning each of the plurality of portions of enhancement side information to a respective one of the plurality of layers, wherein each portion of enhancement side information includes parameters for improving a reconstructed sound representation obtainable from data included in the respective layer and any layers lower than the respective layer. The document further relates to a method of decoding a compressed sound representation of a sound or sound field, wherein the compressed sound representation is encoded in a plurality of hierarchical layers that include a base layer and one or more hierarchical enhancement layers, as well as to an encoder and a decoder for layered coding of a compressed sound representation.