H04S2420/11

Method and apparatus for compressing and decompressing a Higher Order Ambisonics representation

Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.

Determining renderers for spherical harmonic coefficients

In general, techniques are described for determining renderers used for rendering spherical harmonic coefficients to generate one or more loudspeaker signals. A device comprising one or more processors may perform the techniques. The one or more processors may be configured to determine a local speaker geometry of one or more speakers used for playback of spherical harmonic coefficients representative of a sound field, and configure the device to operate based on the local speaker geometry.

User interface for controlling audio rendering for extended reality experiences

A device may be configured to play one or more of a plurality of audio streams. The device may include a memory configured to store the plurality of audio streams, each of the audio streams representative of a soundfield. The device also may include one or more processors coupled to the memory, and configured to present a user interface to a user, obtain an indication from a user via the user interface representing a desired listening position; and select, based on the indication, at least one audio stream of the plurality of audio streams.

METHODS AND APPARATUS FOR DETERMINING FOR DECODING A COMPRESSED HOA SOUND REPRESENTATION

When compressing an HOA data frame representation, a gain control (15, 151) is applied for each channel signal before it is perceptually encoded (16). The gain values are transferred in a differential manner as side information. However, for starting decoding of such streamed compressed HOA data frame representation absolute gain values are required, which should be coded with a minimum number of bits. For determining such lowest integer number (β.sub.e) of bits the HOA data frame representation (C(k)) is rendered in spatial domain to virtual loudspeaker signals lying on a unit sphere, followed by normalisation of the HOA data frame representation (C(k)). Then the lowest integer number of bits is set to β.sub.e=┌log.sub.2(┌log.sub.2(√{square root over (K.sub.MAX)}.Math.O)┐+1)┐.

NON-COINCIDENT AUDIO-VISUAL CAPTURE SYSTEM
20220272477 · 2022-08-25 ·

Systems and methods discussed herein can change a frame of reference for a first spatial audio signal. The first spatial audio signal can include signal components representing audio information from different depths or directions relative to an audio capture location associated with an audio capture source device with a first frame of reference relative to an environment Changing the frame of reference can include receiving a component of the first spatial audio signal, receiving information about a second frame of reference relative to the same environment, determining a difference between the first and second frames of reference, and, using the determined difference between the first and second frames of reference, determining a first filter to use to generate at least one component of a second spatial audio signal that is based on the first spatial audio signal and is referenced to the second frame of reference.

Making available a sound signal for higher order ambisonics signals

Audio signals are recorded with microphones receiving acoustic information from one or more directions. The corresponding audio signals can be pre-listened to in production studios. However, Higher Order Ambisonics (HOA) audio signals are matrixed in such a way that the matrixing prevents listening to the matrixed sound signals without dematrixing the matrixed sound signals. For enabling a sound engineer to listen to such a matrixed signal without full HOA decoding, an informative audio signal is added together with related side information data at encoding side to a selected part of the matrixed signal. This informative audio signal is removed before the inverse matrixing process at decoding side.

Sound system
09773506 · 2017-09-26 ·

Methods and systems for processing audio data, such as spatial audio data, which modify sound characteristics of a given component of a spatial audio signal based on a relationship between a direction characteristic of the given component and a defined range of direction characteristics. A spatial audio in a format using a spherical harmonic representation of sound components is decoded by performing a transform on the spherical harmonic representation, where the transform is based on a predefined speaker layout and a predefined rule, the predefined rule indicating a speaker gain of each speaker arranged according to the predefined layout, when reproducing sound incident form a given direction to provide alternatives to existing methods of decoding spatial audio streams, which focus on soundfield reconstruction. A plurality of matrix transforms is combined into a combined transform that is performed on an audio signal; this saves processing resources of the audio system.

Synthesizing a headphone signal using a rotating head-related transfer function
11252524 · 2022-02-15 · ·

The present technology relates to signal processing device and method that make it possible to reproduce sound more effectively. A signal processing device includes a rotation operation unit that rotates a head-related transfer function in a spherical harmonic domain by an operation on the basis of a rotation matrix corresponding to rotation of a head of a listener, the operation in which an order of the rotation matrix is limited, and a synthesis unit that synthesizes the head-related transfer function after rotation obtained by the operation and a sound signal of the spherical harmonic domain to generate a headphone drive signal. The present technology is applicable to an audio processor.

Method for and apparatus for decoding an ambisonics audio soundfield representation for audio playback using 2D setups

Sound scenes in 3D can be synthesized or captured as a natural sound field. For decoding, a decode matrix is required that is specific for a given loudspeaker setup and is generated using the known loudspeaker positions. However, some source directions are attenuated for 2D loudspeaker setups like e.g. 5.1 surround. An improved method for decoding an encoded audio signal in soundfield format for L loudspeakers at known positions comprises steps of adding (10) a position of at least one virtual loudspeaker to the positions of the L loudspeakers, generating (11) a 3D decode matrix (D′), wherein the positions (Formula I) of the L loudspeakers and the at least one virtual position (Formula II) are used, downmixing (12) the 3D decode matrix (D′), and decoding (14) the encoded audio signal (i14) using the downscaled 3D decode matrix (Formula III). As a result, a plurality of decoded loudspeaker signals (q14) is obtained.

METHOD AND APPARATUS FOR PROVIDING 3D SOUND FOR SURROUND SOUND CONFIGURATIONS
20170272863 · 2017-09-21 ·

A system for listening to binaural audio through a plurality of speakers having at least two pair of speakers, incorporating applying at least two Crosstalk Cancellation Filter to a corresponding at least two binaural signals to create a corresponding at least two pair of speaker signals, and inputting the at least two pairs of speakers signals to a corresponding at least two pairs of speakers of a plurality of speakers. The invention also relates to a system and method for listening to binaural audio through a plurality of speakers by dividing the speakers into groups, generating a Crosstalk Cancellation Filter for each group, and distributing the binaural audio among the speaker groups. The invention also relates to a system for placing a binaural audio signal onto a plurality of pairs of cross talk cancelled loudspeakers.