H04S5/00

Decoding of audio scenes

Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which is represented by one or more audio signals. The encoder generates a bit stream which comprises downmix signals and side information which includes individual matrix elements of a reconstruction matrix which enables reconstruction of the one or more audio signals in the decoder.

Decoding of audio scenes

Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which is represented by one or more audio signals. The encoder generates a bit stream which comprises downmix signals and side information which includes individual matrix elements of a reconstruction matrix which enables reconstruction of the one or more audio signals in the decoder.

SOUND PROCESSING APPARATUS AND SOUND PROCESSING SYSTEM

The present technology relates to a sound processing apparatus and a sound processing system for enabling more stable localization of a sound image.

A virtual speaker is assumed to exist on the lower side among the sides of a tetragon having its corners formed with four speakers surrounding a target sound image position on a spherical plane. Three-dimensional VBAP is performed with respect to the virtual speaker and the two speakers located at the upper right and the upper left, to calculate gains of the two speakers at the upper right and the upper left and the virtual speaker, the gains being to be used for fixing a sound image at the target sound image position. Further, two-dimensional VBAP is performed with respect to the lower right and lower left speakers, to calculate gains of the lower right and lower left speakers, the gains being to be used for fixing a sound image at the position of the virtual speaker. The values obtained by multiplying these gains by the gain of the virtual speaker are set as the gains of the lower right and lower left speakers for fixing a sound image at the target sound image position. The present technology can be applied to sound processing apparatuses.

SYSTEM AND METHOD FOR SYNCHRONIZATION OF MULTI-CHANNEL WIRELESS AUDIO STREAMS FOR DELAY AND DRIFT COMPENSATION

In at least one embodiment, a system for synchronizing an audio stream is provided. The system includes a first loudspeaker and an audio controller. The first loudspeaker plays back a first audio output signal including first signature information. The audio controller provides a first audio input signal and superimpose the first signature information on the first audio input signal. The audio controller receives the first audio output signal including the first audio packets and the first signature information and to detect the first signature information. The audio controller determines a delay attributed to a transmission of the first audio input signal and the first audio output signal based on the first signature information and synchronizes the transmission of a second audio input signal from the audio controller to the first loudspeaker with the playback of another audio output signal from a second loudspeaker based at least on the delay.

SYSTEM AND METHOD FOR SYNCHRONIZATION OF MULTI-CHANNEL WIRELESS AUDIO STREAMS FOR DELAY AND DRIFT COMPENSATION

In at least one embodiment, a system for synchronizing an audio stream is provided. The system includes a first loudspeaker and an audio controller. The first loudspeaker plays back a first audio output signal including first signature information. The audio controller provides a first audio input signal and superimpose the first signature information on the first audio input signal. The audio controller receives the first audio output signal including the first audio packets and the first signature information and to detect the first signature information. The audio controller determines a delay attributed to a transmission of the first audio input signal and the first audio output signal based on the first signature information and synchronizes the transmission of a second audio input signal from the audio controller to the first loudspeaker with the playback of another audio output signal from a second loudspeaker based at least on the delay.

Parametric reconstruction of audio signals

An encoding system encodes an N-channel audio signal (X), wherein N≥3, as a single-channel downmix signal (Y) together with dry and wet upmix parameters ({tilde over (C)}, {tilde over (P)}). In a decoding system, a decorrelating section outputs, based on the downmix signal, an (N−1)-channel decorrelated signal (Z); a dry upmix section maps the downmix signal linearly in accordance with dry upmix coefficients (C) determined based on the dry upmix parameters; a wet upmix section populates an intermediate matrix based on the wet upmix parameters and knowing that the intermediate matrix belongs to a predefined matrix class, obtains wet upmix coefficients (P) by multiplying the intermediate matrix by a predefined matrix, and maps the decorrelated signal linearly in accordance with the wet upmix coefficients; and a combining section combines outputs from the upmix sections to obtain a reconstructed signal ({circumflex over (X)}) corresponding to the signal to be reconstructed.

Systems and methods for spatial audio rendering

Systems and methods for rendering spatial audio in accordance with embodiments of the invention are illustrated. One embodiment includes a spatial audio system, including a primary network connected speaker, including a plurality of sets of drivers, where each set of drivers is oriented in a different direction, a processor system, memory containing an audio player application, wherein the audio player application configures the processor system to obtain an audio source stream from an audio source via the network interface, spatially encode the audio source, decode the spatially encoded audio source to obtain driver inputs for the individual drivers in the plurality of sets of drivers, where the driver inputs cause the drivers to generate directional audio.

VIRTUAL SIMULATION OF SPATIAL AUDIO CHARACTERISTICS

Embodiments of the present invention are directed to a system and method for demonstrating spatial performance of a demonstration speaker model to consumers in order to evaluate different speakers. The system and method comprise a microphone array for recording the output of the demonstration speaker model. The system and method comprise acoustic input samples for processing to an acoustic output and a processor for determining characteristics of each microphone recording, and processing an acoustic input sample and characteristics of each microphone recording corresponding to a selected demonstration speaker model. The system and method further comprise a reference speaker model for outputting an acoustic signal based on the result of the processing. The processing compensates for the performance characteristic of the reference speaker and the performance characteristic of the selected demonstration speaker so as to mimic the spatial characteristics of the demonstration speaker while avoiding bias from the reference speaker.

VIRTUAL SIMULATION OF SPATIAL AUDIO CHARACTERISTICS

Embodiments of the present invention are directed to a system and method for demonstrating spatial performance of a demonstration speaker model to consumers in order to evaluate different speakers. The system and method comprise a microphone array for recording the output of the demonstration speaker model. The system and method comprise acoustic input samples for processing to an acoustic output and a processor for determining characteristics of each microphone recording, and processing an acoustic input sample and characteristics of each microphone recording corresponding to a selected demonstration speaker model. The system and method further comprise a reference speaker model for outputting an acoustic signal based on the result of the processing. The processing compensates for the performance characteristic of the reference speaker and the performance characteristic of the selected demonstration speaker so as to mimic the spatial characteristics of the demonstration speaker while avoiding bias from the reference speaker.

Processing of microphone signals for spatial playback

Disclosed are methods and systems which convert a multi-microphone input signal to a multichannel output signal making use of a time- and frequency-varying matrix. For each time and frequency tile, the matrix is derived as a function of a dominant direction of arrival and a steering strength parameter. Likewise, the dominant direction and steering strength parameter are derived from characteristics of the multi-microphone signals, where those characteristics include values representative of the inter-channel amplitude and group-delay differences.