H04S5/005

SOUND PROCESSING APPARATUS AND SOUND PROCESSING SYSTEM

The present technology relates to a sound processing apparatus and a sound processing system for enabling more stable localization of a sound image.

A virtual speaker is assumed to exist on the lower side among the sides of a tetragon having its corners formed with four speakers surrounding a target sound image position on a spherical plane. Three-dimensional VBAP is performed with respect to the virtual speaker and the two speakers located at the upper right and the upper left, to calculate gains of the two speakers at the upper right and the upper left and the virtual speaker, the gains being to be used for fixing a sound image at the target sound image position. Further, two-dimensional VBAP is performed with respect to the lower right and lower left speakers, to calculate gains of the lower right and lower left speakers, the gains being to be used for fixing a sound image at the position of the virtual speaker. The values obtained by multiplying these gains by the gain of the virtual speaker are set as the gains of the lower right and lower left speakers for fixing a sound image at the target sound image position. The present technology can be applied to sound processing apparatuses.

Apparatus and method for efficient object metadata coding

An apparatus for generating one or more audio channels is provided. The apparatus includes a metadata decoder for receiving one or more compressed metadata signals. Each of the one or more compressed metadata signals includes a plurality of first metadata samples. The metadata decoder is configured to generate one or more reconstructed metadata signals and to generate each of the second metadata samples of each reconstructed metadata signal of the one or more reconstructed metadata signals depending on at least two of the first metadata samples of the reconstructed metadata signal. The apparatus includes an audio channel generator for generating the one or more audio channels depending on the one or more audio object signals and depending on the one or more reconstructed metadata signals. An apparatus for generating encoded audio information including one or more encoded audio signals and one or more compressed metadata signals is provided.

Parametric reconstruction of audio signals

An encoding system encodes an N-channel audio signal (X), wherein N≥3, as a single-channel downmix signal (Y) together with dry and wet upmix parameters ({tilde over (C)}, {tilde over (P)}). In a decoding system, a decorrelating section outputs, based on the downmix signal, an (N−1)-channel decorrelated signal (Z); a dry upmix section maps the downmix signal linearly in accordance with dry upmix coefficients (C) determined based on the dry upmix parameters; a wet upmix section populates an intermediate matrix based on the wet upmix parameters and knowing that the intermediate matrix belongs to a predefined matrix class, obtains wet upmix coefficients (P) by multiplying the intermediate matrix by a predefined matrix, and maps the decorrelated signal linearly in accordance with the wet upmix coefficients; and a combining section combines outputs from the upmix sections to obtain a reconstructed signal ({circumflex over (X)}) corresponding to the signal to be reconstructed.

Audio channel spatial translation

The present invention is directed to methods and apparatus for translating a first plurality of audio input channels to a second plurality of audio output channels. This includes determining that there is pair-wise coding among any of the first plurality of audio input channels, determining an input/output-mapping matrix for mapping at least a first set of the first plurality of audio input channels to at least a second set of the second plurality of audio output channels; and deriving the second plurality of audio output channels based on first plurality of audio input channels, the input/output-mapping matrix and the determined pair-wise coding. The first plurality of audio input channels represent the same soundfield represented by the second plurality of audio output channels.

SELECTABLE LINEAR PREDICTIVE OR TRANSFORM CODING MODES WITH ADVANCED STEREO CODING

Methods and systems for advanced stereo processing of an audio signal are disclosed. The methods and systems include selecting a coding mode of either transform coding or linear predictive coding and performing advanced stereo processing when in the selected coding mode. Both encoding and decoding operations are provided.

System and method for adaptive audio signal generation, coding and rendering

Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.

Sound processing apparatus and sound processing system

The present technology relates to a sound processing apparatus and a sound processing system for enabling more stable localization of a sound image. A virtual speaker is assumed to exist on the lower side among the sides of a tetragon having its corners formed with four speakers surrounding a target sound image position on a spherical plane. Three-dimensional VBAP is performed with respect to the virtual speaker and the two speakers located at the upper right and the upper left, to calculate gains of the two speakers at the upper right and the upper left and the virtual speaker, the gains being to be used for fixing a sound image at the target sound image position. Further, two-dimensional VBAP is performed with respect to the lower right and lower left speakers, to calculate gains of the lower right and lower left speakers, the gains being to be used for fixing a sound image at the position of the virtual speaker. The values obtained by multiplying these gains by the gain of the virtual speaker are set as the gains of the lower right and lower left speakers for fixing a sound image at the target sound image position. The present technology can be applied to sound processing apparatuses.

AUDIO DEVICE, AUDIO SYSTEM, AND COMPUTER-READABLE PROGRAM
20220225046 · 2022-07-14 ·

To enable listening with a desired number of channels at a lower cost and without wasting resources. According to the present invention, a multi-channel digital audio signal is processed through the cooperation of a plurality of AV amplifier devices. One AV amplifier device 1 serves as a master, and distributes the audio channels to be processed to the other AV amplifier devices 1 including the own device 1, and determines, on the basis of the signal processing time of each AV amplifier device 1, the output delay time of each AV amplifier device 1 such that the output timings of the analog audio signals of all AV amplifier devices 1 match each other. Then, the AV amplifier device 1 decodes the input digital audio signal into an analog audio signal for an audio channel distributed to the own device 1 by the master, and delays and outputs the decoded analog audio signal by the output delay time of the own device 1 determined by the master.

Method for transmitting and receiving audio data and apparatus therefor
11393483 · 2022-07-19 · ·

A method for transmitting audio data performed by an audio data transmission apparatus in accordance with the present invention comprises the steps of: generating playback environment information of three-dimensional audio content; encoding a three-dimensional audio signal of the three-dimensional audio content; and transmitting, to an audio data reception apparatus, the encoded three-dimensional audio signal of the three-dimensional audio content and the generated playback environment information, wherein the playback environment information includes environment information of a room in which the three-dimensional audio content is played.

Audio system height channel up-mixing
11373662 · 2022-06-28 · ·

Audio system height channel up-mixing that is configured to develop two or more height channels from audio sources that do not include height-related encoding. The up-mixing involves determining correlations and normalized channel energies between input audio signals. At least two height channels (e.g., left and right height audio signals) are developed from the correlations and normalized energies.