H04S2420/13

PARAMETRIC JOINT-CODING OF AUDIO SOURCES
20220392467 · 2022-12-08 ·

The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals, which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.

PARAMETRIC JOINT-CODING OF AUDIO SOURCES
20220392466 · 2022-12-08 ·

The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals, which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.

PARAMETRIC JOINT-CODING OF AUDIO SOURCES
20220392468 · 2022-12-08 ·

The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals, which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.

RENDERING AUDIO OBJECTS WITH MULTIPLE TYPES OF RENDERERS

An apparatus and method of rendering audio objects with multiple types of renderers. The weighting between the selected renderers depends upon the position information in each audio object. As each type of renderer has a different output coverage, the combination of their weighted outputs results in the audio being perceived at the position according to the position information.

Signal processing apparatus and method, and program

The present technology relates to a signal processing apparatus and method, and a program that can easily make a leaking sound difficult to hear. A signal processing apparatus includes a masking sound generation unit that, in a case where a first content is reproduced in a first region and a second content is reproduced in a second region by wave field synthesis using a speaker array, generates a masking sound for masking a sound of the first content and a sound of the second content heard in a region between the first region and the second region. The present technology can be applied to content reproduction systems.

VOICE CONTROL DEVICE AND VOICE CONTROL SYSTEM

The voice control device includes a sound source signal input unit, a frequency determination unit, a band controller, a sound image controller, and a voice output unit. The sound source signal input unit inputs a sound source signal of content from a sound source. The frequency determination unit determines a cutoff frequency. The band controller acquires a high frequency signal in a frequency band equal to or higher than the cutoff frequency and a low frequency signal in a frequency band equal to or lower than the cutoff frequency, from the sound source signal of the content. The sound image controller generates a plurality of sound image control signals for controlling sound images of the plurality of speakers, by controlling at least one of a phase and a sound pressure level of the high frequency signal. The voice output unit outputs the low frequency signal to a first speaker, and outputs the plurality of sound image control signals to a second speaker composed of a plurality of speakers.

METHOD FOR PROVIDING A SPATIALIZED SOUNDFIELD
20220322025 · 2022-10-06 ·

A signal processing system and method for delivering spatialized sound by optimizing sound waveforms from a sparse array of speakers to the ears of a user. The system can provide listening areas within a room or space, to provide spatialization sounds to create a 3D audio effect. In a binaural mode, a binary speaker array provides targeted beams aimed towards a user's ears.

METHODS AND APPARATUS FOR COMPRESSING AND DECOMPRESSING A HIGHER ORDER AMBISONICS REPRESENTATION

Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore, compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.

METHODS AND APPARATUS FOR COMPRESSING AND DECOMPRESSING A HIGHER ORDER AMBISONICS REPRESENTATION

Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore, compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.

System and method for adaptive audio signal generation, coding and rendering

Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.