H04S2420/13

Wave field synthesis system

One example provides a decentrally structured apparatus including sound transducers and operating according to wave field synthesis principles. The decentrally structured apparatus includes a plurality of assembly units, each including several sound transducers, wherein the decentrally structured apparatus is configured to use a model-based approach to carry out a synthesis of wave fronts within each assembly unit for sound transducers of the respective assembly unit using audio signals and associated data for their form, and to actuate the sound transducers of the respective assembly unit with actuation signals corresponding to the synthesis.

Method for controlling a three-dimensional multi-layer speaker arrangement and apparatus for playing back three-dimensional sound in an audience area
09674631 · 2017-06-06 · ·

A method for controlling a three-dimensional multi-layer speaker arrangement having a plurality of speakers arranged in spaced layers. The method includes: providing information for a sound to be played back from a 3D source position assigned to the sound, wherein the source position is defined with respect to a reference point (RP) within the multi-layer speaker arrangement, extracting a 2D source position (SPXY) from the source position and calculating layer specific speaker coefficients using a 2D calculator to position the sound two dimensional source position, and feeding a vertical pan or 3D source position into a multilayer calculator for obtaining a layer gain factor for each layer for obtaining speaker coefficients used as individual gains enabling the speakers to play back the sound.

Audio Representation and Associated Rendering
20250056176 · 2025-02-13 ·

An apparatus configured to: receive, at least, at least one first audio channel and at least one second audio channel, wherein at least one of the at least one first audio channel or the at least one second audio channel comprises spatial audio configured to enable immersive audio communication; determine a format of at least one of the at least one first or the at least one second audio channel to identify which of the received at least one first audio channel and the at least one second audio channel comprises the spatial audio; process the identified at least one audio channel with at least one parameter dependent on the determined format; and render the processed at least one audio channel and another of the at least one first audio channel or the at least one second audio channel.

Device and method for calculating loudspeaker signals for a plurality of loudspeakers while using a delay in the frequency domain

A device for calculating loudspeaker signals using a plurality of audio sources, an audio source including an audio signal, includes a forward transform stage for transforming each audio signal to a spectral domain to obtain a plurality of temporally consecutive short-term spectra, a memory for storing a plurality of temporally consecutive short-term spectra for each audio signal, a memory access controller for accessing a specific short-term spectrum for a combination consisting of a loudspeaker and an audio signal based on a delay value, a filter stage for filtering the specific short-term spectrum by using a filter, so that a filtered short-term spectrum is obtained for each audio signal and loudspeaker combination, a summing stage for summing up the filtered short-term spectra for a loudspeaker to obtain summed-up short-term spectra, and a backtransform stage for backtransforming summed-up short-term spectra for the loudspeakers to a time domain to obtain the loudspeaker signals.

Parametric joint-coding of audio sources

The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.

APPARATUS AND METHOD FOR COPY-PROTECTED GENERATION AND REPRODUCTION OF A WAVE FIELD SYNTHESIS AUDIO REPRESENTATION
20170150286 · 2017-05-25 ·

An embodiment provides an apparatus for generating a copy-protected wave field synthesis audio representation of an audio scene with a plurality of audio objects, wherein each audio object includes an audio file and position information. The apparatus includes a watermark embedder for embedding a watermark in the audio file of at least one of the plurality of audio objects for generating a modified audio file for the at least one audio object, wherein the watermark specifies a reproduction room. Further, the apparatus includes a wave field synthesis processor for generating the copy-protected wave field synthesis audio representation of the audio scene by using a loudspeaker configuration of the specific reproduction room of the modified audio file and the position for the at least one audio object.

Audio signal playback device, method, and recording medium
09661436 · 2017-05-23 · ·

An audio signal playback device includes a conversion unit that performs discrete Fourier transform on each of 2 channel audio signals obtained from a multi-channel input audio signal, a correlation signal extraction unit that, disregarding a direct current component, extracts a correlation signal from the 2 channel audio signals that result from the discrete Fourier transform, and additionally pulls a correlation signal in a lower frequency than a predetermined frequency f.sub.low out of the correlation signal, and an output unit that allocates the pulled-out correlation signal to a virtual sound source in such a manner that a time difference in a sound output between adjacent speakers falls within a range of 2x/c (where, x is a distance between the adjacent speakers, and c is the speed of sound), and outputs a result of the allocation from one portion or all portions of the speaker group.

PARAMETRIC JOINT-CODING OF AUDIO SOURCES

The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.

System and method for adaptive audio signal generation, coding and rendering

Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.

Information processing system and storage medium

Provided is an information processing system including a recognizing unit configured to recognize a first target and a second target on the basis of signals detected by a plurality of sensors arranged around a specific user, an identifying unit configured to identify the first target and the second target recognized by the recognizing unit, an estimating unit configured to estimate a position of the specific user in accordance with the a signal detected by any one of the plurality of sensors, and a signal processing unit configured to process each of signals acquired from sensors around the first target and the second target identified by the identifying unit in a manner that, when being output from a plurality of actuators arranged around the specific user, the signals are localized near the position of the specific user estimated by the estimating unit.