Patent classifications
H04S2420/13
Rendering audio objects with multiple types of renderers
An apparatus and method of rendering audio objects with multiple types of renderers. The weighting between the selected renderers depends upon the position information in each audio object. As each type of renderer has a different output coverage, the combination of their weighted outputs results in the audio being perceived at the position according to the position information.
Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding using low-order, mid-order and high-order components generators
An apparatus for generating a sound field description using an input signal having a mono-signal or a multi-channel signal comprises: an input signal analyzer for analyzing the input signal to derive direction data and diffuseness data; a low-order components generator for generating a low-order sound field description from the input signal up to a predetermined order and mode; a mid-order components generator for generating a mid-order sound field description above the predetermined order or at the predetermined order and above the predetermined mode and below or at a first truncation order using a synthesis of at least one direct portion and of at least one diffuse portion using the direction data and the diffuseness data; and a high-order components generator for generating a high-order sound field description having a component above the first truncation order using a synthesis of at least one direct portion.
Methods and apparatus for compressing and decompressing a higher order ambisonics representation
Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore, compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. The ambient HOA component is represented by a minimum number of HOA coefficient sequences. The remaining channels contain either directional signals or additional coefficient sequences of the ambient HOA component, depending on what will result in optimum perceptual quality. This processing can change on a frame-by-frame basis.
SPATIAL SOUND GENERATION DEVICE, SPATIAL SOUND GENERATION SYSTEM, SPATIAL SOUND GENERATION METHOD, AND SPATIAL SOUND GENERATION PROGRAM
A spatial sound generation device including a storage (106) and a controller (102) and connected to a plurality of speakers (116) is provided. In the spatial sound generation device, referring to information indicating a movable sounding body, the controller varies a transfer characteristic for each time in accordance with movement of the sounding body and applies an inverse filtering to calculate a plurality of input signals for the respective speakers from a sound source signal indicating a sound emitted by the sounding body. The inverse filtering outputs the input signals into the speakers to form a three-dimensional acoustic wave front under boundary surface control in accordance with a transfer characteristic for a space in which the plurality of speakers are arranged.
System and method for adaptive audio signal generation, coding and rendering
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Sound field reproduction device, sound field reproduction method, and program
The present technique relates to a sound field reproduction device, a sound field reproduction method, and a program that make it possible to further accurately reproduce a certain sound field. A feature amount extraction unit extracts a main sound source feature amount from a sound pickup signal obtained by picking up a sound from a main sound source. A main sound source separation unit separates the sound pickup signal obtained through the sound pickup with a microphone array that mainly picks up a sound from the main sound source into a main sound source component and an auxiliary sound source component using the main sound source feature amount. On the basis of the main sound source component and the auxiliary sound source component that have been separated, a main sound source emphasis unit generates a signal in which the main sound source components are emphasized. A drive signal for a speaker array is generated from the signal generated in this manner and supplied to the speaker array. The present technique can be applied to a sound field reproduction apparatus.
Apparatus and method for generating spatial audio
The present disclosure pertains to an apparatus comprising circuitry configured to: determine a loudspeaker dependent spread factor for at least one individual loudspeaker of a loudspeaker arrangement, wherein the loudspeaker dependent spread factor depends on a specification of the at least one individual loudspeaker; and 5 control the outputs of the loudspeakers of the loudspeaker arrangement based on the loudspeaker dependent spread factor for the at least one individual loudspeaker to generate at least one virtual sound source.
System and method for adaptive audio signal generation, coding and rendering
Embodiments are described for an adaptive audio system that processes audio data comprising a number of independent monophonic audio streams. One or more of the streams has associated with it metadata that specifies whether the stream is a channel-based or object-based stream. Channel-based streams have rendering information encoded by means of channel name; and the object-based streams have location information encoded through location expressions encoded in the associated metadata. A codec packages the independent audio streams into a single serial bitstream that contains all of the audio data. This configuration allows for the sound to be rendered according to an allocentric frame of reference, in which the rendering location of a sound is based on the characteristics of the playback environment (e.g., room size, shape, etc.) to correspond to the mixer's intent. The object position metadata contains the appropriate allocentric frame of reference information required to play the sound correctly using the available speaker positions in a room that is set up to play the adaptive audio content.
Method for providing a spatialized soundfield
A signal processing system and method for delivering spatialized sound by optimizing sound waveforms from a sparse array of speakers to the ears of a user. The system can provide listening areas within a room or space, to provide spatialization sounds to create a 3D audio effect. In a binaural mode, a binary speaker array provides targeted beams aimed towards a user's ears.
SOUND FIELD FORMING APPARATUS AND METHOD, AND PROGRAM
The present technology relates to a sound field forming apparatus and method, and a program, enabled to improve reproducibility of a wave front by using a smaller amount of computation. A sound field forming apparatus includes a listener position acquisition section configured to acquire listener positional information indicating a position of a listener, a drive speaker selection section configured to select one or a plurality of speakers, as a drive speaker, used to form a sound field, among the speakers configuring a speaker array on the basis of the listener positional information, and a drive signal generation section configured to drive the drive speaker and generate a speaker drive signal for forming the sound field in accordance with a selection result of the drive speaker. The present technology can be applied to the sound field forming apparatus.