Patent classifications
H04S2420/07
STEREO-BASED IMMERSIVE CODING
Disclosed is an audio codec that represents an immersive signal by a two-channel stereo signal that is a stereo rendering of the immersive signal and directional parameters. The directional parameters may be based on a perceptual model describing the direction of virtual speaker pairs to recreate the perceived location of dominant sounds. Audio processing at the decoder may be performed on the stereo signal in the frequency domain for multiple channel pairs using time-frequency tiles. Spatial localization of the audio signals may use a panning approach by applying weightings to the time-frequency tiles of the stereo signal for each output channel pair. The weightings for the time-frequency tiles may be derived based on the directional parameters, an analysis of the stereo signal, and the output channel layout. The weightings may be used to adaptively process the time-frequency tiles using a decorrelator to reduce or minimize spectral distortions from spatial rendering.
Audio signal processing method and apparatus
The present invention relates to a method and an apparatus for processing an audio signal, and more particularly, to a method and an apparatus for processing an audio signal, which synthesize an object signal and a channel signal and effectively perform binaural rendering of the synthesized signal. To this end, provided are a method for processing an audio signal, which includes: receiving an input audio signal including a multi-channel signal; receiving truncated subband filter coefficients for filtering the input audio signal, the truncated subband filter coefficients being at least some of subband filter coefficients obtained from binaural room impulse response (BRIR) filter coefficients for binaural filtering of the input audio signal and the length of the truncated subband filter coefficients being determined based on filter order information obtained by at least partially using reverberation time information extracted from the corresponding subband filter coefficients; obtaining vector information indicating the BRIR filter coefficients corresponding to each channel of the input audio signal; and filtering each subband signal of the multi-channel signal by using the truncated subband filter coefficients corresponding to the relevant channel and subband based on the vector information and an apparatus for processing an audio signal by using the same.
Method, apparatus or systems for processing audio objects
Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.
Subband spatial processing and crosstalk cancellation system for conferencing
Embodiments relate to providing a conference for client devices with spatialized audio. Input audio streams are received from the client devices. For each client device, placement data defining spatial locations of other client devices within a sound field is determined. A mixed stream including a left mixed channel and a right mixed channel for the client device is generated by mixing and panning input audio streams of the other client devices according to the placement data. A spatially enhanced stream including a left enhanced channel for a left speaker and a right enhanced channel for a right speaker is generated by applying subband spatial processing and crosstalk processing on the left mixed channel and the right mixed channel of the mixed stream.
METHODS AND SYSTEMS FOR DESIGNING AND APPLYING NUMERICALLY OPTIMIZED BINAURAL ROOM IMPULSE RESPONSES
Methods and systems for designing binaural room impulse responses (BRIRs) for use in headphone virtualizers, and methods and systems for generating a binaural signal in response to a set of channels of a multi-channel audio signal, including by applying a BRIR to each channel of the set, thereby generating filtered signals, and combining the filtered signals to generate the binaural signal, where each BRIR has been designed in accordance with an embodiment of the design method. Other aspects are audio processing units configured to perform any embodiment of the inventive method. In accordance with some embodiments, BRIR design is formulated as a numerical optimization problem based on a simulation model (which generates candidate BRIRs) and at least one objective function (which evaluates each candidate BRIR), and includes identification of a best one of the candidate BRIRs as indicated by performance metrics determined for the candidate BRIRs by each objective function.
Facilitating Calibration of an Audio Playback Device
Example techniques facilitate calibration of a playback device. An example implementation involves a computing device capturing, via a microphone, data representing multiple iterations of a calibration sound as played by a playback device. The computing device identifies multiple sections within the captured data. Two or more sections represent respective iterations of the calibration sound as played by the playback device. Based on the multiple identified sections, the computing device determines a frequency response of the playback device, the frequency response of the playback device representing audio output by the playback device and acoustic characteristics of an environment around the playback device. Based on the frequency response of the playback device and a target frequency response, the computing device determines one or more parameters of an audio processing algorithm and sends, to the playback device, the one or more parameters of the audio processing algorithm.
Spatial Audio Processing
According to an example embodiment, a technique for spatial audio processing including: determining at least one spatial parameter based, at least partially, on at least one input audio signal captured with at least one first device, configured to represent at least a portion of an audio scene; identifying a portion of interest of the audio scene based, at least partially, on the at least one spatial parameter; generating at least one first audio signal based, at least partially, on the at least one input audio signal; generating at least one second audio signal based, at least partially, on at least one audio signal captured with at least one second device; and combining, at least partially, the at least one first audio signal and the at least one second audio signal into at least one combined audio signal.
APPARATUS AND METHOD FOR AUDIO ANALYSIS
An apparatus comprises a receiver (201) receiving a multi-channel audio signal representing audio for a scene. An extractor (203) extracts at least one directional audio component by applying a spatial filtering to the multi-channel signal where the spatial filtering is dependent on the multi-channel audio signal. A feature processor (205) determines a set of features for the first directional audio component and a categorizer (207) determines a first audio source category out of a plurality of audio source categories for the directional audio signal in response to the set of features. An assigner (209) assigns a first audio source property to the first directional audio component from a set of audio source properties for the first audio source category. The apparatus may provide very advantageous categorization and characterization of individual audio sources/components present in a multi-channel signal. This may be advantageous e.g. for visualization of audio events.
SYSTEMS AND METHODS FOR PROVIDING AUGMENTED AUDIO
A system for providing augmented spatialized audio in a vehicle, including a plurality of speakers disposed in a perimeter of a cabin of the vehicle; and a controller configured to receive a position signal indicative of the position of a first user's head in the vehicle and to output to a first binaural device, according to the first position signal, a first spatial audio signal, such that the first binaural device produces a first spatial acoustic signal perceived by the first user as originating from a first virtual source location within the vehicle cabin, wherein the first spatial audio signal comprises at least an upper range of a first content signal, wherein the controller is further configured to drive the plurality of speakers with a driving signal such that a first bass content of the first content signal is produced in the vehicle cabin.
Decoding of audio scenes
Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which is represented by one or more audio signals. The encoder generates a bit stream which comprises downmix signals and side information which includes individual matrix elements of a reconstruction matrix which enables reconstruction of the one or more audio signals in the decoder.