Patent classifications
H04S2420/11
RENDERING BINAURAL AUDIO OVER MULTIPLE NEAR FIELD TRANSDUCERS
An apparatus and method of rendering audio. A binaural signal is split on an amplitude weighting basis into a front binaural signal and a rear binaural signal, based on perceived position information of the audio. In this manner, the front-back differentiation of the binaural signal is improved.
METHOD AND APPARATUS FOR DECODING A BITSTREAM INCLUDING ENCODED HIGHER ORDER AMBISONICS REPRESENTATIONS
Higher Order Ambisonics represents three-dimensional sound independent of a specific loudspeaker set-up. However, transmission of an HOA representation results in a very high bit rate. Therefore compression with a fixed number of channels is used, in which directional and ambient signal components are processed differently. For coding, portions of the original HOA representation are predicted from the directional signal components. This prediction provides side information which is required for a corresponding decoding. By using some additional specific purpose bits, a known side information coding processing is improved in that the required number of bits for coding that side information is reduced on average.
Method and apparatus for processing multimedia signals
The present invention relates to a method and an apparatus for processing a signal, which are used for effectively reproducing a multimedia signal, and more particularly, to a method and an apparatus for processing a signal, which are used for implementing filtering for multimedia signal having a plurality of subbands with a low calculation amount. To this end, provided are a method for processing a multimedia signal including: receiving a multimedia signal having a plurality of subbands; receiving at least one proto-type filter coefficients for filtering each subband signal of the multimedia signal; converting the proto-type filter coefficients into a plurality of subband filter coefficients; truncating each subband filter coefficients based on filter order information obtained by at least partially using characteristic information extracted from the corresponding subband filter coefficients, the length of at least one truncated subband filter coefficients being different from the length of truncated subband filter coefficients of another subband; and filtering the multimedia signal by using the truncated subband filter coefficients corresponding to each subband signal and an apparatus for processing a multimedia signal using the same.
Apparatus, a method and a computer program for delivering audio scene entities
In an example embodiment, method, apparatus, and computer program product are provided. The apparatus includes at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform: assign one or more audio representations to one or more audio scene entities in an audio scene; generate one or more audio scene entity combinations based on the one or more audio scene entities and the one or more audio representations; and signal the one or more audio scene entity combinations to a client, wherein the one or more audio representations assigned to the one or more audio scene entities cause the client to select an appropriate audio scene entity combination from the one or more audio scene entity combinations to render the audio scene.
Music collection navigation device and method
An audio navigation device comprising an input means for inputting two or more audio pieces into the navigation device; a spatialization means for allocating a position in the form of a unique spatial coordinate to each audio piece and arranging the audio pieces in a multi-dimensional arrangement; a generating means for generating a binaural audio output for each audio piece, wherein the audio output simulates sounds that would be made by one or more physical sources located at the given position of each audio piece; an output means for simultaneously outputting multiple audio pieces as binaural audio output to a user; a navigation means for enabling a user to navigate around the audio outputs in the multi-dimensional arrangement; and a selection means for allowing a user to select a single audio output.
Adjusting Spatial Congruency in a Video Conferencing System
Example embodiments disclosed herein relate to spatial congruency adjustment. A method for adjusting spatial congruency in a video conference is disclosed. The method in unwarping a visual scene captured by a video endpoint device into at least one rectilinear scene, the video endpoint device being configured to capture the visual scene in an omnidirectional manner, detecting spatial congruency between the at least one rectilinear scene and an auditory scene captured by an audio endpoint device that is positioned in relation to the video endpoint device. The spatial congruency being a degree of alignment between the auditory scene and the at least one rectilinear scene and in response to the detected spatial congruency being below the threshold, adjusting the spatial congruency. Corresponding system and computer program products are also disclosed.
AUDIO RENDERING USING 6-DOF TRACKING
The methods and apparatus described herein optimally represent full 3D audio mixes (e.g., azimuth, elevation, and depth) as “sound scenes” in which the decoding process facilitates head tracking. Sound scene rendering can be performed for the listener's orientation (e.g., yaw, pitch, roll) and 3D position (e.g., x, y, z), and can be modified for a change in the listener's orientation or 3D position. As described below, the ability to render an audio object in both the near-field and far-field enables the ability to fully render depth of not just objects, but any spatial audio mix decoded with active steering/panning, such as Ambisonics, matrix encoding, etc., thereby enabling full translational head tracking (e.g., user movement) beyond simple rotation in the horizontal plane, or 6-degrees-of-freedom (6-DOF) tracking and rendering.
Audio Processing Method and Apparatus
An audio processing method includes processing, by M first virtual speakers, a to-be-processed audio signal to obtain M first audio signals; processing, by N second virtual speakers, the to-be-processed audio signal to obtain N second audio signals; obtain M first head-related transfer functions (HRTFs) centered at a left ear position and N second HRTFs centered at a right ear position; obtain a first target audio signal based on the M first audio signals and the M first HRTFs; and obtain a second target audio signal based on the N second audio signals and the N second HRTFs.
Spatial Audio Representation and Rendering
An apparatus including circuitry configured to: receive a spatial audio signal, the spatial audio signal including at least one audio signal and spatial metadata associated; generate at least one decorrelated audio signal; determine at least one control parameter, wherein the at least one control parameter is at least based on at least one target further property of the at least two output audio signals and at least one of: the spatial metadata and at least one property determined based on the at least one audio signal; and generate the at least two output audio signals based on the spatial audio signal and at least one decorrelated audio signal, wherein the amount of the at least one decorrelated audio signal within at least two output audio signals is controlled based on the at least one control parameter.
Higher order ambisonics signal compression
Systems and techniques for compression and decoding of audio data are generally disclosed. An example device for compressing higher order ambisonic (HOA) coefficients representative of a soundfield includes a memory configured to store audio data and one or more processors configured to: determine when to use ambient HOA coefficients of the HOA coefficients to augment one or more foreground audio objects obtained through decomposition of the HOA coefficients based on one or more singular values also obtained through the decomposition of the HOA coefficients, the ambient HOA coefficients representative of an ambient component of the soundfield.