Patent classifications
H04S2420/07
METHOD FOR AND APPARATUS FOR DECODING/RENDERING AN AMBISONICS AUDIO SOUNDFIELD REPRESENTATION FOR AUDIO PLAYBACK USING 2D SETUPS
Improved methods and/or apparatus for decoding an encoded audio signal in soundfield format for L loudspeakers. The method and/or apparatus can render an Ambisonics format audio signal to 2D loudspeaker setup(s) based on a rendering matrix. The rendering matrix has elements based on loudspeaker positions and wherein the rendering matrix is determined based on weighting at least an element of a first matrix with a weighting factor
The first matrix is determined based on positions of the L loudspeakers and at least a virtual position of at least a virtual loudspeaker that is added to the positions of the L loudspeakers.
METHOD FOR AND APPARATUS FOR DECODING/RENDERING AN AMBISONICS AUDIO SOUNDFIELD REPRESENTATION FOR AUDIO PLAYBACK USING 2D SETUPS
Improved methods and/or apparatus for decoding an encoded audio signal in soundfield format for L loudspeakers. The method and/or apparatus can render an Ambisonics format audio signal to 2D loudspeaker setup(s) based on a rendering matrix. The rendering matrix has elements based on loudspeaker positions and wherein the rendering matrix is determined based on weighting at least an element of a first matrix with a weighting factor
The first matrix is determined based on positions of the L loudspeakers and at least a virtual position of at least a virtual loudspeaker that is added to the positions of the L loudspeakers.
AUDIO DECODER AND DECODING METHOD
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
SPATIAL AUDIO PROCESSING
According to an example embodiment, a method for processing a multi-channel input audio signal representing a sound field into a multi-channel output audio signal representing said sound field in accordance with a predefined loudspeaker layout is provided, the method comprising the following for at least one frequency band: obtaining spatial audio parameters that are descriptive of spatial characteristics of said sound field; estimating a signal energy of the sound field represented by the multi-channel input audio signal; estimating, based on said signal energy and the obtained spatial audio parameters, respective output signal energies for channels of the multi-channel output audio signal according to said predefined loudspeaker layout; determining a maximum output energy as the largest of the output signal energies across channels of said multi-channel output audio signal; and deriving, on basis of said maximum output energy, a gain value for adjusting sound reproduction gain in at least one of said channels of the multi-channel output audio signal.
METHODS AND SYSTEMS FOR AUDIO SIGNAL FILTERING
Systems and methods for rendering audio signals are disclosed. In some embodiments, a method may receive an input signal including a first portion and the second portion. A first processing stage comprising a first filter is applied to the first portion to generate a first filtered signal. A second processing stage comprising a second filter is applied to the first portion to generate a second filtered signal. A third processing stage comprising a third filter is applied to the second portion to generate a third filtered signal. A fourth processing stage comprising a fourth filter is applied to the second portion to generate a fourth filtered signal. A first output signal is determined based on a sum of the first filtered signal and the third filtered signal. A second output signal is determined based on a sum of the second filtered signal and the fourth filtered signal. The first output signal is presented to a first ear of a user of a virtual environment, and the second output signal is presented to the second ear of the user. The first portion of the input signal corresponds to a first location in the virtual environment, and the second portion of the input signal corresponds to a second location in the virtual environment.
Object-based Audio Spatializer
A 3D sound spatializer provides delay-compensated HRTF interpolation techniques and efficient cross-fading between current and delayed HRTF filter results to mitigate artifacts caused by interpolation between HRTF filters and the use of time- varying HRTF filters.
Distributed audio capture and mixing controlling
An apparatus including a processor configured to determine a position for at least one sound source relative to a reference position, and a position for a sound source tracker relative to the reference position. The processor further configured to determine a direction associated with the sound source tracker, select the at least one sound source based on an analysis of the direction associated with the sound source tracker, the position for the at least one sound source and the position of the sound source tracker. The processor is further configured to receive at least one control interaction associated with the selected at least one sound source from at least one controller, process at least one audio signal associated with the selected sound source based on the control interaction and output the processed at least one audio signal to be rendered.
AUDIO DECODER AND DECODING METHOD
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
Parametric Spatial Audio Rendering with Near-Field Effect
An apparatus including circuitry configured to: obtain two or more audio signals, wherein each audio signal is associated with a microphone array; obtain at least one value associated with an inter-channel difference based on the two or more audio signals; obtain at least one parameter value associated with the two or more audio signals; obtain at least one value associated with an inter-aural difference based at least on the at least one parameter value; generate at least two output audio signals by controlling inter-aural level differences of the generated at least two output audio signals based on the at least one value associated with the inter-channel difference and the at least one value associated with the inter-aural difference, such that sounds nearer to the microphone array are reproduced with a higher inter-aural difference at the at least two output audio signals.
METHOD, APPARATUS OR SYSTEMS FOR PROCESSING AUDIO OBJECTS
Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.