Patent classifications
H04S2420/03
THE MERGING OF SPATIAL AUDIO PARAMETERS
There is inter alia disclosed an apparatus for spatial audio encoding comprising: means for determining at least two of a type of spatial audio parameter for one or more audio signals, wherein a first of the type of spatial audio parameter is associated with a first group of samples in a domain of the one or more audio signals and a second of the type of spatial audio parameter is associated with a second group of samples in the domain of the one or more audio signals; and means for merging the first of the type of spatial audio parameter and the second of the type of spatial audio parameter into a merged spatial audio parameter.
Method, medium, and system generating a stereo signal
Surround audio decoding for selectively generating an audio signal from a multi-channel signal. In the surround audio decoding, a down-mixed signal, e.g., as down-mixed by an encoding terminal, is selectively up-mixed to a stereo signal or a multi-channel signal, by generating spatial information for generating the stereo signal, using spatial information for up-mixing the down-mixed signal to the multi-channel signal.
Spatial Audio Representation and Rendering
An apparatus including circuitry configured to: receive a spatial audio signal, the spatial audio signal including at least one audio signal and spatial metadata associated; generate at least one decorrelated audio signal; determine at least one control parameter, wherein the at least one control parameter is at least based on at least one target further property of the at least two output audio signals and at least one of: the spatial metadata and at least one property determined based on the at least one audio signal; and generate the at least two output audio signals based on the spatial audio signal and at least one decorrelated audio signal, wherein the amount of the at least one decorrelated audio signal within at least two output audio signals is controlled based on the at least one control parameter.
Decorrelator structure for parametric reconstruction of audio signals
An encoding system encodes multiple audio signals (X) as a downmix signal (Y) together with wet and dry upmix coefficients (P, C). In a decoding system, a pre-multiplier (101) computes an intermediate signal (W) by mapping the downmix signal linearly in accordance with a first set of coefficients (Q); a decorrelating section (102) outputs a decorrelated signal (Z) based on the intermediate signal; a wet upmix section (103) computes a wet upmix signal by mapping the decorrelated signal linearly in accordance with the wet upmix coefficients; a dry upmix section (104) computes a dry upmix signal by mapping the downmix signal linearly in accordance with the dry upmix coefficients; a combining section (105) provides a multidimensional reconstructed signal (X) by combining the wet and dry upmix signals; and a converter (106) computes the first set of coefficients based on the wet and dry upmix coefficients and supplies this to the pre-multiplier.
ADAPTIVE PANNER OF AUDIO OBJECTS
An audio object including audio content and object metadata is received. The object metadata indicates an object spatial position of the audio object to be rendered by audio speakers in a playback environment. Based on the object spatial position and source spatial positions of the audio speakers, initial gain values for the audio speakers are determined. The initial gain values can be used to select a set of audio speakers from among the audio speakers. Based on the object spatial position and a set of source spatial positions at which the set of audio speakers are respectively located in the playback environment, a set of non-negative optimized gain values for the set of audio speakers is determined. The audio object at the object spatial position is rendered with the set of optimized gain values for the set of audio speakers.
Parametric joint-coding of audio sources
The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals, which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.
Decoding of audio scenes
Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which is represented by one or more audio signals. The encoder generates a bit stream which comprises downmix signals and side information which includes individual matrix elements of a reconstruction matrix which enables reconstruction of the one or more audio signals in the decoder.
Controlling sounds of individual objects in a video
A method for modifying a sound produced by a sound source in a video includes capturing video and audio of a scene is disclosed. Audio is captured using a microphone array. A sound source is isolated and a direction of arrival of the sound source with respect to a capture location is identified. One or more visual objects in the captured video are identified. One of the isolated sound sources is associated with one of the identified visual objects. An input identifying one of the isolated sound sources is received during playing of the captured video and audio. The input includes a command. Responsive to receiving the input, an attribute of the identified isolated sound source is modified. The input may identify a visual object associated with a sound source. A system and article of manufacture are also disclosed.
METHODS AND APPARATUS FOR DECODING ENCODED AUDIO SIGNAL(S)
There are provided decoding and encoding methods for encoding and decoding of multichannel audio content for playback on a speaker configuration with N channels. The decoding method comprises decoding, in a first decoding module, M input audio signals into M mid signals which are suitable for playback on a speaker configuration with M channels; and for each of the N channels in excess of M channels, receiving an additional input audio signal corresponding to one of the M mid signals and decoding the input audio signal and its corresponding mid signal so as to generate a stereo signal including a first and a second audio signal which are suitable for playback on two of the N channels of the speaker configuration.
Method for generating filter for audio signal, and parameterization device for same
The present invention relates to a method for generating a filter for an audio signal and a parameterization device for the same, and more particularly, to a method for generating a filter for an audio signal, to implement filtering of an input audio signal with a low computational complexity, and a parameterization device therefor. To this end, provided are a method for generating a filter for an audio signal, including: receiving at least one binaural room impulse response (BRIR) filter coefficients for binaural filtering of an input audio signal; converting the BRIR filter coefficients into a plurality of subband filter coefficients; obtaining average reverberation time information of a corresponding subband by using reverberation time information extracted from the subband filter coefficients; obtaining at least one coefficient for curve fitting of the obtained average reverberation time information; obtaining flag information indicating whether the length of the BRIR filter coefficients in a time domain is more than a predetermined value; obtaining filter order information for determining a truncation length of the subband filter coefficients, the filter order information being obtained by using the average reverberation time information or the at least one coefficient according to the obtained flag information and the filter order information of at least one subband being different from filter order information of another subband; and truncating the subband filter coefficient by using the obtained filter order information and a parameterization device therefor.