Patent classifications
H04S2420/11
Intermediate compression for higher order ambisonic audio data
In general, techniques are directed to intermediate compression of higher order ambisonic audio data. For example, a device comprising a processor and a memory may be configured to perform the techniques. The memory may be configured to store an intermediately formatted audio data generated as a result of an intermediate compression of higher order ambisonic audio data. The one or more processors may be configured to process the intermediately formatted audio data.
AUDIO PROCESSING METHOD AND APPARATUS
M audio signals are obtained by processing an audio signal by M virtual speakers; M first HRTFs and M second HRTFs are obtained, where the M first HRTFs corresponding to a left ear position, and the M second HRTFs corresponding to a right ear position; high-band impulse responses of some of the M first HRTFs are modified to obtain modified first target HRTFs, and high-band impulse responses of some of the M second HRTFs are modified to obtain modified second target HRTFs; a first target audio signal corresponding to the left ear position is obtained based on the modified first target HRTFs and un-modified first HRTFs, and the M audio signals; and a second target audio signal corresponding to the right ear position is obtained based on the modified second HRTFs, un-modified second target HRTFs, and the M audio signals.
Method and device for processing music file, terminal and storage medium
Provided are a method and device for processing a music file, a terminal and a storage medium. The method comprises: in response to a received sound effect adjustment instruction, acquiring a music file, the adjustment of which is indicated by the sound effect adjustment instruction; carrying out vocals and accompaniment separation on the music file to obtain vocal data and accompaniment data in the music file; carrying out first sound effect processing on the vocal data to obtain target vocal data, and carrying out second sound effect processing on the accompaniment data to obtain target accompaniment data; and synthesizing the target vocal data and the target accompaniment data to obtain a target music file.
Acquisition of spatialized sound data
A data-processing method for determining at least one spatial coordinate of a sound source emitting a sound signal, in a three-dimensional space, includes the following steps: obtaining at least one first signal and one second signal from the sound signal, collected according to separate directivities by a first sensor and a second sensor; deducing from the first and second signals an expression of at least one first spatial coordinate of the sound source, the expression comprising an uncertainty; determining additional information relating to the first spatial coordinate of the sound source, from a comparison between the respective features of the signals collected by the first and second sensors; and determining the first spatial coordinate of the sound source on the basis of the expression and the additional information.
Method and apparatus for enhancing directivity of a 1st order ambisonics signal
Recordings from microphones that provide 1.sup.st order Ambisonics signals, so-called B-format signals, offer a limited cognition of sound directivity. Sound sources are perceived broader than they actually are, especially for off-center listening positions, and the sound sources are often located to be coming from the closest speaker positions. In a method and apparatus for enhancing the directivity of 1.sup.st order Ambisonics signals, additional directivity information is extracted (SFA) from the lower order Ambisonics input signal. The additional directivity information is used to estimate higher order Ambisonics coefficients, which are then combined with the coefficients of the input signal. Thus, the directivity of the Ambisonics signal is enhanced, which leads to an increased accuracy of spatial source localization when the Ambisonics signal is decoded to loud speaker signals. The resulting output signal has more energy than the input signal.
METHODS, APPARATUS AND SYSTEMS FOR DECOMPRESSING A HIGHER ORDER AMBISONICS (HOA) SIGNAL
A method for compressing a HOA signal being an input HOA representation with input time frames (C(k)) of HOA coefficient sequences comprises spatial HOA encoding of the input time frames and subsequent perceptual encoding and source encoding. Each input time frame is decomposed (802) into a frame of predominant sound signals (X.sub.PS(k−1)) and a frame of an ambient HOA component ({tilde over (C)}.sub.AMB(k−1)). The ambient HOA component ({tilde over (C)}.sub.AMB(k−1)) comprises, in a layered mode, first HOA coefficient sequences of the input HOA representation (c.sub.n(k−1)) in lower positions and second HOA coefficient sequences (c.sub.AMB,n(k−1)) in remaining higher positions. The second HOA coefficient sequences are part of an HOA representation of a residual between the input HOA representation and the HOA representation of the predominant sound signals.
TRANSFORMING AUDIO SIGNALS CAPTURED IN DIFFERENT FORMATS INTO A REDUCED NUMBER OF FORMATS FOR SIMPLIFYING ENCODING AND DECODING OPERATIONS
The disclosed embodiments enable converting audio signals captured in various formats by various capture devices into a limited number of formats that can be processed by an audio codec (e.g., an Immersive Voice and Audio Services (IVAS) codec). In an embodiment, a simplification unit of the audio device receives an audio signal captured by one or more audio capture devices coupled to the audio device. The simplification unit determines whether the audio signal is in a format that is supported/not supported by an encoding unit of the audio device. Based on the determining, the simplification unit, converts the audio signal into a format that is supported by the encoding unit. In an embodiment, if the simplification unit determines that the audio signal is in a spatial format, the simplification unit can convert the audio signal into a spatial “mezzanine” format supported by the encoding.
Reducing correlation between higher order ambisonic (HOA) background channels
In general, techniques are described for compression and decoding of audio data are generally disclosed. An example device for compressing audio data includes one or more processors configured to apply a decorrelation transform to ambient ambisonic coefficients to obtain a decorrelated representation of the ambient ambisonic coefficients, the ambient HOA coefficients having been extracted from a plurality of higher order ambisonic coefficients and representative of a background component of a soundfield described by the plurality of higher order ambisonic coefficients, wherein at least one of the plurality of higher order ambisonic coefficients is associated with a spherical basis function having an order greater than one.
USER INTERFACE FEEDBACK FOR CONTROLLING AUDIO RENDERING FOR EXTENDED REALITY EXPERIENCES
A device may be configured to play one or more of a plurality of audio streams. The device may include a memory configured to store the plurality of audio streams, each of the audio streams representative of a soundfield. The device also may include one or more processors coupled to the memory, and configured to present a user interface to a user, obtain an indication from the user via the user interface representing a desired listening position, select, based on the indication, at least one audio stream of the plurality of audio streams, and output, for a display and in response to obtaining the indication representing the desired listening position, a graphical user interface element suggesting an alternative listening position.
VIRTUAL DRIVING SIMULATION DEVICE AND METHOD FOR IMPROVING SENSATION OF IMMERSION THEREFOR
A virtual driving simulation device and a method for improving a sensation of immersion therefore that may improve the sensation of immersion for a driving simulation in a virtual environment includes a microphone for measuring a 3D sound, and a processor that is configured to record the 3D sound through the microphone, analyze a sound realization influence by reproducing the recorded 3D sound through higher-order ambisonics (HOA) encoding and HOA decoding, and realize the sensation of immersion based on a result of analyzing the sound realization influence.