Patent classifications
H04S2400/01
Recording and rendering spatial audio signals
Examples of the disclosure relate to a method, apparatus and computer program, the method including: obtaining audio signals wherein the audio signals represent spatial sound and can be used to render spatial audio using linear methods; obtaining spatial metadata corresponding to the spatial sound represented by the audio signals; and associating the spatial metadata with the obtained audio signals so that in a first rendering context the obtained audio signals can be rendered without using the spatial metadata and in a second rendering context the obtained audio signals can be rendered with using the spatial metadata.
TRANSMISSION APPARATUS, RECEPTION APPARATUS, AND ACOUSTIC SYSTEM
A transmission apparatus includes a first transmission unit that transmits sound data to a first sound channel in a transmission path, and a second transmission unit that transmits meta data related to the sound data to a second sound channel in the transmission path while ensuring synchronization with the sound data.
METHOD AND SYSTEM FOR MONITORING AND REPORTING SPEAKER HEALTH
A method is provided, including: defining a plurality of frequency bins; sending, during a training phase, a test signal at different amplitude levels to one or more speakers, and gathering resulting test voltage (V) and current (I) points for the different amplitude levels and for each frequency bin; for each frequency bin, applying a linear regression algorithm to the gathered test voltage and current points for the different amplitudes to obtain a reference electrical impedance of said one or more speakers; sending, during a monitoring phase subsequent to said training phase, an audio signal to said one or more speakers, and gathering resulting new voltage and current points to obtain an operating electrical impedance for said one or more speakers for each frequency bin, determining a deviation between the operating and the reference electrical impedance, and, if the deviation exceeds a defined tolerance, reporting the deviation to a user.
System for and method of controlling a three-dimensional audio engine
A system for and a method of controlling generation of a 3D audio stream are disclosed. The method comprises accessing an audio stream; determining a value of a feature associated with the audio stream; selecting one or more 3D control parameters from a set of 3D control parameters, the selecting being based on the value of the feature associated with the audio stream; and generating the 3D audio stream based on the selected one or more 3D control parameters. In some embodiments, the feature is a metric associated with a frequency distribution of correlations of the audio stream.
METHODS AND APPARATUS FOR DECODING A COMPRESSED HOA SIGNAL
Methods and apparatus for decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or soundfield. The method may include receiving a bit stream containing the compressed HOA representation and decoding, based on a determination that there are multiple layers, the compressed HOA representation from the bitstream to obtain a sequence of decoded HOA representations. A first subset of the sequence of decoded HOA representations is determined based only on corresponding ambient HOA components. A second subset of the sequence of decoded HOA representations is determined based on corresponding ambient HOA components and corresponding predominant sound components. For a frame k, the sequence of decoded HOA representations are represented at least in part by
where
corresponds to the corresponding ambient HOA components and
SYSTEM AND METHOD FOR MULTICHANNEL SPEECH DETECTION
Embodiments of the disclosure provide systems and methods for speech detection. The method may include receiving a multichannel audio input that includes a set of audio signals from a set of audio channels in an audio detection array. The method may further include processing the multichannel audio input using a neural network classifier to generate a series of classification results in a series of time windows for the multichannel audio input. The neural network classifier includes a causal temporal convolutional network (TCN) configured to determine a classification result for each time window based on portions of the multichannel audio input n the corresponding time window and one or more time windows before the corresponding time window. The method may additionally include determining whether the multichannel audio input includes one or more speech segments in the series of time windows based on the series of classification results.
METHOD FOR PROCESSING AUDIO SIGNAL AND ELECTRONIC DEVICE
A method for processing an audio signal and an electronic device, relate to the field of audio and video technology. The method includes: detecting beat information of the audio signal; and obtaining virtual surround sound for the audio signal by performing a convolution operation on a head-related transfer function and the audio signal based on the beat information of the audio signal.
METHOD AND APPARATUS FOR SPACE OF INTEREST OF AUDIO SCENE
Aspects of the disclosure include methods, apparatuses, and non-transitory computer-readable storage mediums for decoding audio data of an audio scene. One apparatus includes processing circuitry that receives first audio source data and second audio source data. The first audio source data corresponds to a space of interest in the audio scene and the second audio source data does not correspond to the space of interest in the audio scene. The space of interest in the audio scene is represented by at least one of a listener space, an audio channel, or an audio object. The processing circuitry decodes the first audio source data based on the space of interest.
ACOUSTIC PROCESSING DEVICE, ACOUSTIC PROCESSING METHOD, AND STORAGE MEDIUM
A storage unit stores a first transfer function representing a transfer characteristic of a sound from a sound source for each sound source direction, a sound source direction estimating unit calculates a conversion coefficient of an acoustic signal for each channel in a frequency domain and a spatial spectrum for each sound source direction on the basis of the first transfer function and estimates a sound source direction in which the spatial spectrum becomes a maximum as an estimated sound source direction, a transfer function estimating unit estimates a transfer function for the estimated sound source direction as a second transfer function by normalizing the conversion coefficients among channels, and a transfer function updating unit updates the first transfer function for the estimated sound source direction using the second transfer function.