H04S3/008

Audio processing apparatus and method, and program
11540080 · 2022-12-27 · ·

The present technology relates to an audio processing apparatus and method and a program that make it possible to obtain sound of higher quality. An acquisition unit acquires an audio signal and metadata of an object. A vector calculation unit calculates, based on a horizontal direction angle and a vertical direction angle included in the metadata of the object and indicative of an extent of a sound image, a spread vector indicative of a position in a region indicative of the extent of the sound image. A gain calculation unit calculates, based on the spread vector, a VBAP gain of the audio signal in regard to each speaker by VBAP. The present technology can be applied to an audio processing apparatus.

Methods, apparatus and systems for a pre-rendered signal for audio rendering

The present disclosure relates to a method of decoding audio scene content from a bitstream by a decoder that includes an audio renderer with one or more rendering tools. The method comprises receiving the bitstream, decoding a description of an audio scene from the bitstream, determining one or more effective audio elements from the description of the audio scene, determining effective audio element information indicative of effective audio element positions of the one or more effective audio elements from the description of the audio scene, decoding a rendering mode indication from the bitstream, wherein the rendering mode indication is indicative of whether the one or more effective audio elements represent a sound field obtained from pre-rendered audio elements and should be rendered using a predetermined rendering mode, and in response to the rendering mode indication indicating that the one or more effective audio elements represent the sound field obtained from pre-rendered audio elements and should be rendered using the predetermined rendering mode, rendering the one or more effective audio elements using the predetermined rendering mode, wherein rendering the one or more effective audio elements using the predetermined rendering mode takes into account the effective audio element information, and wherein the predetermined rendering mode defines a predetermined configuration of the rendering tools for controlling an impact of an acoustic environment of the audio scene on the rendering output. The disclosure further relates to a method of generating audio scene content and a method of encoding audio scene content into a bitstream.

Method and device for processing audio signal, using metadata
11540075 · 2022-12-27 · ·

Disclosed is a device for processing an audio signal, which renders an audio signal. The device for processing an audio signal includes a processor. The processor receives metadata including an audio signal and first element reference distance information and renders a first element signal on the basis of the first element reference distance information, wherein the first element reference distance information indicates the reference distance of an element signal. The audio signal is capable of including a second element signal which may be simultaneously rendered with the first element signal, and the metadata is capable of including second element distance information indicating the distance of the second element signal. The number of bits required for representing the first element reference distance information is smaller than the number of bits required for representing the second element distance information.

Apparatus and method for screen related audio object remapping

An apparatus for generating loudspeaker signals includes an object metadata processor configured to receive metadata, to calculate a second position of the audio object depending on the first position of the audio object and on a size of a screen if the audio object is indicated in the metadata as being screen-related, to feed the first position of the audio object as the position information into the object renderer if the audio object is indicated in the metadata as being not screen-related, and to feed the second position of the audio object as the position information into the object renderer if the audio object is indicated in the metadata as being screen-related. The apparatus further includes an object renderer configured to receive an audio object and to generate the loudspeaker signals depending on the audio object and on position information.

Integration of high frequency audio reconstruction techniques

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Distributed audio capturing techniques for virtual reality (VR), augmented reality (AR), and mixed reality (MR) systems

Systems, devices, and methods for capturing audio which can be used in applications such as virtual reality, augmented reality, and mixed reality systems. Some systems can include a plurality of distributed monitoring devices. Each monitoring device can include a microphone and a location tracking unit. The monitoring devices can capture audio signals in an environment, as well as location tracking signals which respectively indicate the locations of the monitoring devices over time during capture of the audio signals. The system can also include a processor to receive the audio signals and the location tracking signals. The processor can determine one or more acoustic properties of the environment based on the audio signals and the location tracking signals.

Sum-difference arrays for audio playback devices
11528574 · 2022-12-13 · ·

In some embodiments, a method comprises receiving audio content comprising left input channel signals and right input channel signals, and generating first and second input signals from the left and right input channel signals. The first input signal is based on a sum of the left and right input channel signals, and the second input signal is based on a difference of the left and right input channel signals. An array transfer function is applied to the first and second input signals to produced audio output signals, which can be provided to a plurality of audio transducers on one or more playback devices.

ELECTRONIC DEVICE, METHOD AND COMPUTER PROGRAM
20220392461 · 2022-12-08 · ·

An electronic device comprising circuitry configured to analyze the results of a stereo or multi-channel source separation to determine one or more time-varying parameters, and to create spatially dynamic audio objects based on the one or more time-varying parameters.

SURROUND SOUND HEADPHONE DEVICE
20220394387 · 2022-12-08 ·

Herein described may be an embodiment of a surround sound headphone device for the production of surround sound which may comprise a first and second headphone assembly which may have acoustic chamber surrounds. The acoustic chamber surrounds may envelope a first and second acoustic chamber assembly. The first and second acoustic chamber assembly may have an acoustic chamber dividing partition, a one or more than one high frequency external port, a one or more than one low frequency external port, a one or more than one low frequency high frequency port, a one or more than one high frequency high frequency port, a one or more than one high frequency auditory source and, a one or more than one low frequency auditory source.

Device, system and method for identifying a scene based on an ordered sequence of sounds captured in an environment
11521626 · 2022-12-06 · ·

An identification device, method and system for identifying a scene in an environment. The environment includes at least one sound capture device. The identification device is configured to identify the scene based on at least two sounds captured in the environment. Each of the at least two sounds are associated respectively with at least one sound class. The scene is identified by taking account of a chronological order in which the at least two sounds were captured.