H04S2400/13

Audio processing
11516615 · 2022-11-29 · ·

A method for rendering a spatial audio signal that represents a sound field in a selectable viewpoint audio environment that includes one or more audio objects associated with respective audio content and a respective position in the audio environment. The method includes receiving an indication of a selected listening position and orientation in the audio environment; detecting an interaction concerning a first audio object on basis of one or more predefined interaction criteria; modifying the first audio object and one or more further audio objects linked thereto; and deriving the spatial audio signal that includes at least audio content associated with the modified first audio object in a first spatial position of the sound field that corresponds to its position in the audio environment in relation to said selected listening position and orientation, and audio content associated with the modified one or more further audio objects.

Apparatus and Method for Reproducing a Spatially Extended Sound Source or Apparatus and Method for Generating a Description for a Spatially Extended Sound Source Using Anchoring Information
20220377489 · 2022-11-24 ·

An apparatus for reproducing a spatially extended sound source having a defined position or orientation and geometry in a space has an interface for receiving a listener position. The apparatus having a projector for calculating a projection of a two- or three-dimensional hull associated with the sound source onto a projection plane using the listener position, information on the geometry of the sound source, and on the position of the sound source; a sound position calculator for calculating positions of at least two sound sources for the spatially extended sound source using the projection plane; and a renderer for rendering the at least two sound sources at the positions to obtain a reproduction of the sound source having two or more output signals, configured to use different sound signals for the different positions.

METHODS, APPARATUS AND SYSTEMS FOR REPRESENTATION, ENCODING, AND DECODING OF DISCRETE DIRECTIVITY DATA

The present disclosure relates to a method of processing audio content including directivity information for at least one sound source, the directivity information comprising a first set of first directivity unit vectors representing directivity directions and associated first directivity gains. The disclosure further relates to corresponding methods of encoding and decoding audio content including directivity information for at least one sound source.

INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD, AND PROGRAM
20220377488 · 2022-11-24 ·

The present technology relates to an information processing apparatus and an information processing method, and a program capable of realizing content reproduction based on an intention of a content creator. An information processing apparatus includes: a listener position information acquisition unit that acquires listener position information of a viewpoint of a listener; a reference viewpoint information acquisition unit that acquires position information of a first reference viewpoint and object position information of an object at the first reference viewpoint, and position information of a second reference viewpoint and object position information of the object at the second reference viewpoint; and an object position calculation unit that calculates position information of the object at the viewpoint of the listener on the basis of the listener position information, the position information of the first reference viewpoint and the object position information at the first reference viewpoint, and the position information of the second reference viewpoint and the object position information at the second reference viewpoint. The present technology can be applied to content reproduction systems.

Interaural time difference crossfader for binaural audio rendering

Examples of the disclosure describe systems and methods for presenting an audio signal to a user of a wearable head device. According to an example method, a first input audio signal is received, the first input audio signal corresponding to a source location in a virtual environment presented to the user via the wearable head device. The first input audio signal is processed to generate a left output audio signal and a right output audio signal. The left output audio signal is presented to the left ear of the user via a left speaker associated with the wearable head device. The right output audio signal is presented to the right ear of the user via a right speaker associated with the wearable head device. Processing the first input audio signal comprises applying a delay process to the first input audio signal to generate a left audio signal and a right audio signal; adjusting a gain of the left audio signal; adjusting a gain of the right audio signal; applying a first head-related transfer function (HRTF) to the left audio signal to generate the left output audio signal; and applying a second HRTF to the right audio signal to generate the right output audio signal. Applying the delay process to the first input audio signal comprises applying an interaural time delay (ITD) to the first input audio signal, the ITD determined based on the source location.

Acoustic signal processing device, acoustic signal processing method, and program for determining a steering coefficient which depends on angle between sound source and microphone

An acoustic signal processing device calculates a signal waveform that a microphone receives when at least one of a sound source and the microphone is moving. The acoustic signal processing device includes a coefficient calculation unit configured to model a steering coefficient g.sub.k,m representing how much an amplitude of a sound source signal emitted at an mth discrete time, where m is an integer between 1 and M and M is a length of the sound source signal, is transferred to an amplitude of a signal that the microphone receives at a kth discrete time, where k is an integer between 1 and K and K is a length of a recording signal, using N-order Fourier series expansion where N is an integer of 1 or more, and a recording signal calculation unit configured to calculate the signal waveform that the microphone receives using the modeled steering coefficient g.sub.k,m.

Signaling loudness adjustment for an audio scene
11595730 · 2023-02-28 · ·

Aspects of the disclosure include methods, apparatuses, and non-transitory computer-readable storage mediums for loudness adjustment for an audio scene associated with an MPEG-I immersive audio stream. One apparatus includes processing circuitry that receives a first syntax element indicating a number of sound signals included in the audio scene. The processing circuitry determines whether one or more speech signals are included in the sound signals indicated by the first syntax element. The processing circuitry determines a reference speech signal from the one or more speech signals based on the one or more speech signals being included in the sound signals. The processing circuitry adjusts a loudness level of the reference speech signal of the audio scene based on an anchor speech signal. The processing circuitry adjusts loudness levels of the sound signals based on the adjusted loudness level of the reference speech signal.

Device to amplify and clarify voice
11508391 · 2022-11-22 ·

A voice enhancing device amplifies and clarifies the voice of a user with hypophonia or other voice issues. The device includes a collar of either rigid or a soft material that is shaped to comfortably sit on the shoulders of the user. One or more microphone arrays are adjustably mounted to the collar to capture audio of the user talking. An electronics module enhances the captured audio signal and generates an enhanced audio signal that drives at least one speaker adjustably attached to the collar. The electronic controller implements one or more of an AGC amplifier to correct amplitude variation in spoked words, adaptive filtering to actively filter out background noise, a variable attack and decay function to improve intelligibility of the spoken words, a diphthong modification function to clarify the spoken words, and an echo cancelation function to reduce echo and feedback in the enhanced audio.

AUDIO APPARATUS AND METHOD OF OPERATION THEREFOR

An audio apparatus, e.g. for rendering audio for a virtual/ augmented reality application, comprises a receiver (201) for receiving audio data for an audio scene including a first audio component representing a real-world audio source present in an audio environment of a user. A determinator (203) determines a first property of a real-world audio component from the real-world audio source and a target processor (205) determines a target property for a combined audio component being a combination of the real-world audio component received by the user and rendered audio of the first audio component received by the user. An adjuster (207) determines a render property by modifying a property of the first audio component indicated by the audio data for the first audio component in response to the target property and the first property. A renderer (209) renders the first audio component in response to the render property.

ENCODING DEVICE AND METHOD, DECODING DEVICE AND METHOD, AND PROGRAM
20230056690 · 2023-02-23 · ·

The present technology relates to an encoding device and method, a decoding device and method, and a program capable of realizing sense-of-distance control based on intention of a content creator.

The encoding device includes: an object encoding unit that encodes audio data of an object; a metadata encoding unit that encodes metadata including position information of the object; a sense-of-distance control information determination unit that determines sense-of-distance control information for sense-of-distance control processing to be performed on the audio data; a sense-of-distance control information encoding unit that encodes the sense-of-distance control information; and a multiplexer that multiplexes the coded audio data, the coded metadata, and the coded sense-of-distance control information to generate coded data. The present technology can be applied to a content reproduction system.