H04S7/30

Audio Representation and Associated Rendering
20230085918 · 2023-03-23 ·

An apparatus for immersive audio communication including circuitry configured to: receive at least a first audio data stream and a second audio data stream, wherein at least one of the first and second audio stream includes a spatial audio stream to enable immersive audio during a communication; determine a type of each of the first and second audio streams to identify which of the received first and second audio data streams the spatial audio stream; process the second audio data stream with at least one parameter dependent on the determined type; and render the first audio data stream and the processed second audio data stream.

Electronic device and method for outputting audio signal

An electronic device is provided. The electronic device includes a display, at least one speaker, at least one memory configured to store instructions, and at least one processor operatively connected to the display, the at least one speaker, and the at least one memory, wherein, when executing the instructions, the at least one processor is configured to display a plurality of screens based on applications which are distinct from one another, within sub regions which are distinct from one another within a display region of the display, in a state where the plurality of screens are displayed, output a first audio signal provided from a first application from among the applications at a first volume through the at least one speaker, while outputting the first audio signal, identify a user input for outputting a second audio signal related to a second application within a screen provided from the second application from among the plurality of screens, in response to the user input being identified, reduce the first volume of the first audio signal being outputted through the at least one speaker, and output the second audio signal provided from the second application at a second volume which is higher than the reduced first volume through the at least one speaker.

Time domain neural networks for spatial audio reproduction
11490218 · 2022-11-01 · ·

A device for reproducing spatial audio using a machine learning model may include at least one processor configured to receive multiple audio signals corresponding to a sound scene captured by respective microphones of a device. The at least one processor may be further configured to provide the multiple audio signals to a machine learning model, the machine learning model having been trained based at least in part on a target rendering configuration. The at least one processor may be further configured to provide, responsive to providing the multiple audio signals to the machine learning model, multichannel audio signals that comprise a spatial reproduction of the sound scene in accordance with the target rendering configuration.

Method and Apparatus for Communication Audio Handling in Immersive Audio Scene Rendering

An apparatus for rendering communication audio signal within an immersive audio scene, the apparatus comprising means configured to: obtain at least one spatial audio signal for rendering within the immersive audio scene; obtain the communication audio signal and positional information associated with the communication audio signal; obtain a rendering processing parameter associated with the communication audio signal; determine a rendering method based on the rendering processing parameter; determine an insertion point in a rendering processing for the determined rendering method and/or a selection of rendering elements for the determined rendering method based on the rendering processing parameter.

Method, Systems and Apparatus for Hybrid Near/Far Virtualization for Enhanced Consumer Surround Sound

Embodiments are disclosed for hybrid near/far-field speaker virtualization. In an embodiment, a method comprises: receiving a source signal including channel-based audio or audio objects; generating near-field gain(s) and far-field gain(s) based on the source signal and a blending mode; generating a far-field signal based, at least in part, on the source signal and the far-field gain(s); rendering, using a speaker virtualizer, the far-field signal for playback of far-field acoustic audio through far-field speakers into an audio reproduction environment; generating a near-field signal based at least in part on the source signal and the near-field gain(s); prior to providing the far-field signal to the far-field speakers, sending the near-field signal to a near-field playback device or an intermediate device coupled to the near-field playback device; providing the far-field signal to the far-field speakers; and providing the near-field signal to the near-field speakers to synchronously overlay the far-field acoustic audio.

Methods and systems for processing and mixing signals using signal decomposition

A method for mixing, processing and enhancing signals using signal decomposition is presented. A method for improving sorting of decomposed signal parts using cross-component similarity is also provided.

Signal processor and signal processing method

A signal processor and a signal processing method thereof. The signal processor obtain a first audio signal of a first channel, obtains a second audio signal of a second channel. The signal processor performs a first signal processing on the input first audio signal. The signal processor, when it does not obtain the second audio signal, performs a second signal processing on the input first audio signal having undergone the first signal processing and output a further processed first audio signal.

METHOD AND DEVICE FOR DECODING A HIGHER-ORDER AMBISONICS (HOA) REPRESENTATION OF AN AUDIO SOUNDFIELD

The invention discloses rendering sound field signals, such as Higher-Order Ambisonics (HOA), for arbitrary loudspeaker setups, where the rendering results in highly improved localization properties and is energy preserving. This is obtained by rendering an audio sound field representation for arbitrary spatial loudspeaker setups and/or by a decoder that decodes based on a decode matrix (D). The decode matrix (D) is based on smoothing and scaling of a first decode matrix {circumflex over (D)} with smoothing coefficients. The first decode matrix {circumflex over (D)} is based on a mix matrix G and a mode matrix {tilde over (Ψ)}, where the mix matrix G was determined based on L speakers and positions of a spherical modelling grid related to a HOA order N, and the mode matrix {tilde over (Ψ)} was determined based on the spherical modelling grid and the HOA order N.

Processing audio signals

An apparatus, method and computer program is described comprising: receiving a near-field audio source signal from a near-field microphone (22); receiving a far-field audio signal from an array comprising one or more far-field microphones (23); determining a filter length of a first portion of a room impulse response filter for the near-field microphone, wherein said filter length of said first portion is the same at each of a plurality of frequency bands of the filter and wherein said filter length of said first portion includes a direct acoustic propagation delay; and determining a filter length of a second portion of the room impulse response filter at each of the plurality of frequency bands, wherein the filter length of said second portion is frequency-dependent.

Apparatus, Methods and Computer Programs for Repositioning Spatial Audio Streams

Examples of the disclosure relate to apparatus, methods and computer programs for repositioning spatial audio streams. The apparatus is configured to receive a plurality of spatial audio streams wherein the spatial audio streams include one or more audio signals and associated spatial metadata. The apparatus is also configured to receive obtaining repositioning information relating to at least one of the plurality of spatial audio streams and repositioning the at least one of the plurality of spatial audio streams based on the repositioning information.