H04S1/002

SOUND EFFECT ADJUSTMENT
20230041730 · 2023-02-09 · ·

A sound effect adjustment method is provided. In the method, a video frame and an audio signal of a corresponding time unit of a target video are obtained. A sound source orientation and a sound source distance of a sound source object in the video frame are determined. Scene information corresponding to the video frame is determined. The audio signal is filtered based on the sound source orientation and the sound source distance. An echo coefficient is determined according to the scene information. Further, an adjusted audio signal with an adjusted sound effect is generated based on the filtered audio signal and the echo coefficient.

PROCESSING DEVICE AND PROCESSING METHOD
20230045207 · 2023-02-09 ·

A processing device according to this embodiment includes: a frequency characteristics acquisition unit configured to acquire frequency characteristics of an input signal; an extreme value extraction unit configured to extract an extreme value of spectral data; a kurtosis calculation unit configured to: calculate an evaluation value from spectral data; and calculate a kurtosis of a peak or a dip based on a plurality of evaluation values calculated by changing a calculation width, the evaluation value being used for evaluating the peak or the dip corresponding to the extreme value; a determination unit configured to determine whether to suppress the peak or the dip according to a comparison result between the kurtosis and a threshold value; and a suppression unit configured to suppress the peak or the dip with the extreme value that is determined to be suppressed.

METHOD FOR PROCESSING SOUND ON BASIS OF IMAGE INFORMATION, AND CORRESPONDING DEVICE

A method of processing an audio signal including at least one audio object based on image information includes: obtaining the audio signal and a current image that corresponds to the audio signal; dividing the current image into at least one block; obtaining motion information of the at least one block; generating index information including information for giving a three-dimensional (3D) effect in at least one direction to the at least one audio object, based on the motion information of the at least one block; and processing the audio object, in order to give the 3D effect in the at least one direction to the audio object, based on the index information.

Sound Localization for an Electronic Call
20230224658 · 2023-07-13 ·

During an electronic call between two individuals, a sound localization point simulates a location in empty space from where an origin of a voice of one individual occurs for the other individual.

ARRAY AUGMENTATION FOR AUDIO PLAYBACK DEVICES
20250234132 · 2025-07-17 ·

Systems and methods for providing augmented arrays for audio playback are disclosed. An example playback device includes a first transducer configured to output audio along a first acoustic axis and a second transducer configured to output audio along a second acoustic axis. The playback device is configured to receive a source stream of audio content including at least a first input channel and a second input channel. The device plays back first audio output via the first transducer based on the first input channel and directed along the first acoustic axis, and plays back second audio output via the second transducer based on the second input channel and directed along the second acoustic axis, wherein the second audio output at least partially cancels the first audio output along a first spatial region offset from the first acoustic axis.

Converting Binaural Signals to Stereo Audio Signals
20220417691 · 2022-12-29 ·

An apparatus including circuitry configured to: obtain a binaural audio signal; obtain, based on the binaural audio signal, at least one direction parameter of at least one frequency band of the binaural audio signal; process the binaural audio signal to generate at least two audio signals for loudspeaker reproduction by modifying an inter-channel difference of the at least one frequency band of the binaural audio signal based on the at least one direction parameter for the at least one frequency band; and output the at least two audio signals for loudspeaker reproduction.

Apparatus and Method for Synthesizing a Spatially Extended Sound Source Using Cue Information Items
20220417694 · 2022-12-29 ·

An apparatus for synthesizing a spatially extended sound source includes: a spatial information interface for receiving a spatial range indication indicating a limited spatial range for the spatially extended sound source within a maximum spatial range; a cue information provider for providing one or more cue information items in response to the limited spatial range; and an audio processor for processing an audio signal representing the spatially extended sound source using the one or more cue information items.

Loudness adjustment for downmixed audio content

Audio content coded for a reference speaker configuration is downmixed to downmix audio content coded for a specific speaker configuration. One or more gain adjustments are performed on individual portions of the downmix audio content coded for the specific speaker configuration. Loudness measurements are then performed on the individual portions of the downmix audio content. An audio signal that comprises the audio content coded for the reference speaker configuration and downmix loudness metadata is generated. The downmix loudness metadata is created based at least in part on the loudness measurements on the individual portions of the downmix audio content.

ELECTRONIC DEVICE, METHOD AND COMPUTER PROGRAM
20220392461 · 2022-12-08 · ·

An electronic device comprising circuitry configured to analyze the results of a stereo or multi-channel source separation to determine one or more time-varying parameters, and to create spatially dynamic audio objects based on the one or more time-varying parameters.

STEREOPHONIC AUDIO REARRANGEMENT BASED ON DECOMPOSED TRACKS
20220386062 · 2022-12-01 ·

The present invention provides a method for processing audio data, comprising providing input audio data containing a mixture of different timbres, decomposing the input audio data to generate decomposed data representing a predetermined timbre selected from the timbres contained in the input audio data, determining a set point position of a virtual sound source outputting the predetermined timbre relative to a position of a virtual listener, and generating stereophonic output data based on the decomposed data and the determined set point position.