H04S7/301

Information processing device, information processing method, and information processing program

An information processing device (100) according to the present disclosure includes: an acquisition unit (141) configured to acquire a first image including a content image of an ear of a user; and a calculation unit (142) configured to calculate, based on the first image acquired by the acquisition unit (141), a head-related transfer function corresponding to the user by using a learned model having learned to output a head-related transfer function corresponding to an ear when an image including a content image of the ear is input.

Voice Control of a Media Playback System

Multiple aspects of systems and methods for voice control and related features and functionality for various embodiments of media playback devices, networked microphone devices, microphone-equipped media playback devices, and speaker-equipped networked microphone devices are disclosed and described herein, including but not limited to designating and managing default networked devices, audio response playback, room-corrected voice detection, content mixing, music service selection, metadata exchange between networked playback systems and networked microphone systems, handling loss of pairing between networked devices, actions based on user identification, and other voice control of networked devices.

Distributed feedback echo cancellation

A system configured to perform distributed echo cancellation processing to attenuate feedback echo from occurring when two devices are acoustically coupled during a communication session. To reduce the feedback echo, one of the devices is configured as a hub device and receives microphone signals, synchronizes the microphone signals, and generates a mixed microphone signal. To enable distributed echo cancellation, the system includes bidirectional feedback link(s) between the hub device and each device synchronized with the hub device. For example, a first bidirectional feedback link sends a microphone signal from a second device to the hub device and sends the mixed microphone signal from the hub device to the second device, which the second device uses to perform echo cancellation. In addition, a second bidirectional feedback link sends a playback signal from the hub device to the second device and sends the output of echo cancellation back to the hub device.

SPEAKER TO ADJUST ITS SPEAKER SETTINGS
20220369060 · 2022-11-17 ·

Examples disclosed herein include a speaker. The speaker may include a group of microphones and a processor. The processor may determine a first speaker-channel identifier for a multi-speaker system at least partially responsive to a first tone captured at the group of microphones. The processor may also determine a position of a source of the captured first tone relative to the speaker at least partially responsive to position information derived from the captured first tone. The processor may also determine a second speaker-channel identifier at least partially responsive to the first speaker-channel identifier and the position of the source of the captured first tone. The processor may also determine speaker settings at least partially responsive to the second speaker-channel identifier. Related devices, systems and methods are also disclosed.

Audio processing apparatus and method therefor

An audio processing apparatus comprises a receiver (705) which receives audio data including audio components and render configuration data including audio transducer position data for a set of audio transducers (703). A renderer (707) generating audio transducer signals for the set of audio transducers from the audio data. The renderer (7010) is capable of rendering audio components in accordance with a plurality of rendering modes. A render controller (709) selects the rendering modes for the renderer (707) from the plurality of rendering modes based on the audio transducer position data. The renderer (707) can employ different rendering modes for different subsets of the set of audio transducers the render controller (709) can independently select rendering modes for each of the different subsets of the set of audio transducers (703). The render controller (709) can select the rendering mode for a first audio transducer of the set of audio transducers (703) in response to a position of the first audio transducer relative to a predetermined position for the audio transducer. The approach may provide improved adaptation, e.g. to scenarios where most speakers are at desired positions whereas a subset deviate from the desired position(s).

BASS MANAGEMENT IN AUDIO SYSTEMS
20220360926 · 2022-11-10 ·

There is provided a method for controlling bass reproduction properties of a multichannel audio system, wherein the audio system has inputs for at least two audio input signals and includes a set of loudspeakers, including at least one bass-capable loudspeaker and at least two high-range loudspeakers, each loudspeaker being associated with a loudspeaker channel. The method includes obtaining impulse responses or transfer functions that represent the sound reproduction properties of each loudspeaker channel at a number of measurement or control positions. The method also includes tuning, when the audio system includes more than one bass-capable loudspeaker, loudspeaker channels of at least two bass loudspeakers to each other so that their sum impulse response has minimum spatial variability, and/or controlling high-range loudspeaker speaker channels to be in-phase with each other and/or with bass-capable loudspeaker channel in a crossover frequency band.

ROOM CALIBRATION BASED ON GAUSSIAN DISTRIBUTION AND K-NEAREST NEIGHBORS ALGORITHM

A method of room calibration comprises measuring a plurality of impulse responses at a plurality of measurement points in a room for each speaker of a plurality of speakers. The method also comprises determining a plurality of transfer functions at the plurality of measurement points for each speaker based on the plurality of impulse responses. Furthermore, the method also comprises weighting and summing the transfer functions to obtain a weighted and summed sound curve for each speaker.

METHOD AND APPARATUS FOR REPRESENTING SPACE OF INTEREST OF AUDIO SCENE
20220360929 · 2022-11-10 · ·

Aspects of the disclosure include methods, apparatuses, and non-transitory computer-readable storage mediums for representing a space of interest of an audio scene. One apparatus includes processing circuitry that decodes audio scene data for the audio scene. The audio scene data includes (i) audio content for a plurality of items representing the audio scene and (ii) a first syntax element indicating a type of a subset of the plurality of items. The subset of the plurality of items represents the space of interest of the audio scene. The processing circuitry determines a part of the audio content for the subset of the plurality of items based on the type of the subset of the plurality of items indicated in the first syntax element. The processing circuitry renders the determined part of the audio content.

MULTIBAND LIMITER MODES AND NOISE COMPENSATION METHODS

Some implementations involve receiving a content stream that includes audio data, receiving at least one type of level adjustment indication relating to playback of the audio data and controlling a level of the input audio data, based on the at least one type of level adjustment indication, to produce level-adjusted audio data. Some examples involve determining, based at least in part on the type(s) of level adjustment indication, a multiband limiter configuration, applying the multiband limiter to the level-adjusted audio data, to produce multiband limited audio data and providing the multiband limited audio data to one or more audio reproduction transducers of an audio environment.

MULTI-CHANNEL AUDIO SYSTEM, MULTI-CHANNEL AUDIO DEVICE, PROGRAM, AND MULTI-CHANNEL AUDIO PLAYBACK METHOD
20230101944 · 2023-03-30 ·

[Problem] To provide a technology capable of comfortably enjoying audio content via multiple channels even in a noisy environment. [Solution] A wireless terminal 3 is disposed at a listening point of a multi-channel audio device 1. The multi-channel audio device 1 plays a multi-channel audio signal as audio playback signals of a plurality of channels, and outputs an audio playback signal for each channel from the corresponding speaker 2, and the wireless terminal 3 collects the environmental sound at the listening point, and transmits the sound collection signal to the multi-channel audio device 1. The multi-channel audio device 1 identifies, as a noise component, the difference between the sound collection signal received from the wireless terminal 3 and the audio playback signals of the plurality of channels output from the plurality of speakers, generates a noise canceling signal with the opposite phase to the noise component, and outputs the noise canceling signal from any speaker 2.