H04R2227/009

METADATA FOR DUCKING CONTROL

An audio encoding device and an audio decoding device are described herein. The audio encoding device may examine a set of audio channels/channel groups representing a piece of sound program content and produce a set of ducking values to associate with one of the channels/channel groups. During playback of the piece of sound program content, the ducking values may be applied to all other channels/channel groups. Application of these ducking values may cause (1) the reduction in dynamic range of ducked channels/channel groups and/or (2) movement of channels/channel groups in the sound field. This ducking may improve intelligibility of audio in the non-ducked channel/channel group. For instance, a narration channel/channel group may be more clearly heard by listeners through the use of selective ducking of other channels/channel groups during playback.

Methods of managing audio data transmissions over a network to ensure live voice quality

Methods of managing audio data transmissions over a network disclosed herein may include the step of selecting a client device from a plurality of client devices as a participating device, each client device of the plurality of client devices being in data communication with a network. The methods may include the step of signaling the participating device over said network thereby initiating transmitting of audio data from the participating device at least in part over said network for live broadcasting, the audio data being indicative of a speaking voice being input into a participating device microphone of the participating device. The methods may include the step of minimizing latency in transmitting of the audio data by throttling data being communicated over said network by one or more client devices of the plurality of client devices only while the participating device is transmitting audio data over said network.

Methods, systems, and media for identifying a plurality of sets of coordinates for a plurality of devices
11716569 · 2023-08-01 · ·

Methods, systems, and media for identifying a plurality of sets of coordinates for a plurality of devices are provided. In some embodiments, the method comprises: identifying each device in a plurality of devices associated with a user account; instructing the plurality of devices to perform an audio sequence; receiving a plurality of transit times from the plurality of devices; determining a plurality of distances based on the plurality of transit times; determining a plurality of sets of coordinates based on the plurality of distances; associating to each of the plurality of devices a corresponding unique one of the plurality of sets of coordinates; and causing at least one of the plurality of devices to play spatial audio determined from the plurality of sets of coordinates.

Dynamic Player Selection for Audio Signal Processing
20230215452 · 2023-07-06 ·

In one aspect, a first playback device is configured to (i) receive a set of voice signals, (ii) process the set of voice signals using a first set of audio processing algorithms, (iii) identify, from the set of voice signals, at least two voice signals that are to be further processed, (iv) determine that the first playback device does not have a threshold amount of computational power available, (v) receive an indication of an available amount of computational power of a second playback device, (vi) send the at least two voice signals to the second playback device, (vii) cause the second playback device to process the at least two voice signals using a second set of audio processing algorithms, (viii) receive, from the second playback device, the processed at least two voice signals, and (ix) combine the processed at least two voice signals into a combined voice signal.

Crosstalk data detection method and electronic device
11551706 · 2023-01-10 · ·

A method and an electronic device for detecting crosstalk data are provided. The method for detecting crosstalk data can detect whether an audio data stream includes crosstalk data. The method includes: receiving a first audio data block, a second audio data block, and a reference time difference, wherein the first audio data block and the second audio data block separately include a plurality of audio data segments; using a time difference between an acquisition time of an audio data segment in the first audio data block and a corresponding audio data segment in the second audio data block as an audio segment time difference; and determining that the audio data segment of the first audio data block includes crosstalk data when the audio segment time difference does not match the reference time difference.

METHODS, SYSTEMS, AND MEDIA FOR IDENTIFYING A PLURALITY OF SETS OF COORDINATES FOR A PLURALITY OF DEVICES
20230217171 · 2023-07-06 ·

Methods, systems, and media for identifying a plurality of sets of coordinates for a plurality of devices are provided. In some embodiments, the method comprises: identifying each device in a plurality of devices associated with a user account; instructing the plurality of devices to perform an audio sequence; receiving a plurality of transit times from the plurality of devices; determining a plurality of distances based on the plurality of transit times; determining a plurality of sets of coordinates based on the plurality of distances; associating to each of the plurality of devices a corresponding unique one of the plurality of sets of coordinates; and causing at least one of the plurality of devices to play spatial audio determined from the plurality of sets of coordinates.

Robust step-size control for multi-channel acoustic echo canceller
11539833 · 2022-12-27 · ·

A multi-channel acoustic echo cancellation (AEC) system that includes a step-size controller that dynamically determines a step-size value for each channel and each tone index on a frame-by-frame basis. The system determines that near-end signals are present by calculating a scaled error and determining that the scaled error exceeds a threshold value. When the scaled error exceeds the threshold value, the system may switch from a first cost function to a second cost function and determine a step-size value using a robust algorithm. The robust algorithm may prevent the system from diverging due to the presence of the near-end signal. For example, the robust algorithm may select a different cost function to determine the step-size value and/or combine different step-size computations, resulting in the step-size value being temporarily reduced. Thus, the robust algorithm may enable the AEC to better model the near-end disturbance statistics while the near-end signal is present.

Information processing device, mixing device using the same, and latency reduction method

An information processing device includes a first time-frequency converter configured to perform a time-frequency conversion with respect to an input signal, using a window function having a first width, a second time-frequency converter configured to perform a time-frequency conversion with respect to the input signal, using a second window function having a second width smaller than the first width, and a modification processing unit configured to modify an output of the second time-frequency converter, using a frequency analysis result based on an output of the first time-frequency converter.

Communication system for multiple acoustic zones

An communication system supports communication paths within an environment by receiving speech signals of a speaker and playing it back for one or more listeners. Signal processing tasks are split into a microphone related part and into a loudspeaker related part. A sound processing system suitable for use in an environment having multiple acoustic zones includes a plurality of microphone communication instances coupled and a plurality of loudspeaker instances.

Method, Systems and Apparatus for Hybrid Near/Far Virtualization for Enhanced Consumer Surround Sound

Embodiments are disclosed for hybrid near/far-field speaker virtualization. In an embodiment, a method comprises: receiving a source signal including channel-based audio or audio objects; generating near-field gain(s) and far-field gain(s) based on the source signal and a blending mode; generating a far-field signal based, at least in part, on the source signal and the far-field gain(s); rendering, using a speaker virtualizer, the far-field signal for playback of far-field acoustic audio through far-field speakers into an audio reproduction environment; generating a near-field signal based at least in part on the source signal and the near-field gain(s); prior to providing the far-field signal to the far-field speakers, sending the near-field signal to a near-field playback device or an intermediate device coupled to the near-field playback device; providing the far-field signal to the far-field speakers; and providing the near-field signal to the near-field speakers to synchronously overlay the far-field acoustic audio.