H04R2227/009

Characterizing environment using ultrasound pilot tones

A voice-activated computing device configured to transmit a pilot tone and then capture or receive a signal, which corresponds to the pilot tone, reflected from within the environment containing the voice-activated computing device. The voice-activated computing device, or some other computing system or device, analyzes the received signal in order to determine analyze one or more characteristics present within the signal, i.e. noise, echo, etc. Based upon the analysis, models for signal processing can be determined, selected and/or altered. Future signals received by the voice-activated computing device can be processed with such models. The analysis can also allow for the models to be dynamically updated and for models to be dynamically created.

System and method for distributed call processing and audio reinforcement in conferencing environments

Systems, apparatus, and methods for processing audio signals associated with conferencing devices communicatively connected in a daisy-chain configuration using local connection ports included on each device are provided. One method involving a first conferencing device comprises receiving auxiliary mixed microphone signal(s) from at least one other conferencing device via at least one local connection port, each auxiliary signal comprising a mix of microphone signals captured by the at least one other conferencing device; determining a gain adjustment value for each auxiliary mixed microphone signal based on a daisy-chain position of the at least one other conferencing device relative to the position of the first conferencing device; adjusting a gain value for each auxiliary mixed microphone signal based on the corresponding gain adjustment value; generating a loudspeaker output signal from the gain-adjusted auxiliary mixed microphone signal(s); and providing the loudspeaker signal to the loudspeaker of the first conferencing device.

AUDIO DEVICE WITH DYNAMICALLY RESPONSIVE VOLUME

Described herein is an audio device with a microphone which may adapt the audio output volume of a speaker by either increasing or decreasing output volume based on an audio input volume from a user and a distance from the user to the audio device. The audio device may also adapt its output volume to lower the audio output based on detecting one or more interruptions including occupancy and acoustic sounds.

Noise mitigation using machine learning

This disclosure relates to solutions for eliminating undesired audio artifacts, such as background noises, on an audio channel. A process for implementing the technology can include receiving a set of audio segments, analyzing the segments using a first ML model to identify a first probability of unwanted background noises in the segments, and if the first probability exceeds a threshold, analyzing the segments using a second ML model to determine a second probability that the one or more background features exist in the segments. In some aspects, the process can include attenuating audio artifacts in the segments, if the second probability exceeds a second threshold. In some implementations, dynamic time stretching and shrinking can be applied to the noise attenuation. Systems and machine-readable media are also provided.

Audio output control

Systems and methods for audio output control are disclosed. Audio may be output via a speaker of a communal device associated with a first portion of an environment. A user may provide a user utterance indicating an intent to add another device in a second portion of the environment to the audio-output session, and/or an intent to move the audio-output session from the first device to the second device, and/or an intent to remove a device from an audio-output session. Based on this determined intent, audio-session queues may be associated and dissociated from devices and device states may be altered to effectuate the intent of the user utterance.

Linear filtering for noise-suppressed speech detection
10847178 · 2020-11-24 · ·

Systems and methods for suppressing noise and detecting voice input in a multi-channel audio signal captured by a plurality of microphones include (i) capturing a first audio signal via a first microphone and a second audio signal via a second microphone, wherein the first and second audio signals respectively comprises first and second noise content from a noise source; (ii) identifying the first noise content in the first audio signal; (iii) using the identified first noise content to determine an estimated noise content captured by the plurality of microphones; (iv) using the estimated noise content to suppress the first and second noise content in the first and second audio signals; (v) combining the suppressed first and second audio signals into a third audio signal; and (vi) determining that the third audio signal includes a voice input comprising a wake word.

Electronic device and method for controling the electronic device thereof
10825463 · 2020-11-03 · ·

An electronic device and a controlling method therefor are provided. The electronic device includes a speaker, a microphone, and an audio processor configured to adjust a size of a signal of a predetermined frequency band in an input audio signal, determine whether to adjust the size of the audio signal wherein the size of the frequency band was adjusted based on the output level of the speaker, and output the audio signal processed based on whether the adjustment was performed through the speaker. The audio processor is further configured to perform acoustic echo cancellation for the sound signal using the input audio signal or the audio signal of which size was adjusted based on whether the adjustment was performed, based on receiving a sound signal including the output audio signal through the microphone.

Dynamic Player Selection for Audio Signal Processing
20200342888 · 2020-10-29 ·

In one aspect, a first playback device is configured to (i) receive a set of voice signals, (ii) process the set of voice signals using a first set of audio processing algorithms, (iii) identify, from the set of voice signals, at least two voice signals that are to be further processed, (iv) determine that the first playback device does not have a threshold amount of computational power available, (v) receive an indication of an available amount of computational power of a second playback device, (vi) send the at least two voice signals to the second playback device, (vii) cause the second playback device to process the at least two voice signals using a second set of audio processing algorithms, (viii) receive, from the second playback device, the processed at least two voice signals, and (ix) combine the processed at least two voice signals into a combined voice signal.

IMPROVEMENTS IN SOUND REPRODUCTION

A method, and system, of digital room correction for a device, such as a smart speaker, including a loudspeaker. The method comprises capturing audio from an environment local to the device, for example from one or more microphones of a smart speaker. The captured audio is then processed to recognize one or more categories of sound. A digital room correction procedure may then be controlled dependent upon recognition and/or analysis of at least one of the categories of sound.

Audio device with dynamically responsive volume

Described herein is an audio device with a microphone which may adapt the audio output volume of a speaker by either increasing or decreasing output volume based on an audio input volume from a user and a distance from the user to the audio device. The audio device may also adapt its output volume to lower the audio output based on detecting one or more interruptions including occupancy and acoustic sounds.