H04R1/406

MICROPHONE ARRAY SYSTEM WITH SOUND WIRE INTERFACE AND ELECTRONIC DEVICE
20230043176 · 2023-02-09 · ·

A microphone array system, comprises N microphones, including a first microphone . . . a Nth microphone, wherein N is a natural number greater than 2. Each of the N microphones is provided with: an acoustic transducer for picking up a sound signal and converting the sound signal into an electric signal; a voice activation detector, connected to a corresponding acoustic transducer, and configured to perform a voice activation detection on the electric signal and form an activation signal; a buffer memory, connected to the acoustic transducer, and configured to store a 1/N electric signal of a predetermined segment; a sound wire interface, connected to a corresponding acoustic transducer, the buffer memory, and the voice activation detector, wherein the sound wire interface is connected to an external master chip via a sound wire bus for outputting the activation signal to the external master chip.

Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality

Array microphone systems and methods that can automatically focus and/or place beamformed lobes in response to detected sound activity are provided. The automatic focus and/or placement of the beamformed lobes can be inhibited based on a remote far end audio signal. The quality of the coverage of audio sources in an environment may be improved by ensuring that beamformed lobes are optimally picking up the audio sources even if they have moved and changed locations.

Dynamically assigning multi-modality circumstantial data to assistant action requests for correlating with subsequent requests

Implementations set forth herein relate to an automated assistant that uses circumstantial condition data, generated based on circumstantial conditions of an input, to determine whether the input should affect an action been initialized by a particular user. The automated assistant can allow each user to manipulate their respective ongoing action without necessitating interruptions for soliciting explicit user authentication. For example, when an individual in a group of persons interacts with the automated assistant to initialize or affect a particular ongoing action, the automated assistant can generate data that correlates that individual to the particular ongoing action. The data can be generated using a variety of different input modalities, which can be dynamically selected based on changing circumstances of the individual. Therefore, different sets of input modalities can be processed each time a user provides an input for modifying an ongoing action and/or initializing another action.

Method and system for speech enhancement

A method and a system for speech enhancement including a time synchronization unit configured to synchronize microphone signals sent from at least two microphones; a source separation unit configured to separate the synchronized microphone signals and output a separated speech signal, which corresponds to a speech source; and a noise reduction unit including a feature extraction unit configured to extract a speech feature of the separated speech signal and a neural network configured to receive the speech feature and output a clean speech feature.

VEHICLE AVATAR DEVICES FOR INTERACTIVE VIRTUAL ASSISTANT

A system and method for providing avatar device status indicators for voice assistants in multi-zone vehicles. The method comprises: receiving at least one signal from a plurality of microphones, wherein each microphone is associated with one of a plurality of spatial zones, and one of a plurality of avatar devices; wherein the at least one signal further comprises a speech signal component from a speaker; wherein the speech signal component is a voice command or question; sending zone information associated with the speaker and with one of the plurality of spatial zones to an avatar; activating one the plurality of avatar devices in a respective one of the plurality of spatial zones associated with the speaker.

Deep multi-channel acoustic modeling using multiple microphone array geometries

Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-geometry/multi-channel DNN) that is trained using a plurality of microphone array geometries. Thus, the first model may receive a variable number of microphone channels, generate multiple outputs using multiple microphone array geometries, and select the best output as a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. The DNN front-end enables improved performance despite a reduction in microphones.

Terrestrial acoustic sensor array

A terrestrial acoustic sensor array for detecting and preventing airspace collision with an unmanned aerial vehicle (UAV) includes a plurality of ground-based acoustic sensor installations, each of the acoustic sensor installations including a sub-array of microphones. The terrestrial acoustic sensor array may further include a processor for detecting an aircraft based on sensor data collected from the microphones of at least one of the plurality of acoustic sensor installations and a network link for transmitting a signal based on the detection of the aircraft to a control system of the UAV.

Detection and removal of wind noise
11594239 · 2023-02-28 · ·

An electronic device includes one or more microphones that generate audio signals and a wind noise detection subsystem. The electronic device may also include a wind noise reduction subsystem. The wind noise detection subsystem applies multiple wind noise detection techniques to the set of audio signals to generate corresponding indications of whether wind noise is present. The wind noise detection subsystem determines whether wind noise is present based on the indications generated by each detection technique and generates an overall indication of whether wind noise is present. The wind noise reduction subsystem applies one or more wind noise reduction techniques to the audio signal if wind noise is detected. The wind noise detection and reduction techniques may work in multiple domains (e.g., the time, spatial, and frequency domains).

Audio signal processing for noise reduction

A headphone, headphone system, and speech enhancing method is provided to enhance speech pick-up from the user of a headphone and includes receiving a plurality of signals from a set of microphones and generating a primary signal by array processing the microphone signals to steer a beam toward the user's mouth. A noise reference signal is also derived from one or more microphones via a delay-and-sum technique, and a voice estimate signal is generated by filtering the primary signal to remove components that are correlated to the noise reference signal.

Discrete binaural spatialization of sound sources on two audio channels

Embodiments relate to binaural spatialization of more than two sound sources on two audio channels of an audio system. Sound signals each emitted from a corresponding sound source are collected, and a respective virtual position within an angular range of a sound scene is assigned to each sound source. Multi-source audio signals are generated by panning each sound signal according to the respective virtual position. A first multi-source audio signal is spatialized to a first direction to generate a first left signal and a first right signal. A second multi-source audio signal is spatialized to a second direction to generate a second left signal and a second right signal. A binaural signal is generated using the first left signal, the second left signal, the first right signal, and the second right signal. The binaural signal is such that each sound source appears to originate from its respective virtual position.