G10L2021/02166

CAMERA-VIEW ACOUSTIC FENCE
20230053202 · 2023-02-16 ·

Determining the angle of sound relative to the centerline of a microphone array. The angle of the centerline of a camera field-of-view (FoV) and the angle of the camera FoV is determined. Knowing the angle from the centerline of the microphone array of the particular sound and then the angle of the centerline of the camera FoV and angles of the camera FoV allows a determination if the sound is inside the FoV of the camera. If so, the microphones are unmuted. If not, the microphones are muted. As the camera zooms or pans, the changes in camera FoV and centerline angle are computed and used with the sound angle, so that the muting and unmuting occurs automatically as the camera zoom and pan angle change.

Wideband DOA Improvements for Fixed and Dynamic Beamformers
20230050677 · 2023-02-16 · ·

This disclosure describes an apparatus and method of an embodiment of an invention that is improves Direction of Arrival (DOA) determinations. This embodiment of the apparatus includes a plurality of microphones coupled together as a microphone array used for beamforming, the plurality of microphones are positioned at predetermined locations and produce audio signals to be used to form a directional pickup pattern; a processor, memory, storage, and a power supply operably coupled to the microphone array, the processor configured to execute the following steps: processing an algorithm for a Direction of Arrival (DOA) determination; supplemental processing that improves the DOA processing.

CONTACT AND ACOUSTIC MICROPHONES FOR VOICE WAKE AND VOICE PROCESSING FOR AR/VR APPLICATIONS
20230050954 · 2023-02-16 ·

A method to combine contact and acoustic microphones in a headset for voice wake and voice processing in immersive reality applications is provided. The method includes receiving, from a contact microphone, a first acoustic signal, determining a fidelity and a quality of the first acoustic signal, receiving, from an acoustic microphone, a second acoustic signal, and when the fidelity and quality of the first acoustic signal exceeds a pre-selected threshold, combining the first acoustic signal and the second acoustic signal to provide an enhanced acoustic signal to a smart glass user. A non-transitory, computer-readable medium storing instructions to cause a headset to perform the above method, and the headset, are also provided.

Automated clinical documentation system and method
11581077 · 2023-02-14 · ·

A method, computer program product, and computing system for proactive encounter scanning is executed on a computing device and includes obtaining encounter information of a patient encounter. The encounter information is proactively processed to determine if the encounter information is indicative of one or more medical conditions and to generate one or more result set. The one or more result sets are provided to the user.

MICROPHONE ARRAY SYSTEM WITH SOUND WIRE INTERFACE AND ELECTRONIC DEVICE
20230043176 · 2023-02-09 · ·

A microphone array system, comprises N microphones, including a first microphone . . . a Nth microphone, wherein N is a natural number greater than 2. Each of the N microphones is provided with: an acoustic transducer for picking up a sound signal and converting the sound signal into an electric signal; a voice activation detector, connected to a corresponding acoustic transducer, and configured to perform a voice activation detection on the electric signal and form an activation signal; a buffer memory, connected to the acoustic transducer, and configured to store a 1/N electric signal of a predetermined segment; a sound wire interface, connected to a corresponding acoustic transducer, the buffer memory, and the voice activation detector, wherein the sound wire interface is connected to an external master chip via a sound wire bus for outputting the activation signal to the external master chip.

Method and System for Dereverberation of Speech Signals

A system and method for reverberation reduction is disclosed. A first Deep Neural Network (DNN) produces a first estimate of a target direct-path signal from a mixture of acoustic signals that include the target direct-path signal and a reverberation of the target direct-path signal. A filter modeling a room impulse response (RIR) for the first estimate is estimated. The filter when applied to the first estimate of the target direct-path signal generates a result closest to a residual between the mixture of the acoustic signals and the first estimate of the target direct-path signal according to a distance function. A mixture with reduced reverberation of the target direct-path signal is obtained by removing the result of applying the filter to the first estimate of the target direct-path signal from the received mixture. A second DNN produces a second estimate of the target direct-path signal from the mixture with reduced reverberation.

Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality

Array microphone systems and methods that can automatically focus and/or place beamformed lobes in response to detected sound activity are provided. The automatic focus and/or placement of the beamformed lobes can be inhibited based on a remote far end audio signal. The quality of the coverage of audio sources in an environment may be improved by ensuring that beamformed lobes are optimally picking up the audio sources even if they have moved and changed locations.

User voice control system

Embodiments include techniques and objects related to a wearable audio device that includes a microphone to detect a plurality of sounds in an environment in which the wearable audio device is located. The wearable audio device further includes a non-acoustic sensor to detect that a user of the wearable audio device is speaking. The wearable audio device further includes one or more processors communicatively to alter, based on an identification by the non-acoustic sensor that the user of the wearable audio device is speaking, one or more of the plurality of sounds to generate a sound output. Other embodiments may be described or claimed.

Deep multi-channel acoustic modeling using multiple microphone array geometries

Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-geometry/multi-channel DNN) that is trained using a plurality of microphone array geometries. Thus, the first model may receive a variable number of microphone channels, generate multiple outputs using multiple microphone array geometries, and select the best output as a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. The DNN front-end enables improved performance despite a reduction in microphones.

Detection and removal of wind noise
11594239 · 2023-02-28 · ·

An electronic device includes one or more microphones that generate audio signals and a wind noise detection subsystem. The electronic device may also include a wind noise reduction subsystem. The wind noise detection subsystem applies multiple wind noise detection techniques to the set of audio signals to generate corresponding indications of whether wind noise is present. The wind noise detection subsystem determines whether wind noise is present based on the indications generated by each detection technique and generates an overall indication of whether wind noise is present. The wind noise reduction subsystem applies one or more wind noise reduction techniques to the audio signal if wind noise is detected. The wind noise detection and reduction techniques may work in multiple domains (e.g., the time, spatial, and frequency domains).