H04R2430/23

Microphone Array Beamforming Control
20230215432 · 2023-07-06 ·

Systems, apparatuses, and methods are described for controlling source tracking and delaying beamforming in a microphone array system. A source tracker may continuously determine a direction of an audio source. A source tracker controller may pause the source tracking of the source tracker if a user may continue to speak to the system. The source tracker controller may resume the source tracking of the source tracker if the user may cease to speak to the system, or when one or more pause durations have been reached.

Hearing device or system for evaluating and selecting an external audio source
11553285 · 2023-01-10 · ·

A hearing system comprises a hearing device worn on the head, or fully or partially implanted in the head, of a user, and external audio transmitters. The hearing system allows wireless communication to be established between the hearing device and the audio transmitters. The hearing device comprises microphones providing respective electric input signals; a beamformer filter providing a beamformed signal from the electric input signals; and an output unit. The hearing system further comprises a selector/mixer for selecting and possibly mixing one or more of the electric input signals or the beamformed signal and external electric signals from the audio transmitters, and providing a current input sound signal based thereon for presentation to the user. The selector/mixer is controlled by a source selection processor, which determines the source selection control signal based on a comparison of the beamformed signal and the external electric sound signals or processed versions thereof.

System and method for a voice-controllable apparatus

In accordance with an embodiment, an apparatus includes a millimeter wave radar sensor system configured to detect a location of a body of a person, where the detected location of the body of the person defines a direction of the person relative to the apparatus; and a microphone system configured to generate at least one audio beam as a function at least of the direction.

Multi-stream target-speech detection and channel fusion

Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.

ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF
20230005477 · 2023-01-05 ·

An electronic apparatus includes: a memory storing a first threshold value and a second threshold value corresponding to a receiving direction of a wake-up word, a sound receiver comprising sound receiving circuitry, and a processor configured to: identify a receiving direction of the sound based on a sound received through the sound receiver, based on a similarity between sound data obtained in response to the received sound and the wake-up word being greater than or equal to the first threshold value corresponding to the identified receiving direction, perform voice recognition for a subsequent sound received through the sound receiver, and based on the similarity being less than the first threshold value and greater than or equal to the second threshold value, change the first threshold value.

Head-tracked spatial audio

Spatial filters are generated that map response of an audio capture device to head related transfer functions (HRTFs) for different positions of the audio capture device relative to the HRTFs. A current set of spatial filters are determined based on the plurality of spatial filters and a head position of a user. The microphone signals are convolved with the current set of spatial filters, resulting in a left audio channel and right audio channel that form output binaural audio channels. The binaural audio channels can be used to drive speakers of a headphone set to generate sound that is perceived to have a spatial quality. Other aspects are described and claimed.

Speech-Tracking Listening Device
20220417679 · 2022-12-29 ·

A system (20) includes a plurality of microphones (22), configured to generate different respective signals in response to acoustic waves (36) arriving at the microphones, and a processor (34). The processor is configured to receive the signals, to combine the signals into multiple channels, which correspond to different respective directions relative to the microphones by virtue of each channel representing any portion of the acoustic waves arriving from the corresponding direction with greater weight, relative to others of the directions, to calculate respective energy measures of the channels, to select one of the directions, in response to the energy measure for the channel corresponding to the selected direction passing one or more energy thresholds, and to output a combined signal representing the selected direction with greater weight, relative to others of the directions. Other embodiments are also described.

SOUND SOURCE LOCALIZATION WITH CO-LOCATED SENSOR ELEMENTS

A system includes a plurality of acoustic sensor elements co-located with one another, each acoustic sensor element of the plurality of acoustic sensor elements being configured to generate a signal representative of sound incident upon the plurality of acoustic sensor elements, and a processor configured to determine data indicative of a location of a source of the sound based on the signals representative of the incident sound. The plurality of acoustic sensor elements include a directional acoustic sensor element configured to generate a signal representative of a directional component of the sound.

Matching Active Speaker Pose Between Two Cameras
20220408015 · 2022-12-22 ·

Described are multiple cameras in a conference room, each pointed in a different direction. A primary camera includes a microphone array to perform sound source localization (SSL). The SSL is used in combination with a video image to identify the speaker from among multiple individuals that appear in the video image. Pose information of the speaker is developed. Pose information of each individual identified in each other camera is developed. The speaker pose information is compared to the pose information of the individuals from the other cameras. The best match for each other camera is selected as the speaker in that camera. The speaker views of each camera are compared to determine the speaker view with the most frontal view of the speaker. That camera is selected to provide the video for provision to the far end.

Wearer identification based on personalized acoustic transfer functions

A wearable device includes an audio system. In one embodiment, the audio system includes a sensor array that includes a plurality of acoustic sensors. When a user wears the wearable device, the audio system determines an acoustic transfer function for the user based upon detected sounds within a local area surrounding the sensor array. Because the acoustic transfer function is based upon the size, shape, and density of the user's body (e.g., the user's head), different acoustic transfer functions will be determined for different users. The determined acoustic transfer functions are compared with stored acoustic transfer functions of known users in order to authenticate the user of the wearable device.