H04R2201/405

Drone assisted setup for building specific sound localization model
11581010 · 2023-02-14 · ·

Techniques and systems are described for generating and using a sound localization model. A described technique includes obtaining for a building a sound sensor map indicating locations of first and second sound sensor devices in respective first and second rooms of the building; causing an autonomous device to navigate to the first room and to emit, during a time window, sound patterns at one or more frequencies within the first room; receiving sound data including first and second sound data respectively from the first and second sound sensor devices that are observed during the time window; and generating and storing a sound localization model based on the sound sensor map, autonomous device location information, and the received sound data, the model being configured to compensate for how sounds travels among rooms in at least a portion of the building such that an origin room of a sound source is identifiable.

Automated transcript generation from multi-channel audio

Systems and methods are described for generating a transcript of a legal proceeding or other multi-speaker conversation or performance in real time or near-real time using multi-channel audio capture. Different speakers or participants in a conversation may each be assigned a separate microphone that is placed in proximity to the given speaker, where each audio channel includes audio captured by a different microphone. Filters may be applied to isolate each channel to include speech utterances of a different speaker, and these filtered channels of audio data may then be processed in parallel to generate speech-to-text results that are interleaved to form a generated transcript.

ARRAY MICROPHONE SYSTEM AND METHOD OF ASSEMBLING THE SAME

Embodiments include a microphone assembly comprising an array microphone and a housing configured to support the array microphone and sized and shaped to be mountable in a drop ceiling in place of at least one of a plurality of ceiling tiles included in the drop ceiling. A front face of the housing includes a sound-permeable screen having a size and shape that is substantially similar to the at least one of the plurality of ceiling tiles. Embodiments also include an array microphone system comprising a plurality of microphones arranged, on a substrate, in a number of concentric, nested rings of varying sizes around a central point of the substrate. Each ring comprises a subset of the plurality of microphones positioned at predetermined intervals along a circumference of the ring.

Audio transducer system and audio transducer device of the same
11540045 · 2022-12-27 ·

An audio transducer device includes an audio transducer, a controller and a direction adjusting mechanism. The audio transducer has a sound receiving surface formed with multiple sound collecting holes, and multiple microphones corresponding in position to the sound collecting holes. The controller is detachably mounted to an electronic device, and controls the microphones to cooperatively perform directional sound reception to obtain audio data. The direction adjusting mechanism interconnects an audio transducer shell and the controller such that the sound receiving surface can be rotated to a position where a normal direction thereof and an image capturing direction of the electronic device forming a desired angle therebetween.

LINEAR DIFFERENTIAL DIRECTIONAL MICROPHONE ARRAY
20220408183 · 2022-12-22 ·

Apparatus and method provided herein are directed to a linear differential directional microphone array (LDDMA), which takes into account the directionality of the array elements. The LDDMA may be designed by generating a steering vector for a linear array (LA) having preselected parameters including parameters δ, p, θ, N, and M, generating a constraint matrix based on the steering vector, reformulating the constraint matrix based on a microphone response matrix and a steering matrix, obtaining a beamformer by applying a minimum norm solution in terms of the constraint matrix, verifying a desired characteristic of the LA by calculating the beamformer for a desired direction, and constructing the LA based on the preselected parameters and the beamformer.

Pattern-forming microphone array

Embodiments include a microphone array with a plurality of microphone elements comprising a first set of elements arranged along a first axis, comprising at least two microphone elements spaced apart by a first distance; a second set of elements arranged along the first axis, comprising at least two microphone elements spaced apart by a second, greater distance, such that the first set is nested within the second set; a third set of elements arranged along a second axis orthogonal to the first axis, comprising at least two microphone elements spaced apart by the second distance; and a fourth set of elements nested within the third set along the second axis, comprising at least two microphone elements spaced apart by the first distance, wherein each set includes a first cluster of microphone elements and a second cluster of microphone elements spaced apart by the specified distance.

EARLOOP MICROPHONE

A headset implements an earloop microphone and includes a housing. An earloop of the headset secures the headset to an ear of a user. A first microphone is acoustically coupled to a first opening in the housing. A second microphone is acoustically coupled to a second opening in the housing. A third microphone is acoustically coupled to a third opening in the earloop.

Orientation-based playback device microphone selection
11516610 · 2022-11-29 · ·

Aspects of a multi-orientation playback device including at least one microphone array are discussed. A method may include determining an orientation of the playback device which includes at least one microphone array and determining at least one microphone training response for the playback device from a plurality of microphone training responses based on the orientation of the playback device. The at least one microphone array can detect a sound input, and the location information of a source of the sound input can be determined based on the at least one microphone training response and the detected sound input. Based on the location information of the source, the directional focus of the at least one microphone array can be adjusted, and the sound input can be captured based on the adjusted directional focus.

Emergency sound localization

Techniques for determining information associated with sounds detected in an environment based on audio data are discussed herein. Audio sensors of a vehicle may determine audio data associated with sounds from the environment. Sounds may be caused by objects in the environment such as emergency vehicles, construction zones, non-emergency vehicles, humans, audio speakers, nature, etc. A model may determine a classification of the audio data and/or a probability value representing a likelihood that sound in the audio data is associated with the classification. A direction of arrival may be determined based on receiving classification values from multiple audio sensors of the vehicle, and other actions can be performed or the vehicle can be controlled based on the direction of arrival.

Asynchronous ad-hoc distributed microphone array processing in smart home applications using voice biometrics

Voice biometrics scoring is performed on received asynchronous audio outputs from microphones distributed at ad hoc locations to generate confidence scores that indicate a likelihood of an enrolled user speech utterance in the output, a subset of the outputs is selected based on the confidence scores, and the subset is spatially processed to provide audio output for voice application use. Alternatively, asynchronous spatially processed audio outputs and corresponding biometric identifiers are received from corresponding devices distributed at ad hoc locations, audio frames of the outputs are synchronized using the biometric identifiers, and the synchronized frames are coherently combined. Alternatively, uttered speech associated with respective ad hoc distributed devices is received and non-coherently combined to generate a final output of uttered speech. The uttered speech is recognized from respective spatially processed outputs generated by the respective devices using biometrics of talkers enrolled by the devices.