Patent classifications
G01S3/802
Acoustic sensing and alerting
A housing can be wearable by a user. At least one microphone in or on the housing can sense ambient audio and produce at least one time-domain audio signal. The housing can be passive, such that the sensed ambient audio does not include any sound emitted from the housing. A transformation circuit can transform the at least one time-domain audio signal to form at least one frequency-domain audio signal. An identification circuit can identify a spectral feature in the at least one frequency-domain audio signal. A tracking circuit can track a time evolution of the spectral feature. A determination circuit can determine from the tracked time evolution of the spectral feature that the spectral feature corresponds to an object moving toward the housing. An alert circuit can alert the user, in response to the determination circuit determining that the object is moving toward the housing.
Acoustic sensing and alerting
A housing can be wearable by a user. At least one microphone in or on the housing can sense ambient audio and produce at least one time-domain audio signal. The housing can be passive, such that the sensed ambient audio does not include any sound emitted from the housing. A transformation circuit can transform the at least one time-domain audio signal to form at least one frequency-domain audio signal. An identification circuit can identify a spectral feature in the at least one frequency-domain audio signal. A tracking circuit can track a time evolution of the spectral feature. A determination circuit can determine from the tracked time evolution of the spectral feature that the spectral feature corresponds to an object moving toward the housing. An alert circuit can alert the user, in response to the determination circuit determining that the object is moving toward the housing.
POSITION DETERMINATION SYSTEM HAVING A DECONVOLUTION DECODER
The present disclosure relates to an acoustic position determination system that includes a mobile communication device and at least one base transmitter unit. The mobile communication device is configured to transmit and receive acoustic signals. Due to relative movements between the mobile communication device and the base transmitter unit, frequencies of the received signals shift due to Doppler effect. The mobile communication device is configured to compensate Doppler frequency shifts in the received acoustic signals prior to performing a deconvolution decoding process. The mobile communication device is further configured to compensate Doppler frequency shifts and perform deconvolution decoding process on acoustic signals received from multiple signal transmission paths.
Multi-modal speech localization
Multi-modal speech localization is achieved using image data captured by one or more cameras, and audio data captured by a microphone array. Audio data captured by each microphone of the array is transformed to obtain a frequency domain representation that is discretized in a plurality of frequency intervals. Image data captured by each camera is used to determine a positioning of each human face. Input data is provided to a previously-trained, audio source localization classifier, including: the frequency domain representation of the audio data captured by each microphone, and the positioning of each human face captured by each camera in which the positioning of each human face represents a candidate audio source. An identified audio source is indicated by the classifier based on the input data that is estimated to be the human face from which the audio data originated.
VIDEO CONFERENCE SYSTEM, VIDEO CONFERENCE APPARATUS, AND VIDEO CONFERENCE METHOD
A video conference system, a video conference apparatus and a video conference method are provided. The video conference system includes a video conference apparatus and a display apparatus. The video conference apparatus includes an image detection device, a sound source detection device, and a processor. The image detection device obtains a conference image of a conference space. When the sound source detection device detects a sound generated by a sound source in the conference space, the sound source detection device outputs a positioning signal. The processor receives the positioning signal, and determines whether a real face image exists in a sub-image block of the conference image corresponding to the sound source according to the positioning signal to output the image signal. The display apparatus displays a close-up conference image including the real face image according to the image signal.
System and method for voice activity detection and generation of characteristics respective thereof
A system and method for analyzing sound signals within a predetermined space, including: analyzing a plurality of sound signals captured within a predetermined space via at least one sound sensor; generating a grid corresponding to the predetermined space based on the plurality of sound signals, wherein the grid is utilized to identify areas within the predetermined space as interest points; identifying, based on the interest point, at least one sound generating object within the grid based on the analysis of the plurality of sound signals; and identifying at least one characteristic of the plurality of sound signals.
Methods and systems for sound source locating
A method and system for locating a sound source are provide. The method may include detecting a sound signal of a sound by each of two audio sensors. The method may also include converting the sound signals detected by the two audio sensors from a time domain to a frequency domain. The method may further include determining a high frequency ratio of each of the sound signals in the frequency domain. The method may further include determining a direction of the sound source based on the high frequency ratios.
Methods and systems for sound source locating
A method and system for locating a sound source are provide. The method may include detecting a sound signal of a sound by each of two audio sensors. The method may also include converting the sound signals detected by the two audio sensors from a time domain to a frequency domain. The method may further include determining a high frequency ratio of each of the sound signals in the frequency domain. The method may further include determining a direction of the sound source based on the high frequency ratios.
ACOUSTIC TRANSFER FUNCTION PERSONALIZATION USING SOUND SCENE ANALYSIS AND BEAMFORMING
An audio system for a wearable device dynamically updates acoustic transfer functions. The audio system is configured to estimate a direction of arrival (DoA) of each sound source detected by a microphone array relative to a position of the wearable device within a local area. The audio system may track the movement of each sound source. The audio system may form a beam in the direction of each sound source. The audio system may identify and classify each sound source based on the sound source properties. Based on the DoA estimates, the movement tracking, and the beamforming, the audio system generates or updates the acoustic transfer functions for the sound sources.
NARROWBAND DIRECTION OF ARRIVAL FOR FULL BAND BEAMFORMER
A system and method for improving the performance of a hands-free voice user interface system while minimizing the computational complexity without sacrificing performance. Specifically, when estimating the location of the talker for the purpose of steering a directional beam in the direction of the active talker. A hands-free voice user interface system requires a clean signal to be streamed to the cloud for recognition. One way to improve the speech signal is to estimate where the talker is and steer a beam in the direction of the active talker. To locate the talker to a localized position, a direction of arrival estimator (DOA) algorithm is used. DoA generally requires noise and echo free signal for optimal estimation, but it is computationally expensive to run audio pre-processing such as an acoustic echo cancellation for each microphone in microphone array. To reduce computational complexity, the system and method extract certain range of frequency and operate pre-processing only on the selected frequency. By properly selecting the frequency range, it does not degrade DoA accuracy while significantly reducing computational complexity.