Patent classifications
G10L25/72
Embedded audio sensor system and methods
An embedded sensor can include an audio detector, a digital signal processor, a library, and a rules engine. The digital signal processor can be configured to receive signals from the audio detector and to identify the environment in which the embedded sensor is located. The library can store statistical models associated with specific environments, and the digital signal processor can be configured identify specific events based on detected sounds within the particular environment by utilizing the statistical model associated with the particular environment. The DSP can associate a probability of accuracy for the identified audible event. A rules engine can be configured to receive the probability and transmit a report of the detected audible event.
VOICE CONFERENCE APPARATUS, VOICE CONFERENCE SYSTEM AND VOICE CONFERENCE METHOD
A voice conference apparatus that provides a plurality of voice conferences via a network, the voice conference apparatus including: a voice conference section that transmits and receives a sound generated in each of the plurality of voice conferences to and from a plurality of user terminals used by a plurality of users participating in the voice conference; a sound analyzing section that analyzes the sound generated in each of the plurality of voice conferences; and a display control section that causes an administrator terminal used by an administrator administering the plurality of voice conferences to display a result of the analysis, by the sound analyzing section, of the sound generated in each of the plurality of voice conferences, in association with the voice conference.
Predicting acoustic features for geographic locations
The technology described in this document can be embodied in a computer-implemented method that includes receiving identification information associated with a geographic location. The identification information includes one or more features that affect an acoustic environment of the geographic location at a particular time. The method also includes determining one or more parameters representing at least a subset of the one or more features, and estimating at least one acoustic parameter that represents the acoustic environment of the geographic location at the particular time. The at least one parameter can be estimated using a mapping function that generates the estimate of the at least one acoustic parameter as a weighted combination of the one or more parameters. The method further includes presenting, using a user-interface displayed on a computing device, information representing the at least one acoustic parameter estimated for the geographic location for the particular time.
Predicting acoustic features for geographic locations
The technology described in this document can be embodied in a computer-implemented method that includes receiving identification information associated with a geographic location. The identification information includes one or more features that affect an acoustic environment of the geographic location at a particular time. The method also includes determining one or more parameters representing at least a subset of the one or more features, and estimating at least one acoustic parameter that represents the acoustic environment of the geographic location at the particular time. The at least one parameter can be estimated using a mapping function that generates the estimate of the at least one acoustic parameter as a weighted combination of the one or more parameters. The method further includes presenting, using a user-interface displayed on a computing device, information representing the at least one acoustic parameter estimated for the geographic location for the particular time.
EMBEDDED AUDIO SENSOR SYSTEM AND METHODS
An embedded sensor can include an audio detector, a digital signal processor, a library, and a rules engine. The digital signal processor can be configured to receive signals from the audio detector and to identify the environment in which the embedded sensor is located. The library can store statistical models associated with specific environments, and the digital signal processor can be configured identify specific events based on detected sounds within the particular environment by utilizing the statistical model associated with the particular environment. The DSP can associate a probability of accuracy for the identified audible event. A rules engine can be configured to receive the probability and transmit a report of the detected audible event.
Method and apparatus for camera activation
During operation a first personal-area network will activate a first camera. The first camera may be manually activated, or triggered by an audio signal. The event that causes the first camera to activate will also cause the personal-area network to send an acoustic signature to other personal-area networks. Personal-area networks that receive the acoustic signature will modify audio triggers so that the acoustic signature can be better distinguished from other noises.
Method and apparatus for camera activation
During operation a first personal-area network will activate a first camera. The first camera may be manually activated, or triggered by an audio signal. The event that causes the first camera to activate will also cause the personal-area network to send an acoustic signature to other personal-area networks. Personal-area networks that receive the acoustic signature will modify audio triggers so that the acoustic signature can be better distinguished from other noises.
VOICE CAPTCHA
A method of Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) includes: recording, by a voice CAPTCHA module, a speech spoken by a user; determining, by a voice biometric service (VBS), whether a voiceprint matching the user's speech exists; and if a voiceprint matching the user's speech exists, verifying the user as a human user by the VBS. If a voiceprint matching the user's speech does not exist, the VBS i) generates a unique voiceprint for the user based on the user's speech, and/or ii) determines whether the user's speech is at least one of a synthetically generated speech and a previously recorded audio being played back. The user can perform a guest checkout without logging into the voice CAPTCHA module, in which case the VBS compares previously used voiceprints to the user's speech.
VOICE CAPTCHA
A method of Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) includes: recording, by a voice CAPTCHA module, a speech spoken by a user; determining, by a voice biometric service (VBS), whether a voiceprint matching the user's speech exists; and if a voiceprint matching the user's speech exists, verifying the user as a human user by the VBS. If a voiceprint matching the user's speech does not exist, the VBS i) generates a unique voiceprint for the user based on the user's speech, and/or ii) determines whether the user's speech is at least one of a synthetically generated speech and a previously recorded audio being played back. The user can perform a guest checkout without logging into the voice CAPTCHA module, in which case the VBS compares previously used voiceprints to the user's speech.
ACOUSTIC ECHO CANCELLATION WITH DELAY UNCERTAINTY AND DELAY CHANGE
An echo cancellation method includes receiving an echo reference signal, receiving a microphone signal, decomposing, with a first filter bank, the echo reference signal into a series of subband echo reference signals, decomposing, with a second filter bank, the microphone signal into a series of subband microphone signals, estimating a group delay between the echo reference signal and the microphone signal using the series of subband echo reference signals and the series of subband microphone signals, estimating, using adaptive filters, acoustic echoes in the echo reference signal based at least in part on the group delay, subtracting the acoustic echoes from the series of subband microphone signals to obtain a series of acoustic echo removed subband signals, combining the series of acoustic echo removed subband signals into a single time domain echo removed signal, and sending the single time domain echo removed signal to a host operating system.