Patent classifications
G10L21/00
Device pairing via device to device contact
A system may include and/or involve a first device, a second device, and logic to effect pairing of the first and second devices upon detection of physical contact between the devices.
System and method for enhancing speech recognition accuracy using weighted grammars based on user profile including demographic, account, time and date information
Disclosed herein are systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments. The weighted grammar is weighted based on a user profile which consists of information about a number called from, demographic information, account information, a time of day, and a date. Exclusively activating each weighted grammar can include a transition period blending the previously activated grammar and the grammar to be activated.
Method and apparatus for recognizing speech by lip reading
A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.
Sound localization for user in motion
Methods, apparatus, and computer programs for simulating the source of sound are provided. One method includes operations for determining a location in space of the head of a user utilizing face recognition of images of the user. Further, the method includes an operation for determining a sound for two speakers, and an operation for determining an emanating location in space for the sound, each speaker being associated with one ear of the user. The acoustic signals for each speaker are established based on the location in space of the head, the sound, the emanating location in space, and the auditory characteristics of the user. In addition, the acoustic signals are transmitted to the two speakers. When the acoustic signals are played by the two speakers, the acoustic signals simulate that the sound originated at the emanating location in space.
Audio user interaction recognition and context refinement
A system which tracks a social interaction between a plurality of participants, includes a fixed beamformer that is adapted to output a first spatially filtered output and configured to receive a plurality of second spatially filtered outputs from a plurality of steerable beamformers. Each steerable beamformer outputs a respective one of the second spatially filtered outputs associated with a different one of the participants. The system also includes a processor capable of determining a similarity between the first spatially filtered output and each of the second spatially filtered outputs. The processor determines the social interaction between the participants based on the similarity between the first spatially filtered output and each of the second spatially filtered outputs.
Audio user interaction recognition and context refinement
A system which tracks a social interaction between a plurality of participants, includes a fixed beamformer that is adapted to output a first spatially filtered output and configured to receive a plurality of second spatially filtered outputs from a plurality of steerable beamformers. Each steerable beamformer outputs a respective one of the second spatially filtered outputs associated with a different one of the participants. The system also includes a processor capable of determining a similarity between the first spatially filtered output and each of the second spatially filtered outputs. The processor determines the social interaction between the participants based on the similarity between the first spatially filtered output and each of the second spatially filtered outputs.
Method and apparatus for decoding speech/audio bitstream
A method and an apparatus for decoding a speech/audio bitstream are disclosed, where the method for decoding a speech/audio bitstream includes determining whether a current frame is a normal decoding frame or a redundancy decoding frame, obtaining a decoded parameter of the current frame by means of parsing when the current frame is a normal decoding frame or a redundancy decoding frame, performing post-processing on the decoded parameter of the current frame to obtain a post-processed decoded parameter of the current frame, and using the post-processed decoded parameter of the current frame to reconstruct a speech/audio signal.
Secure mechanism for mute alert
In one implementation, an apparatus includes an audio detection circuit, a central processor, and the switch. The audio detection circuit is configured to determine whether audio is present in an input signal and generate an audio presence indicator indicative of the audio. The central processor is configured to receive the audio presence indicator and a mute command. The central processor generates a switch command based on the mute command. The switch is configured to block the input signal from a digital signal processor in response to the switch command. The central processor generates a dynamic mute message that indicates audio is detected while a mute command is active.
Secure mechanism for mute alert
In one implementation, an apparatus includes an audio detection circuit, a central processor, and the switch. The audio detection circuit is configured to determine whether audio is present in an input signal and generate an audio presence indicator indicative of the audio. The central processor is configured to receive the audio presence indicator and a mute command. The central processor generates a switch command based on the mute command. The switch is configured to block the input signal from a digital signal processor in response to the switch command. The central processor generates a dynamic mute message that indicates audio is detected while a mute command is active.
Routing natural language commands to the appropriate applications
A device is configured with multiple applications that each respond to various commands. The correct application to receive a natural language command is identified by consideration of how well the command matches functions of the application. A target application to receive the command may additionally be selected by consideration of which application is most likely to receive a command. The likelihood of an application to receive a command may be determined by considering context. The command may be a voice input that is analyzed by speech recognition technology to determine word strings representing possible commands. Thus, the selection of a target application to receive the command may be based on any or all of the word strings from the natural language input, a closeness of fit between the command and an application, and the likelihood an application is the target for the next incoming command.