Patent classifications
G10L2015/221
PASSIVE DISAMBIGUATION OF ASSISTANT COMMANDS
Implementations set forth herein relate to an automated assistant that can initialize execution of an assistant command associated with an interpretation that is predicted to be responsive to a user input, while simultaneously providing suggestions for alternative assistant command(s) associated with alternative interpretation(s) that is/are also predicted to be responsive to the user input. The alternative assistant command(s) that are suggested can be selectable such that, when selected, the automated assistant can pivot from executing the assistant command to initializing execution of the selected alternative assistant command(s). Further, the alternative assistant command(s) that are suggested can be partially fulfilled prior to any user selection thereof. Accordingly, implementations set forth herein can enable the automated assistant to quickly and efficiently pivot between assistant commands that are predicted to be responsive to the user input.
Method for controlling the operation of an appliance by a user through voice control
A method for controlling operation of an appliance by a user through voice control includes at least the steps of: detecting, by the appliance, a control action performed by the user on the appliance; activating a voice control system by the appliance; capturing, by the voice control system, a voice input from the user as a captured voice input; recognizing, by the voice control system, a piece of information and/or an instruction in the captured voice input from the user as a recognized information and/or instruction; and executing, by the voice control system, a user control action on the appliance in accordance with the recognized information and/or instruction.
Methods and systems for correcting, based on speech, input generated using automatic speech recognition
Methods and systems for correcting, based on subsequent second speech, an error in an input generated from first speech using automatic speech recognition, without an explicit indication in the second speech that a user intended to correct the input with the second speech, include determining that a time difference between when search results in response to the input were displayed and when the second speech was received is less than a threshold time, and based on the determination, correcting the input based on the second speech. The methods and systems also include determining that a difference in acceleration of a user input device, used to input the first speech and second speech, between when the search results in response to the input were displayed and when the second speech was received is less than a threshold acceleration, and based on the determination, correcting the input based on the second speech.
USAGE OF VOICE RECOGNITION CONFIDENCE LEVELS IN A PASSENGER INTERFACE
A voice recognition system for an elevator system including: one or more microphones configured to capture a voice command from an individual and convert the voice command into an audio signal; a command arbitrator including one or more speech interpretation systems, the command arbitrator being configured to analyze the audio signal and determine an interpreted command for the elevator system from the audio signal using the one or more speech interpretation systems, wherein the interpreted command includes a confidence measure associated with the interpreted command, and wherein the confidence measure is an indicator depicting how confident the command arbitrator is that the interpreted command matches the voice command from the individual.
Smart wearable electric devices and methods for communication
This disclosure generally relates to smart-phone and communication device which is wireless and chargeable. The device can be controlled through voice and embedded with software for a number of functionalities. The smartphone named Chear can respond to voice commands, access databases through internet connectivity, and create a novel communication stream that informs both the wearer and those around them. The smart-phone invention has a protective covering of rubber and plastic to protect the device from possible damages. The invention can be connected to the World Wide Web (internet) and can perform live streaming for audio and video, the complete functionality of texting including reading, writing and sending messages and e-mails. The device can operate via speakerphone. As a smart gadget, it can access the Internet, take voice commands, transfer/exchange/store cloud-based data, and be transferred to any other device through a USB port. It includes the functionalities of a camera and is GPS enabled.
SYSTEM AND METHOD FOR EXTRACTING AND DISPLAYING SPEAKER INFORMATION IN AN ATC TRANSCRIPTION
A system for extracting speaker information in an ATC transcription and displaying the speaker information on a graphical display unit is provided. The system is configured to: segment a stream of audio received from an ATC and other aircraft into a plurality of chunks; determine, for each chunk, if the speaker is enrolled in an enrolled speaker database; when the speaker is enrolled in the enrolled speaker database, decode the chunk using a speaker-dependent automatic speech recognition (ASR) model and tag the chunk with a permanent name for the speaker; when the speaker is not enrolled in the enrolled speaker database, assign a temporary name for the speaker, tag the chunk with the temporary name, and decode the chunk using a speaker independent speech recognition model; format the decoded chunk as text; and signal the graphical display unit to display the formatted text along with an identity for the speaker.
EXPLAINING ANOMALOUS PHONETIC TRANSLATIONS
A method includes: receiving, by a computing device, a digital voice stream; receiving, by the computing device, converted text that represents the digital voice stream; identifying, by the computing device, an erroneously converted portion of the converted text; selecting, by the computing device, the erroneously converted portion for explainability processing; parsing, by the computing device, the erroneously converted portion into parts based on a predetermined parsing level; collecting, by the computing device, supplementary input data related to the erroneously converted portion; and determining, by the computing device and based on the supplemental input data, a reason why the erroneously converted portion was erroneously converted.
Using multiple languages during speech to text input
A method and apparatus for correcting a wrongly-translated word in a device employing speech recognition is provided herein. During operation, a device will use a second language to correct a wrongly-translated word that was wrongly translated using a first language. More particularly, after speech recognition is performed using the first language, when a user selects text to be corrected, the user will utter the speech again using the second language that differs from the first language. Both the first and the second language can be used by the device to determine a best translation of the speech.
Systems and methods to accept speech input and edit a note upon receipt of an indication to edit
Systems and methods to accept speech input and edit a note upon receipt of an indication to edit are disclosed. Exemplary implementations may: effectuate presentation of a graphical user interface that includes a note, the note including note sections, the note sections including a first note section, the individual note sections including body fields; obtain user input from the client computing platform, the user input representing an indication to edit a first body field of the first note section; obtain audio information representing sound captured by an audio section of the client computing platform, the audio information including value definition information specifying one or more values to be included in the individual body fields; perform speech recognition on the audio information to obtain a first value; and populate the first body field with the first value so that the first value is included in the first body field.
Intelligent text and voice feedback for voice assistant
A method for text feedback includes: receiving, by a controller, an utterance from a user; determining, by an automatic speech recognition engine of the controller, a plurality of speech recognition results based on the utterance from the user, wherein the speech recognition results include probable commands; determining, by the automatic speech recognition engine of the controller, a plurality of confidence scores for each of the plurality of speech recognition results; determining, by the controller, a text characteristic for each of the plurality of probable commands as a function of the confidence scores for each of the plurality of speech recognition results; and commanding, by the controller, a display to show text corresponding to each of the plurality of probable commands with the text characteristic determined by the controller.