Patent classifications
G10L2015/221
Display apparatus and method for registration of user command
A display apparatus includes an input unit configured to receive a user command; an output unit configured to output a registration suitability determination result for the user command; and a processor configured to generate phonetic symbols for the user command, analyze the generated phonetic symbols to determine registration suitability for the user command, and control the output unit to output the registration suitability determination result for the user command. Therefore, the display apparatus may register a user command which is resistant to misrecognition and guarantees high recognition rate among user commands defined by a user.
DISPLAY APPARATUS, VOICE ACQUIRING APPARATUS AND VOICE RECOGNITION METHOD THEREOF
Disclosed are a display apparatus, a voice acquiring apparatus and a voice recognition method thereof, the display apparatus including: a display unit which displays an image; a communication unit which communicates with a plurality of external apparatuses; and a controller which includes a voice recognition engine to recognize a user's voice, receives a voice signal from a voice acquiring unit, and controls the communication unit to receive candidate instruction words from at least one of the plurality of external apparatuses to recognize the received voice signal.
CONDITIONAL RESPONSE FULFILLMENT CACHE FOR LOCALLY RESPONDING TO AUTOMATED ASSISTANT INPUTS
Implementations set forth herein relate to conditionally caching responses to automated assistant queries according to certain contextual data that may be associated with each automated assistant query. Each query can be identified based on historical interactions between a user and an automated assistant, and—depending on the query, fulfillment data can be cached according to certain contextual data that influences the query response. Depending on how the contextual data changes, a cached response stored at a client device can be discarded and/or replaced with an updated cached response. For example, a query that users commonly ask prior to leaving for work can have a corresponding assistant response that depends on features of an environment of the users. This unique assistant response can be cached, before the users provide the query, to minimize latency that can occur when network or processing bandwidth is unpredictable.
INFORMATION PROCESSING DEVICE, METHOD OF INFORMATION PROCESSING, AND PROGRAM
[Object] To provide technology capable of performing processing on a string recognized from input speech more efficiently. [Solution] Provided is an information processing device including: a processing unit acquisition portion configured to acquire one or more processing units, on the basis of noise, from a first recognition string obtained by performing speech recognition on first input speech; and a processor configured to, when any one of the one or more processing units is selected as a processing target, process the processing target.
Apparatus, method and program to facilitate retrieval of voice messages
An information processing apparatus includes a display, an input unit, and a controller. The input unit is configured to receive an input of a first keyword from a user. The controller is configured to retrieve first character information including the input first keyword from a database configured to store a plurality of character information items converted from a plurality of voice information items by voice recognition processing, extract a second keyword that is included in the first character information acquired by the retrieval and is different from the first keyword, and control the display to display a list of items including first identification information with which the acquired first character information is identified and the second keyword included in the first character information.
Compounding corrective actions and learning in mixed mode dictation
Techniques performed by a data processing system for processing voice content received from a user herein include receiving a first audio input from the user comprising a mixed-mode dictation, analyzing, using one or more machine learning (ML) models, the first audio input to obtain a first interpretation of the mixed-mode dictation, presenting the first interpretation to the user in an application on the data processing system, receiving a second audio input from the user comprising a corrective command, analyzing the second audio input to obtain a second interpretation of the restatement of the mixed-mode dictation presenting the second interpretation to the user, receiving an indication from the user that the second interpretation is a correct interpretation of the mixed-mode dictation, and modifying the operating parameters of the one or more machine learning models to interpret the subsequent instances of the mixed-mode dictation based on the second interpretation.
SMART INTERFACE WITH FACILITATED INPUT AND MISTAKE RECOVERY
Systems, methods, and devices including smart interfaces with facilitated input and mistake recovery are described. For example, a smart interface system can identify one or more portions of user input as alterable decisions, and, for each of the one or more alterable decisions, store, in a memory, information about one or more alternative options for the alterable decision. The system can also identify one of the alterable decisions as the currently alterable decision, and upon receiving an input indicative of an actuation of the alteration key, alter the currently alterable decision to another of the one or more alternative options based on the stored information.
Synchronized voice application to present accurate real time content uttered by a text reader/reciter
The embodiments of the invention allows retrieval of information or processing of commands through a speech interface and/or a combination of a speech interface and a non-speech interface. Thus, facilitating verbal search of religious and non-religious texts, and publishing resulting finds, along with exegesis and/or explanations. Beneficial uses can be gotten in the fields of, but not limited to those fields, of religious worship, and education. The embodiments of the invention eases interaction of the user(s) with the text(s) and allows for chances of in-depth comprehension and greater access to knowledge, but not limited only to those benefits. The scope and ramifications of the use(s) and benefits cannot be measured in a limited manner. As technology and imagination increases, the true scope and ramifications would increase as well.
PHRASE ALTERNATIVES REPRESENTATION FOR AUTOMATIC SPEECH RECOGNITION AND METHODS OF USE
Phrase alternative data structures are generated from the lattice output of an audio input to Automatic Speech Recognition (ASR) system. A user interface is supported for users to view phrase alternatives to selected portions of an audio transcript of the audio input, search the transcript based on query phrases, or edit the transcript based on phrase alternatives.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
There is provided an information processing apparatus, an information processing method, and a program, in which a database in which unknown words are registered is able to be efficiently created, the information processing apparatus including: an identifying portion that identifies a word uttered according to a predetermined utterance method, from within utterance information of a user, as an unknown word; and a processing portion that performs processing to register the unknown word that has been identified.