Patent classifications
G10L2015/226
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
An information processing apparatus that includes a control unit (150) controlling an action of an autonomous operation unit, and in which the control unit controls transition of plural states relating to speech recognition processing through the autonomous operation unit based on a detected trigger, and the states include a first active state in which an action of the autonomous operation unit is restricted, and a second active state in which the speech recognition processing is performed is provided. Moreover, an information processing method in which a processor controls an action of an autonomous operation unit, the controlling includes controlling transition of plural states relating to speech recognition processing through the autonomous operation unit based on a detected trigger, and the states include a first active state in which an action of the autonomous operation unit is restricted, and a second active state in which the speech recognition processing is performed is provided.
INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT
Systems and processes for operating an intelligent automated assistant are provided. In one example process, discourse input representing a user request is received. The process determines whether the discourse input relates to a device of an established location. In response to determining that the discourse input relates to a device of an established location, a data structure representing a set of devices of the established location is retrieved. The process determines, using the data structure, a user intent corresponding to the discourse input, the user intent associated with an action to be performed by a device of the set of devices, and a criterion to be satisfied prior to performing the action. The action and the device are stored in association with the criterion, where, in accordance with a determination that the criterion is satisfied, the action is performed by the device.
Recorded media HOTWORD trigger suppression
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotword triggers when detecting a hotword in recorded media are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio corresponding to playback of an item of media content. The actions further include determining, by the computing device, that the audio includes an utterance of a predefined hotword and that the audio includes an audio watermark. The actions further include analyzing, by the computing device, the audio watermark. The actions further include based on analyzing the audio watermark, determining, by the computing device, whether to perform speech recognition on a portion of the audio following the predefined hotword.
ELECTRONIC APPARATUS AND SERVICE PROVIDING METHOD THEREOF
An electronic apparatus and a service providing method thereof are provided. The service providing method includes receiving a sound generated from outside through at least one microphone, extracting a noise feature from the received sound, and providing a service corresponding to the extracted noise feature by comparing the extracted noise feature and a pre-stored noise feature.
Contextual hotwords
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for contextual hotwords are disclosed. In one aspect, a method, during a boot process of a computing device, includes the actions of determining, by a computing device, a context associated with the computing device. The actions further include, based on the context associated with the computing device, determining a hotword. The actions further include, after determining the hotword, receiving audio data that corresponds to an utterance. The actions further include determining that the audio data includes the hotword. The actions further include, in response to determining that the audio data includes the hotword, performing an operation associated with the hotword.
SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION SYSTEM, AND SPEECH RECOGNITION METHOD
A speech signal processing unit individually separates uttered speech of a plurality of passengers each seated in one of a plurality of speech recognition target seats in a vehicle. A speech recognition unit performs speech recognition on uttered speech of each of the passengers separated by the speech signal processing unit and calculates a speech recognition score. A score-using determining unit determines a speech recognition result of which of the passengers is to be used from among speech recognition results for the passengers, using the speech recognition score of each of the passengers.
Device, method, and program
A device communicates with a human through voice recognition of voice of the human. The device includes: a drive mechanism that drives the device; and a processor. The processor controls the drive mechanism to drive the device to a waiting place for the device to contact the human, and the waiting place is determined based on contact information that is history of contact between the device and the human.
Electronic device providing response corresponding to user conversation style and emotion and method of operating same
An electronic device includes a microphone, a communication circuit, and a processor configured to obtain a user's utterance through the microphone, transmit first information about the utterance through the communication circuit to an external server for at least partially automatic speech recognition (ASR) or natural language understanding (NLU), obtain a second text from the external server through the communication circuit, the second text being a text resulting from modifying at least part of a first text included in a neutral response to the utterance based on parameters corresponding to the user's conversation style and emotion identified based on the first information, and provide a voice corresponding to the second text or a message including the second text in response to the utterance.
Voice activation based on user recognition
A device for voice activation includes one or more processors. The one or more processors are configured to receive, via one or more microphones, a keyword and a first command spoken by a first user. The one or more processors are also configured to, subsequent to receiving the first command, receive a second command via the one or more microphones without an intervening receipt of the keyword. The one or more processors are further configured to, based at least in part on determining that the second command is spoken by the same first user, selectively process the second command.
COMPOUND GESTURE-SPEECH COMMANDS
A multimedia entertainment system combines both gestures and voice commands to provide an enhanced control scheme. A user's body position or motion may be recognized as a gesture, and may be used to provide context to recognize user generated sounds, such as speech input. Likewise, speech input may be recognized as a voice command, and may be used to provide context to recognize a body position or motion as a gesture. Weights may be assigned to the inputs to facilitate processing. When a gesture is recognized, a limited set of voice commands associated with the recognized gesture are loaded for use. Further, additional sets of voice commands may be structured in a hierarchical manner such that speaking a voice command from one set of voice commands leads to the system loading a next set of voice commands.