Patent classifications
H04M3/42204
System and method for calling a service representative using an intelligent voice assistant
A system and method for making a call to a service provider on behalf of a user is disclosed. The system and method include using an intelligent voice assistant to call the service provider and having the intelligent voice assistant navigate an interactive voice response system to reach a representative. The system and method also include using the intelligent voice assistant to reconnect a user with a representative after an interrupted call.
VOICE-CONTROLLED COMMUNICATION REQUESTS AND RESPONSES
Systems and methods for establishing communication connections using speech, such as establishing calls between speech-controlled devices, are described. A first speech-controlled device receives a communication request in the form of audio and sends audio data corresponding to the captured audio to a server. The server performs speech processing on the audio data to determine a recipient, a subject for the call, and a device associated with the recipient. The server then sends a message indicating the communication request and audio data corresponding to the communication topic to the recipient's speech-controlled device. The recipient device outputs audio to the recipient requesting whether the recipient accepts the communication request. The recipient audibly refuses or accepts the communication request, and the recipient's speech-controlled device sends an indication of the recipient's audible decision to the server. If the recipient accepted the communication request, the server causes a communication connection be established between the two speech-controlled devices.
REUSABLE MULTIMODAL APPLICATION
A method and system are disclosed herein for accepting multimodal inputs and deriving synchronized and processed information. A reusable multimodal application is provided on the mobile device. A user transmits a multimodal command to the multimodal platform via the mobile network. The one or more modes of communication that are inputted are transmitted to the multimodal platform(s) via the mobile network(s) and thereafter synchronized and processed at the multimodal platform. The synchronized and processed information is transmitted to the multimodal application. If required, the user verifies and appropriately modifies the synchronized and processed information. The verified and modified information are transferred from the multimodal application to the visual application. The final result(s) are derived by inputting the verified and modified results into the visual application.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM
Provided is an information processing apparatus capable of reliably delivering a message to a third party desired by a user.
Provided is an information processing apparatus including an acquisition unit configured to acquire information including a sound message, and a recognition unit configured to recognize a sender of the sound message, a destination of a message included is the sound message, and content of the message from the information acquired by the acquisition unit, in which the recognition unit generates information for inputting the destination of the message is a case where the destination cannot be uniquely specified.
Voice control of remote device
A system configured to enable remote control to allow a first user to provide assistance to a second user. The system may receive a command from the second user granting remote control to the first user, enabling the first user to initiate a voice command on behalf of the second user. In some examples, the system may enable the remote control by enabling wakeword detection for incoming audio data, enabling a second device to detect a wakeword and corresponding voice command from incoming audio data originating from a first device. For example, the second device may disable and/or modify echo cancellation processing, enabling the second device to detect the voice command from audio output based on the incoming audio data and/or from the incoming audio data itself.
Voice recognition-based dialing
A voice recognition-based dialing method and a voice recognition-based dialing system are provided. The methods includes: determining a recognition result based on a user's voice input, at least one acoustic model and at least one language model, where the at least one acoustic model and the at least one language model are obtained based on information collected in an electronic device. The system includes: obtain at least one acoustic model and at least one language model based on information collected in an electronic device; and determine a recognition result based on a user's voice input, the at least one acoustic model and the at least one language model. The acoustic models and the language models are updated based on the information collected in the electronic device, which may be helpful to the voice recognition-based dialing.
AUTOMATIC UPLOAD OF PICTURES FROM A CAMERA
A system and method is disclosed for enabling user friendly interaction with a camera system. Specifically, the inventive system and method has several aspects to improve the interaction with a camera system, including voice recognition, gaze tracking, touch sensitive inputs and others. The voice recognition unit is operable for, among other things, receiving multiple different voice commands, recognizing the vocal commands, associating the different voice commands to one camera command and controlling at least some aspect of the digital camera operation in response to these voice commands. The gaze tracking unit is operable for, among other things, determining the location on the viewfinder image that the user is gazing upon. One aspect of the touch sensitive inputs provides that the touch sensitive pad is mouse-like and is operable for, among other things, receiving user touch inputs to control at least some aspect of the camera operation. Another aspect of the disclosed invention provides for gesture recognition to be used to interface with and control the camera system.
INTELLIGENT TELECONFERENCE OPERATIONS IN AN INTERNET OF THINGS (IOT) COMPUTING ENVIRONMENT
Embodiments for intelligent teleconference operations in an Internet of Things (IoT) computing environment by a processor. A communication connection for a conference call session may be cognitively initiated or terminated with one or more users according to one or more parameters associated with a user profile, a schedule of the one or more users, activities of daily living (ADL), one or more contextual factors, or a combination thereof.
Network conference management and arbitration via voice-capturing devices
Systems and methods are provided for managing a conference call with multiple voice-enabled and voice-capturing devices, such as smart speakers. Reproduced, duplicate voice commands can cause unexpected results in a conference call. The voice commands can be determined to be received from the same conference call. A voice command for a particular voice-enabled device can be selected based on an energy level of an audio signal, event data, time data, and/or user identification.
Reusable multimodal application
A method and system are disclosed herein for accepting multimodal inputs and deriving synchronized and processed information. A reusable multimodal application is provided on the mobile device. A user transmits a multimodal command to the multimodal platform via the mobile network. The one or more modes of communication that are inputted are transmitted to the multimodal platform(s) via the mobile network(s) and thereafter synchronized and processed at the multimodal platform. The synchronized and processed information is transmitted to the multimodal application. If required, the user verifies and appropriately modifies the synchronized and processed information. The verified and modified information are transferred from the multimodal application to the visual application. The final result(s) are derived by inputting the verified and modified results into the visual application.