Patent classifications
H04M2201/40
BACKGROUND AUDIO IDENTIFICATION FOR SPEECH DISAMBIGUATION
Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
COMMUNICATION SYSTEM
A communication system includes a management system connected to plural mobile communication terminals through wireless communication and configured to broadcast utterance voice data received from one of the mobile communication terminals to the other mobile communication terminals, to chronologically accumulate the result of utterance voice recognition from voice recognition processing on the received utterance voice data as a user-to-user communication history, and to control text delivery such that the communication history is displayed on the mobile communication terminals in synchronization. Each mobile communication terminal is configured to store notification setting information including a keyword and a predetermined notification function associated with the keyword and provided for the mobile communication terminal, to reproduce the received utterance voice data, display the result of utterance voice recognition, and perform operation control for the notification function associated with the keyword included in the result of utterance voice recognition.
VOICE COMMUNICATION SYSTEM AND METHOD FOR PROVIDING CALL SESSIONS BETWEEN PERSONAL COMMUNICATION DEVICES OF CALLER USERS AND RECIPIENT USERS
A system and method for providing call sessions between personal communication devices (PCDs) of sender users and recipient users are described. The system includes one or more voice communication domains (VCDs) interacting with each other, and a cross-domain coordinator configured to coordinate interaction between the voice communication domains over the Global Communication Network. Each VCD includes PCDs associated with the corresponding users, and a voice communication server (VCS) deployed within the voice communication domain. The VCS is configured to control operation of the PCDs verbally by user voice commands, and to provide call sessions between a sender user and a recipient user within the same voice communication domain and between the users of different voice communication domains. Initiation of the call sessions can be carried out either by voice commands to the VCS of the system, or by delivering a voice call proposal of the caller user for voice communication directly to the recipient in a natural manner.
METHOD AND SYSTEM FOR CHALLENGING POTENTIAL UNWANTED CALLS
Aspects of the subject disclosure may include, for example, detecting, over a network, a call originating from a call originator and intended for a user of a user equipment, responsive to the detecting the call, determining whether to challenge the call originator, based on a determination to challenge the call originator, transmitting a request to the call originator, wherein the request prompts the call originator to specify an identity of the call originator and a purpose for the call, obtaining information from a call originator input responsive to the transmitting the request, deriving enhanced Caller Name or Caller ID data that includes the information, and causing the enhanced Caller Name or Caller ID data to be provided to the user equipment, thereby enabling the user of the user equipment to determine whether to answer the call. Other embodiments are disclosed.
SYSTEMS AND METHODS FOR DETECTING EMOTION FROM AUDIO FILES
Disclosed embodiments may include a system that may receive an audio file comprising an interaction between a first user and a second user. The system may detect, using a deep neural network (DNN), moment(s) of interruption between the first and second users from the audio file. The system may extract, using the DNN, vocal feature(s) from the moment(s) of interruption. The system may determine, using a machine learning model (MLM) and based on the vocal feature(s), whether a threshold number of moments of the moment(s) of interruption corresponds to a first emotion type. When the threshold number of moments corresponds to the first emotion type, the system may transmit a first message comprising a first binary indication. When the threshold number of moments do not correspond to the first emotion type, the system may transmit a second message comprising a second binary indication.
SYSTEM, METHOD, OR APPARATUS FOR EFFICIENT OPERATIONS OF CONVERSATIONAL INTERACTIONS
The present disclosure can include a system, method, or platform for automated call management utilizing a switch capable being utilized with signals from a communication device. The switch interfaces with an artificial intelligence engine that provides contextual interactions to the switch. One or more databases of playback assets may be utilized to send the playback messages to the communications device. Middleware may provide analysis of data coming into the switch along with a call engine for configuring one or more calls made through the switch. Each call can include a call detail record (CDR) which is updateable with information regarding the calls.
VOICEMAIL AS A MESSAGE DELIVERED TO THE DEVICE
Aspects of the subject disclosure may include, for example transcribing a voicemail message left by a call originator and intended for a call recipient. The resulting text is scheduled to be delivered as an SMS message to the call recipient. The source phone number in the SMS message is set to the call originator’s phone number so that the transcribed voicemail text appears in a messaging application’s conversation view between the call originator and the call recipient. Other embodiments are disclosed.
Transcription of communications through a device
A method to transcribe communications is provided. The method may include obtaining first communication data during a communication session between a first communication device and a second communication device and transmitting the first communication data to the second communication device by way of a mobile device that is locally coupled with the first communication device. The method may also include receiving, at the first communication device, second communication data from the second communication device through the mobile device and transmitting the second communication data to a remote transcription system. The method may further include receiving, at the first communication device, transcription data from the remote transcription system, the transcription data corresponding to a transcription of the second communication data, the transcription generated by the remote transcription system and presenting, by the first communication device, the transcription of the second communication data.
Systems and methods for computerized interactive skill training
The present invention is directed to interactive training, and in particular, to methods and systems for computerized interactive skill training. An example embodiment provides a method and system for providing skill training using a computerized system. The computerized system receives a selection of a first training subject. Several related training components can be invoked, such as reading, watching, performing, and/or reviewing components. In addition, a scored challenge session is provided, wherein a training challenge is provided to a user via a terminal, optionally in video form.
Transcription of communications
A method to transcribe communications may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to an automated speech recognition system configured to transcribe the audio data. The method may further include obtaining multiple hypothesis transcriptions generated by the automated speech recognition system. Each of the multiple hypothesis transcriptions may include one or more words determined by the automated speech recognition system to be a transcription of a portion of the audio data. The method may further include determining one or more consistent words that are included in two or more of the multiple hypothesis transcriptions and in response to determining the one or more consistent words, providing the one or more consistent words to the second device for presentation of the one or more consistent words by the second device.