Patent classifications
H04M2201/405
Personal Voice-Based Information Retrieval System
The present invention relates to a system for retrieving information from a network such as the Internet. A user creates a user-defined record in a database that identifies an information source, such as a web site, containing information of interest to the user. This record identifies the location of the information source and also contains a recognition grammar based upon a speech command assigned by the user. Upon receiving the speech command from the user that is described within the recognition grammar, a network interface system accesses the information source and retrieves the information requested by the user.
SEMIAUTOMATED RELAY METHOD AND APPARATUS
A call captioning system for captioning a hearing user's (HU's) voice signal during an ongoing call with an assisted user (AU) includes: an AU communication device with a display screen and a caption service activation feature, and a first processor programmed to, during an ongoing call, receive the HU's voice signal. Prior to activating the caption service via the activation feature, the processor uses an automated speech recognition (ASR) engine to generate HU voice signal captions, detect errors in the HU voice signal captions, use the errors to train the ASR software to the HU's voice signal to increase accuracy of the HU captions generated by the ASR engine; and store the trained ASR engine for subsequent use. Upon activating the caption service during the ongoing call, the processor uses the trained ASR engine to generate HU voice signal captions and present them to the AU via the display screen.
Selective internal forwarding in conferences with distributed media servers
A computer-implemented method comprises establishing, by media servers, a video conference for client computing devices, each media server receiving audio data and video data from a local subset of the client computing devices, selecting, by each media server, a portion of the local subset for which to send audio data to other media servers, sending, by each media server, audio data associated with the portion to other media servers, after receiving audio data from other media servers, generating, by each media server, ordered global list data that identifies each client computing device for which the media server has received audio data, based on the global list data and by each media server to other media servers, sending video data for each client computing device of the local subset that satisfies a threshold value.
PROVIDING HIGH QUALITY SPEECH RECOGNITION
A computer-implemented method, system and computer program product for providing high quality speech recognition. A first speech-to-text model is selected to perform speech recognition of a customer's spoken words and a second speech-to-text model is selected to perform speech recognition of the agent's spoken words during a call. The combined results of the speech-to-text models used to process the customer's and agent's spoken words are then analyzed to generate a reference speech-to-text result. The customer speech data that was processed by the first speech-to-text model is reprocessed by multiple other speech-to-text models. A similarity analysis is performed on the results of these speech-to-text models with respect to the reference speech-to-text result resulting in similarity scores being assigned to these speech-to-text models. The speech-to-text model with the highest similarity score is then selected as the new speech-to-text model for performing speech recognition of the customer's spoken words during the call.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
An information processing device includes a processor configured to output, in a case where a service is being used in which at least speech is exchanged among multiple users such that a conversation takes places among all of the multiple users, a speech of a separate conversation distinctly from a speech of the conversation taking place among all of the multiple users to a device of a user who is engaged in the separate conversation with a specific user from among the multiple users, and output the speech of the conversation taking place among all of the multiple users without outputting the speech of the separate conversation to a device of a user who is not engaged in the separate conversation.
SYSTEM AND METHOD FOR TRACKING AND DISPLAY OF COMPLIANCE WITH INSTRUCTIONS PROVIDED BY EMERGENCY CALL TAKER
Techniques for tracking and display of compliance with instructions provided by emergency call takers are provided. An artificial intelligence (AI) bot monitors a conversation between an emergency caller and an emergency call taker. The AI bot identifies at least one instruction issued by the emergency call taker to the emergency caller. The AI bot determines when execution of the at least one instruction has been confirmed.
Captioned telephone services improvement
Internet Protocol captioned telephone service often utilizing Automated Speech Recognition can be utilized with conference calls to separate out each of the various parties' speech as text, such as with text bubbles differentiated by caller on a device of the user. Additionally, a prioritized vocabulary can be provided for each user that is not shared with a public so that if the user utilizes words in their speech not common in the general public, those words can be more accurately identified by the telephone service. The service may learn and apply that vocabulary and/or the user may provide words to the service.
Semiautomated relay method and apparatus
A call captioning system for captioning a hearing user's (HU's) voice signal during an ongoing call with an assisted user (AU) includes: an AU communication device with a display screen and a caption service activation feature, and a first processor programmed to, during an ongoing call, receive the HU's voice signal. Prior to activating the caption service via the activation feature, the processor uses an automated speech recognition (ASR) engine to generate HU voice signal captions, detect errors in the HU voice signal captions, use the errors to train the ASR software to the HU's voice signal to increase accuracy of the HU captions generated by the ASR engine; and store the trained ASR engine for subsequent use. Upon activating the caption service during the ongoing call, the processor uses the trained ASR engine to generate HU voice signal captions and present them to the AU via the display screen.
Systems and methods for prioritizing emergency calls
Systems for and methods of determining the priority of a call interaction include receiving a call interaction from a call center; validating, by a validation and transcription engine, that the call interaction is authentic; converting, by the validation and transcription engine, the call interaction into text; extracting, by a data calculation engine, organization, location, and time information from the text; calculating, by the data calculation engine, a priority of the call interaction from the extracted information and the text by determining an important of words in the text and correlating the words to a priority class using a pre-trained algorithm that is trained on emergency-type and emergency services-type language; determining that the call interaction should be transmitted to a queue of the call center for initial handling by a call center agent; and transmitting the call interaction, the calculated priority, and the extracted information to the call center.
SYSTEMS AND METHODS FOR PRIORITIZING EMERGENCY CALLS
Systems for and methods of determining the priority of a call interaction include receiving a call interaction from a call center; validating, by a validation and transcription engine, that the call interaction is authentic; converting, by the validation and transcription engine, the call interaction into text; calculating, by the data calculation engine, a priority of the call interaction from the text and organization, location, and time information in the text by determining an important of words in the text and correlating the words to a priority class using a pre-trained algorithm that is trained on emergency-type and emergency services-type language; determining that the call interaction should be transmitted to the call center for initial handling by a call center agent; and transmitting the call interaction, the calculated priority, and the extracted information to the call center.