Patent classifications
H04M2201/405
SELECTIVE INTERNAL FORWARDING IN CONFERENCES WITH DISTRIBUTED MEDIA SERVERS
A computer-implemented method comprises establishing, by media servers, a video conference for client computing devices, each media server receiving audio data and video data from a local subset of the client computing devices, selecting, by each media server, a portion of the local subset for which to send audio data to other media servers, sending, by each media server, audio data associated with the portion to other media servers, after receiving audio data from other media servers, generating, by each media server, ordered global list data that identifies each client computing device for which the media server has received audio data, based on the global list data and by each media server to other media servers, sending video data for each client computing device of the local subset that satisfies a threshold value.
Virtual Office Receptionist
Provided is an office receptionist system formed from a distributed set of system valets and a system concierge. The system valets record and pass human inquiries at various points of ingress and egress to the system concierge. The system concierge parses each inquiry, determines the type of inquiry being made, and further determines whether the inquiry provides sufficient information for the determined inquiry type. The sufficiency of the inquiry is determined from a rule set that further defines different data sources from which the system concierge obtains data elements for generating a response to the inquiry as well as the actions to perform as part of responding to the inquiry. The response is returned to the system valet originating the inquiry for playback thereon.
ELECTRONIC APPARATUS FOR RECOGNIZING KEYWORD INCLUDED IN YOUR UTTERANCE TO CHANGE TO OPERATING STATE AND CONTROLLING METHOD THEREOF
An apparatus comprising one or more processors, a communication circuit, and a memory for storing instructions, which when executed, performs a method of recognizing a user utterance. The method comprises: receiving first data associated with a user utterance, performing, a first determination to determine whether the user utterance includes the first data and a specified word, performing a second determination to determine whether the first data includes the specified word, transmitting the first data to an external server, receiving a text generated from the first data by the external server, performing a third determination to determine whether the received text matches the specified word, and determining whether to activate the voice-based input system based on the third determination.
Systems and methods for intelligent call agent evaluations
A computer-implemented method is provided for quantitative performance evaluation of a call agent. The method comprises converting an audio recording of a call between the call agent and a customer to a text-based transcript and identifying at least one topic for categorizing the transcript. The method also includes retrieving a set of criteria associated with the topic. Each criterion correlates to a set of predefined questions for interrogating the transcript to evaluate the performance of the call agent with respect to the corresponding criterion. Each question captures a sub-criterion under the corresponding criterion. The method further includes inputting the predefined questions and the transcript into a trained large language model to obtain scores for respective ones of the predefined questions. Each score measures a degree of satisfaction of the performance of the call agent during the call with respect to the sub-criterion captured by the corresponding predefined question.
Bridge for Non-Voice Communications User Interface to Voice-Enabled Interactive Voice Response System
A bridging for using a non-voice-based user interface, such as a text chat interface, with a voice-enabled interactive voice response system which, during a non-voice-based communication session with a client user device, receives from the client user device, a non-voice entry entered by a client user into the communication session; identifies one or more elements in the non-voice entry constrained by one or more allowed responses by the voice-enabled interactive voice response system; maps the one or more elements to one or more of the allowed responses; and passes the mapped one or more identified elements to a voice-enabled interactive voice response system as a input via emulation of a voice recognition analysis response.
Inbound calls to intelligent controlled-environment facility resident media and/or communications devices
An inbound call connection request may be received from a non-resident, directed to a controlled-environment facility resident and/or the resident's device. A determination may be made that a calling account of the resident does not have sufficient funds to pay for the inbound call, whereupon a message may be provided to the non-resident offering billing options, including at least a wireless carrier billing option, to complete the call connection. The call may be connected with the resident device in response to a determination a calling account of the inmate has sufficient funds to pay for the call or acceptance of one of the payment methods by the non-resident, along with authentication that the non-resident is associated with an address identifier (AID) of the resident device, and verification that the resident operating the device is associated with the AID of the device.
System and method for analyzing and classifying calls without transcription via keyword spotting
A facility and method for analyzing and classifying calls without transcription via keyword spotting is disclosed. The facility uses a group of calls having known outcomes to generate one or more domain- or entity-specific grammars containing keywords and related information that are indicative of particular outcome. The facility monitors telephone calls by determining the domain or entity associated with the call, loading the appropriate grammar or grammars associated with the determined domain or entity, and tracking keywords contained in the loaded grammar or grammars that are spoken during the monitored call, along with additional information. The facility performs a statistical analysis on the tracked keywords and additional information to determine a classification for the monitored telephone call.
System and method for tracking and display of compliance with instructions provided by emergency call taker
Techniques for tracking and display of compliance with instructions provided by emergency call takers are provided. An artificial intelligence (AI) bot monitors a conversation between an emergency caller and an emergency call taker. The AI bot identifies at least one instruction issued by the emergency call taker to the emergency caller. The AI bot determines when execution of the at least one instruction has been confirmed.
CONTROL METHOD FOR CONTROL DEVICE, CONTROL METHOD FOR APPARATUS CONTROL SYSTEM, AND CONTROL DEVICE
Provided is a control method for a control device including: acquiring a user instruction to control a control target apparatus by a user, generating control speech information in response to the user instruction, the control speech information being speech information representing content of control on the control target apparatus and including auxiliary speech information which is different information from the user instruction, and outputting the generated control speech information to a speech recognition server which executes speech recognition processing.
ACTIVE VOICE LIVENESS DETECTION SYSTEM
Disclosed are systems and methods including software processes executed by a server that detect audio-based synthetic speech (deepfakes) in a call conversation. Embodiments include systems and methods for detecting fraudulent presentation attacks using multiple functional engines that implement various fraud-detection techniques, to produce calibrated scores and/or fused scores. A computer may, for example, evaluate the audio quality of speech signals within audio signals, where speech signals contain the speech portions having speaker utterances.