H04M2201/405

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM
20210105437 · 2021-04-08 ·

An information processing device includes an image acquirer that acquires a captured image captured by an imager, an utterer identifier that identifies an utterer, a display target identifier that identifies a display target corresponding to the utterer identified by the utterer identifier from the captured image acquired by the image acquirer, and a display processor that displays display information corresponding to the display target identified by the display target identifier, on a first display.

DISRUPTED-SPEECH MANAGEMENT ENGINE FOR A MEETING MANAGEMENT SYSTEM
20230412734 · 2023-12-21 ·

Methods, systems, and computer storage media for providing a disrupted-speech assistance service associated with a disrupted-speech management engine of a meeting management system. The disrupted-speech assistance service is an accessibility service that supports accessibility operations of a disrupted-speech management engine to provide disrupted-speech assistance features in a meeting management system. In operation, meeting data comprising audio data is accessed. The audio data is analyzed to determine that the audio data comprises disrupted-speech at a threshold level of disrupted-speech. Based on the audio data comprising disrupted-speech at the threshold level of disrupted-speech, one or more disrupted-speech assistance operations for a meeting can be executed. The one or more disrupted-speech assistance operations comprises identifying a disrupted-speech word; determining an alternative word for the disrupted-speech word. A disrupted-speech assistance interface is generated based on the one or more disrupted-speech assistance operations. The disrupted-speech assistance interface comprises the alternative word for the disrupted-speech word.

TARGETED GENERATIVE AI FROM MERGED COMMUNICATION TRANSCRIPTS
20230410801 · 2023-12-21 · ·

The present disclosure relates generally to systems, methods, instructions, and other aspects describing automated transcription and associated script generation. In one aspect, a method includes facilitating a voice bot segment of a two-way communication session, where the voice bot segment is between a customer device and a non-human bot agent, and transfer of the session to a human agent device as part of a human voice segment of the two-way communication session, wherein the transfer occurs following a failure of the non-human bot agent to resolve a customer issue. Accessing survey data describing the two-way communication session, wherein the survey data is associated with successful resolution of the customer issue and automatically processing transcript data from the two-way communication with the survey data to identify language data from the transcript associated with resolution of the customer issue. The non-human bot agent is then dynamically updated using the language data.

DOCUMENT IDENTIFICATION DEVICE, DOCUMENT IDENTIFICATION METHOD, AND PROGRAM

A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.

Voice user interface for wired communications system

A system capable of connecting a device to a Public Switched Telephone Network (PSTN) using an adapter. During a telephone call using the PSTN, the adapter may receive an incoming call from the PSTN and send caller identification to remote server(s). The remote server(s) may determine an identity of a first user currently on the telephone call and determine that the incoming call is directed to a second user. Based on the caller identification, the remote server(s) may send a notification to the second user indicating the incoming call. Alternatively, the remote server(s) may interrupt the current telephone call to announce the incoming call. For example, if a parent is on the telephone when an incoming call for a child is received, the remote server(s) may send a text message to the child without interrupting the current telephone call.

Reprioritizing waitlisted callers based on real-time biometric feedback

Techniques for reprioritizing waitlisted callers using biometric feedback. A biometric aspect of a calling user is monitored in real time via a sensor. A digital output characterizing an emotional state of the user is generated. The digital output is encoded and transmitted to a server via multi-frequency signaling. The server decodes the digital output and reprioritizes the calling user relative to other calling users in the waitlist, in order to expedite the calling user being serviced.

SELECTIVE INTERNAL FORWARDING IN CONFERENCES WITH DISTRIBUTED MEDIA SERVERS
20200336519 · 2020-10-22 ·

A computer-implemented method comprises establishing, by media servers, a video conference for client computing devices, each media server receiving audio data and video data from a local subset of the client computing devices, selecting, by each media server, a portion of the local subset for which to send audio data to other media servers, sending, by each media server, audio data associated with the portion to other media servers, after receiving audio data from other media servers, generating, by each media server, ordered global list data that identifies each client computing device for which the media server has received audio data, based on the global list data and by each media server to other media servers, sending video data for each client computing device of the local subset that satisfies a threshold value.

SEMIAUTOMATED RELAY METHOD AND APPARATUS

A call captioning system for captioning a hearing user's (HU's) voice signal during an ongoing call with an assisted user (AU) includes: an AU communication device with a display screen and a caption service activation feature, and a first processor programmed to, during an ongoing call, receive the HU's voice signal. Prior to activating the caption service via the activation feature, the processor uses an automated speech recognition (ASR) engine to generate HU voice signal captions, detect errors in the HU voice signal captions, use the errors to train the ASR software to the HU's voice signal to increase accuracy of the HU captions generated by the ASR engine; and store the trained ASR engine for subsequent use. Upon activating the caption service during the ongoing call, the processor uses the trained ASR engine to generate HU voice signal captions and present them to the AU via the display screen.

Semiautomated relay method and apparatus

A captioning system comprising a processor and a memory having stored thereon software such that, when the software is executed by the one or more processors, the system generates text captions from speech data, including at least the following, receiving, from a hearing user's (HU's) device, an HU's speech data, generating, at the one or more hardware processors, first text captions from the speech data using a speech recognition algorithm, automatically determining, at the one or more processors, whether the generated first text captions meet a first accuracy threshold and when the first text captions meet the first accuracy threshold, sending the first text captions to an assisted user's (AU's) device for display, when the first text captions do not meet the first accuracy threshold, generating, at the one or more processors, second text captions from the speech data based on user input to the speech recognition algorithm from a call assistant and sending the second text captions to the AU's device for display.

Selective internal forwarding in conferences with distributed media servers
10708320 · 2020-07-07 · ·

A computer-implemented method comprises establishing, by media servers, a video conference for client computing devices, each media server receiving audio data and video data from a local subset of the client computing devices, selecting, by each media server, a portion of the local subset for which to send audio data to other media servers, sending, by each media server, audio data associated with the portion to other media servers, after receiving audio data from other media servers, generating, by each media server, ordered global list data that identifies each client computing device for which the media server has received audio data, based on the global list data and by each media server to other media servers, sending video data for each client computing device of the local subset that satisfies a threshold value.