Patent classifications
G10L17/00
Automated transcript generation from multi-channel audio
Systems and methods are described for generating a transcript of a legal proceeding or other multi-speaker conversation or performance in real time or near-real time using multi-channel audio capture. Different speakers or participants in a conversation may each be assigned a separate microphone that is placed in proximity to the given speaker, where each audio channel includes audio captured by a different microphone. Filters may be applied to isolate each channel to include speech utterances of a different speaker, and these filtered channels of audio data may then be processed in parallel to generate speech-to-text results that are interleaved to form a generated transcript.
Automated transcript generation from multi-channel audio
Systems and methods are described for generating a transcript of a legal proceeding or other multi-speaker conversation or performance in real time or near-real time using multi-channel audio capture. Different speakers or participants in a conversation may each be assigned a separate microphone that is placed in proximity to the given speaker, where each audio channel includes audio captured by a different microphone. Filters may be applied to isolate each channel to include speech utterances of a different speaker, and these filtered channels of audio data may then be processed in parallel to generate speech-to-text results that are interleaved to form a generated transcript.
Sending messages from smart speakers and smart displays via smartphones
Techniques are described herein for using a smart device such as a standalone assistant-centric interactive speaker and/or a standalone assistant-centric interactive display with speaker(s) to send a message using a messaging application on a client device such as a smartphone. A method includes: receiving, by a first device, a request from a first user to send a message to a second user; determining that a messaging application corresponding to the request is unavailable on the first device; and in response to determining that the messaging application corresponding to the request is unavailable on the first device: selecting a second device on which the messaging application corresponding to the request is available; and sending, to the second device, a command that causes the second device to send the message from the first user to the second user using the messaging application on the second device.
Telephone system for the hearing impaired
A telephone system is described herein, wherein the telephone system is configured to assist a hearing-impaired person with telephone communications as well as face-to-face conversations. In telephone communication sessions, the telephone system is configured to audibly emit spoken utterances while simultaneously depicting a transcription of the spoken utterances on a display. When the telephone system is not employed in a telephone communication session, the telephone system is configured to display transcriptions of spoken utterances of people who are in proximity to the telephone system.
Telephone system for the hearing impaired
A telephone system is described herein, wherein the telephone system is configured to assist a hearing-impaired person with telephone communications as well as face-to-face conversations. In telephone communication sessions, the telephone system is configured to audibly emit spoken utterances while simultaneously depicting a transcription of the spoken utterances on a display. When the telephone system is not employed in a telephone communication session, the telephone system is configured to display transcriptions of spoken utterances of people who are in proximity to the telephone system.
Digital assistant and a corresponding method for voice-based interactive communication based on detected user gaze indicating attention
Method for voice-based interactive communication using a digital assistant, wherein the method comprises, an attention detection step, in which the digital assistant detects a user attention and as a result is set into a listening mode; a speaker detection step, in which the digital assistant detects the user as a current speaker; a speech sound detection step, in which the digital assistant detects and records speech uttered by the current speaker, which speech sound detection step further comprises a lip movement detection step, in which the digital assistant detects a lip movement of the current speaker; a speech analysis step, in which the digital assistant parses said recorded speech and extracts speech-based verbal informational content from said recorded speech; and a subsequent response step, in which the digital assistant provides feed-back to the user based on said recorded speech.
System and method for data augmentation for multi-microphone signal processing
A method, computer program product, and computing system for receiving a signal from each microphone of a plurality of microphones, thus defining a plurality of signals. One or more inter-microphone gain-based augmentations may be performed on the plurality of signals, thus defining one or more inter-microphone gain-augmented signals.
System and method for quantifying meeting effectiveness using natural language processing
Systems, methods, and computer-readable storage media for quantifying meeting effectiveness for an individual. A system configured as disclosed herein uses data from multiple meetings in which a user participated to create a user profile for the user. The system then receives data related to a new meeting in which the user participated, processes the new meeting data into segments using natural language processing, tags the resulting segments based on contexts, and compares the tagged segments to the user profile to generate a meeting effectiveness score for the new meeting which is specific to the user. The system can use machine learning to iteratively improve an ability of the system to generate the tagged segments using historical meeting data and updating that historical meeting data with each iteration of scoring a meeting's effectiveness.
Terminal and Operating Method Thereof
A terminal may include a display that is divided into at least two areas, when a real time broadcasting, where a user of the terminal is a host, starts through a broadcasting channel, and of which one area of the at least two areas is allocated to the host; an input/output interface that receives a voice of the host; a communication interface that receives one item selected of at least one or more items and a certain text from a terminal of a certain guest, of at least one or more guests who entered the broadcasting channel; and a processor that generates a voice message converted from the certain text into the voice of the host or a voice of the certain guest.
Voice Biometric Authentication in a Virtual Assistant
Aspects of the disclosure relate to voice biometric authentication in a virtual assistant. In some embodiments, a computing platform may receive, from a user device, an audio file comprising a voice command to access information related to a user account. The computing platform may retrieve one or more voice biometric signatures from a voice biometric database associated with the user account, and apply a voice biometric matching algorithm to compare the voice command of the audio file to the one or more voice biometric signatures to determine if a match exists between the voice command and one of the one or more voice biometric signatures. In response to determining that a match exists, the computing platform may retrieve information associated with the user account, and then send, via the communication interface, the information associated with the user account to the user device.