G10L15/005

Digital Monitoring Badge System
20230228832 · 2023-07-20 ·

A wearable badge for an employee that records and transmits audio from client interactions with the professional, comprising two microphones and two microphone channels that focus one microphone on the speech of the employee and the other microphone on the speech of the customer, making diarizing easier. The wearable badge also comprises a module to determine whether or not the employee is maintaining an appropriate social distance with customers.

VIRTUAL RECEPTIONIST VIA VIDEOCONFERENCING
20230231974 · 2023-07-20 ·

One disclosed example system includes a reception room meeting device configured for establishing a video conference with a device associated with a remote receptionist. The reception room meeting device sends a request for a video meeting with one of a plurality of candidate remote receptionists in response to receiving an activation signal triggered by a visitor to a reception area, and establishes the video meeting with a device associated with one remote receptionist selected based on the request. The system further includes a virtual receptionist system configured to access visitor data obtained by various input devices at the reception area, and determine the status of the visitor based on the visitor data. The virtual receptionist system further transmits the status of the visitor to the device associated with the selected remote receptionist to facilitate the check-in process.

System and method for identifying spoken language in telecommunications relay service
11705131 · 2023-07-18 · ·

A system for identifying spoken language in a telecommunications relay service, which includes a call serving entity; and a plurality of automatic speech recognition groups where each of the automatic speech recognition groups includes an associated automatic speech recognition engine that recognizes and transcribes speech to a predefined language. One of the plurality of automatic speech recognition groups is set as a default automatic speech recognition group and automatic speech recognition engines transcribe and convert peer voices into text packets. The text packets are scored by the automatic speech recognition engine and transmitted to the call serving entity to determine whether the text packets meet a predetermined threshold based on their respective scores with the text packet having the highest score that meets or exceeds the predetermine threshold transmitted to a user.

Language-agnostic Multilingual Modeling Using Effective Script Normalization

A method includes obtaining a plurality of training data sets each associated with a respective native language and includes a plurality of respective training data samples. For each respective training data sample of each training data set in the respective native language, the method includes transliterating the corresponding transcription in the respective native script into corresponding transliterated text representing the respective native language of the corresponding audio in a target script and associating the corresponding transliterated text in the target script with the corresponding audio in the respective native language to generate a respective normalized training data sample. The method also includes training, using the normalized training data samples, a multilingual end-to-end speech recognition model to predict speech recognition results in the target script for corresponding speech utterances spoken in any of the different native languages associated with the plurality of training data sets.

Cognitive analysis for speech recognition using multi-language vector representations

A method, system and computer program product for speech recognition using multiple languages includes receiving, by one or more processors, an input from a user, the input includes a sentence in a first language. The one or more processors translate the sentence to a plurality of languages different than the first language, and create vectors associated with the plurality of languages, each vector includes a representation of the sentence in each of the plurality of languages. The one or more processors calculate eigenvectors for each vector associated with a language in the plurality of languages, and based on the calculated eigenvectors, a score is assigned to each of the plurality of languages according to a relevance for determining a meaning of the sentence.

Systems and methods for providing media based on a detected language being spoken

Various embodiments provide media based on a detected language being spoken. In one embodiment, the system electronically detects which language of a plurality of languages is being spoken by a user, such during a conversation or while giving a voice command to the television. Based on which language of a plurality of languages is being spoken by the user, the system electronically presents media to the user that is in the detected language. For example, the media may be television channels and/or programs that are in the detected language and/or a program guide, such as a pop-up menu, including such media that are in the detected language.

VOICE TRANSLATION AND VIDEO MANIPULATION SYSTEM
20230009957 · 2023-01-12 · ·

A communication modification system including an audio gathering unit that gathers an audio stream, a language detection unit that converts the audio stream into text, where the language detection unit correlates portions of the text with audio portions of the audio stream, and the language detection unit determines a first and second deviation in the audio stream portion based on the text portion and audio portion gathered by the audio gathering unit.

COMPENSATING FOR HARDWARE DISPARITIES WHEN DETERMINING WHETHER TO OFFLOAD ASSISTANT-RELATED PROCESSING TASKS FROM CERTAIN CLIENT DEVICES
20230215438 · 2023-07-06 ·

Implementations set forth herein relate to off-loading, or temporarily ceasing such off-loading, computational tasks to a separate computing device based on a network metric(s) that is not limited to signal strength. Rather, a network metric for determining whether to continue relying on a network connection with a server computing device for certain computational tasks can be based on a current, or recent, interaction with the server computing device. In this way, an application executing at a computing device having a powerful antenna — but an otherwise limited network velocity, can determine to temporarily rely exclusively on local processing. For instance, an automated assistant can temporarily cease communicating audio data to a remote server computing device, during a dialog session, in response to determining a network metric fails to satisfy a threshold-even though there may appear to be adequate signal strength to effectively transmit the audio data.

Conversation assistance system
11553077 · 2023-01-10 · ·

Systems and methods for providing conversation assistance include receiving from at least one user device of a user, conversation information and determining that the conversation information is associated with a conversation involving the user and a first person that is associated with first conversation assistance information in a non-transitory memory. Body measurement data of the user is retrieved from the at least first user device. A need for conversation assistance in the conversation involving the user and the first person is detected using the body measurement data. First conversation assistance information associated with the first person is retrieved from the non-transitory memory. The first conversation assistance information associated with the first person is provided through the at least one user device.

VOICE-BASED CONTROL OF SEXUAL STIMULATION DEVICES
20230210716 · 2023-07-06 ·

A system and method for voice-based control of sexual stimulation devices. In some configurations, the system and method involve receiving voice data, analyzing the voice data to detect spoken commands, and generating control signals based on the commands. In some configurations, the system and method involve receiving voice data, analyzing the voice data for non-speech vocalizations, detecting voice stress patterns, and generating control signals based on the detected patterns. In some configurations, the analyses of the voice data are performed by machine learning algorithms which may be trained on associations between speech and non-speech vocalizations of a user while the user engages in one or more voice-based training tasks, associating speech and non-speech vocalizations with controls of the sexual stimulation device. In some configurations, machine learning algorithms are used to make the associations. In some configurations, data from other biometric sensors is included in the associations.