G10L2015/086

Systems and methods for improved audio-video conferences
11756568 · 2023-09-12 · ·

Systems and methods for efficient management of an audio/video conferences is disclosed. The method includes receiving an audio question from a first user of a plurality of users connected to a conference, recording the audio question and preventing an immediate transmission of the audio question to the plurality of users connected to the conference, analyzing the recorded question and a recorded portion of the conference to determine that the question has been answered during the recorded portion of the conference, and in response to the determining that the audio question has previously been answered, transmitting a relevant section of the recorded portion of the conference consisting of an answer to the audio question to the first user.

Allowing spelling of arbitrary words

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

Explaining anomalous phonetic translations

A method includes: receiving, by a computing device, a digital voice stream; receiving, by the computing device, converted text that represents the digital voice stream; identifying, by the computing device, an erroneously converted portion of the converted text; selecting, by the computing device, the erroneously converted portion for explainability processing; parsing, by the computing device, the erroneously converted portion into parts based on a predetermined parsing level; collecting, by the computing device, supplementary input data related to the erroneously converted portion; and determining, by the computing device and based on the supplemental input data, a reason why the erroneously converted portion was erroneously converted.

SYSTEMS AND METHODS TO IDENTIFY PRODUCTS FROM VERBAL UTTERANCES

Some embodiments provide retail product ordering systems comprising: a user computing device comprising an application executed by a device control circuit to: receive an audible utterance; controls a product identifier application interface to: apply a tokenizer model and obtain a set of individual search words; apply a series of featurizer models to the search words to generate features; and apply a classifier and extractor model based on the features and generate multiple requested product entities each comprising a respective sub-set of the position labeled product terms; wherein the device control circuit is further configured to access a purchase history database, confirm an accuracy of each of requested product entities relative to a purchase history, generate a listing of determined product identifiers corresponding to the confirmed set of the multiple requested product entities, and control a display system of the user computing device to render the listing of determined product identifiers.

Systems and methods for conversing with a user

A system comprising: an input configured to receive input speech data originating from a user; an output configured to output speech or text information; and a processor configured to: provide first input data to a character sequence determination module to determine a character sequence from the first input data, wherein determining a character sequence comprises: obtaining a first list of one or more candidate character sequences from the first input data; selecting a first candidate character sequence from the first list; generating a first confirm request to confirm the selected first candidate character sequence, wherein the first confirm request is outputted by way of the output; if second input data indicating that the first candidate character sequence is not confirmed is received, selecting a second candidate character sequence and generating a second confirm request to confirm the selected second candidate if the second candidate character sequence is different from the first candidate character sequence, wherein the second confirm request is outputted by way of the output; and if second input data indicating that the first candidate character sequence is confirmed is received, the one or more processors are further configured to: provide third input data to a dialogue module, wherein the dialogue module is configured to: determine, based on the third input data, a dialogue act that specifies speech or text information; and output, by way of the output, the speech or text information specified by the determined dialogue act.

Systems and methods for conversing with a user

A system comprising: an input configured to receive input speech data originating from a user; an output configured to output speech or text information; and a processor configured to: provide first input data to a character sequence determination module to determine a character sequence from the first input data, wherein determining a character sequence comprises: obtaining a first list of one or more candidate character sequences from the first input data; selecting a first candidate character sequence from the first list; generating a first confirm request to confirm the selected first candidate character sequence, wherein the first confirm request is outputted by way of the output; if second input data indicating that the first candidate character sequence is not confirmed is received, selecting a second candidate character sequence and generating a second confirm request to confirm the selected second candidate if the second candidate character sequence is different from the first candidate character sequence, wherein the second confirm request is outputted by way of the output; and if second input data indicating that the first candidate character sequence is confirmed is received, the one or more processors are further configured to: provide third input data to a dialogue module, wherein the dialogue module is configured to: determine, based on the third input data, a dialogue act that specifies speech or text information; and output, by way of the output, the speech or text information specified by the determined dialogue act.

Information processing device and information processing method

An information processing device includes: a first reception unit configured to receive an input of one or more characters; a second reception unit configured to receive an input of voice; and a voice recognition unit configured to recognize the voice, and output a voice recognition result beginning with the one or more characters entered into the first reception unit when the second reception unit receives the input of voice with the input of the one or more characters received by the first reception unit.

Editing of word blocks generated by morphological analysis on a character string obtained by speech recognition
11238867 · 2022-02-01 · ·

An apparatus displays, on a terminal that enables a touch operation, an edit screen on which a text including word blocks is edited, where the word blocks are generated by performing morphological analysis on a character string obtained by speech recognition. Upon reception of a scroll instruction to scroll the text, the apparatus shifts each of the word blocks displayed on the edit screen in a description direction of the text, based on the scroll instruction.

ALLOWING SPELLING OF ARBITRARY WORDS

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

Allowing spelling of arbitrary words

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing One of the methods includes receiving a first voice input from a user device, generating a first recognition output, receiving a user selection of one or more terms in the first recognition output- receiving a second voice input spelling a correction of the user selection, determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.