G06F40/49

Constructing imaginary discourse trees to improve answering convergent questions
11782985 · 2023-10-10 · ·

Systems and methods for improving question-answering recall for complex, multi-sentence, convergent questions. More specifically, an autonomous agent accesses an initial answer that partly answers a question received from a user device. The agent represents the question and the initial answer as discourse trees. From the discourse trees, the agent identifies entities in the question that are not addressed by the answer. The agent forms an additional discourse tree from an additional resource such as a corpus of text. The additional discourse tree rhetorically connects a non-addressed entity with the answer. The agent designates this discourse tree as an imaginary discourse tree. When combined with the initial answer discourse tree, the imaginary discourse tree is used to generate an improved answer relative to existing solutions.

AGGREGATED CONTENT EDITING SERVICES (ACES), AND RELATED SYSTEMS, METHODS, AND APPARATUS

A method for distributed pod-editing may be performed by an enhanced pod editor, and may include the following steps: receiving a framework for a pod, wherein the framework identifies one or more content items already assigned to one or more slots in the pod by one or more pod editors; determining attributes of the content items already assigned to the pod in a native taxonomy of the enhanced pod editor; determining restrictions on the pod's slots based on the attributes of the content items already assigned to the pod and on the pod's editorial constraints; rejecting content items already assigned to the pod that violate the restrictions on the pod's slots (if any); identifying candidate content items that comply with the restrictions on the pod's unfilled slots (if any), and selecting candidate content items and assigning the selected content items to the pod's unfilled slots.

AGGREGATED CONTENT EDITING SERVICES (ACES), AND RELATED SYSTEMS, METHODS, AND APPARATUS

A method for distributed pod-editing may be performed by an enhanced pod editor, and may include the following steps: receiving a framework for a pod, wherein the framework identifies one or more content items already assigned to one or more slots in the pod by one or more pod editors; determining attributes of the content items already assigned to the pod in a native taxonomy of the enhanced pod editor; determining restrictions on the pod's slots based on the attributes of the content items already assigned to the pod and on the pod's editorial constraints; rejecting content items already assigned to the pod that violate the restrictions on the pod's slots (if any); identifying candidate content items that comply with the restrictions on the pod's unfilled slots (if any), and selecting candidate content items and assigning the selected content items to the pod's unfilled slots.

Processing data utilizing a corpus

A device may receive data stored in one or more data sources associated with an organization based on utilizing one or more data discovery-related application programming interfaces (APIs) to access the data. The device may process, utilizing one or more data feature models via the one or more data discovery-related APIs, the data received from the one or more data sources to identify types of data included in the data based on a contextualization of the data. The one or more data feature models may identify a respective set of attributes expected to be included in the types of data. The device may perform multiple analyses of the data after identifying the types of data. The device may determine, based on a result of the multiple analyses, a score for the data. The device may perform one or more actions based on a respective result of the multiple analyses.

Processing data utilizing a corpus

A device may receive data stored in one or more data sources associated with an organization based on utilizing one or more data discovery-related application programming interfaces (APIs) to access the data. The device may process, utilizing one or more data feature models via the one or more data discovery-related APIs, the data received from the one or more data sources to identify types of data included in the data based on a contextualization of the data. The one or more data feature models may identify a respective set of attributes expected to be included in the types of data. The device may perform multiple analyses of the data after identifying the types of data. The device may determine, based on a result of the multiple analyses, a score for the data. The device may perform one or more actions based on a respective result of the multiple analyses.

PROVIDING MULTISTREAM MACHINE TRANSLATION DURING VIRTUAL CONFERENCES

An example method includes hosting, by a conference provider, a virtual conference between a plurality of client devices exchanging audio streams; translating, by a translation process, a first transcription of a first audio stream in a first language to create a first translation, translating, by the translation process, a second transcription of a second audio stream in a second language different than the first language to create a second translation; and providing, during the virtual conference, the first translation and the second translation to a first client device and a second client device of the plurality of client devices.

PROVIDING MULTISTREAM MACHINE TRANSLATION DURING VIRTUAL CONFERENCES

An example method includes hosting, by a conference provider, a virtual conference between a plurality of client devices exchanging audio streams; translating, by a translation process, a first transcription of a first audio stream in a first language to create a first translation, translating, by the translation process, a second transcription of a second audio stream in a second language different than the first language to create a second translation; and providing, during the virtual conference, the first translation and the second translation to a first client device and a second client device of the plurality of client devices.

Method and system for evaluating and improving live translation captioning systems

Methods, systems, and apparatus, including computer programs encoded on computer storage media for evaluating and improving live translation captioning systems. An exemplary method includes: displaying a word in a first language; receiving a first audio sequence, the first audio sequence comprising a verbal description of the word; generating a first translated text in a second language; displaying the first translated text; receiving a second audio sequence, the second audio sequence comprising a guessed word based on the first translated text; generating a second translated text in the first language; determining a matching score between the word and the second translated text; determining a performance score of the live translation captioning system based on the matching score.

Method and system for evaluating and improving live translation captioning systems

Methods, systems, and apparatus, including computer programs encoded on computer storage media for evaluating and improving live translation captioning systems. An exemplary method includes: displaying a word in a first language; receiving a first audio sequence, the first audio sequence comprising a verbal description of the word; generating a first translated text in a second language; displaying the first translated text; receiving a second audio sequence, the second audio sequence comprising a guessed word based on the first translated text; generating a second translated text in the first language; determining a matching score between the word and the second translated text; determining a performance score of the live translation captioning system based on the matching score.

DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS

Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes generating a list of demographic words associated with a demographic group, searching an unlabeled corpus of text to identify unlabeled examples in a target domain comprising at least one demographic word from the list of demographic words, rewriting the unlabeled examples to create one or more versions of each of the unlabeled examples and generate a fairness invariance data set, and training the machine learning model using unlabeled examples from the fairness invariance data set.