G06F40/44

Narrative evaluator

A system includes a narrative repository which stores a plurality of narratives and, for each narrative, a corresponding outcome. A narrative evaluator receives the plurality of narratives and the outcome for each narrative. For each received narrative, a subset of the narrative is determined to retain based on rules. For each determined subset, a entropy matrix is determined which includes, for each word in the subset, a measure associated with whether the word is expected to appear in a sentence with another word in the subset. For each entropy matrix, a distance matrix is determined which includes, for each word in the subset, a numerical representation of a difference in meaning of the word and another word. Using one or more distance matrix(es), a first threshold distance is determined for a first word of the subset. The first word and first threshold are stored as a first word-threshold pair associated with the first outcome.

ABSTRACT LEARNING METHOD, ABSTRACT LEARNING APPARATUS AND PROGRAM

The efficiency of summary learning that requires an additional input parameter is improved by causing a computer to execute: a first learning step of learning a first model for calculating an importance value of each component in source text, with use of a first training data group and a second training data group, the first training data group including source text, a query related to a summary of the source text, and summary data related to the query in the source text, and the second training group including source text and summary data generated based on the source text; and a second learning step of learning a second model for generating summary data from source text of training data, with use of each piece of training data in the second training data group and a plurality of components extracted for each piece of training data in the second training data group based on importance values calculated by the first model for components of the source text of the piece of training data.

Dataset Refining with Machine Translation Quality Prediction

Aspects of the technology employ a machine translation quality prediction (MTQP) model to refine datasets that are used in training machine translation systems. This includes receiving, by a machine translation quality prediction model, a sentence pair of a source sentence and a translated output (802). Then performing feature extraction on the sentence pair using a set of two or more feature extractors, where each feature extractor generates a corresponding feature vector (804). The corresponding feature vectors from the set of feature extractors are concatenated together (806). And the concatenated feature vectors are applied to a feedforward neural network, in which the feedforward neural network generates a machine translation quality prediction score for the translated output (808).

Full Attention with Sparse Computation Cost

The present disclosure is directed to machine learning model architectures which provide full attention capability in each attention head while maintaining low computation and memory complexity. Specifically, according to one aspect of the present disclosure, example attention models provided herein can treat the self-attention mechanism as a conditional expectation over embeddings at each location and approximate the conditional distribution with a structured factorization. Each location can attend to all other locations, either via direct attention, or through indirect attention to group representations, which are again conditional expectations of embeddings from corresponding local regions.

Inverted Projection for Robust Speech Translation
20230021824 · 2023-01-26 · ·

The technology provides an approach to train translation models that are robust to transcription errors and punctuation errors. The approach includes introducing errors from actual automatic speech recognition and automatic punctuation systems into the source side of the machine translation training data. A method for training a machine translation model includes performing automatic speech recognition on input source audio to generate a system transcript. The method aligns a human transcript of the source audio to the system transcript, including projecting system segmentation onto the human transcript. Then the method performs segment robustness training of a machine translation model according to the aligned human and system transcripts, and performs system robustness training of the machine translation model, e.g., by injecting token errors into training data.

Method and apparatus for training models in machine translation, electronic device and storage medium

A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.

Method and apparatus for training models in machine translation, electronic device and storage medium

A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.

Learned evaluation model for grading quality of natural language generation outputs
11704506 · 2023-07-18 · ·

Systems and methods for automatic evaluation of the quality of NLG outputs. In some aspects of the technology, a learned evaluation model may be pretrained first using NLG model pretraining tasks, and then with further pretraining tasks using automatically generated synthetic sentence pairs. In some cases, following pretraining, the evaluation model may be further fine-tuned using a set of human-graded sentence pairs, so that it learns to approximate the grades allocated by the human evaluators. In some cases, following fine-tuning, the learned evaluation model may be distilled into a student model.

Generating questions using a resource-efficient neural network

Technology is described herein for generating questions using a neural network. The technology generates the questions in a three-step process. In the first step, the technology selects, using a first neural network, a subset of textual passages from an identified electronic document. In the second step, the technology generates, using a second neural network, one or more candidate answers for each textual passage selected by the first neural network, to produce a plurality of candidate passage-answer pairs. In the third step, the technology selects, using a third neural network, a subset of the plurality of candidate passage-answer pairs. The technology then generates an output result that includes one or more output questions chosen from the candidate passage-answer pairs produced by the third neural network. The use of the first neural network reduces the processing burden placed on the second and third neural networks. It also reduces latency.

Generating neural network outputs using insertion operations

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating network outputs using insertion operations.