IPIQ

G06F16/3347

Unsupervised aspect-based multi-document abstractive summarization

11494564 · 2022-11-08 ·

Naver Corporation

A multi-document summarization system includes: an encoding module configured to receive multiple documents associated with a subject and to, using a first model, generate vector representations for sentences, respectively, of the documents; a grouping module configured to group first and second ones of the sentences associated with first and second aspects into first and second groups, respectively; a group representation module configured to generate a first vector representation based on the first ones of the sentences and a second vector representation based on the second ones of the sentences; a summary module configured to: using a second model: generate a first sentence regarding the first aspect based on the first vector representation; and generate a second sentence regarding the second aspect based on the second vector representation; and store a summary including the first and second sentences in memory in association with the subject.

DATABASE GENERATION FROM NATURAL LANGUAGE TEXT DOCUMENTS

20230037077 · 2023-02-02 ·

Some embodiments may perform operations of a process that includes obtaining a natural language text document and use a machine learning model to generate a set of attributes based on a set of machine-learning-model-generated classifications in the document. The process may include performing hierarchical data extraction operations to populate the attributes, where different machine learning models may be used in sequence. The process may include using a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model augmented with a pooling operation to determine a BERT output via a multi-channel transformer model to generate vectors on a per-sentence level or other per-text-section level. The process may include using a finer-grain model to extract quantitative or categorical values of interest, where the context of the per-sentence level may be retained for the finer-grain model.

TRAINING DATA COLLECTION SYSTEM, SIMILARITY SCORE CALCULATION SYSTEM, SIMILAR DOCUMENT RETRIEVAL SYSTEM, AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM STORING TRAINING DATA COLLECTION PROGRAM

20230034027 · 2023-02-02 ·

A vector generation unit derives a feature vector of a reference document and a feature vector of a population document. A feature quantity extraction unit performs a dimensionality reduction process to reduce dimensionality of the above feature vectors and sets a dimensional value obtained by the dimensionality reduction process as a first feature quantity, and derives a cosine similarity between the feature vector of the reference document and the feature vector of the population document as a second feature quantity. A retrieval range control unit extracts a specific number of population documents, starting from the population document with the shortest distance to the reference document in a feature quantity space of the first feature quantity, so as to limit a retrieval range. A training data extraction unit extracts, as training data, a specific number of documents from the extracted documents, starting from the document with the lowest cosine similarity.

UTTERANCE INTENT DETECTION

20230030870 · 2023-02-02 ·

Certain aspects of the present disclosure provide techniques for detecting sentences that are utterances by an agent indicating an intent to poach a customer. According to certain embodiments, a language model is trained using query sentences that are confirmed to be sentences used in poaching a customer, to identify semantically similar sentences in a corpus. These semantically similar sentences are then used as base sentences for comparison to sentences in a transcript. Sentences of the transcript that are found to be semantically similar to one or more base sentences are provided to a user device for review and confirmation that the similar sentence was generated by an agent in an attempt to poach a customer.

WORD PROCESSING SYSTEM AND WORD PROCESSING METHOD

20220350964 · 2022-11-03 ·

Hitachi, Ltd.

Provided is a word processing system which includes: a first generation unit which generates, based on sentence information including a plurality of sentences, hierarchy data indicating syntax tree data for each hierarchy with regard to each sentence; a second generation unit which acquires, from a plurality of hierarchy data generated by the first generation unit, hierarchy data of a second sentence similar to hierarchy data of a first sentence generated by the first generation unit, extracts a difference between the hierarchy data of the first sentence and the hierarchy data of the second sentence, and generates, as paraphrasing rule data, first expression data as a difference in the first sentence and second expression data as a difference in the second sentence; and a storage unit which stores the paraphrasing rule data generated by the second generation unit in a storage unit.

MACHINE READING COMPREHENSION APPARATUS AND METHOD

20230088411 · 2023-03-23 ·

A machine reading comprehension apparatus and method are provided. The apparatus receives a question and a text. The apparatus generates a plurality of first predicted answers and a plurality of first source sentences corresponding to each of the first predicted answers according to the question, the text, and the machine reading comprehension model. The apparatus determines a question category of the question. The apparatus extracts a plurality of special terms related to the question category and a plurality of second source sentences corresponding to each of the special terms from the text. The apparatus concatenates the question, the first source sentences, the second source sentences, the first predicted answers, and the special terms into an extended string. The apparatus generates a plurality of second predicted answers corresponding to the question according to the extended string and the micro finder model.

TECHNICAL SPECIFICATION MATCHING

20220343159 · 2022-10-27 ·

Systems and methods are provided for detail matching. The method includes training a feature classifier to identify technical features, and training a neural network model for a trained importance calculator to calculate an importance value for each identified technical feature. The method further includes receiving a specification sheet including a plurality of technical features, and receiving a plurality of descriptive sheets each including a plurality of technical features. The method further includes identifying the technical features in the specification sheet and the plurality of descriptive sheets using the trained feature classifier, and calculating an importance for each identified technical feature using the trained feature importance calculator. The method further includes calculating a matching score between the identified technical features of the specification sheet and the identified technical features of the plurality of descriptive sheets based on the importance of each identified technical feature.

Method and apparatus for evaluating matching degree based on artificial intelligence, device and storage medium

11481419 · 2022-10-25 ·

Beijing Baidu Netcom Science And Technology Co., Ltd.

The present disclosure provides a method and apparatus for evaluating a matching degree based on artificial intelligence, a device and a storage medium, wherein the method comprises: respectively obtaining word expressions of words in a query and word expressions of words in a title; respectively obtaining context-based word expressions of words in the query and context-based word expressions of words in the title according to the word expressions; generating matching features according to obtained information; determining a matching degree score between the query and the title according to the matching features. The solution of the present disclosure may be applied to improve the accuracy of the evaluation result.

VISUAL DIALOG METHOD AND APPARATUS, METHOD AND APPARATUS FOR TRAINING VISUAL DIALOG MODEL, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM

20230082605 · 2023-03-16 ·

Disclosed in this application are a visual content dialog method performed by an electronic device. The method includes: acquiring an image feature of an input image and state vectors corresponding to first n rounds of historical question answering dialog, n being a positive integer; acquiring a question feature of a current round of questioning related to the input image; performing multimodal encoding on the image feature of the input image, the state vectors corresponding to the first n rounds of historical question answering dialog, and the question feature of the current round of questioning, to obtain a state vector corresponding to the current round of questioning; and performing multimodal decoding on the state vector corresponding to the current round of questioning and the image feature of the input image, to obtain an actual output answer corresponding to the current round of questioning.

Method and apparatus for evaluating a matching degree of multi-domain information based on artificial intelligence, device and medium

11481656 · 2022-10-25 ·

Beijing Baidu Netcom Science And Technology Co., Ltd.

The present disclosure provides a method and apparatus for evaluating a matching degree of multi-domain information based on artificial intelligence, a device and a medium. The method comprises: respectively obtaining valid words in a query, and valid words in each information domain in at least two information domains in a to-be-queried document; respectively obtaining word expressions of valid words in the query and word expressions of valid words in said each information domain in at least two information domains in the to-be-queried document; based on the word expressions, respectively obtaining context-based word expressions of valid words in the query and context-based word expressions of valid words in said each information domain; generating matching features corresponding to said each information domain according to the obtained information; determining a matching degree score between the query and the to-be-queried document according to the matching features corresponding to said each information domain.

Patent classifications

G06F16/3347