IPIQ

G06F16/313

LANGUAGE PROCESSOR, LANGUAGE PROCESSING METHOD AND LANGUAGE PROCESSING PROGRAM

20220382790 · 2022-12-01 ·

Nippon Telegraph And Telephone Corporation

The present disclosure is directed to enabling acquisition of information of an argument corresponding to a case. The present disclosure is a language processing apparatus which refers to an argument emergence history database 14 which stores argument emergence patterns associated with cases and arguments of verbs for each meaning of a word or usage of a verb, acquires an argument emergence pattern which matches a verb and a case of the verb included in a request from a user from the argument emergence history database 14, and generates a response to the user using an argument included in the argument emergence pattern acquired from the argument emergence history database 14.

NARROWING SYNONYM DICTIONARY RESULTS USING DOCUMENT ATTRIBUTES

20220382753 · 2022-12-01 ·

In a method for improving generation and relevancy of search results, a processor receives a search query comprising a search term. A processor generates a document group based on the search query and at least one synonym related to the search term in a synonym dictionary. The synonym dictionary may include search document attributes for base words and synonyms of the base words. A processor extracts, from the document group, an extracted document having a document attribute matching a search document attribute of the at least one synonym. A processor lists the extracted document as a search result.

MODIFYING DATA PIPELINE BASED ON SERVICES EXECUTING ACROSS MULTIPLE TRUSTED DOMAINS

20220382852 · 2022-12-01 ·

Computing systems of a multi-tenant trusted domain collect metadata describing data stored in data sources of a set of tenant trusted domains. The computing systems of the multi-tenant trusted domain use the metadata to process natural language questions based on data stored in data sources of a tenant trusted domain. The computing systems of the multi-tenant trusted domain identify a set of data sources of the tenant trusted domain that are relevant for processing the natural language question and generate an execution plan for answering the natural language question. The computing systems of the multi-tenant trusted domain send the execution plan to one or more computing systems of the tenant trusted domain. The computing systems of the tenant trusted domain execute the execution plan and send the result of executing the execution plan to a client device that sent the natural language question.

WORDBREAK ALGORITHM WITH OFFSET MAPPING

20220382789 · 2022-12-01 ·

Microsoft Technology Licensing, Llc

A computer system is provided, including a processor coupled to a mass storage device that stores instructions, which, upon execution by the processor, cause the processor to store an original string formed of a plurality of characters, perform a wordbreak algorithm on the original string, and tokenize the original string to generate a processed string including a plurality of word tokens separated by spaces. The processor is further configured to generate an offset map between locations within the word tokens in the processed string and corresponding locations in the original string and classify a portion of the processed string as a target. The processor is further configured to identify target characters in the original string that correspond to the target using the offset map and perform a predetermined action on the target characters in the original string.

Natural language processing for entity resolution

11514096 · 2022-11-29 ·

Panjiva, Inc.

An apparatus includes a data access circuit that interprets data records, each having a number of data fields, a record parsing circuit that determines a number of n-grams from terms of each of the data records and maps the number of n-grams to a corresponding number of mathematical vectors, and a record association circuit that determines whether a similarity value between a first mathematical vector for the first data record and a second mathematical vector for the second data record is greater than a threshold similarity value, and associates the first and second data records in response to the similarity value exceeding the threshold similarity value. An example apparatus includes a reporting circuit that provides a catalog entity identifier, associates each of the first term and the second term to the catalog entity identifier, and provides a summary of activity for an entity.

Extracting entity relations from semi-structured information

11514091 · 2022-11-29 ·

International Business Machines Corporation

Methods and systems for processing records include extracting feature vectors from words in an unstructured portion of a record. The feature vectors are weighted based similarity to a topic vector from a structured portion of the record associated with the unstructured portion. The weighted feature vectors are classified using a machine learning model to determine respective probability vectors that assign a probability to each of a set of possible relations for each feature vector. Relations between entities are determined within the record based on the probability vectors. An action is performed responsive to the determined relations.

DOCUMENT DISPLAY ASSISTANCE SYSTEM, DOCUMENT DISPLAY ASSISTANCE METHOD, AND PROGRAM FOR EXECUTING SAID METHOD

20220375246 · 2022-11-24 ·

Xcoo, Inc.

The present invention provides a document display assistance system which estimates and highlights significant words in a document of a specific field. The system comprises: a database in which selection-target words and non-selection-target words are registered; a learned word-selection model having been applied with machine learning for estimating whether a word is a selection-target word; a text pre-processing unit which segments words from an accepted display-target document; a word classification unit which classifies, based on the database, the word into any of a selection-target word, a non-selection-target word, and an indeterminate word; a text post-processing unit which generates output data by imparting a predetermined attribute to a predetermined word in the display-target document; and an output unit which outputs the output data. If a label is estimated indicating that the indeterminate word classified by the word classification unit is a selection-target word, the word selection model classifies the indeterminate word into a selection-target word and the text post-processing unit imparts the predetermined attribute to the classified selection-target word.

SYSTEMS AND METHODS FOR HIERARCHICAL RETRIEVAL OF SEMANTIC-BASED PASSAGES IN DEEP LEARNING

20220374459 · 2022-11-24 ·

Embodiments described herein provide a dense hierarchical retrieval for open-domain question and answering for a corpus of documents using a document-level and passage-level dense retrieval model. Specifically, each document is viewed as a structural collection that has sections, subsections and paragraphs. Each document may be split into short length passages, where a document-level retrieval model and a passage-level retrieval model may be applied to return a smaller set of filtered texts. Top documents may be identified after encoding the question and the documents and determining document relevance scores to the encoded question. Thereafter, a set of top passages are further identified based on encoding of the passages and determining passage relevance scores to the encoded question. The document and passage relevance scores may be used in combination to determine a final retrieval ranking for the documents having the set of top passages.

System and method for generating subjective wellbeing analytics score

11593565 · 2023-02-28 ·

TSG Technologies, LLC

A system includes at least one processor to perform natural language processing on text from at least one document and assign the at least one document to at least one subjective wellbeing dimension by comparing the text from the at least one document with a subjective wellbeing dimension filter for each subjective wellbeing dimension, insert the at least one document into at least one bin, each bin associated with a particular subjective wellbeing dimension, and analyze each document in each bin associated with the particular subjective wellbeing dimension to determine a score for each subjective wellbeing dimension and an overall score that is based on each score for each subjective wellbeing dimension.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

20230054525 · 2023-02-23 ·

An information processing apparatus (1S) includes: a creation unit (232) configured to create support information for supporting at least one of construction and maintenance of a database on the basis of first information including a representative question and an answer sentence associated with the representative question and stored in the database stored in a storage unit and second information indicating a history of reception with respect to a user; and a processing unit (233) configured to execute processing of performing at least one of construction and maintenance of the database on the basis of input information input to the support information created by the creation unit (232).

Patent classifications

G06F16/313