G06F40/247

NATURAL LANGUAGE PROCESSING COMPREHENSION AND RESPONSE SYSTEM AND METHODS
20230044048 · 2023-02-09 ·

An automatic, system-generated, multi-faceted comprehension and response capability, using Natural Language Processing, to provide value specific answers from available unstructured data, documents and text. Questions and queries are interpreted by the system's capability to determine the type of questions and provide a response or answer based on the data or information available. If the answer is in the ingested data, a response is provided that is either; a list of documents, a list of document snippets with the answer contained in the snippets, a formalized and templated response, or a highly relevant hand curated response.

NATURAL LANGUAGE PROCESSING COMPREHENSION AND RESPONSE SYSTEM AND METHODS
20230044048 · 2023-02-09 ·

An automatic, system-generated, multi-faceted comprehension and response capability, using Natural Language Processing, to provide value specific answers from available unstructured data, documents and text. Questions and queries are interpreted by the system's capability to determine the type of questions and provide a response or answer based on the data or information available. If the answer is in the ingested data, a response is provided that is either; a list of documents, a list of document snippets with the answer contained in the snippets, a formalized and templated response, or a highly relevant hand curated response.

TECHNOLOGIES FOR RELATING TERMS AND ONTOLOGY CONCEPTS

This disclosure enables various technologies that can (1) learn new synonyms for a given concept without manual curation techniques, (2) relate (e.g., map) some, many, most, or all raw named entity recognition outputs (e.g., “United States”, “United States of America”) to ontological concepts (e.g., ISO-3166 country code: “USA”), (3) account for false positives from a prior named entity recognition process, or (4) aggregate some, many, most, or all named entity recognition results from machine learning or rules based approaches to provide a best of breed hybrid approach (e.g., synergistic effect).

TECHNOLOGIES FOR RELATING TERMS AND ONTOLOGY CONCEPTS

This disclosure enables various technologies that can (1) learn new synonyms for a given concept without manual curation techniques, (2) relate (e.g., map) some, many, most, or all raw named entity recognition outputs (e.g., “United States”, “United States of America”) to ontological concepts (e.g., ISO-3166 country code: “USA”), (3) account for false positives from a prior named entity recognition process, or (4) aggregate some, many, most, or all named entity recognition results from machine learning or rules based approaches to provide a best of breed hybrid approach (e.g., synergistic effect).

IMAGE PROCESSING UTILIZING AN ENTIGEN CONSTRUCT

A method performed by a computing device includes determining a set of identigens for each word of a query of a topic to produce sets of identigens. Each set of identigens represents one or more different meanings of a word of the query. The method further includes interpreting, using identigen pairing rules, the sets of identigens to determine a most likely meaning interpretation of the query and produce an excluding query entigen group with an excluding entigen. The method further includes recovering a response entigen group for the query from a knowledge database utilizing the excluding query entigen group. The response entigen group provides a response to the query.

IMAGE PROCESSING UTILIZING AN ENTIGEN CONSTRUCT

A method performed by a computing device includes determining a set of identigens for each word of a query of a topic to produce sets of identigens. Each set of identigens represents one or more different meanings of a word of the query. The method further includes interpreting, using identigen pairing rules, the sets of identigens to determine a most likely meaning interpretation of the query and produce an excluding query entigen group with an excluding entigen. The method further includes recovering a response entigen group for the query from a knowledge database utilizing the excluding query entigen group. The response entigen group provides a response to the query.

NATURAL LANGUAGE BASED PROCESSOR AND QUERY CONSTRUCTOR
20230042940 · 2023-02-09 ·

An apparatus comprising an interface and a natural language processor. The interface receives a data retrieval request formatted in a natural language and the natural language processor processes the data retrieval request. Processing the data retrieval request includes identifying database entities, database relations, or any combination thereof based words in the data retrieval request. It can also include identifying database entity criterion, database relation criterion, or any combination thereof based on words in the data retrieval request. It also includes generating a database query based on the database entities, the database relations, the database entity criterion, the database relation criterion, or any combination thereof and causing the database query to be applied to a database. Wherein, processing the data retrieval request includes grammatically tagging the data retrieval request using part-of-speech tagging techniques, e.g. grammatical type, grammatical context, semantic, or any combination thereof, and a database ontology.

Preparing documents for coreference analysis

Unstructured text is identified as larger than a threshold size. Named-entity recognition analysis is executed on the unstructured text. One or more anchor entities of the unstructured text are determined that each occur more than a threshold amount of times within the unstructured text. Two or more instances of the one or more anchor entities that are separated by at least a threshold amount of text of the unstructured text are identified. The unstructured text is partitioned into at least three sections. The unstructured text is partitioned at respective natural language demarcation points associated with each of the two or more instances such that each of the at least three sections is smaller than the threshold size. Separate coreference analyses are performed in parallel on each of the at least three sections.

Preparing documents for coreference analysis

Unstructured text is identified as larger than a threshold size. Named-entity recognition analysis is executed on the unstructured text. One or more anchor entities of the unstructured text are determined that each occur more than a threshold amount of times within the unstructured text. Two or more instances of the one or more anchor entities that are separated by at least a threshold amount of text of the unstructured text are identified. The unstructured text is partitioned into at least three sections. The unstructured text is partitioned at respective natural language demarcation points associated with each of the two or more instances such that each of the at least three sections is smaller than the threshold size. Separate coreference analyses are performed in parallel on each of the at least three sections.

Content extraction system

A system includes a content extraction engine comprising at least one processor and configured to receive a content page for a target product including product data for the target product and noise content unrelated to the target product, identify noise content pertaining to data unrelated to the target product, remove noise content from the content page, thereby generating a remainder content page containing target product data usable to enable product comparison between multiple sources.