G06F40/49

Determining semantic content of textual clusters
11651032 · 2023-05-16 · ·

The embodiments herein provide a framework for and specific implementations of machine learning (ML) analysis of incident, online chat, knowledgebase, skills, and perhaps other types of databases. The ML techniques described herein may include various forms of semantic analysis of textual information in these databases, such as clustering, term frequency, word embedding, paragraph embedding, and potentially other techniques. Advantageously, use of ML in the specific ways described herein can provide insights into this textual information that otherwise would be impossible to determine in an accurate or concise fashion.

Determining semantic content of textual clusters
11651032 · 2023-05-16 · ·

The embodiments herein provide a framework for and specific implementations of machine learning (ML) analysis of incident, online chat, knowledgebase, skills, and perhaps other types of databases. The ML techniques described herein may include various forms of semantic analysis of textual information in these databases, such as clustering, term frequency, word embedding, paragraph embedding, and potentially other techniques. Advantageously, use of ML in the specific ways described herein can provide insights into this textual information that otherwise would be impossible to determine in an accurate or concise fashion.

DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS

Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes obtaining a training set of labeled examples for training a machine learning model to classify sentiment, searching the training set of labeled examples or an unlabeled corpus of text on target domains for sentiment examples having negation cues, sentiment laden words, words with sentiment prefixes or suffixes, or a combination thereof, rewriting the sentiment examples to create negated versions thereof and generate a labeled negation pair data set, and training the machine learning model using labeled examples from the labeled negation pair data set.

DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS

Techniques for augmentation and batch balancing of training data to enhance negation and fairness of a machine learning model. In one particular aspect, a method is provided that includes obtaining a training set of labeled examples for training a machine learning model to classify sentiment, searching the training set of labeled examples or an unlabeled corpus of text on target domains for sentiment examples having negation cues, sentiment laden words, words with sentiment prefixes or suffixes, or a combination thereof, rewriting the sentiment examples to create negated versions thereof and generate a labeled negation pair data set, and training the machine learning model using labeled examples from the labeled negation pair data set.

LANGUAGE MODEL USING REVERSE TRANSLATIONS
20170371866 · 2017-12-28 · ·

Exemplary embodiments relate to techniques for improving machine translation systems. The machine translation system may apply one or more models for translating material from a source language into a destination language. The models are initially trained using training data. According to exemplary embodiments, supplemental training data is used to train the models, where the supplemental training data uses in-domain material to improve the quality of output translations. In-domain data may include data that relates to the same or similar topics as those expected to be encountered in a translation of material from the source language into the destination language. In-domain data may include material previously translated from the source language into the destination language, material similar to previous translations, and destination language material that has previously been the subject of a request for translation into the source language.

Creating line item information from free-form tabular data

The present disclosure involves systems, software, and computer implemented methods for creating line item information from tabular data. One example method includes receiving event data values at a system. Column headers of columns in the event data values are identified. At least one column header is not included in standard line item terms used by the system. Column values of the columns in the event data values are identified. The identified column headers and the identified column values are processed using one or more models to map each column to a standard line item term used by the system. The processing includes using context determination and content recognition to identify standard line item terms. An event is created in the system, including the creation of line items from the identified column value. Each line item includes standard line item terms mapped to the columns.

Creating line item information from free-form tabular data

The present disclosure involves systems, software, and computer implemented methods for creating line item information from tabular data. One example method includes receiving event data values at a system. Column headers of columns in the event data values are identified. At least one column header is not included in standard line item terms used by the system. Column values of the columns in the event data values are identified. The identified column headers and the identified column values are processed using one or more models to map each column to a standard line item term used by the system. The processing includes using context determination and content recognition to identify standard line item terms. An event is created in the system, including the creation of line items from the identified column value. Each line item includes standard line item terms mapped to the columns.

Dynamic attribute extraction systems and methods for artificial intelligence platform
11681874 · 2023-06-20 · ·

An AI platform may receive a request for information on text. The text is processed through a text mining pipeline for dynamic attribute extraction. An engine determines entities in the text and utilizes the entities to determine a relationship pattern. The engine identifies a trigger by matching one of the entities with a predefined entity in a utility authority file, locates an entity in close proximity to the trigger, identifies a value or regular expression in close proximity to the trigger in the text, and creates a triplet containing the entity, the trigger, and the value or regular expression, the triplet representing the relationship pattern. The engine applies an action to the triplet, wherein the action comprises obtaining the value from the text or translating the regular expression. The engine attaches the value or a result from the translating to the entity as a dynamic attribute of the entity.

Dynamic attribute extraction systems and methods for artificial intelligence platform
11681874 · 2023-06-20 · ·

An AI platform may receive a request for information on text. The text is processed through a text mining pipeline for dynamic attribute extraction. An engine determines entities in the text and utilizes the entities to determine a relationship pattern. The engine identifies a trigger by matching one of the entities with a predefined entity in a utility authority file, locates an entity in close proximity to the trigger, identifies a value or regular expression in close proximity to the trigger in the text, and creates a triplet containing the entity, the trigger, and the value or regular expression, the triplet representing the relationship pattern. The engine applies an action to the triplet, wherein the action comprises obtaining the value from the text or translating the regular expression. The engine attaches the value or a result from the translating to the entity as a dynamic attribute of the entity.

LOW-RESOURCE MULTILINGUAL MACHINE LEARNING FRAMEWORK
20230177281 · 2023-06-08 ·

Systems and methods for performing machine learning on multilingual text data.