Patent classifications
G06F16/35
Word attribution prediction from subject data
A digital attribution system is described to generate predictions of word attributions from subject data, e.g., titles, subject lines of emails, and so on. To do so, an attribution score is first generated by the digital attribution system that describe an amount to which respective words in the subject data cause performance of a corresponding outcome. The attribution scores are then used by the digital attribution system to generate representations for display in a user interface for respective words in the subject data and may also be used to generate attribution recommendations of changes to be made to the subject data.
TECHNOLOGY TREND PREDICTION METHOD AND SYSTEM
A technology trend prediction method and system are provided. The method comprises acquiring paper data, and further comprises following steps: processing the paper data to generate a candidate technology lexicon; screening the candidate technology lexicon based on mutual information; calculating an independent word forming probability of an OOV word; extracting missed words in a title using a bidirectional long short-term memory network and a conditional random field (BI-LSTM+CRF) model; predicting a technology trend. The technology trend prediction method and system provided analyzes relationship of technology changes in a high-dimensional space, and predicts a development of technology trend based on time by extracting technical features of papers through natural language processing and time sequence algorithms.
SYSTEM FOR RECOMMENDING DATA BASED ON SIMILARITY AND METHOD THEREOF
Provided are a system for recommending related data based on similarity, and a method thereof, the system including: a data collection device; an event extraction device; a data cleansing device; an event vector generation device; an artificial intelligence learning device; and a similar data recommendation device. The present disclosure is directed to providing a system for recommending related data based on similarity and a method thereof, wherein unstructured open data on a webpage is collected to automatically generate an event label for determining a similarity relation, and an artificial intelligence (AI)-based model is trained to group and recommend semantically similar related data, thereby effectively helping users including data scientists who want to see meaningful results through open data.
SYSTEM FOR RECOMMENDING DATA BASED ON SIMILARITY AND METHOD THEREOF
Provided are a system for recommending related data based on similarity, and a method thereof, the system including: a data collection device; an event extraction device; a data cleansing device; an event vector generation device; an artificial intelligence learning device; and a similar data recommendation device. The present disclosure is directed to providing a system for recommending related data based on similarity and a method thereof, wherein unstructured open data on a webpage is collected to automatically generate an event label for determining a similarity relation, and an artificial intelligence (AI)-based model is trained to group and recommend semantically similar related data, thereby effectively helping users including data scientists who want to see meaningful results through open data.
SYSTEM AND METHOD FOR MULTI-MODAL TRANSFORMER-BASED CATAGORIZATION
A transformer categorization architecture is applied to image and text data sets to determine a taxonomy for items in a large database of products. Aggregating recommendations from a multi-modal categorization process achieves a more accurate product classification with potentially less training. The system is implemented to support an e-commerce portal and user facilitated access to products for online purchases.
Automatic Synonyms, Abbreviations, and Acronyms Detection
A completely unsupervised solution for generating and maintaining a list of lexically similar terms for an e-commerce system is provided. Given a particular electronic collection of items in an e-commerce system, each term in a first item listing is initially paired with each term in a second item listing to form a set of token pairs. The token pairs represent possible candidates for being synonyms. For a respective token pair, an attempt is made to match the shortest token of the token pair to the longest token of the token pair, character by character. If a match is successful, the terms in the token pair are automatically labeled as synonyms for the particular electronic collection of items. Some implementations automatically filter out false positives and/or token pairs that are unrelated and not likely synonyms. The solution can be performed at the granularity of a product, category, vertical, or entire catalog.
Semantic cluster formation in deep learning intelligent assistants
Enhanced techniques and circuitry are presented herein for providing responses to questions from among digital documentation sources spanning various documentation formats, versions, and types. One example includes a method comprising receiving an indication of a question directed to subject having a documentation corpus, determining a set of passages of the documentation corpus related to the question, ranking the set of passages according to relevance to the question, forming semantic clusters comprising sentences extracted from ranked ones of the set of passages according to sentence similarity, and providing a response to the question based at least on a selected semantic cluster.
Extraction of semantic relation
A computer-implemented method for extracting semantic relations is disclosed. In the method, a plurality of hierarchal structures that originates from a corpus of documents is obtained. Each hierarchal structure includes a plurality of elements having respective recitations included in a corresponding document. In the method, for each predetermined relationship between ancestor and descendant elements in the hierarchal structures, a first keyword list is extracted from the ancestor element and a second keyword list is extracted from the descendant element. A statistical index is calculated for each pair of first and second keywords using the first keyword lists and the second keyword lists. The index indicates a strength of association between the first and second keywords. In the method, a candidate list of keyword pairs having semantic relationships is output using the statistical index calculated for each pair.
Systems and methods for coverage analysis of textual queries
A computer based system and method for assigning queries to topics and/or visualizing or analyzing query coverage may include, using a computer processor, searching, using a set of queries, over a set of text documents, to produce for each query a set of search results for the query. Each search result may include a subset of text from a text document of the set of text documents. For each query, a query vector may be calculated based on the set of search results for the query, and for each of a set of topics describing the set of text documents, a topic vector may be calculated. A report or visualization may be generated including the set of queries and the set of topics using the topic vectors and the query vectors.
Predictive resolutions for tickets using semi-supervised machine learning
Aspects of the subject disclosure may include, for example, a method in which a processing system collects information associated with trouble tickets each including a problem abstract and a log text. The method includes analyzing the log text to obtain a problem resolution for that ticket; defining ticket clusters according to the problem abstracts, and labeling the clusters. The processing system creates a library of the labeled clusters, each entry including a cluster label, a problem abstract for that cluster, and a resolution summary for that problem abstract, indicating a mapping of the problem abstract to the resolution summary for that cluster. The method includes training, based on the mapping, machine-learning applications for a predicted resolution summary for each cluster and for classifying a new ticket. The method includes assigning the new ticket to a cluster according to the classifying. Other embodiments are disclosed.