Patent classifications
G06F16/3347
Method and system for automatic discovery of topics and trends over time
A method and system for automatically performing a discovery of topics within temporal ordered text document collections are provided. The method includes generating a bag of words vector for each text document collection using a predefined dictionary. The method also includes iteratively calculating, based on the generated bag of words vectors, for each text document collection, a hidden topic vector representing topics of the respective text document collection using a calculated hidden state vector memorizing a hidden state of all previous text document collections.
Method, apparatus, device and medium for determining text relevance
Embodiments of the present disclosure provide a method, apparatus, device and medium for determining text relevance. The method for determining text relevance may include: identifying, from a predefined knowledge base, a first set of knowledge elements associated with a first text and a second set of knowledge elements associated with a second text. The knowledge base includes a knowledge representation consist of knowledge elements. The method may further include: determining knowledge element relevance between the first set of knowledge elements and the second set of knowledge elements, and determining text relevance between the second text and the first text based at least on the knowledge element relevance.
Generating sentiment metrics using emoji selections
Methods, devices and systems for measuring emotions expressed by computing emoji responses to videos are described. An example method includes receiving user input corresponding to an emoji at a selected time, assigning at least one meaning-bearing word to the emoji, wherein the at least one meaning-bearing word has an intended use or meaning that is represented by the emoji, associating a corresponding vector with the at least one meaning-bearing word, wherein the corresponding vector is a vector of a plurality of vectors in a vector space, and aggregating the plurality of vectors to generate an emoji vector that corresponds to the user sentiment.
Database query generation using natural language text
Some embodiments may obtain a natural language question, determine a context of the natural language question, and generate a first vector based on the natural language question using encoder neural network layers. Some embodiments may access a data table comprising column names, generate vectors based on the column names, and determine attention scores based on the vectors. Some embodiments may update the vectors based on the attention scores, generating a second vector based on the natural language question, determine a set of strings comprising a name of the column names and a database language operator based on the vectors. Some embodiments may determine a values based on the determined database language operator, the name, using a transformer neural network model. Some embodiments may generate a query based on the set of strings and the values.
Global-to-local memory pointer networks for task-oriented dialogue
A system and corresponding method are provided for generating responses for a dialogue between a user and a computer. The system includes a memory storing information for a dialogue history and a knowledge base. An encoder may receive a new utterance from the user and generate a global memory pointer used for filtering the knowledge base information in the memory. A decoder may generate at least one local memory pointer and a sketch response for the new utterance. The sketch response includes at least one sketch tag to be replaced by knowledge base information from the memory. The system generates the dialogue computer response using the local memory pointer to select a word from the filtered knowledge base information to replace the at least one sketch tag in the sketch response.
Centralized machine learning predictor for a remote network management platform
A remote network management platform is provided that includes an end-user computational instance dedicated to a managed network, a training computational instance, and a prediction computational instance. The training instance is configured to receive a corpus of textual records from the end-user instance and to determine therefrom a machine learning (ML) model to determine the numerical similarity between input textual records and textual records in the corpus of textual records. The prediction instance is configured to receive the ML model and an additional textual record from the end-user instance, to use the ML model to determine respective numerical similarities between the additional textual record and the textual records in the corpus of textual records, and to transmit, based on the respective numerical similarities, representations of one or more of the textual records in the corpus of textual records to the end-user computational instance.
Providing approximate top-k nearest neighbours using an inverted list
Various embodiments are provided for implementing an approximation nearest neighbour (ANN) search in a computing environment are provided. An approximation nearest neighbour (ANN) of a plurality of feature vectors in hyper-planes with dynamically variable subspaces by searching an inverted index may be retrieved.
Automated metadata asset creation using machine learning models
Systems and methods are described that employ machine learning models to optimize database management. Machine learning models may be utilized to decide whether a new database record needs to be created (e.g., to avoid duplicates) and to decide what record to create. For example, candidate database records potentially matching a received database record may be identified in a local database, and a respective probability of each candidate database record matching the received record is output by a match machine learning model. A list of statistical scores is generated based on the respective probabilities and is input to an in-database machine learning model to calculate the probability that the received database record already exists in the local database.
Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
Techniques are described herein for training and evaluating machine learning (ML) models for document processing computing applications based on in-domain and out-of-domain characteristics. In some embodiments, an ML system is configured to form feature vectors by mapping unknown tokens to known tokens within a domain based, at least in part, on out-of-domain characteristics. In other embodiments, the ML system is configured to map the unknown tokens to an aggregate vector representation based on the out-of-domain characteristics. The ML system may use the feature vectors to train ML models and/or estimate unknown labels for the new documents.
Precomputed similarity index of files in data protection systems with neural network
Described is a system and method that provides a data protection risk assessment for the overall functioning of a backup and recovery system. Accordingly, the system may provide a single overall risk assessment score that provide an operator with an “at-a-glance” overview of the entire system. Moreover, the system may account for changes that occur over time based on leveraging statistical methods to automatically generate assessment scores for various components (e.g. application, server, network, load, etc.). In order to determine a risk assessment score, the system may utilize a predictive model based on historical data. Accordingly, residual values for newly observed data may be determined using the predictive model and the system may identify potentially anomalous or high risk indicators.