Patent classifications
G06F16/3346
Question inference device
A question inference device comprises: an input unit inputting an inquiry from a user; a storage unit storing a plurality of questions prepared in advance and one or more keywords for identifying each question in association with each other; a choice unit referring to the storage unit and choosing a question associated with a keyword contained in the inquiry inputted by the input unit from among the plurality of questions; a computation unit computing the likelihood of each of the plurality of questions for the inquiry inputted by the input unit; an inference unit inferring, based on a choice result of the choice unit and a computation result of the computation unit, such a question that the user intends the substance of the question from among the plurality of questions; and an output unit outputting information based on an inference result of the inference unit.
SEMANTICS-AWARE HYBRID ENCODER FOR IMPROVED RELATED CONVERSATIONS
A method of finding online relevant conversing posts, comprises receiving, by a web server serving an online forum, a query post from an inquirer using the online forum, computing a contextual similarity score between each conversing post of a set of conversing posts with a query post, wherein the contextual similarity score is computed between the body of each of conversing posts and of the query post, wherein N1 conversing posts with a highest contextual similarity score are selected; computing a fine grained similarity score between the subject of the query post and of each of the N.sub.1 conversing posts, wherein N2 conversing posts with a highest fine grained similarity score are selected; and boosting the fine grained similarity score of the N2 conversing posts based on relevance metrics, wherein N3 highest ranked conversing posts are selected as a list of conversing posts most relevant to the query post.
Data retrieving apparatus, method, and program
A data search apparatus according to an embodiment includes: an input unit; and a storage apparatus configured to store master data names managed with master data. The data search apparatus calculates edit distances between master data names stored in the storage apparatus and input data names input in the input unit, calculates degrees of similarity between the master data names and the input data names based on term frequency and inverse document frequency of the master data names and the input data names, performs processing for narrowing down candidates for the data name being searched for in the master data names based on the calculation results and adjacency information indicating adjacency relationships between the master data names and the input data names, and outputs information indicating correspondence between the master data names and the input data names based on the candidate for the data name being searched for, the candidate for the data name being obtained through the narrowing-down processing.
SEARCH SERVICE PROVIDING DEVICE, METHOD, AND COMPUTER PROGRAM
Provided is a search service providing method of providing search results related to a search word performed by a search service providing device, the search service providing method comprising: receiving, by the search service providing device, an initial search word; determining, by the search service providing device, one or more additional search words based on the initial search word; ranking, by the search service providing device, the one or more additional search words; selecting, by the search service providing device, at least one related search word from among the one or more additional search words based on the ranking; and providing, by the search service providing device, additional search results corresponding to the at least one related search word, the initial search word, an initial search result corresponding to the initial search word, and the one or more additional search words.
Efficient search for combinations of matching entities given constraints
Methods, systems, and computer-readable storage media for receiving a set of inference results generated by a ML model, the inference results including a set of query entities and a set of target entities, each query entity having one or more target entities matched thereto by the ML model, processing the set of inference results to generate a set of matched sub-sets of target entities by executing a search over target entities in the set of target entities based on constraints, for each problem in a set of problems, providing the problem as a tuple including an index value representative of a target entity in the set of target entities and a value associated with the query entity, the value including a constraint relative to the query entity, and executing at least one task in response to one or more matched sub-sets in the set of matched sub-sets.
DISTRIBUTED MODEL OPTIMIZER FOR CONTENT CONSUMPTION
A distributed model generation system includes a master node that estimates parameter sets for a topic classification (TC) model. The estimated parameter sets are loaded into a queue. Multiple training nodes download the estimated parameter sets from the queue for training associated TC models. The training nodes generate model performance values for the trained TC models and send the model performance values back to the master node. The master node uses the model performance values and the associated parameter sets to estimate additional TC model parameter sets. The master node estimates new parameter sets until a desired model performance value is obtained. The master node may use a Bayesian optimization to more efficiently estimate the parameter sets and may distribute the high processing demands of model training and testing operations to the training nodes.
Predicting object identity using an ensemble of predictors
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting object identity using an ensemble of predictors. In one aspect, a method includes selecting candidate objects that likely match a received object that is to be identified, from a database of objects, and providing attributes of the received object compared with those of the candidates to an ensemble of predictors having respective properties. Based on previous training, each predictor can predict a most likely candidate. From among the most likely candidates, a previously trained support vector machine can select a potential match candidate. If a score that the support vector machine associates with the potential match candidate, that is representative of the potential match candidate's likelihood to match the received candidate satisfies a threshold, then the potential match candidate can be determined to be the received candidate.
Sentiment analysis for aspect terms extracted from documents having unstructured text data
An apparatus comprises at least one processing device configured to receive a query to perform sentiment analysis for a document, to generate, utilizing a first machine learning model, a first set of encodings classifying words of the document as being aspect or non-aspect terms, to generate, utilizing a second machine learning model, a second set of encodings classifying sentiment of the words of the document, and to determine, for a given aspect term, attention weights for a given subset of the words of the document surrounding the given aspect term. The processing device is also configured to generate, utilizing a third machine learning model, a given sentiment classification of the given aspect term based on the attention weights and a given portion of the second set of encodings for the given subset of the words, and to provide a response to the query comprising the given sentiment classification.
Context-sensitive feature score generation
Document information may define words, key groups of words, and sets of context words within a document. Word feature scores for words within the document may be generated. Key group feature scores for individual key groups of words may be generated based on aggregation of word feature scores the words within the individual key groups of words and word feature scores for words within corresponding sets of context words. A document feature score for the document may be generated based on aggregation of word feature scores for words within the document. The key group feature scores and the document feature score may enable context-sensitive searching of words/word vectors in the document.
Data normalization system
A data normalization system that normalizes data that includes user names is disclosed herein. The data normalization system converts, using a metaphone algorithm, a first string into a first metaphone string, and a second string into a second metaphone string. The data normalization system searches, based on the first metaphone string and the second metaphone string, a name index including a listing of metaphone strings representing common names and probabilities that the common names are either a given name or a surname. The data normalization system determines, based on searching the name index, a confidence score indicating a confidence level that the first string represents the given name and the second string represents the surname. The data normalization system determines that the confidence score meets or exceeds a threshold confidence score, and in response, determines that the first string represents the given name and the second string represents the surname.