Patent classifications
G06F16/374
COMPUTER-IMPLEMENTED PRESENTATION OF SYNONYMS BASED ON SYNTACTIC DEPENDENCY
In an embodiment, the disclosed technologies are capable of identifying a target word within a text sequence; displaying a subset of candidate synonyms for the target word, determining a synonym selected from the subset of candidate synonyms, and replacing the target word with the selected synonym, where the subset of candidate synonyms has been created using syntactic dependency data for the target word.
SYSTEM AND METHOD FOR QUERYING A DATA REPOSITORY
The present disclosure relates to methods and systems for querying data in a data repository. According to a first aspect, this disclosure describes a method of querying a database, comprising: receiving, at a computing device, a plurality of keywords; determining, by the computer device, a plurality of datasets relating to the keywords; identifying, by the computer device, metadata for the plurality of datasets indicating a relationship between the datasets by examining an ontology associated with the datasets; providing, by the computer device, one or more suggested database queries in natural language form, the one or more suggested database queries constructed based on the plurality of keywords and the metadata; receiving, by the computing device, a selection of the one or more suggested database queries; and constructing, by the computer device, an object view for the plurality of datasets based on the selected query and the metadata.
METHOD FOR GENERATING QUESTION ANSWERING ROBOT AND COMPUTER DEVICE
The present disclosure discloses a method for generating a question answering robot, relates to the field of robotics. The specific implementation includes: obtaining field information input by a user, obtaining a field-specific robot from a robot library based on the field information; obtaining a template list corresponding to the field-specific robot, providing the template list to the user, the template list including a plurality of templates; receiving the plurality of templates filled in by the user, the templates filled in by the user including at least one question and an answer corresponding to the at least one question; expanding the at least one question filled in by the user based on a question semantic database to form a combination of questions corresponding to the answer, the answer and the combination of questions forming a question-answer pair; and generating a question answering robot based on the question-answer pair.
System and method for context-based abbreviation disambiguation using machine learning on synonyms of abbreviation expansions
Disclosed is a system for context-based abbreviation disambiguation, the system comprising: an ontological databank represented into a multi-dimensional space, a synonym databank, a glossary databank, and a server arrangement. The server arrangement is configured to obtain a text comprising abbreviations and concept phrases, extract a target abbreviation from the abbreviations, obtain potential expansions for the target abbreviation, calculate a synonym match score for potential expansions, using synonyms of the potential expansions and the concept phrases, calculate a concept match score, using concepts relating to the potential expansions and the concept phrases, calculate a context match score for the potential expansions using a comparison module, and determine one of the potential expansions as a valid expansion of the target abbreviation based on at least one of the: synonym match score, concept match score and context match score.
APPARATUS AND METHOD FOR AUTOMATED AND ASSISTED PATENT CLAIM MAPPING AND EXPENSE PLANNING
An apparatus and computer implemented method that include obtaining, into a computer, text of a patent, automatically finding and extracting, using the computer, a set of claim text from the patent text, identifying, using the computer, text of independent claims from the set of claim text, displaying in a first row on a computer monitor the text of the independent claims, automatically determining a plurality of preliminary scope-concept phrases from the text of the independent claims, displaying in a second row on the computer monitor the text of the plurality of preliminary scope-concept phrases, eliciting and receiving user input to specify a first one of the plurality of preliminary scope-concepts phrases, and highlighting each occurrence of the specified first one of the plurality of preliminary scope-concept phrases in a plurality of the independent claims displayed in the first row. A scope concept builder tool is also provided.
Stream based named entity recognition
Embodiments of the present invention relate to performing entity recognition on a stream while providing ongoing training or supplementation of an entity dictionary. In one embodiment, a method of and computer program product for stream based named entity recognition is provided. A first portion of a textual input is received. A plurality of patterns is applied to the first portion to determine that a predetermined type is present in the first portion. Approval is requested of the presence of the predetermined type. An indication of approval or disapproval of the predetermined type is received. A dictionary is supplemented according to the indication. A second portion of the textual input is received. The plurality of patterns is applied to the second portion.
METHOD AND SYSTEM FOR DETECTING A PATTERN IN COMMON IN A SET OF TEXT FILES
A method of detecting a pattern in common in two text files, each comprising an ordered sequence of words, is disclosed. The method includes generating groups of words having the same syntactic function, comprising at least one word from each text file such that each word in a group is synonymous with another word in the same group, associating each word in a text file belonging to a group of words, with a tag representative of the group, generating, for each text file, at least one dense set of words satisfying a condition of internal proximity in the text file, determining at least one pattern in common in the two text files, a pattern in common including one or more sets of words sharing the same tag and comprising at least one word from a dense set of words in each text file.
Data analytics systems and methods
Data analytics systems and methods are disclosed herein. A parser can parse reference data from various data sources to store in a data structure. An uploader can receive study data designated by a researcher and store the study data in the data structure. A matcher can compare analyte nameset data in the study data with analyte nameset data from the reference data to generate one or more links each correlating an instance of an analyte in the study data with an instance of that analyte in the reference data. Library overlays each include one or more modules to access reference data to generate organized associations of reference data. A calculation engine can receive a selection of one or more library overlay(s) and manipulate the reference data and study data according to the organized associations of the selected library overlay(s) to generate configured data stored in a collection of data caches for presentation to a researcher via a user interface.
SIMILARITY CALCULATION APPARATUS, RECORDING MEDIUM, AND SIMILARITY CALCULATION METHOD
A similarity calculation apparatus according to the present invention includes: a name acquisition unit configured to acquire a first group name to which each word belonging to a first synonym group belongs and a second group name to which each word belonging to a second synonym group belongs; a name set generation unit configured to generate a first group name set and a second group name set; and a similarity calculation unit configured to calculate similarity between the first group name set and the second group name set. Therefore, even when a plurality of synonym groups are created, terms can be effectively unified.
REPRESENTATION OF A DATA ANALYSIS USING A FLOW GRAPH
Techniques facilitating using flow graphs to represent a data analysis program in a cloud based system for open science collaboration and discovery are provided. In an example, a system can represent a data analysis execution as a flow graph where vertices of the flow graph represent function calls made during the data analysis program and edges between the vertices represent objects passed between the functions. In another example, the flow graph can then be annotated using an annotation database to label the recognized function calls and objects. In another example, the system can then semantically label the annotated flow graph by aligning the annotated graph with a knowledge base of data analysis concepts to provide context for the operations being performed by the data analysis program.