G06F16/353

Self-evolving knowledge graph

A computer system updates a knowledge graph. A model corresponding to a set of documents is received, wherein the model comprises a plurality of entities, a plurality of entity associations, and a plurality of confidence scores corresponding to the plurality of entity associations. A relevance value is calculated for each entity of the plurality of entities that are present in the set of documents and for each entity of the plurality of entities that are present in a new document. One or more entity associations that are supported by specific portions of the new document are identified. The confidence scores for each of the identified one or more entity associations are updated based on a level of support in the new document. Embodiments of the present invention further include a method and program product for updating a knowledge graph in substantially the same manner described above.

SOFTWARE-AIDED CONSISTENT ANALYSIS OF DOCUMENTS

The present technology pertains to a system for automatic analysis and segregation of documents. The system provides a graphical user interface for receiving inputs pertaining to a first document of a plurality of documents in a document analysis project. For example, the graphical user interfaces may receive a classification input classifying the first document with a first classification. The system automatically analyzes other documents in the plurality of documents to identify a subset of documents that are similar to the first document, and automatically classify the subset of the documents that are similar to the first document with the first classification. Further, the present technology pertains to conducting a patent analysis project by a team of analysts, including presenting a detailed analysis user interface for reviewing patent-related documents, where the detailed analysis user interface includes text of a first patent-related document to be analyzed and categories and related subcategories.

Server and method for classifying entities of a query

A server, method, and non-transitory computer readable medium for ranking a plurality of data sources are provided. The server includes a network interface, a memory storage unit and a processor. The method involves receiving an input query, identifying entities of the input query using conditional random fields, generating a normalized query and applying a support vector machine to the normalized query. The non-transitory computer readable medium is encoded with programming instructions to direct a processor to carry out the method.

METHOD AND SYSTEM FOR DETECTION OF MISINFORMATION

A system and method for automatically detecting misinformation is disclosed. The misinformation detection system is implemented using a cross-stitch based semi-supervised end-to-end neural attention model which is configured to leverage the large amount of unlabeled data that is available. In one embodiment, the model can at least partially generalize and identify emerging misinformation as it learns from an array of relevant external knowledge. Embodiments of the proposed system rely on heterogeneous information such as a social media post's text content, user details, and activity around the post, as well as external knowledge from the web, to identify whether the content includes misinformation. The results of the model are produced via an attention mechanism.

System and method for update of data and meta data via an enumerator

A data storage system includes storage and a global enumerator. The storage stores data chunks, object level metadata associated with portions of the data chunks, and chunk level metadata associated with respective data chunks. The global enumerator obtains an update request including a metadata characteristic and update data; in response to obtaining the update request: matches the metadata characteristic to at least one selected from a group consisting of a portion of the object level metadata and a portion of the chunk level metadata to identify an implicated metadata portion; and modifies, based on the update data, the implicated metadata portion.

AUTOMATIC INDUSTRY CLASSIFICATION METHOD AND SYSTEM
20220374462 · 2022-11-24 · ·

An automatic industry classification method comprises: determining a scope of target patents, defining a target industry tree; generating marks on the target industry tree; performing a rough classification for the target patents by using the marks; performing a fine classification for the target patents according to a result of the rough classification. The automatic industry classification method and system provided by the present invention uses a transductive learning method, so that full mining of small annotation quantity information is realized. The automatic industry classification method and system uses information of IPC, so that information dimension is enriched, and calculation amount needed in the classification is reduced. The automatic industry classification method and system further uses the hierarchical vectors generated by the abstract, the claims and the description, so that the information of word order relation is reserved, and the patent text is deeply mined.

Fault Processing Method and System

Various embodiments of the teachings herein include a fault processing method comprising: receiving two historical faults similar to a target fault; searching keywords in a description of the target fault and each historical fault, wherein the keywords are classified into N grades, and for each system component in a grade, the grade comprises at least one keyword for describing the component, wherein N is an integer no less than 2; for each of the N grades, counting a quantity of identical system components represented by the keywords in the text description of each historical fault and the target fault; and comparing a degree of similarity of each historical fault to the target fault according to the quantity of identical system components counted in each grade of the N different grades, wherein a historical fault relating to a larger number of high-grade identical system components has a higher degree of similarity to the target fault.

Document elimination for compact and secure storage and management thereof

Documents, such as those that may or will be the subject of a litigation, may be managed by automatically determining that a document, such as an email or other communication, is privileged or producible such that superfluous documents may be removed to improve data storage and reduce the burden on storage, processing, and communication resources. Additionally, documents such as emails may comprise attached or embedded documents (e.g., attachments) which may be similarly or independently classified from their associated email. After determining privilege, such as via metadata associated with a sender/receiver of an email, similarly categorized documents may be grouped for presentation and/or storage. The documents may be indexed, such as by entries within a production log, to further facilitate accurate production and management of non-privileged documents, as well as, the exclusion of privileged documents. Documents not required for production may be indexed and/or purged from storage.

System and method for generating subjective wellbeing analytics score

A system includes at least one processor to perform natural language processing on text from at least one document and assign the at least one document to at least one subjective wellbeing dimension by comparing the text from the at least one document with a subjective wellbeing dimension filter for each subjective wellbeing dimension, insert the at least one document into at least one bin, each bin associated with a particular subjective wellbeing dimension, and analyze each document in each bin associated with the particular subjective wellbeing dimension to determine a score for each subjective wellbeing dimension and an overall score that is based on each score for each subjective wellbeing dimension.

Natural language processing and machine learning assisted cataloging and recommendation engine

Systems and methods that determining a solution for a real-time message are provided. Multiple messages of different types are received from multiple platforms. The messages were generated in response to errors caused by applications monitored by the platforms. For each message, a language processing system determines the content of the message and the machine learning system determines a classification of the message. The set of message candidates are generated by comparing the classification and the content of the message to historical messages. From the set of message candidates, solution messages are identified. A recommended solution is determined from the solution messages.