G06F16/316

Generating a graphical representation of relationships among a set of articles and information associated with the set of articles
10839013 · 2020-11-17 · ·

An online system identifies articles containing factual reporting and information associated with the articles (e.g., authors, publishers, distributors, content, etc.). The online system extracts embeddings for the articles based on the information associated with the articles and generates nodes of a graph, in which each node corresponds to an article or information associated with an article. The online system then identifies relationships among the nodes using the embeddings and generates additional nodes of the graph indicating these relationships. Each of the additional nodes may correspond to any type of information that may be associated with an article. The online system may query the graph for information identifying publishers that published articles alleging a fact, information identifying articles containing editorialized content or clickbait, etc. or to identify and remove similar articles from a feed to be presented to an online system user, to highlight contradicting articles in the feed, etc.

EDOC UTILITY USING NON-STRUCTURED-QUERY-LANGUAGE DATABASES
20200349319 · 2020-11-05 ·

A database management system for processing large volumes of data in a key-value store database is provided. The system may be configured to receive a plurality of filled fillable request forms where each request form may include a request including a plurality of field labels and a plurality of fillable text fields corresponding to each of the plurality of the field labels. The system may be configured to extract each set of inputted data from each fillable text field. The system may be configured to store, in the key-value store database, for each request form, each of the plurality of field labels and the corresponding set of inputted data as a combination key-value pair. The combination key may be equal to a WIP ID number, form ID number and field ID number. The corresponding value may be equal to the set of data of the corresponding field ID number.

DYNAMIC FACETED SEARCH ON A DOCUMENT CORPUS

A query-focused faceted structure generation method, system, and computer program product for generating a query-focused faceted structure from a taxonomy for searching a document corpus, including augmenting taxonomy types with new instances where the instances comprise entities within a proximity of existing instances of taxonomy types in a local embedding of entities parsed from the document corpus, ranking each instance in the augmented taxonomy with respect to its type as a function of both a distance from an instance to a query in a global embedding vector space of the entities trained from the document corpus and a distance of an instance to a type in the local embedding, and ranking the taxonomy types using expanded instances in the document corpus for each type.

DYNAMIC PROCESS MODEL OPTIMIZATION IN DOMAINS
20200334282 · 2020-10-22 ·

A computing server may receive master data, transaction data, and one or more existing process models of a domain. The computing server may aggregate, based on domain knowledge ontology of the domain, the master data and the transaction data to generate a fact table. For example, entries in the fact table may be identified as relevant to the target process model and include attributes and facts that are extracted from master data or transaction data. The computing server may convert the entries in the fact table into vectors. The computing server inputting vectors into one or more machine learning algorithms to generate one or more algorithm outputs. One or more algorithm outputs may correspond to one or more improved process models that are optimized compared to the existing process models. The computing server may provide the improved process model to the domain to replace one of the existing process models.

Headstart for Data Scientists

A method, system, and apparatus are provided for recommending machine learning (ML) project resources for completing a user project by generating indexed project metadata for a plurality of ML projects, generating search metadata for a search request for ML project resources to develop an ML project, and then evaluating the search metadata against the indexed project metadata for each ML project to form a relevancy assessment which is used to order trained models from the ML projects and to display one or more recommended ML project resources comprising one or more of the plurality of trained models having a relevancy assessment exceeding a relevancy threshold.

FACILITATING EVENT CREATION VIA PREVIEWS

Embodiments are directed towards previewing results generated from indexing data raw data before the corresponding index data is added to an index store. Raw data may be received from a preview data source. After an initial set of configuration information may be established, the preview data may be submitted to an index processing pipeline. A previewing application may generate preview results based on the preview index data and the configuration information. The preview results may enable previewing how the data is being processed by the indexing application. If the preview results are not acceptable, the configuration information may be modified. The preview application enables modification of the configuration information until the generated preview results may be acceptable. If the configuration information is acceptable, the preview data may be processed and indexed in one or more index stores.

INFORMATION EXTRACTION FROM OPEN-ENDED SCHEMA-LESS TABLES
20200302114 · 2020-09-24 ·

Systems and methods for generating and annotating cell documents include extracting tables from a document using a table extraction engine. Headers are extracted for each of the tables using a header detection engine. Cells are extracted from each of the tables using a cell extraction engine. A cell document is generated for each of the cells which are each correlated to corresponding portions of the headers, each cell document recording the correlation between the cells and the headers. Each cell document is annotated to generate annotated cell documents with a cell recognition model trained to perform natural language processing on the cell documents by classifying each term in each of the cell documents and extracting relationships between the terms of each of the cell documents.

FINANCIAL DOCUMENTS EXAMINATION METHODS AND SYSTEMS

A user is able to extract financial data, particularly tables, from a document. The table is stored and the user can compare the data in this table with data from similar tables from previous documents. The user can see how financial data has changed historically by looking only at financial tables from the same type of document, for example, only balance sheet tables from annual reports for a specific public company, over many years, and see how the values have changed or whether any new categories or types of data have been added or deleted. From the time series of financial data, the user can gain real intelligence into an entity's financial health.

AUTOMATED PROCESS COLLABORATION PLATFORM IN DOMAINS
20200293564 · 2020-09-17 ·

A computing server may receive master data, transaction data, and a process model of a domain. The computing server may aggregate, based on domain knowledge ontology of the domain, the master data and the transaction data to generate a fact table. For example, entries in the fact table may be identified as relevant to the target process model and include attributes and facts that are extracted from master data or transaction data. The computing server may convert the entries in the fact table into vectors. The computing server may identify, based on the vectors, an attribute in the process model as being statistically significant on impacting the process model. For example, a regression model may be used to determine the statistical significance of an attribute on the model process. The computing server may generate an action associated with the attribute to improve the process model.

AUTOMATIC GENERATION OF SCIENTIFIC ARTICLE METADATA

Examples of the disclosure are directed to systems and methods of using natural language processing techniques to automatically assign metadata to articles as they are published. The automatically-assigned metadata can then feed into the algorithms that calculate updated causation scores for agent-outcome hypotheses, powering live visualizations of the data that update automatically as new scientific articles become available.