Patent classifications
G06F16/254
Dashboard loading from a cloud-based data warehouse cache
Dashboard loading from a cloud-based data warehouse cache, including determining that a result for a first query is stored in a cache of a cloud-based data warehouse; sending, in response to the result being stored in the cache, to the cloud-based data warehouse, a request for the result from the cache; and providing, based on the result for the first query, one or more dashboard visualizations.
Techniques for data extraction
Computer-implemented techniques for data extraction are described. The techniques include a method and system for retrieving an extraction job specification, wherein the extraction job specification comprises a source repository identifier that identifies a source repository comprising a plurality of data records; a data recipient identifier that identifies a data recipient; and a schedule that indicates a timing of when to retrieve the plurality of data records. The method and system further include retrieving the plurality of data records from the source repository based on the schedule, creating an extraction transaction from the plurality of data records, wherein the extraction transaction comprises a subset of the plurality of data records and metadata, and sending the extraction transaction to the data recipient.
Automated runtime configuration for dataflows
Methods, systems and computer program products are provided for automated runtime configuration for dataflows to automatically select or adapt a runtime environment or resources to a dataflow plan prior to execution. Metadata generated for dataflows indicates dataflow information, such as numbers and types of sources, sinks and operations, and the amount of data being consumed, processed and written. Weighted dataflow plans are created from unweighted dataflow plans based on metadata. Weights that indicate operation complexity or resource consumption are generated for data operations. A runtime environment or resources to execute a dataflow plan is/are selected based on the weighted dataflow and/or a maximum flow. Preferences may be provided to influence weighting and runtime selections.
Editor for generating computational graphs
Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.
SYSTEM PERFORMANCE LOGGING OF COMPLEX REMOTE QUERY PROCESSOR QUERY OPERATIONS
Described are methods, systems and computer readable media for performance logging of complex query operations.
Graph embedding already-collected but not yet connected data
Systems and methods for graph embedding already-collected but not yet connected data are disclosed. A method includes extracting a first set of actor-related data, a second set of object-related data, and a third set of temporal data from a set of the already-collected but not yet connected data representative of a unit-level contribution to the target activity. The method further includes generating graph data for at least one graph having a plurality of nodes and a plurality of edges using the set of the already-collected but not yet connected data, where each of the plurality of nodes corresponds to the actor or the object, and where an attribute associated with each of the plurality of edges corresponds to a measurement associated with the target activity during a temporal dimension of interest. The method further includes converting the graph data into metric space data using a graph embedding process.
COMPUTER DATA SYSTEM DATA SOURCE REFRESHING USING AN UPDATE PROPAGATION GRAPH
Described are methods, systems and computer readable media for data source refreshing.
Artificial intelligence based smart data engine
A machine learning computing system for extracting structured data objects from electronic documents comprising unstructured text includes a first data repository storing a plurality of electronic documents including at least one text data object and an expert system computing device. The expert system computing device includes a processor and a non-transitory memory device storing instructions causing the expert system to receive a first data object comprising unstructured data identified from an electronic document stored in the first data repository, process, a first set of rules to identify at least one key-value pair data object from the first data object; process, by an inference engine module, a second set of rules to identify at least one free text data object from the first data object and store, in a non-transitory memory device, the at least one key-value pair and the at least one free text data object.
Data mapper tool
An apparatus includes a processor. The processor extracts a column from an external source for import into a database configured to store a set of columns including a first and second column. The processor splits the entries of the import column into a set of terms. The processor generates a first, second, and third vector based on the frequency of each term of the set of terms in the first, second, and import columns, respectively. The processor determines a first similarity measure between the first and third vectors and a second similarity measure between the second and third vectors. The first similarity measure is greater than the second. In response, the processor provides an indication to a user that the first column is a mapping candidate for the import column, such that entries of the import column may be stored in the database as additional entries in the first column.
METHODS AND SYSTEMS FOR MULTI-DYNAMIC DATA RETRIEVAL AND DATA DISBURSEMENT
A device includes circuitry configured to provide a configurable platform including a rules-based processing engine, access and manipulate a plurality of configurable databases, retrieve first data from one of the configurable databases, register the first data for task programs, authenticate the first data according to authenticity parameters, process the first data against processing rules, identify and configure each task program when the assessment of the first data satisfies predetermined criteria, measure the ETL data load flow against a predetermined performance threshold, route the ETL data load flow to a database processing engine, and output a data disbursement of results when the first data is authenticated and the predetermined performance threshold has been satisfied by the rules-based processing engine. In one aspect, such implementation can increase data search and retrieval times as well as outputting a more thorough and accurate count of eligible end-user services.