G06F16/2453

COMPUTER DATA SYSTEM DATA SOURCE REFRESHING USING AN UPDATE PROPAGATION GRAPH

Described are methods, systems and computer readable media for data source refreshing.

Method and database system for sequentially executing a query and methods for use therein
11709834 · 2023-07-25 · ·

A database system operates by facilitating execution of a query, where each of a plurality of sequential operator execution steps includes: determining whether each operator of a plurality of operators of a query operator execution flow is currently executable; generating a plurality of priority values by calculating a priority value for each operator based on whether each operator is determined to be currently executable, and based on a position value of each operator; identifying one operator of with a most favorable priority value; facilitating execution of the one operator on a queued set of data blocks to generate at least one output data block; identifying a next operator serially positioned consecutively after the one operator; and appending the at least one output data block to another queued set of data blocks of the next operator.

Performing geospatial-function join using implied interval join

Disclosed herein are systems and methods for performing a geospatial-function join using an implied interval join. In an embodiment, a database platform receives a query that includes a geospatial-function join, which applies a geospatial-function predicate to a first geography data object of a first relation and a second geography data object of a second relation. The database platform processes the first and second relations through an interval join that applies an interval-join predicate that is implied by the geospatial-function predicate. The database platform obtains query results at least in part by implementing a filter that applies the geospatial-function predicate to an output of the interval join, and outputs the query results.

Cost-based query optimization for array fields in database systems

A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.

Cost-based query optimization for array fields in database systems

A document-oriented database system generates an optimal query execution plan for database queries on an untyped data field included in a collection of documents. The system generates histograms for multiple types of data stored by the untyped data field and uses the histograms to assign costs to operators usable to execute the database query. The system generates the optimal query execution plan by selecting operators based on the assigned costs. In various embodiments, the untyped data field stores scalars, arrays, and objects.

Distributed real-time partitioned MapReduce for a data fabric
11709843 · 2023-07-25 · ·

A system includes an interface and a processor. The interface is configured to receive an indication that a change has occurred to partition data on a first node, wherein the partition data is stored on a partition on the first node. The processor is configured to: determine whether the change to the partition data causes a change to a predetermined partition result of a set of predetermined partition results stored by the partition; and in response to a determination that the change to partition data affects the predetermined partition result stored by the partition: determine a new value for the predetermined partition result; store the new value; and provide an indication to a service node that the new value for the predetermined partition result has been determined, wherein the service node is selected by a client application system to manage execution of a task.

Information processing system, information processing device, and non-transitory computer-readable storage medium
11709832 · 2023-07-25 · ·

An information processing system includes a first information processing device configured to accept an input of a query to be processed, and a second information processing device configured to execute the query for each of a plurality of tasks in parallel. The first information processing device determines whether or not an external database server contains records targeted by the query, and transmit the query and a connection information for accessing the external database server to the second information processing device. The second information processing device connects to the external database server based on the connection information received from the first information processing device, acquires information indicating a storage status of the records targeted by the query among records stored in the external database server, and determines a processing target range for each of the plurality of tasks relevant to the records targeted by the query, based on the acquired information.

PROCESSING INGESTED DATA TO IDENTIFY ANOMALIES

Systems and methods are described for processing ingested data in an asynchronous manner as the data is being ingested to detect potential anomalies. For example, one or more streaming data processors can convert data as the data is ingested into a comparable data structure, determine whether the comparable data structure should be assigned to an existing data pattern or a new data pattern, and optionally update a characteristic of the data pattern to which the comparable data structure is assigned. The streaming data processor(s) can perform these operations automatically in real-time or in periodic batches. Once one or more comparable data structures have been assigned to one or more data patterns, the streaming data processor(s) can analyze the comparable data structures assigned to a particular data pattern to determine whether any of the comparable data structures appear to be anomalous.

APPLYING QUERY COST DATA BASED ON AN AUTOMATICALLY GENERATED SCHEME

An analytics system is operable to receive a first plurality of query requests from a plurality of requesting entities. Query pricing scheme data is automatically generated based on the first plurality of query requests. A second plurality of query requests are received from the plurality of requesting entities. Query cost data is automatically generated for each of the second plurality of query requests by utilizing the query pricing scheme data. The query cost data for each of the second plurality of query requests is transmitted to a corresponding one of the plurality of requesting entities.

ONTOLOGY-BASED GRAPH QUERY OPTIMIZATION

Examples of the present disclosure describe systems and methods for ontology-based graph query optimization. In an example, ontology data relating to a graph or isolated collection may be collected. The ontology data may comprise uniqueness and topology information and may be used to reformulate a query in order to yield a query that is more performant than the original query when retrieving target information from a graph. In an example, reformulating a query may comprise reordering one or more parameters of the query relating to resources, relationships, and/or properties based on uniqueness information. In another example, the query may be reformulated by modifying the resource type to which the query is anchored based on the topology information. The reformulated query may then be executed to identify target information in the isolated collection, thereby identifying the same target information as the original query, but in a manner that is more performant.