G06F16/24537

COMPUTER DATA SYSTEM DATA SOURCE REFRESHING USING AN UPDATE PROPAGATION GRAPH

Described are methods, systems and computer readable media for data source refreshing.

Performing geospatial-function join using implied interval join

Disclosed herein are systems and methods for performing a geospatial-function join using an implied interval join. In an embodiment, a database platform receives a query that includes a geospatial-function join, which applies a geospatial-function predicate to a first geography data object of a first relation and a second geography data object of a second relation. The database platform processes the first and second relations through an interval join that applies an interval-join predicate that is implied by the geospatial-function predicate. The database platform obtains query results at least in part by implementing a filter that applies the geospatial-function predicate to an output of the interval join, and outputs the query results.

Systems and methods for integration of multiple programming languages within a pipelined search query

According to one embodiment, a method that supports queries deploying operators based on multiple programming languages is described. A sequence of operators associated with a query is identified, where the sequence of operators includes at least two neighboring operators including a first operator based on a first programming language and a second operator based on a second programming language that is different from the first programming language. Thereafter, a schema associated with the first operator and a schema associated with the second operator is determined along with the compatibility between the schema of the first operator and the schema of the second operator. A query error message is generated in response to incompatibility between the first operator schema and the second operator schema. Compatibility is determined when an output generated by execution of the first operator provides machine data needed as input for execution of the second operator.

Zero Copy Optimization for SELECT * Queries
20230229657 · 2023-07-20 · ·

A computer-implemented method includes receiving a query specifying an operation to perform on a first table of a plurality of data blocks stored. Each data block in the first table includes a respective reference count indicating a number of tables referencing the data block. The method also includes determining that the operation specified by the query includes copying the plurality of data blocks in the first table into a second table and, in response, for each data block of the plurality of data blocks in the first table copied into the second table, incrementing, the respective reference count associated with the data block in the first table, appending, by the data processing hardware, into metadata of the second table, a reference of the corresponding data block copied into the second table.

Multiplexing data operation

Embodiments of the present invention relate to a method, system, and computer program product for multiplexing data operation. In some embodiments, a method is disclosed. A query for at least one table comprising a plurality of data records is received. The query indicating a plurality of data operations to be performed on the plurality of data records. The plurality of data operations are combined into a target data operation. An intermediate result of the query is generated by performing the target data operation on the plurality of data records. A final result of the query is determined based on the intermediate result. In other embodiments, a system and a computer program product are disclosed.

FILTER CLASS FOR QUERYING OPERATIONS
20230014435 · 2023-01-19 · ·

A data model identifying a first and second table may be stored, the first table comprising a first and second attribute, the second table comprising a third attribute. A first filter parameter of a first filter and a second filter parameter of a second filter may be obtained. A first tag value may be associated with the first and second filters. A set of filters including the first and second filters may be determined in response to a determination that the first and second filters are associated with the first tag value. An argument indicating the first and second filter parameters may be generated based on the set of filters. A call to the first table may be executed based on the argument, the execution of the call causing values of the first and second attributes to be obtained based on the first and second filter parameters.

COST-BASED SEMI-JOIN REWRITE

A method, apparatus, and computer program product for executing a relational database management system (RDBMS) in a computer system, wherein the RDBMS manages a relational database comprised of one or more tables storing data. The RDBMS executes a query with a semi-join operation comprising an inclusion join and/or an exclusion join performed against at least an outer table and an inner table, wherein the inclusion join returns a row from the outer table when there is a match with a row in the inner table, and the exclusion join returns a row from the outer table when there is no match with a row in the inner table. The RDBMS performs a rewrite of the query to avoid spooling and/or sorting of the inner table, when the inner table is larger than the outer table and a cost after the rewrite is lower than before the rewrite.

METHOD AND SYSTEM FOR PROVIDING A CONTEXT-SENSITIVE, NON-INTRUSIVE DATA PROCESSING OPTIMIZATION FRAMEWORK

A method of performing a data search in a data source by which an operator of a data search pipeline is just-in-time optimized and compiled, using an operator optimization module which optimizes and compiles an intermediate representation of the operator, considering runtime information, and optimization rules, to produce an operator that is optimized for the data search being performed. The method can be applied with one operator or with many operators applied in any sequence or tree structure according to a query plan, as determined by runtime information and optimization rules.

Data statement chunking
11537610 · 2022-12-27 · ·

Techniques are presented for applying fine-grained client-specific rules to divide (e.g., chunk) data statements to achieve cost reduction and/or failure rate reduction associated with executing the data statements over a subject dataset. Data statements for the subject dataset are received from a client. Statement attributes derived from the data statements are processed with respect to fine-grained rules and/or other client-specific data to determine whether a data statement chunking scheme is to be applied to the data statements. If a data statement chunking scheme is to be applied, further analysis is performed to select a data statement chunking scheme. A set of data operations are generated based at least in part on the selected data statement chunking scheme. The data operations are issued for execution over the subject dataset. The results from the data operations are consolidated in accordance with the selected data statement chunking scheme and returned to the client.

Compliant entity conflation and access

The disclosed embodiments provide a system for managing data conflation. During operation, the system generates matches between a first set of entities in a first dataset from a first data provider and a second set of entities in a second dataset from a second data provider based on comparisons of fields in the first and second datasets. Next, the system modifies a join query for joining the first and second datasets to include operators representing compliance rules for the first or second datasets. The system executes the modified join query to produce a joined dataset that adheres to the compliance rules and stores data related to the joined dataset within a platform that logically isolates the data from additional datasets. During processing of queries of the data, the system modifies the queries to include additional operators that enforce access control policies for the data.