G06F16/24545

Cloning catalog objects
11573978 · 2023-02-07 · ·

Example systems and methods for cloning catalog objects are described. In one implementation, a method identifies an original catalog object associated with data and creates a duplicate copy of the original catalog object without copying the data itself. The method allows access to the data using the duplicate catalog object and supports modifying the data associated with the original catalog object independently of the duplicate catalog object. The duplicate catalog object can be deleted upon completion of modifying the data associated with the original catalog object.

Automatic pruning cutoff in a database system
11615095 · 2023-03-28 · ·

During a query compilation process, a query is received that is directed to a set of source tables, each source table from the set of source tables being organized into at least one micro-partition and the query including at least one pruning operation. During the query compilation process, a modification of the query is performed for adjusting the at least one pruning operation, the modification being based on a set of statistics collected for previous pruning operations on at least a portion of the set of source tables and a set of heuristics, the set of statistics indicating at least an amount of execution time for each previous query associated with each of the previous pruning operations. The query is compiled including the modification of the query. The compiled query is provided to an execution node of a database system for execution.

Join optimization using multi-index augmented nested loop join method

A system and method for efficient query processing using multiple indices in a join operation are described. In one embodiment, a join query including a join operation on a first table and a second table and including a first condition and a second condition is received, wherein the first condition is based on a first index of the second table, and the second condition based on a second index of the second table; a first result set is determined by index scanning the second table using the first index as an index key; a second result set is determined by index scanning the second table using the second index as the index key; a third result set is determined by applying a set operation to the first result set and the second result set; and the third result set is provided in response to the join query.

HISTOGRAM WITH INTEGRATED DISTINCT VALUE SKETCHES
20230087753 · 2023-03-23 ·

Provided are systems and methods for creating histograms with distinct value sketches integrated therein and for query processing based on the histograms with distinct value sketches. In one example, the method may include storing a histogram that comprises a representation of a bucket of data from a database and that includes a distinct value sketch with a distinct value attribute that identifies an estimated number of distinct values within the bucket of data, receiving a database query, generating a query execution plan for the database query based on the distinct value attribute of the bucket within the distinct value sketch embedded within the histogram, and executing the database query on the bucket of data from the database based on the generated query execution plan.

DATA PROCESSING METHOD AND DATA PROCESSING APPARATUS
20230082563 · 2023-03-16 ·

A data processing method includes: receiving a data processing request carrying a query statement; converting the query statement into a corresponding relational algebra tree based on the data processing request; determining an operation type corresponding to the query statement based on the relational algebra tree; delivering the query statement to a first database in response to the operation type being a first type; and completing the data processing request in the first database based on the query statement.

Unified optimization of iterative analytical query processing
11604796 · 2023-03-14 · ·

Optimization of procedures for enterprise applications can take both declarative query statements and imperative logic into account in a unified optimization technique. An input procedure can implement complex analytical queries and also include iterative control flow logic such as loops. Alternative query execution plans for the procedure can be enumerated by moving queries out of and into loop boundaries via hoist and sink operations. Program correctness can be preserved via excluding some operations via dependency graphs. Sink subgraphs can also be used. Query inlining can also be supported, resulting in synergies that produce superior execution plans. The computing execution resource demand of the respective alternatives can be considered to arrive at an optimal query execution plan that can then be used to actually implement execution of the procedure. Execution performance can thus be greatly improved by performing counterintuitive optimizations.

DATA TYPE BASED VISUAL PROFILING OF LARGE-SCALE DATABASE TABLES

A computer-implemented method can comprise establishing programmatic connections to a digitally stored first database comprising over one million records, each of the records comprising columns; reading a configuration file that specifies tables in the database; for each particular table, forming and submitting a plurality of queries to the database, each of the queries specifying data aggregation operations, and in response thereto, receiving result sets of records of the database; calculating metadata metrics that characterize columns of the records in the result sets and storing the metadata metrics in tables for string column statistics, numeric column statistics, date column statistics, based upon a particular data type among different data types of the columns; generating presentation instructions which when rendered cause displaying one or more graphical visualizations in a graphical user interface.

Query and change propagation scheduling for heterogeneous database systems

Techniques are presented herein for efficient query processing and data change propagation at a secondary database system. The techniques involve determining execution costs for executing a query at a primary DBMS and for executing the query at an offload DBMS. The cost for executing the query at the offload DBMS includes the cost of propagating changes to database objects required by the query to the offload DBMS. Based on the execution cost, the query is sent to either the primary DBMS or the offload DBMS.

DATABASE QUERY SPLITTING
20230072930 · 2023-03-09 ·

A determination is made whether a received database query is to be processed by either a first database, a second database, or at least in part by both the first and second databases including by determining whether the query meets criteria to split the query for processing across the first and second databases. The first and second databases store shared synchronized records, the first database configured to store the records in a column-oriented format and the second database configured to store the records in a row-oriented format. In response to a determination that the query meets the criteria to split the query, a first and second component query of the database query are generated for the first and second databases, respectively, the second component query based at least in part on a result of the first component query. The execution of the first and second component queries is pipelined.

Resource provisioning systems and methods

A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/Furthermore, the device processes the set of queries using the updated set of processors.