Patent classifications
G06F16/24537
Sort optimization
A system and method for processing of queries including receiving a query including a set operation and a sort operation, wherein the set operation includes a first data structure and a second data structure and the sort operation requests a result set that is sorted based on a column or attribute of the first data structure and a column or attribute of the second data structure; generating a query plan in which a sort operation occurs prior to the set operation; determining a first, partial set of one or more resultant rows responsive to the query; sending the first, partial set of one or more resultant rows responsive to the query to a client; determining a second, partial set of one or more resultant rows responsive to the query; and sending the second, partial set of one or more resultant rows to the client.
Merge small file consolidation
The subject technology receives a query plan corresponding to a query. The subject technology executes the query based at least in part on the query plan, the executing including: filtering a first set of files that are to be modified by a merge statement, performing a split operation to send information related to a second set of files to a scan set builder operation in a first portion of the query plan and scan back operation in a second portion of the query plan, performing the scan set builder operation to remove the second set of files from the first set of files, performing a table scan operation based on a third set of files, and performing a first union all operation to combine the first set of data with a second set of data as a first set of combined data.
Predictive query improvement
The present approach relates to improving query performance in a database context. Examples of query improvement are described in the context of certain query patterns, one or more of which may be observed in a given query. When a given query pattern is observed, changes may be made to the query at the application or database level to improve performance of the respective query. Query improvements may be performed in a manner transparent to the user.
INDEX-BASED, ADAPTIVE JOIN SIZE ESTIMATION
Systems, methods, and computer media are described for index-based join size estimation. For a join operation between two tables, a filter is applied to the first table, resulting in a filter output. The filter output is then sampled. For each sample, an index for a second table is accessed and counts of records in the second table that match the sample are retrieved. Using the sample size and the retrieved counts from the index of the second table, a data size for the join operation can be efficiently and accurately estimated. Statistical confidence in the estimate can also be assessed using variance-based calculations.
RESOLVING INCOMPATIBLE COMPUTING SYSTEMS
Source data rendered as a string of hexadecimal data representing a set of Extended Binary Coded Decimal Interchange Code (EBCDIC) data, and a data layout description defining a record in the source data that includes a plurality of fields, are obtained. Respective hexadecimal lengths of the fields based on a source data length of each field and a source datatype of each field are determined. Hexadecimal sub-strings are extracted from the hexadecimal string based on the hexadecimal lengths and source datatypes of the fields. At least some of the hexadecimal sub-strings are converted to a target format. The sub-strings are output in the target format.
Techniques for pushing joins into union all views
A query with a UNION ALL (UA) view is detected by a query optimizer. A query execution plan and cost for the query is obtained. The query is rewritten to push aggregates of the original query into the view. A query execution plan is generated for the rewritten query and a cost for executing the rewritten query is obtained. The lowest cost execution plan is selected for execution by a database engine of a database.
Runtime optimization of grouping operators
Runtime optimization of grouping operators is described. A system estimates a resource cost for each of multiple grouping operators based on values identified during query runtime, in response to receiving a query request associated with a data stream. The system selects a grouping operator during query runtime, based on a corresponding resource cost, from the multiple grouping operators. The selected grouping operator enables grouping the data stream based on the query request, and outputting a response based on the grouped data stream.
QUERY PERFORMANCE
An approach is provided for improving query performance. A query is received whose execution includes a first join of tables having sets of records and includes a second join with a next table whose set of records is smaller than a set of transient records resulting from the first join. A threshold for a number of records in the next table is received. A first count of the transient records resulting from the first join is estimated. A second count of a number of records in the next table is determined. It is determined that the second count is less than the threshold. Based on the second count being less than the threshold and without using the first count, a query execution plan is generated to include a broadcast of the records in the next table to data slices without including a broadcast of the transient records.
REMOTE DATASOURCE-BASED OPTIMIZATION OF PROCEDURE-BASED MULTI-DATASOURCE QUERIES
An approach is provided for optimizing multi-datasource queries in a networked computing environment. A procedure that contains a set of queries designed to access a specific set of data from a plurality of datasources in a virtualized hybrid storage environment (e.g., a virtualized hybrid cloud) is obtained. A set of mapped store procedures is created for the set of datasources referenced in the procedure. Each mapped store procedure includes a subset of queries that are applicable to a corresponding datasource from the set of queries in the procedure. These mapped store procedures are forwarded to the corresponding datasource for storage on the corresponding datasource. In response to a running of the procedure, execution of the mapped store procedure is commenced on each of the corresponding datasources on which they are stored.
System, method, and computer program for converting a natural language query to a structured database query
The present disclosure describes a system, method, and computer program for converting a natural language query to a structured database query. In response to receiving a natural language query for a database, an NLU model is applied to the query to identify an intent and entities associated with the query. The intent is mapped to a database object, and candidate query fields and operands are identified from the entities. The candidate query fields and operands are evaluated to identify any subject fields, conditional expressions, record count limit, and ordering/sorting criteria for the query. This including matching certain query fields and operands based on query parameters, operand types, and locations of operands relative to query fields. A query plan is created based on the evaluation of the candidate query fields and operands, and a database query is generated from the query plan.