G06F16/24545

SEGMENT TREND ANALYTICS QUERY PROCESSING USING EVENT DATA
20230010139 · 2023-01-12 · ·

A method, system, and computer program product for conserving resources in segment trend analytics query processing using event data. A set of events of an entity is aggregated and sorted from earliest to last, and sequentially processed to incrementally set a subset therefrom. A predicate function for determining segment membership is applied respective of a linear timeline of events of the subset represented by a time of an event processed. A data record comprising identification of the entity, time, and respective segment is generated and stored. Data records are aggregated by respective identification of a segment and a time comprised therein, and at least one analytic measure respective of entities which identification thereof is comprised therein, is calculated and stored. An indication of the at least one analytic measure calculated respective of a segment and a time queried is returned, whereby determination of a trend of the segment is enabled.

QUERY GENERATION FROM EVENT-BASED PATTERN MATCHING

A set of queries from an application executing on a client computing device is obtained. A first database based on the set of queries is searched to select a set of event types. A set of predicted parameters associated with the set of event types is sent to the application. The application includes instructions to obtain a first parameter and the set of predicted parameters via a user interface of the application and generate a message comprising the first parameter and an indicator identifying the set of predicted parameters. The first parameter and the indicator are obtained via the second message. A combined query including the first parameter and the set of predicted parameters is generated in response to obtaining the indicator. A vehicle record from the vehicle database is obtained based on the combined query. Values of the vehicle record are sent to the client computing device.

GENERATING A SUBQUERY FOR AN EXTERNAL DATA SYSTEM USING A CONFIGURATION FILE
20230214386 · 2023-07-06 ·

Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be processed and generates a subquery for execution by the third-party data storage and processing system, generates instructions for one or more worker nodes to receive and process results of the subquery from the third-party data storage and processing system, and instructs the worker nodes to provide results of the processing to the data intake and query system.

COST-BASED SEMI-JOIN REWRITE

A method, apparatus, and computer program product for executing a relational database management system (RDBMS) in a computer system, wherein the RDBMS manages a relational database comprised of one or more tables storing data. The RDBMS executes a query with a semi-join operation comprising an inclusion join and/or an exclusion join performed against at least an outer table and an inner table, wherein the inclusion join returns a row from the outer table when there is a match with a row in the inner table, and the exclusion join returns a row from the outer table when there is no match with a row in the inner table. The RDBMS performs a rewrite of the query to avoid spooling and/or sorting of the inner table, when the inner table is larger than the outer table and a cost after the rewrite is lower than before the rewrite.

Access path optimization

A computer-implemented method for access path optimization is provided according to embodiments of the present disclosure. In the method, a plurality of real values of an access path factor may be collected during a specified time period. One of the real values may be generated when a query is executed on a first access path. Then, at least one second access path may be generated for the query based on the plurality of real values of the access path factor. Moreover, an optimal access path for the query may be identified from the first access path and the at least one second access path.

Query processing in a polystore

A method may include generating, based at least on an analysis plan, a logical plan, the analysis plan specifying one or more operations performed on data stored in a polystore that includes a first database management system and a second database management system. The logical plan may include a sequence of logical operators corresponding to the operators specified by the analysis plan. The generating of the logical plan may include rewriting the sequence of logical operators by at least reordering, replacing, and/or combining one or more logical operators in the sequence of logical operators. Candidate physical plans may be generated based on the logical plan. The analysis plan may be executed based on a physical plan selected from the candidate physical plans. Related systems and articles of manufacture are also provided.

Managed tuning for data clouds

Implementations described herein relate to systems and methods to configure a data warehouse system. In some implementations, a method includes obtaining, by a configuration management system, historical query workload metadata associated with a data warehouse from the data warehouse system, determining, a first configuration setting associated with a configurable parameter for a first time period, wherein the first configuration setting is associated with a computing resource utilization at the data warehouse system different from a previous configuration setting, transmitting, to the data warehouse system, the first configuration setting for the configurable parameter, receiving, from the data warehouse system, during the first time period, query workload metadata, determining, whether the query workload metadata meets a threshold performance, and based on a determination that the query workload metadata does not meet the threshold performance, transmitting a backoff configuration setting for the configurable parameter to the data warehouse system.

USING QUERY LOGS TO OPTIMIZE EXECUTION OF PARAMETRIC QUERIES

The present disclosure relates to systems, methods, and computer-readable media for optimizing selection of a cached execution plan to use in processing a parametric query. For example, systems described herein involve training a plan selection model that makes use of machine learning to identify an execution plan from a set of pre-selected execution plans based on predicted cost of executing a query instance in accordance with the selected execution plan (e.g., relative to predicted costs of executing the query instance using other pre-selected execution plans). This application describes features related to lowering costs associated with selecting the execution plan in a way that will continue to be more accurate overtime based on training and refining the plan selection model.

DECENTRALIZED QUERY EVALUATION FOR A DISTRIBUTED GRAPH DATABASE

The disclosed technologies are capable of decentralized query evaluation for a distributed graph database. In one technique, a query is divided into first and second sets of operations. The query comprises variables and constraints that correspond to at least two nodes and at least one edge of a graph in a graph database. The first set of operations for processing the query is assigned to multiple shards. A limit is communicated to the shards. The second set of operations for processing the query is executed. A list of completed operations is received from each shard. The lists of operations received from the shards are merged into a merged set of operations, which is used to determine whether query processing is finished. If query processing is not finished, then an updated limit is communicated to the shards; otherwise, query results are provided in response to the query.

Using machine learning to estimate query resource consumption in MPPDB

Methods and apparatus are provided for using machine learning to estimate query resource consumption in a massively parallel processing database (MPPDB). In various embodiments, the machine learning may jointly perform query resource consumption estimation for a query and resource extreme events detection together, utilize an adaptive kernel that is configured to learn most optimal similarity relation metric for data from each system settings, and utilize multi-level stacking technology configured to leverage outputs of diverse base classifier models. Advantages and benefits of the disclosed embodiments include providing faster and more reliable system performance and avoiding resource issues such as out of memory (OOM) occurrences.