G06F16/24539

Platform agnostic query acceleration

Implementations described herein relate to systems and methods to provide platform agnostic query acceleration. In some implementations, a method includes receiving, at a processor associated with a query acceleration service, a request from an client/application, wherein the request conforms to a particular wire protocol of a plurality of supported wire protocols, and wherein the request includes header data and body content data, analyzing the request to identify at least one of a query and a command in the body content data, determining an optimal matched model of the one or more query acceleration models, rewriting the query based on the optimal matched model, transmitting the rewritten query to the query processing platform, receiving a response to the rewritten query or the query from the query processing platform, and transmitting the received response to the application, wherein the transmission is configured based on the particular wire protocol.

INTELLIGENT QUERY PLAN CACHE SIZE MANAGEMENT

A method for intelligent query plan cache size management can be implemented. During execution of a plurality of incoming queries in a database management system, the method can measure actual compilation times of generating query execution plans for the plurality of incoming queries. The database management system can have a query execution plan cache which has a size that can store at least some of the query execution plans. The method can monitor differences between the actual compilation times and ideal compilation times of generating query execution plans for the plurality of incoming queries. The ideal compilation times can be estimated by assuming no query execution plan is evicted from the query execution plan cache. The method can adjust the size of the query execution plan cache based on the monitored differences.

IN-MEMORY DATABASE (IMDB) ACCELERATION THROUGH NEAR DATA PROCESSING
20230027648 · 2023-01-26 ·

An accelerator is disclosed. The accelerator may include an on-chip memory to store a data from a database. The on-chip memory may include a first memory bank and a second memory bank. The first memory bank may store the data, which may include a first value and a second value. A computational engine may execute, in parallel, a command on the first value in the data and the command on the second value in the data in the on-chip memory. The on-chip memory may be configured to load a second data from the database into the second memory bank in parallel with the computation engine executing the command on the first value in the data and executing the command on the second value in the data.

Iterative data processing

Data is processed iteratively by a database system with a first cache storing key-value data which resulted from previous iterations of processing input data and a second cache storing aggregated data which resulted from previous iterations of processing key-value data stored in the first cache. In a current iteration, the database system receives further input data related to the input data of the previous iterations, transforms the further input data into further key-value data and stores the further key-value data in the first cache in addition to the stored key-value data which resulted from previous iterations. The database system further processes the further key-value data and the aggregated data stored in the second cache to form updated aggregated data, and stores the updated aggregated data in the second cache for usage in further iterations. The database system also provides the updated aggregated data to at least one client.

SYSTEMS AND METHODS FOR DATA MANAGEMENT AND QUERY OPTIMIZATION
20230229658 · 2023-07-20 ·

A central node can: receive a query comprising at least one parameter comprising a time range of a dataset stored in a cloud storage system; transmit one or more of the query parameters comprising the time range to a metadata service; receive from the metadata service a list of files related to the query; and assign to each processing node of a plurality of processing nodes a subset of the files. Each processing node can: determine that the subset is not stored on a cache; retrieving the subset not stored on the cache from the cloud storage system; store the retrieved subset in a local memory; scan the subset stored in the local memory for data matching the at least one parameter to generate a subset of query results; and concurrently copy using a separate thread from the scanning, the subset stored in the local memory to the cache.

UPDATING SHARED AND INDEPENDENT MATERIALIZED VIEWS IN A MULTI-TENANT ENVIRONMENT
20230021006 · 2023-01-19 ·

Shared materialized views are maintained during data changes to the primary data and during creation of new materialized views. Shared data stored for use by shared materialized views is distinguished from data stored by an independent materialized view. A view selector manages data updates to shared materialized views and corresponding mapping table. The view selector directs movement of data between a shared materialized view and an independent materialized view through the lifecycle of the materialized views.

Autonomic caching for in memory data grid query processing

A method, system and computer program product for autonomic caching in an IMDG has been provided. A method for autonomic caching in an IMDG includes receiving from a client of the IMDG a request for a primary query in the IMDG. The method also includes associating the primary query with a previously requested sub-query related to the primary query. Finally, the method includes directing the sub-query concurrently with a directing of the primary query without waiting to receive a request for the sub-query from the client. In this way, the method can proactively predict a receipt of the request for a sub-query following a request for a primary query prior the actual receipt of the request for the sub-query.

Data access policy management

A method for automated data access management can include creating a project that manages data access to data sources by a plurality of users, wherein each user has user attributes indicating data access policies for the data sources. The method can also include performing project equalization for the project, wherein the project equalization determines a set of user attributes shared by the users. Additionally, the method can include modifying the user attributes of each user for the project, wherein the user attributes of each user are modified to conform to the set of user attributes determined by the project equalization, and detecting a query to retrieve data from the data source. The method can include modifying the query to produce a modified query by applying the modified user attributes associated with the project to the query and retrieving the data from the data source based on the modified query.

CLUSTERING AND COMPACTION OF MATERIALIZED VIEWS ON A DATABASE SYSTEM

Methods, systems, and computer programs are presented for providing a cluster view method of a database to perform compaction and clustering of database objects, such as database materialized view. A cluster view system identifies a materialized view including data from one or more base tables, a portion of the data of the materialized view including stale data. The cluster view system performs an integrated task within a maintenance operation on a database, the integrated task including compacting the materialized view, the maintenance operation including clustering the materialized view, and stores the compacted and clustered materialized view in the database.

SYSTEM AND METHOD FOR TRANSFERRABLE DATA TRANSFORMATIONS
20230019634 · 2023-01-19 ·

This invention enables users to work with large datasets that are available from data producers, transforming the data into meaningful information whose derivation may later be easily comprehended. Users can build queries by applying transformation functions to the datasets. These queries can be saved and used to build further queries, and queries can be saved and visualized, creating a clear and comprehensible record of data transformations. Inferences are applied to datasets and parameters so that transformations are processed with minimal errors. Limited multiprocessing is implemented on each server on which queries are performed, increasing processing speeds. A graph database of relationships between raw data and queries is used to ensure that queries are performed on updated data. These solutions lead to greater processing efficiency even when datasets tend to be enormous and subject to frequent updates.