G06F16/24539

AD HOC DATA EXPLORATION TOOL

The disclosed application relates to a tool by which a user may create a cloud workspace that includes a data memory space, as well as a tool for automatically identifying ad-hoc analyses on that data. The solution allows a user to connect to data sources using SQL or GUI tools, combine data from different data sources, prepare and clean the data, mine the data for insights, and move that data into downstream reporting tools for visualization. The system is linked to a code repository to allow data scientists to execute code from the code repository in trial data spaces, investigate that data, and prepare more in-depth analytics for downstream reporting tools.

Systems and methods for distributed architectures to provide scalable infrastructure in repricing systems

A query processing server for providing scalable architectures in repricing is provided. The query processing server includes a gateway platform for processing communications and a repricing engine. The query processing server processor is configured to receive a repricing request from a frontend server. The query processing server processor is also configured to define a set of repricing periods. The query processing server processor is further configured to transmit a creation instruction to the repricing database server to create a repricing status database and queue. The query processing server processor is also configured to determine a set of claims data queries for execution based on the set of scenario data and to query the repricing database server with the set of claims data queries. The query processing server processor is configured to perform a repricing analysis using the set of claims data responses and the set of repricing periods.

Automatic generation of materialized views

Definitions of material views are automatically generated. In general, Automated MV generation identifies a set of candidates MVs by examining a working set of query blocks. Once the candidates are formed, the candidate MVs are further evaluated to calculate a benefit to the candidate MVs. An improved approach for generating a candidate set of MVs is described herein. The improved approach is referred to as the extended covering subexpression technique (ECSE). Under ECSE, various relationships between join sets other than strict equivalence are used to generate new resultant join sets. Such relationships include subset, intersection, superset, and union, which shall be described in further detail below. In some cases, relationships among resultant join sets and initial join sets are considered to generate new resultant join sets. The final resultant join sets are then used to form a candidate set of MVs.

Search time estimate in a data intake and query system

Systems and methods are described for determining a query execution time in a data intake and query system. The system parses a query to identify different portions of the query that are executed by different components of the data intake and query system. The system determines a query execution time for the different portions of the query based on the corresponding components. Based on the query execution time of the different portions for the query, the system determines a query execution time for the query.

Storage level parallel query processing

Storage level query processing may be implemented for processing database queries. Nodes that can access a database may perform parallel processing for at least a portion of a database query. An indication may be received that specifies parallel processing for the database query. The nodes can then be caused to perform the portion of the query as part of providing a result in response to the database query instead of a node, such as a query engine node, that received the database query.

Automatic derivation of shard key values and transparent multi-shard transaction and query support

Techniques are provided for processing a database command in a sharded database. The processing of the database command may include generating or otherwise accessing a shard key expression, and evaluating the shard key expression to identify one or more target shards that contain data used to execute the database command.

Efficient set operation execution on streaming data using sketches
11609915 · 2023-03-21 · ·

The present disclosure relates to method for responding to a query requesting an intersection being performed. The method includes receiving a query referencing a first set, a second set, and a desired quantile related to the first set from among a plurality of quantiles; generating a data structure including a bottom-k sketch of user identifiers (ids) of the first set and corresponding numerical values of the first data; partitioning the data structure into a plurality of sketches to correspond to the quantiles, respectively; determining an intersection of one of the sketches associated with the desired quantile and a sketch of the second set; and responding to the query based on the intersection.

Automatically refreshing materialized views according to performance benefit

Materialized views for a database system may be automatically refreshed according to performance benefits. Materialized views may be ordered according to determined performance benefits for the materialized views indicating the performance benefit obtained when a materialized view is used to perform a query at the database system. Materialized views may be selected for refresh operations according to the ordering based on a capacity of the database system to perform refresh operations.

SYSTEM AND METHOD FOR QUERY ACCELERATION FOR USE WITH DATA ANALYTICS ENVIRONMENTS

In accordance with an embodiment, described herein is a system and method for providing query acceleration with a computing environment such as, for example, a business intelligence environment, database, data warehouse, or other type of environment that supports data analytics. A middle layer is provided as a long-term table data storage format; and one more acceleration formats, or acceleration tables, can be periodically regenerated from the middle layer, wherein a determination can be made as to whether an accelerated table exists for a dataset table, and if so, then the accelerated table is used to process the query.

USAGE RECORD AGGREGATION
20230084078 · 2023-03-16 ·

In an example embodiment, a solution is provided that aggregates records as they are submitted to a third party (on the write path) rather than performing a real-time aggregation when a request is processed that needs the aggregation (read path). More particularly, in an example embodiment, a caching layer is introduced that avoids having to read all usage events to compute an aggregation when a request is received for aggregated data. The caching layer maintains values for various metrics that require aggregation.