G06F16/24532

SYSTEMS, METHODS, AND APPARATUSES FOR SIMULTANEOUSLY RUNNING PARALLEL DATABASES
20220374411 · 2022-11-24 ·

Embodiments herein relate to replacing a legacy Pick environment with a modern microservice architecture. A legacy database and a modern database may be operated in parallel for data validation. During the validation process, the legacy database may be used as the master copy of the data. After verifying that the modern database satisfies the data needs of a system, the system can switch to using the new modern database as the master copy.

Techniques for pushing joins into union all views
11593366 · 2023-02-28 · ·

A query with a UNION ALL (UA) view is detected by a query optimizer. A query execution plan and cost for the query is obtained. The query is rewritten to push aggregates of the original query into the view. A query execution plan is generated for the rewritten query and a cost for executing the rewritten query is obtained. The lowest cost execution plan is selected for execution by a database engine of a database.

Splitting a time-range query into multiple sub-queries for serial execution

Techniques for splitting a time-range query into sub-queries for serial execution are provided. In one embodiment, a user query is received requesting items within a time range from a database. The time range is divided into a plurality of time periods within the time range. Sub-queries defining respective time periods of the plurality of time periods are generated from the user query, and a first sub-query is executed. The first sub-query defines a first time period of the plurality of time periods, where the first time period is a most-recent time period or a least-recent time period among the plurality of time periods. If it is determined that a number of items obtained from executing the first sub-query is greater than or equal to a predetermined result target, then the items obtained from executing the first sub-query are provided and subsequent sub-queries are not executed.

Systems and methods for rapidly generating security ratings
11595427 · 2023-02-28 · ·

A system for determining an entity's security rating may include a ratings engine and a security database. The security database may include a manifest and a distributed index containing security records. Each of the security records may have a key (e.g., a network identifier of a network asset) and a value (e.g., security information associated with the network asset identified by the key). The keyspace may be partitioned into multiple key ranges. The manifest may contain references to segments of the distributed index. Each segment may be associated with a key range and may index a group of security records having keys within the key range. The manifest and the segments may be stored in an object storage system. The ratings engine may determine the security rating of an entity based on security records of the entity's network assets, which may be retrieved from the database.

DYNAMIC DEGREE OF QUERY PARALLELISM OPTIMIZATION

Approaches presented herein enable dynamic optimization of a degree to which a query is parallelized for execution. More specifically, a priority associated with an obtained user query for execution is identified. A real-time metric indicating availability of one or more runtime resources is checked. An optimal degree of parallelism is calculated based on the priority associated with the obtained user query and the real-time availability metric. A plan is generated for executing the query using the calculated optimal degree of parallelism.

Maintaining an unknown purpose data block cache in a database system

A method for execution by a node of a database system includes receiving a first data block, determining data block processing instruction data for the first data block is not indicated in previously received data blocks, and adding the first data block to an unknown purpose data block cache. Prior to elapsing of a storage time window for storage of the first data block, at least one second data block is received that indicates data block processing instruction data for the first data block. The first data block is processed by applying the data block processing instruction data. A third data block is received and is added to the unknown purpose data block cache. The third data block is removed from the unknown purpose data block cache based on elapsing of a storage time window for storage of the third data block.

Parallel scan of single file using multiple threads

Multiple execution threads process a query directed to a database organized into a plurality of files. In processing the query, a first thread downloads a file from the plurality of files. The file comprises a set of blocks. A parallel scan of the set of blocks is performed by at least the first thread and a second thread to identify data that matches the query. A response to the query is provided based in part on the parallel scan of the set of blocks.

Cloning catalog objects
11573978 · 2023-02-07 · ·

Example systems and methods for cloning catalog objects are described. In one implementation, a method identifies an original catalog object associated with data and creates a duplicate copy of the original catalog object without copying the data itself. The method allows access to the data using the duplicate catalog object and supports modifying the data associated with the original catalog object independently of the duplicate catalog object. The duplicate catalog object can be deleted upon completion of modifying the data associated with the original catalog object.

Systems, methods, and data structures for high-speed searching or filtering of large datasets
11573941 · 2023-02-07 · ·

An inline tree data structure and one or more auxiliary data structure encode a multitude of data records of a dataset; data fields of the dataset define a tree hierarchy. The inline tree comprises one binary string for each data record that are all the same length, are arranged in an ordered sequence that corresponds to the tree hierarchy, and include an indicator string indicating position in the tree hierarchy of each data record relative to an immediately adjacent data record. A search program is guided through the dataset by interrogating each indicator string in the inline tree data structure so as to reduce unnecessary interrogation of data field values.

Data partitioning and parallelism in a distributed event processing system

An event processing system for processing events in an event stream is disclosed. The system is configured for determining a stage for a continuous query language (CQL) query being processed by an event processing system and/or determining a stage type associated with the stage. The system is also configured for determining a transformation to be computed for the stage based at least in part on the stage type and/or determining a classification for the CQL query based at least in part on a plurality of rules. The system can also be configured for generating a transformation in a Directly Acyclic Graph (DAG) of a data transformation pipeline for the stage based at least in part on the partitioning criteria for the stage. In some examples, the system can also be configured for determining a partitioning of the stage based at least in part on the transformation.