G06F16/2386

SYSTEMS AND METHODS FOR EFFICIENT BULK DATA DELETION
20230064907 · 2023-03-02 ·

Systems and methods for efficient bulk data deletion. The system comprises: 1) a deletion record set; an in-memory database representation comprising: tables and records; one or more exclusive locks for the records; and a record block index; 2) a persistent database representation comprising: record blocks; and a transaction log. The method comprises: receiving, by a processor, a deletion record set; acquiring, by the processor, an exclusive lock for one or more records in the deletion record set; deleting, by the processor, the deletion record set from an in-memory representation of the database; generating, by the processor, one or more post-delete record block sets; updating, by the processor, an in-memory record block index; writing, by the processor, the one or more post-delete record block sets to a persistent storage representation of the database; and, adding, by the processor, a transaction log entry for the record block index update.

Powering Scalable Data Warehousing with Robust Query Performance

The present disclosure describes an analytical data management system (ADMS) that serves critical dashboards, applications, and internal users. This ADMS has high scalability, and availability through replication and failover, high user query load, and large data volumes. The ADMS provides continuous ingestion and high performance querying with tunable freshness. It further advances the idea of disaggregation by decoupling its architectural components: ingestion, indexing, and querying. As a result, the impact of a slow down in indexing on the query performance is minimized by either trading off data freshness or incurring higher costs.

METHOD AND DEVICE FOR PROCESSING INFORMATION BY BATCH-STREAM FUSION, AND STORAGE MEDIUM

The present invention discloses a method and a device for processing information by batch-stream fusion, and a storage medium. The method comprises the following steps: Obtaining an index based on an input query statement; extracting a pre-computed index data segment based on the index as a query result; and extracting a re-computed index data segment to update the query result. The present invention solves the technical problem that real-time data and offline data are difficult to fuse and analyze.

Computer-Based Systems Involving Pipeline and/or Machine Learning Aspects Configured to Generate Predictions for Batch Automation/Processes and Methods of Use Thereof
20230153191 · 2023-05-18 ·

Systems and methods involving provision of machine-learning-based prediction of future failure, anomaly, etc. in execution of batch processes are disclosed. In one illustrative implementation, an exemplary method may comprise obtaining historical data from prior execution of one or more batch processes, training a machine learning model to predict one or more future failure(s) and/or future flag(s) in execution of a future batch process, generating and/or collecting descriptive analytics pertinent to execution of the batch processes, and predicting a future failure and/or future flag in execution of the batch processes using the trained machine learning model and/or the descriptive analytics.

Intelligent datastore determination for microservice

A method comprises dividing a plurality of operations of a microservice between a plurality of databases, and synchronizing data corresponding to the plurality of operations between the plurality of databases. The microservice is a create, read, update, delete (CRUD) microservice, and the plurality of operations comprise creating, reading, updating and deleting the data.

SYSTEMS AND METHODS FOR MATCHING ELECTRONIC ACTIVITIES DIRECTLY TO RECORD OBJECTS OF SYSTEMS OF RECORD WITH NODE PROFILES

The system described herein can automatically match, link, or otherwise associate electronic activities with one or more record objects. For an electronic activity that is eligible or qualifies to be matched with one or more record objects, the system can identify one or more set of rules or rule sets. Using the rule sets, the system can identify candidate record objects. The system can then rank the identified candidate record objects to select one or more record objects with which to associate the electronic activity. The system can then store an association between the electronic activity and the selected one or more record objects.

Shuffle-less Reclustering of Clustered Tables
20220374455 · 2022-11-24 · ·

A method for shuffle-less reclustering of clustered tables includes receiving a first and second group of clustered data blocks sorted by a clustering key value. A range of clustering key values of one or more the data blocks in the second group overlaps with the range of clustering key values of a data block in the first group. The method also includes generating split points for partitioning the first and second groups of clustered data blocks into a third group. The method also includes partitioning using the split points, the first and second groups into the third group. Each data block in the third group includes a range of clustering key values that do not overlap with any other data block in the third group. Each split point defines an upper limit or lower limit for the range of clustering key values a data block in the third group.

BATCH PROCESSING OF AUDIT RECORDS

An audited device generates, for each of a plurality of events, an audit file and the audit device locally store the audit files. Upon the occurrence of a trigger condition, the audit device retrieves a batch of audit files stored locally and generates an audit block for transmitting the batch of audit files to an auditing system. The audit block includes the audit files in the batch of audit files, and a digital signature generated based in part on the audit files in the batch of audit files. The audited device then sends the audit block to the auditing system. Accordingly, the amount of data for transmitting the audit files from the audited device to the auditing system may be reduced. Additionally, the computational power for authenticating the audit files to the auditing system may also be reduced.

ACCESSING BOTH REPLICATION BASED STORAGE AND REDUNDANCY CODING BASED STORAGE FOR QUERY EXECUTION

A database system is operable to determine a query for execution that requires access to a set of records stored by the database system. A first proper subset of the set of records are accessed in conjunction with executing the query by reading exactly one of a set of multiple replicas of each record of the first proper subset of the set of records from the replication-based storage system. A second proper subset of the set of records are accessed in conjunction with executing the query by reading at least one redundancy-coded segment from the redundancy-coding based storage system. A final resultant for the query is generated by performing at least one query operation on the first proper subset of the set of records and the second proper subset of the set of records in conjunction with executing the query.

Execution-Time Dynamic Range Partitioning Transformations

An example method includes receiving a data load request requesting loading and partitioning of an unknown quantity of user data for storage at a data storage system. The user data including a partitioning key; a total data size of the user data; a plurality of rows, each row of the plurality of rows associated with a value defined by the partitioning key; and one or more columns. The method also includes identifying one or more storage constraints for the data storage system. The method further includes, after receiving the user data, determining a plurality of partitioning quantiles defining respective ranges of values of the partitioning key based on the user data and the one or more storage constraints for the data storage system; and range partitioning each row of the user data into files based on the value associated with the row defined by the partitioning key, and the respective ranges of the values of the partitioning key defined by the plurality of partitioning quantiles.