G06F16/1724

Delta compression of probabilistically clustered chunks of data
09798731 · 2017-10-24 · ·

The invention pertains to a method and Information Handling System (IHS) for performing delta compression on probabilistically clustered chunks of data. From a source of chunks a corresponding sketch to represent each chunk is generated. Then, from the generated sketches a subset of similar sketches is determined using a probabilistic based algorithm. Finally, delta compression is performed on the chunks which are represented by the similar sketches in the determined subset.

Database Management Systems and Methods Using Data Normalization and Defragmentation Techniques
20220044259 · 2022-02-10 · ·

Improved systems and methods for database management using data normalization and defragmentation techniques are provided. At least one exchange processor in communication with an exchange computer system receives market data from the exchange computer system, processes the market information, and transmits the market data to a master processor. The master processor receives the market data, processes the data using at least one normalization process to generate normalized data including an intra-day file and an archival file, and stores the intra-day file and the archival file in the master database. The master processor transmits the intra-day file and the archival file to the at least one regional processor. The regional processor receives a request for information from a customer computer system in communication with the regional processor, queries the intra-day file and the archival file to identify matching market data in response to the request, and transmits the matching market data to the customer computer system.

Prediction and repair of database fragmentation

Methods, information handling systems and computer readable media are disclosed for detection and repair of fragmentation in databases. In one embodiment, a method includes obtaining log data reflecting transactions in a database, where the log data is generated during operation of the database. The method continues with applying a machine learning classification model to at least a portion of the log data to obtain a first prediction, where the first prediction indicates whether defragmentation of the database should be scheduled. In this embodiment the method also includes using a machine learning time series forecasting model to obtain a second prediction, where the second prediction identifies a future time interval of low relative database utilization, and scheduling a defragmentation procedure for performance during the future time interval of low relative database utilization.

Efficient and non-disruptive online defragmentation with record locking
11204911 · 2021-12-21 · ·

Methods, systems, and computer-readable storage media for online defragmentation of memory in database systems by applying an IX-lock to each table having data stored in a marked page in a set of marked pages, generating a record map including key-value pairs, each being associated with a record location in a marked page, a value of each key-value pair initially set to a first value, iteratively executing the online defragmentation to delete data from marked pages and add the data to non-sparse pages, at least one iteration including applying a try-lock to a record in a marked page, and at iterations of the online defragmentation, updating the record map to change the value of at least one key-value pair from the first value to the second value, the second value representing that data of a marked page has been deleted from the marked page and added to a non-sparse page.

STORAGE SYSTEM GARBAGE COLLECTION AND DEFRAGMENTATION
20220179828 · 2022-06-09 ·

Metadata of each file of a group of files of a storage and chunk file metadata are analyzed to identify one or more file segment data chunks that are not referenced by the group of files of the storage. Fragmented chunk files to be combined together are identified based at least in part on the one or more identified file segment data chunks. The chunk file metadata is updated with an update that concurrently reflects the removal of at least a portion of the one or more file segment data chunks that are not referenced by the group of files and the combination of the identified fragmented chunk files.

Optimized record placement in graph database

Methods and systems are disclosed for optimizing record placement in a graph by minimizing fragmentation when writing data. Issues with fragmented data within a graph database are addressed on the record level by placing data that is frequently accessed together contiguously within memory. For example, a dynamic rule set may be developed based on dynamically analyzing access patterns of the graph database, policies, system characteristics and/or other heuristics. Based on statistics regarding normal query patterns, the systems and methods may identify an optimal position for certain types of edges that are often traversed with respect to particular types of nodes.

Reducing database fragmentation

Techniques to reduce database fragmentation are disclosed. In various embodiments, an indication is received to store an attribute value for an entity that has a row or other entry in a first database table, wherein the first database table does not have a column for the attribute. It is determined that the value corresponds to a mapped value that is associated with not having an entry in a separate, second database table configured to store the attribute. Entries are made in the second database table only for values of the attribute other than the mapped value. Application level software code is configured to associate absence of a row in the second database table with the mapped value for the attribute.

INCREMENTALLY IMPROVING CLUSTERING OF CROSS PARTITION DATA IN A DISTRIBUTED DATA SYSTEM

Methods and systems are provided for improved access to rows of data in a distributed data system. Each data row is associated with a partition. Data rows are distributed in one or more files and an impure file includes data rows associated multiple partitions. A clustering set is generated from a plurality of impure files by selecting a candidate impure file based on file access activity metrics and one or more neighbor impure files. Data rows of the impure files included in the clustering set are sorted according to their respective associated partitions. A set of disjoint partition range files are generated based on the sorted data rows of the impure files included in the clustering set. Each file of the set of disjoint partition range files is transferred to a respective target partition.

Method, computer program and system for transferring a file

A method for transferring a digital file from an OPC VA server to an OPC VA client that is executed in a web browser as a web application, wherein an OPC VA file module is used to open the desired file on the OPC VA server, the digital data included therein are read using the OPC VA communication protocol and subsequently the open file is closed again. From the read digital data, a file is then formed that is a copy of the file to be transferred, the file formed then being provided to the web browser of the client as a file download.

Storage system garbage collection and defragmentation
11226934 · 2022-01-18 · ·

Metadata of each file of a group of files of a storage and chunk file metadata are analyzed to identify one or more file segment data chunks that are not referenced by the group of files of the storage. Fragmented chunk files to be combined together are identified based at least in part on the one or more identified file segment data chunks. The chunk file metadata is updated with an update that concurrently reflects the removal of at least a portion of the one or more file segment data chunks that are not referenced by the group of files and the combination of the identified fragmented chunk files.