Patent classifications
H03M7/3091
IN-MEMORY DATA STORAGE WITH TRANSPARENT COMPRESSION
A storage aware memory controller and method for managing a physical storage system. A described controller includes: a system for mapping physical memory space into a memory region and a storage region; a system for applying different error protections schemes, in which a fine-grained memory fault tolerance scheme is applied to data in the memory region and a course-grained memory fault tolerance scheme is applied to data in the storage region; and an in-memory storage filesystem that compresses and stores individual pages of data in the storage region, wherein each page of data is compressed into a set of codewords that are codeword aligned such that no codeword shares compressed data from different pages, and wherein the in-memory storage filesystem stores a compression-aware logical block address (CA-LBA) for each page of data.
COMPRESSION DICTIONARY SNAPSHOT SYSTEM AND METHOD
A system configured to generate a set of compression dictionary snapshots. The system can determine a subset of a set of compression dictionary definitions, the subset having a first subset comprising one or more definitions that have changed since a time of a previous snapshot and a second subset having one or more definitions associated with a predetermined portion of the dictionary. The system can further generate and store snapshots based at least in part on the determined subset of one or more definitions and determine a plurality of active snapshots from the set of snapshots such that the set of one or more definitions is included in the plurality of active snapshots.
System and method for improving data compression of a storage system in an online manner
Techniques for improving data compression of a storage system in an online manner are described herein. According to one embodiment, in response to a sequence of data to be stored, the sequence of data is partitioned into a plurality of data chunks according to a predetermined chunking algorithm. A sketch for each of the data chunks is generated based on one or more features extracted from the data chunk. Each of the data chunks of the sequence of data is associated with one of a plurality of groups based on the sketch, wherein each group is represented by a sketch. The data chunks of each group are compressed and stored in a compression region of the storage systems, such that similar data chunks are compressed and stored in the same compression region.
METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR DATA PROCESSING
Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for data processing. The method includes: determining, based on sizes of multiple data segments included in data to be processed, a first time required to perform a matching operation for each data segment, wherein the matching operation is used to determine non-duplicate data segments; determining, based on the size of each data segment and a compression level for the data to be processed, a second time required to perform a compression operation for each data segment; and determining, based on the first time, the second time, and a de-duplication rate for the data to be processed, a target mode for processing the multiple data segments from a first mode and a second mode, wherein in the first mode, a compression operation is performed only on the non-duplicate data segments in the multiple data segments, and in the second mode, a compression operation is performed on each of the multiple data segments. In this way, the data processing mode can be dynamically selected according to features of the data to be processed, thereby improving the efficiency of data processing.
METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR DATA COMPRESSION
Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for data compression. The method includes: determining an amount of data to be compressed in a storage system; determining, based on the amount of the data to be compressed, a target compression level for compressing the data to be compressed; and compressing the data to be compressed according to the target compression level. In this way, it is possible to compress data to be compressed using a compression level corresponding to the amount of the data to be compressed, thereby improving the efficiency of data compression in the storage system.
ADDITIONAL COMPRESSION FOR EXISTING COMPRESSED DATA
Techniques are provided for implementing additional compression for existing compressed data. Format information stored within a data block is evaluated to determine whether the data block is compressed or uncompressed. In response to the data block being compressed according to a first compression format, the data block is decompressed using the format information. The data block is compressed with one or more other data blocks to create compressed data having a second compression format different than the first compression format.
LAYOUT FORMAT FOR COMPRESSED DATA
Techniques are provided for a layout format for compressed data. A first set of data blocks are grouped into a first group based upon a first frequency of access to the first set of data blocks. A second set of data blocks are grouped into a second group based upon a second frequency of access to the second set of data blocks. The first set of data blocks are compressed into a first compression group using a first compression algorithm. The second set of data blocks are compressed into a second compression group using a second compression algorithm.
Efficient storage and retrieval of localized software resource data
A method of and system of for compressing and decompressing a localized software resource is disclosed. The method may include receiving a software resource, the software resource being in a first language, receiving a localized software resource for compression, where the software resource in the first language is a counterpart of the localized software resource in the second language. Upon receiving the software resources creating a first local dictionary for the localized software resource based at least in part on one or more first language words in the software resource and on data from a global dictionary, and compressing the localized software resource based on the local dictionary.
Approach to improve decompression performance by scatter gather compression units and also updating checksum mechanism
A method, apparatus, and system for decompressing data with a hardware compression/decompression accelerator is disclosed. The operations comprise: submitting compressed data from a plurality of stored compression units in a compression region to a hardware compression/decompression accelerator in a single submission for decompression, wherein each compression unit stores a checksum calculated based on corresponding uncompressed data; decompressing, at the hardware compression/decompression accelerator, the compressed data from the plurality of stored compression units, the decompressing generating combined decompressed data corresponding to the compressed data; calculating, at the hardware compression/decompression accelerator, a first combined checksum based on the combined decompressed data; calculating a second combined checksum based on individual checksums stored in the plurality of compression units; determining whether the first combined checksum matches the second combined checksum; and if the combined checksums match, forwarding the combined decompressed data to a storage device for storage as uncompressed data.
SYSTEMS AND METHODS TO DECREASE THE SIZE OF A COMPOUND VIRTUAL APPLIANCE FILE
An application is provided as a compound virtual appliance having components to be hosted by virtual machines. Each component includes a set of virtual machine disks. Partial versions of the components are created by removing from each component each virtual machine disk determined to be a duplicate of a virtual machine disk of another component. A compact version of the compound virtual appliance is created by packing together the partial versions of the components and a single copy of each virtual machine disk having been determined to be a duplicate. The compact compound virtual appliance is deployed to a customer site. At the customer site, a complete version of the compound virtual appliance is reconstructed by adding back the single copy of each virtual machine disk having been determined to be a duplicate into each component having had the duplicate virtual machine disk removed.