H03M7/3091

Efficient deduplication of compressed files

The present disclosure describes a technique for performing an efficient deduplication of compressed source data. The techniques may reduce the required storage footprint required for deduplication of compressed data. In order to reduce the storage size required, the system may perform additional decompression/recompression processes by identifying particular compression algorithms used by a source storage system. Once the compression algorithm is identified, the system may initiate decompression and then perform fingerprint analysis of the segment in the file of the uncompressed data. When a recovery process is initiated, the system may recompress the deduplicated data using the same compression algorithm used by the source storage system. Accordingly, the data recovery process may be performed in manner in which the client device receives restored data as expected and in the original compression format.

Optimizing Offline Map Data Updates

In some implementations, a system can optimize offline map data updates. For example, a server device in the system can determine a metric for identifying map data objects based on attributes of the map data objects. The server device can then generate a quadtree that stores the map data objects in nodes of the quadtree based on the metric. When processing an update to the map data stored at the server device, the server device can generate update data describing the updates for each node in the quadtree based on a binary difference algorithm and/or a semantic difference algorithm. The server device can select the algorithm based on which algorithm results in the smallest compressed size of the update data.

Preservation of data during scaling of a geographically diverse data storage system

Preservation of data during scaling of a geographically diverse data storage system is disclosed. In regard to scaling-in, a first zone storage component (ZSC) can be placed in read-only (RO) mode to allow continued access to data stored on the first ZSC, completion of previously queued operations, updating of data chunks, etc. Data chunks can comprise metadata stored in directory table partitions organized in a tree data structure scheme. An updated data chunk of the first ZSC can be replicated at other ZSCs before deleting the first ZSC. A first hash function can be used to distribute portions of the updated data chunk among the other ZSCs. A second hash function can be used to distribute key data values corresponding to the distributed portions of the updated data chunk among the other ZSCs. Employing the first and second hash functions can result in more efficient use of storage space and more even distribution of key data values when compared to simple replication of a data chunk of the first ZSC by the other ZSCs.

Advanced database compression
11050436 · 2021-06-29 · ·

A method, a system, and a computer program product for executing a database compression. A compressed string dictionary having a block size and a front coding bucket size is generated from a dataset. Front coding is applied to one or more buckets of strings in the dictionary having the front coding bucket size to generate one or more front coded buckets of strings. One or more portions of the generated front coded buckets of strings are concatenated to form one or more blocks having the block size. Each block is compressed. A set of compressed blocks is stored. The set of the compressed blocks stores all strings in the dataset.

METHOD FOR COMPRESSING BEHAVIOR EVENT IN COMPUTER AND COMPUTER DEVICE THEREFOR
20210194501 · 2021-06-24 · ·

A method for compressing a behavior event and a computer device therefor are provided. The method for compressing the behavior event includes generating, by a processor of the computer, an event block on the basis of an event target, when the behavior event occurs, updating, by the processor, input/output (I/O) information while the behavior event occurs to the event block, and storing, by the processor, the event block, when the behavior event is ended.

CROSS-PLATFORM DIGITAL CONTENT STORAGE AND SHARING SYSTEM
20210160340 · 2021-05-27 ·

A content sharing system for peer-to-peer private sharing of digital content between users of a sharing platform is configured, using application programs executing on the users' mobile devices, so that the content can only be used (e.g., viewed, shared, etc.) by the receiver from within an instance of the application executing on an authorized receiving device, and only if the owner has set permissions allowing the receiver to perform the requested action. The owner can modify these permissions and promulgate the changes to the platform users via the network of user devices, causing the applications of impacted users to allow or deny activity, and keep or discard data, accordingly. Further, each transaction involving the shared content can be recorded to each parties' private or synchronized shared ledgers, creating a complete interaction timeline that essentially bestows the content with “memory” of how it has been shared.

Method and system for compressing data
11017155 · 2021-05-25 · ·

A system and method for a non-transient computer readable medium containing program instructions for causing a computer to perform a method for compressing data comprising the steps of receiving a data string for compression, the data string including a plurality of data elements, creating a template based on processing the data string, the template including common information across all data elements of the data string, creating one or more entries, wherein the one or more entries include information that is different to the template, and storing the template and the one or more entries.

LOSSLESS REDUCTION OF DATA BY USING A PRIME DATA SIEVE AND PERFORMING MULTIDIMENSIONAL SEARCH AND CONTENT-ASSOCIATIVE RETRIEVAL ON DATA THAT HAS BEEN LOSSLESSLY REDUCED USING A PRIME DATA SIEVE
20210144405 · 2021-05-13 · ·

Input data can be losslessly reduced by using a data structure that organizes prime data elements based on their contents. Alternatively, the data structure can organize prime data elements based on the contents of a name that is derived from the prime data elements. Specifically, video data can be losslessly reduced by (1) using the data structure to identify a set of prime data elements, and (2) using the set of prime data elements to losslessly reduce intra-frames. The input data can be dynamically partitioned based on the memory usage of components of the data structure. Parcels can be created based on the partitions to facilitate archiving and movement of the data. The losslessly reduced data can be stored using a set of distilled files and a set of prime data element files.

CAPACITY REDUCTION IN A STORAGE SYSTEM
20210124532 · 2021-04-29 · ·

An aspect includes implementing capacity reduction in a storage system includes for each of a candidate page and a target page in the storage system, identifying a subset of sectors having identical data or a minimum amount of non-identical data, performing a bit-wise exclusive OR (XOR) operation on sectors of the candidate page and the target page, determining entropy from results of the XOR operation. Upon determining the entropy is less than or equal to a threshold value, an aspect includes building a reference page from an XOR sector containing results of the bit-wise XOR operation, and performing a compression operation on the reference page.

APPROACH TO IMPROVE DECOMPRESSION PERFORMANCE BY SCATTER GATHER COMPRESSION UNITS AND ALSO UPDATING CHECKSUM MECHANISM
20210109806 · 2021-04-15 ·

A method, apparatus, and system for decompressing data with a hardware compression/decompression accelerator is disclosed. The operations comprise: submitting compressed data from a plurality of stored compression units in a compression region to a hardware compression/decompression accelerator in a single submission for decompression, wherein each compression unit stores a checksum calculated based on corresponding uncompressed data; decompressing, at the hardware compression/decompression accelerator, the compressed data from the plurality of stored compression units, the decompressing generating combined decompressed data corresponding to the compressed data; calculating, at the hardware compression/decompression accelerator, a first combined checksum based on the combined decompressed data; calculating a second combined checksum based on individual checksums stored in the plurality of compression units; determining whether the first combined checksum matches the second combined checksum; and if the combined checksums match, forwarding the combined decompressed data to a storage device for storage as uncompressed data.