IPIQ

H03M7/3091

Additional compression for existing compressed data

11728827 · 2023-08-15 ·

NetApp, Inc.

Techniques are provided for implementing additional compression for existing compressed data. Format information stored within a data block is evaluated to determine whether the data block is compressed or uncompressed. In response to the data block being compressed according to a first compression format, the data block is decompressed using the format information. The data block is compressed with one or more other data blocks to create compressed data having a second compression format different than the first compression format.

DATA COMPRESSION APPARATUS AND DATA COMPRESSION METHOD

20220131555 · 2022-04-28 ·

Hitachi, Ltd.

A compression engine calculates replacement CRC codes, in predetermined data lengths, for DIF-in cleartext data including cleartext data and multiple CRC codes based on the cleartext data. The compression engine generates headered compressed-text data in which a header including the replacement CRC codes is added to compressed-text data in which the cleartext data is compressed, and generates code-in compressed-text data by calculating multiple CRC codes based on the headered compressed-text data to add the calculated CRC codes to the headered compressed-text data.

TECHNIQUES FOR GENERATING DATA SETS WITH SPECIFIED COMPRESSION AND DEDUPLICATION RATIOS

20220129190 · 2022-04-28 ·

Emc IP Holding Company Llc

Techniques for generating data sets may include: receiving an initial buffer that achieves a compression ratio responsive to compression processing using a compression algorithm, the initial buffer including first content located at a first position in the initial buffer and including second content located at a second position in the initial buffer; and generating a data set of buffers using the initial buffer. The data set may be expected to achieve a specified deduplication ratio responsive to deduplication processing and to achieve the compression ratio responsive to compression processing using the compression algorithm. Generating the data set may include generating a first plurality of buffers where each buffer of the first plurality is not a duplicate of another buffer in the first plurality, and generating a second plurality of duplicate buffers. Each duplicate buffer may be a duplicate of a buffer in the first plurality of buffers.

Client-side compression

11720270 · 2023-08-08 ·

Emc IP Holding Company Llc

A method of sending blocks of data from a client to be stored at a storage server, wherein for each block compression and encryption is performed at the client, and deduplication is performed at the server. Security is thus enhanced as the block is compressed and encrypted when it is sent over an unsecured network and when it is stored in potentially a third-party backup system. Provisions are made to enable addition of new compression algorithms and for retirement of old compression algorithms, while ensuring that a client would not receive a block which was compressed using an unsupported, e.g., retired, compression algorithm. In some examples a compression algorithm ID is tied to an encryption key version to enable refresh of blocks compressed with old algorithm.

DATA COMPRESSION METHOD AND DATA DECOMPRESSION METHOD FOR ELECTRONIC DEVICE, AND ELECTRONIC DEVICE

20220121626 · 2022-04-21 ·

A data compression method and a data decompression method for an electronic device, and an electronic device, are provided to make compressed data become smaller, so that overheads caused by data storage and receiving/sending are reduced. Each of one or more matching rules includes one or more matching entries, each matching entry is used to perform matching on one or more pieces of to-be-matched data in a to-be-matched data group, and each matching entry includes: a preset field; a matching rule field. The method includes: receiving a to-be-matched data group, obtaining a target matching rule by performing matching based on the preset field and the matching rule field in each matching entry in the one or more matching rules, and performing processing based on a compression rule field in each matching entry in the target matching rule.

Selection of hash key sizes for data deduplication

11232075 · 2022-01-25 ·

Emc IP Holding Company Llc

Techniques for data processing may include: receiving a data chunk; determining a metric value denoting a degree of compressibility of the data chunk; selecting, in accordance with the metric value denoting the compressibility of the data chunk, a first size of a plurality of sizes, wherein each of the plurality of sizes denotes a different size of an amount of storage used for storing a value of said each size; and performing the data deduplication processing for the data chunk, wherein the data deduplication processing includes using a first hash value for the data chunk to determine whether the data chunk is a duplicate of another data chunk of a hash table, wherein the first hash value is stored in a storage location of the first size.

Information processing device, information processing method, and data structure

11222068 · 2022-01-11 ·

Fujitsu Limited

Hidetoshi Matsumura

An information processing device includes: a memory; and a processor coupled to the memory and configured to: convert target data into first data by predetermined arithmetic processing; generate second data based on the converted first data and identification information which specifies a file of the target data; and store the target data in an address of a memory corresponding to the generated second data.

OPPORTUNISTIC CONTENT DELIVERY USING DELTA CODING

20230328131 · 2023-10-12 ·

David Lerner

Systems and methods are described for avoiding redundant data transfers using delta coding techniques when reliably and opportunistically communicating data to multiple user systems. According to embodiments, user systems track received block sequences for locally stored content blocks. An intermediate server intercepts content requests between user systems and target hosts, and deterministically chucks and fingerprints content data received in response to those requests. A fingerprint of a received content block is communicated to the requesting user system, and the user system determines based on the fingerprint whether the corresponding content block matches a content block that is already locally stored. If so, the user system returns a set of fingerprints representing a sequence of next content blocks that were previously stored after the matching content block. The intermediate server can then send only those content data blocks that are not already locally stored at the user system according to the returned set of fingerprints.

Method for compressing behavior event in computer and computer device therefor

11784661 · 2023-10-10 ·

Somma, Inc.

Yonghwan Roh

A method for compressing a behavior event and a computer device therefor are provided. The method for compressing the behavior event includes generating, by a processor of the computer, an event block on the basis of an event target, when the behavior event occurs, updating, by the processor, input/output (I/O) information while the behavior event occurs to the event block, and storing, by the processor, the event block, when the behavior event is ended.

Content-adaptive tiling solution via image similarity for efficient image compression

11776164 · 2023-10-03 ·

Adobe, Inc.

Techniques are provided herein for more efficiently storing images that have a common subject, such as product images that share the same product in the image. Each image undergoes an adaptive tiling procedure to split the image into a plurality of tiles, with each tile identifying a region of the image having pixels with the same content. The tiles across multiple images can then be clustered together and those tiles having identical content are removed. Once all duplicate tiles have been removed from the set of all tiles across the images, the tiles are once again clustered based on their encoding scheme and certain encoding parameters. Tiles within each cluster are compressed using the best compression technique for the tiles in each corresponding cluster. By removing duplicative tile content between numerous images of the same subject, the total amount of data that needs to be stored is reduced.

Patent classifications

H03M7/3091