H03M7/3093

Multiple overlapping hashes at variable offset in a hardware offload

A hardware offload includes a hash engine that performs hashing for a block-based storage system. The hash engine calculates multiple hash values for each input buffer provided by the storage system. The hash values may be calculated with variably offset and overlapping portions of the input buffer, wherein each portion is larger than the native block size of the storage system. The hardware offload may also include a compression engine that performs compression on the input buffer using the entire input buffer and/or chunks as compression domains.

Data transmission method and apparatus having data reuse mechanism
20220224450 · 2022-07-14 ·

The present invention discloses a data transmission method having data reuse mechanism that includes the steps outlined below. A driver corresponding to a communication circuit is operated as a transmission terminal to analyze under-transmitted data to generate reuse setting information, indication information and packets having a complete packet and incomplete packets so as to be transmitted by the transmission terminal through a transmission interface to be received by the communication circuit as a receiving terminal. The receiving terminal identifies the complete packet and the incomplete packets according to the indication information. A data location that a reusable data section corresponds to is determined according to the reuse information. The complete packet is outputted. A non-reusable data section of each of the incomplete packets and the reusable data section of the complete packet are reconstructed to output reconstructed packets according to the data location.

Aligning Variable Sized Compressed Data To Fixed Sized Storage Blocks
20220091768 · 2022-03-24 ·

Preparing data for deduplication including: generating, by a storage system for a compressed data block, a padded compressed data block by padding the compressed data block to conform to a fixed block size, wherein the fixed block size is greater than a size of the compressed data block; storing, in the storage system, the padded compressed data block beginning at a block boundary of a storage device in the storage system; and performing block-based deduplication on the storage system, wherein the block-based deduplication determines whether the padded compressed data block matches one or more other padded compressed data blocks stored in the storage system.

MULTIPLE OVERLAPPING HASHES AT VARIABLE OFFSET IN A HARDWARE OFFLOAD

A hardware offload includes a hash engine that performs hashing for a block-based storage system. The hash engine calculates multiple hash values for each input buffer provided by the storage system. The hash values may be calculated with variably offset and overlapping portions of the input buffer, wherein each portion is larger than the native block size of the storage system. The hardware offload may also include a compression engine that performs compression on the input buffer using the entire input buffer and/or chunks as compression domains.

Binary difference operations for navigational bit streams
10642824 · 2020-05-05 · ·

A computing device may identify a series of bits representative of a first binary large object (BLOB) for navigation data including road segments and road attributes. The computing device duplicates each bit of the series of bits a predetermined number of times to form a first bit string. The first bit string is larger than the series of bits by a factor of the predetermined number. The computing device performs a binary difference of the first bit string to a second bit string representative of a second BLOB. A result of the binary difference is stored in a navigation patch file.

Systems and Methods for Version Chain Clustering
20200099392 · 2020-03-26 ·

A system, a method and a computer program product for storing data, which include receiving a data stream having a plurality of transactions that include at least one portion of data, determining whether at least one portion of data within at least one transaction is substantially similar to at least another portion of data within at least one transaction, clustering together at least one portion of data and at least another portion of data within at least one transaction, selecting one of at least one portion of data and at least another portion of data as a representative of at least one portion of data and at least another portion of data in the received data stream, and storing each representative of a portion of data from each transaction in the plurality of transactions, wherein a plurality of representatives is configured to form a chain representing the received data stream.

Compression and transmission of genomic information

Systems and methods for performing genomic information compression, transmission, and decompression are provided. A system for compression, transmission, and decompression of genomic information includes a first computer associated with a first index and a second computer associated with a second index, each index containing reference permutations of nucleic acid sequence portions, each permutation associated with a reference number. The first computer uses input genomic information and the first index to produce a compressed representation of the genomic information, and transmits the compressed representation to the second computer. The second computer uses the compressed representation and the second index to assemble a data representation of the genomic information. The compressed representation comprises references to permutations, indications of locations of each permutation in the input information, indications of variations to permutations, and/or indications of sequence length.

Systems and methods for version chain clustering

A system, a method and a computer program product for storing data, which include receiving a data stream having a plurality of transactions that include at least one portion of data, determining whether at least one portion of data within at least one transaction is substantially similar to at least another portion of data within at least one transaction, clustering together at least one portion of data and at least another portion of data within at least one transaction, selecting one of at least one portion of data and at least another portion of data as a representative of at least one portion of data and at least another portion of data in the received data stream, and storing each representative of a portion of data from each transaction in the plurality of transactions, wherein a plurality of representatives is configured to form a chain representing the received data stream.

DATA PROCESSING METHOD AND APPARATUS
20190312589 · 2019-10-10 · ·

A method of compression is disclosed in which an input sequence of bits is divided into a plurality of portions. Each portion is sub-divided into a plurality of sub-divisions. Frequency analysis is performed to determine the number of occurrences of each sub-division permutation and new values are assigned, based on the frequency analysis, to each of the sub-division permutations. For each portion a label representing the permutation of bits in that portion is assigned. The label comprises a representation of a combined value resulting from combining the new values associated with the sub-division permutations of that portion. A processed sequence of bits is generated by replacing, within the input sequence of bits, bit portions with the respective label representing the permutation of bits in that portion.

Aligning variable sized compressed data to fixed sized storage blocks
12008255 · 2024-06-11 · ·

Preparing data for deduplication including: generating, by a storage system for a compressed data block, a padded compressed data block by padding the compressed data block to conform to a fixed block size, wherein the fixed block size is greater than a size of the compressed data block; storing, in the storage system, the padded compressed data block beginning at a block boundary of a storage device in the storage system; and performing block-based deduplication on the storage system, wherein the block-based deduplication determines whether the padded compressed data block matches one or more other padded compressed data blocks stored in the storage system.