H03M7/6017

Low complexity optimal parallel Huffman encoder and decoder
12113554 · 2024-10-08 · ·

A memory device includes a memory; and at least one processor configured to: obtain a symbol stream including a plurality of symbols; determine a Huffman tree corresponding to the symbol stream, wherein each symbol of the plurality of symbols is assigned a corresponding prefix code from among a plurality of prefix codes based on the Huffman tree; generate a prefix length table based on the Huffman tree, wherein the prefix length table indicates a length of the corresponding prefix code for each symbol; generate a logarithm frequency table based on the prefix length table, wherein the logarithm frequency table indicates a logarithm of a frequency count for each symbol, generate a cumulative frequency table which indicates a cumulative frequency count corresponding to each symbol; generate a compressed bitstream by iteratively applying an encoding function to the plurality of symbols based on the logarithm frequency table and the cumulative frequency table; and store the compressed bitstream in the memory.

Hybrid bit-sliced dictionary encoding for fast index-based operations

Techniques are described herein for storing and processing codes included in dictionary-encoded data. In an embodiment, for each respective code of a plurality of codes in the dictionary-encoded data: a plurality of bits from a first portion of the respective code is contiguously stored. One or more bits from a second portion of the respective code is stored in one or more slices. Each respective slice of the one or more slices stores a bit from the one or more bits with a corresponding bit position in the respective code. In another embodiment, a bit-vector is generated based on at least one slice by loading each respective bit of the plurality of bits into different respective partitions in a register at a bit position corresponding to the at least one slice. A plurality of codes may be reconstructed by combining the bit-vector with one or more other bit-vectors.

Lossless data compression

A method of data compression includes obtaining binary sensor data having rows with multi-bit data samples. The rows are divided into data groups each including two or more samples. A precedent value is selected for the rows or respective precedent values are selected for each data group. A compressed row of compressed sensor data is generated from each row by calculating differences between the data sample and the precedent value for its associated data groups. A Compression Information Packet (CIP) is generated for each row including information for returning the binary sensor data that includes a compressed predicate indicating whether each data group is stored compressed, a data group size being a multi-bit value that stores a group size used for row compression, and a compressed word size that stores a dynamic range of the row compression. The compressed rows are stored as stored compressed data along with the CIPs.

Hybrid compression for large history compressors

A compression engine and method for optimizing the high compression of a content addressable memory (CAM) and the efficiency of a static random access memory (SRAM) by synchronizing a CAM with a relatively small near history buffer and an SRAM with a larger far history buffer. An input stream is processed in parallel through the near history and far history components and an encoder selects for the compressed output the longest matching strings from matching strings provided by each of the near history and far history components. A further optimization is enabled by selectively disabling one or the other of the two types of compressors.

Method and system for compressing application data for operations on multi-core systems
12126367 · 2024-10-22 · ·

A system and method to compress application control data, such as weights for a layer of a convolutional neural network, is disclosed. A multi-core system for executing at least one layer of the convolutional neural network includes a storage device storing a compressed weight matrix of a set of weights of the at least one layer of the convolutional network and a decompression matrix. The compressed weight matrix is formed by matrix factorization and quantization of a floating point value of each weight to a floating point format. A decompression module is operable to obtain an approximation of the weight values by decompressing the compressed weight matrix through the decompression matrix. A plurality of cores executes the at least one layer of the convolutional neural network with the approximation of weight values to produce an inference output.

Speculative data decompression

A computing system includes a network interface, a processor, and a decompression circuit. In response to a compression request from the processor the decompression circuit compresses data to produce compressed data and transmits the compressed data through the network interface. In response to a decompression request from the processor for compressed data the decompression circuit retrieves the requested compressed data, speculatively detects codewords in each of a plurality of overlapping bit windows within the compressed data, selects valid codewords from some, but not all of the overlapping bit windows, decodes the selected valid codewords to generate decompressed data, and provides the decompressed data to the processor.

LOSSLESS DATA COMPRESSION
20180167083 · 2018-06-14 ·

A method of data compression includes obtaining binary sensor data having rows with multi-bit data samples. The rows are divided into data groups each including two or more samples. A precedent value is selected for the rows or respective precedent values are selected for each data group. A compressed row of compressed sensor data is generated from each row by calculating differences between the data sample and the precedent value for its associated data groups. A Compression Information Packet (CIP) is generated for each row including information for returning the binary sensor data that includes a compressed predicate indicating whether each data group is stored compressed, a data group size being a multi-bit value that stores a group size used for row compression, and a compressed word size that stores a dynamic range of the row compression. The compressed rows are stored as stored compressed data along with the CIPs.

METHOD AND APPARATUS FOR HYBRID COMPRESSION PROCESSING FOR HIGH LEVELS OF COMPRESSION

In one embodiment, an apparatus comprises a first compression engine to receive a first compressed data block from a second compression engine that is to generate the first compressed data block by compressing a first plurality of repeated instances of data that each have a length greater than or equal to a first length. The first compression engine is further to compress a second plurality of repeated instances of data of the first compressed data block that each have a length greater than or equal to a second length, the second length being shorter than the first length, wherein each compressed repeated instance of the first and second pluralities of repeated instances comprises a location and length of a data instance that is repeated. The apparatus further comprises a memory buffer to store the compressed first and second plurality of repeated instances of data.

TECHNOLOGIES FOR OFFLOADING I/O INTENSIVE OPERATIONS TO A DATA STORAGE SLED

Technologies for offloading I/O intensive workload phases to a data storage sled include a compute sled. The compute sled is to execute a workload that includes multiple phases. Each phase is indicative of a different resource utilization over a time period. Additionally, the compute sled is to identify an I/O intensive phase of the workload, wherein the amount of data to be communicated through a network path between the compute sled and the data storage sled to execute the I/O intensive phase satisfies a predefined threshold. The compute sled is also to migrate the workload to the data storage sled to execute the I/O intensive phase locally on the data storage sled. Other embodiments as also described and claimed.

TECHNOLOGIES FOR DATA DEDUPLICATION IN DISAGGREGATED ARCHITECTURES

Technologies for providing data deduplication in a disaggregated architecture include a network device. The network device is to receive, from a compute sled, a request to write a data block to one or more data storage sleds and determine, for each of one or more data sub-blocks within the data block and from deduplication data indicative of physical addresses of data sub-blocks, whether each data sub-block is already stored in a data storage device of a data storage sled. Additionally, the network device is to write, in the deduplication data and in response to a determination that a data sub-block is already stored in a data storage device, a pointer to a physical address of the already-stored data sub-block in association with a logical address of the data sub-block.