Patent classifications
H03M7/405
SYSTEM AND METHOD FOR DATA COMPACTION WITH CODEBOOK STATISTICAL ESTIMATES
A system and method for data compaction with codebook statistical estimates to improve entropy encoding methods to account for, and efficiently handle, previously-unseen data in data to be compacted. Training data sets are analyzed to determine the frequency of occurrence of each sourceblock in the training data sets. A mismatch probability estimate is calculated comprising an estimated frequency at which any given data sourceblock received during encoding will not have a codeword in the codebook. Entropy encoding is used to generate codebooks comprising codewords for data sourceblocks based on the frequency of occurrence of each sourceblock. A mismatch codeword is inserted into the codebook based on the mismatch probability estimate to represent those cases when a block of data to be encoded does not have a codeword in the codebook. During encoding, if a mismatch occurs, a secondary encoding process is used to encode the mismatched sourceblock.
CODE TABLE GENERATION DEVICE, MEMORY SYSTEM, AND CODE TABLE GENERATION METHOD
According to one embodiment, a code table generation device includes a frequency table generation unit, a frequency sorting unit, and a Huffman tree generation unit. The frequency table generation unit generates a frequency table including entries each including a symbol and a frequency of occurrence, based on a frequency of occurrence for each symbol of input symbols. The frequency sorting unit sorts the entries in the frequency table by frequency of occurrence. The Huffman tree generation unit generates a Huffman tree having leaf nodes by using a queue that includes storage areas in which the sorted entries are respectively stored as the leaf nodes in an initial state, in response to the entries having been sorted.
Methods and Devices for Binary Entropy Coding of Point Clouds
Methods and devices for encoding a point cloud. A bit sequence signalling an occupancy pattern for sub-volumes of a volume is coded using binary entropy coding. For a given bit in the bit sequence, a context may be based on a sub-volume neighbour configuration for the sub-volume corresponding to that bit. The sub-volume neighbour configuration depends on an occupancy pattern of a group of sub-volumes of neighbouring volumes to the volume, the group of sub-volumes neighbouring the sub-volume corresponding to the given bit. The context may be further based on a partial sequence of previously-coded bits of the bit sequence.
FILE COMPRESSION SYSTEM
Examples of the disclosure describe systems and methods for implementing a file compression system. In an example method, a source string to be compressed is received. The source string comprises a plurality of characters. A first frequency is determined for each character of the plurality of characters of the source string. A first tree corresponding to the source string is determined based on the first frequencies. The source string is encoded using the first tree to generate a first encoded string. It is determined whether a total number of bits in the first encoded string is a multiple of eight. In accordance with a determination that the total number of bits in the first encoded string is not a multiple of eight, the first encoded string is appended with zeroes so that a new total number of bits in the first encoded string is a multiple of eight. In accordance with a determination that the total number of bits in the first encoded string is a multiple of eight, the method forgoes appending the first encoded string with zeroes. The first encoded string is divided into one or more eight-bit segments and a placeholder character is assigned to each eight-bit segment. A second frequency for each of the placeholder characters in the first encoded string is determined. A second tree corresponding to the first encoded string is determined based on the second frequencies. The first encoded string is encoded using the second tree to generate a second encoded string.
SYSTEM AND METHOD FOR HIGH-SPEED TRANSFER OF SMALL DATA SETS
A system and method for high-speed transfer of small data sets, that provides near-instantaneous bit-level lossless compression, that is ideal for communications environments that cannot tolerate even small amounts of data corruption, have very low latency tolerance, where data has a low entropy rate, and where every bit costs the user bandwidth, power, or time so that deflation is worthwhile. Where some loss of data can be tolerated, the system and method can be configured for use as lossy compression.
System and method for personal health monitor data compaction using multiple encoding algorithms
A system and method for encoding personal health monitor data using a plurality of encoding libraries. Portions of the data are encoded by different encoding libraries, depending on which library provides the greatest compaction or on some other criteria for a given portion of the data. This methodology not only provides substantial improvements in data compaction over use of a single data compaction algorithm with the highest average compaction, but provides substantial additional security in that multiple decoding libraries must be used to decode the data. In some embodiments, each portion of data may further be encoded using different sourceblock sizes, providing further security enhancements as decoding requires multiple decoding libraries and knowledge of the sourceblock size used for each portion of the data. In some embodiments, encoding libraries may be randomly or pseudo-randomly rotated to provide additional security.
METHOD AND APPARATUS DATA WITH DATA COMPRESSION AND/OR DECOMPRESSION
A processor-implemented method including generating k sub-compressed data streams based on a compressed data stream for a plurality of symbols divided into a plurality of k blocks and count information for each of the plurality of k blocks, generating k sub-symbols by processing each of the k sub-compressed data streams using k decoding engines, metadata about the compressed data stream, and generating an output data stream corresponding to the plurality of symbols based on the k sub-symbols.
Self-balancing tree data structure compression
A data element to be inserted into a memory data structure, represented by a key and a value, is received. A target node into which the received data element is to be inserted is determined based on the key of the received data element. A determination is made whether or not the target node is already compressed. An append-write operation to insert the data element into the target node is performed when the target node is already compressed. An evaluation is performed prior to inserting the data element when the target node is not already compressed. An in-place write operation to insert the data element into the uncompressed target node is performed when the evaluation generates a first result. The target node is compressed and then an append-write operation to insert the data element into the compressed target node is performed when the evaluation generates a second result.
Data decompression device, memory system, and data decompression method
According to one embodiment, a data decompression device decodes a code included in compressed data into a symbol. The data decompression device includes a first code length generation unit and a second code length generation unit. The first code length generation unit generates a first code length of a first code included in the compressed data by arithmetic calculation. The second code length generation unit generates a second code length of a second code by using a table. The second code is included in the compressed data. The second code is subsequent to the first code. The table indicates at least the first code and the second code length that is associated with the first code.
STORAGE INFRASTRUCTURE THAT EMPLOYS A LOW COMPLEXITY ENCODER
A storage infrastructure, method and encoder device for implementing low complexity encoding, The described encoder includes: a preprocessing system that assigns a code length to each unique symbol based on the frequency without performing a sort operation and determines maximum and minimum occurrence frequencies of symbols of each given code length, and the maximum and minimum code length among all the symbols; and a post processing system that cycles through each code length, determines if a maximum occurrence frequency of a current code length, associated with a first symbol, is greater than a minimum occurrence frequency of an adjacent code length, associated with a second symbol, and if greater, swaps code lengths of the first and second symbols.