Patent classifications
H03M7/34
Methods and devices for sparse data compression through dimension coding
Methods and devices for encoding a sparse signal x to generate a compressed encoded signal. The methods employ directionless grammar-based dimension coding. Using labelled subsets and the finding of disjoint repeated subsets in order to build a directionless grammar, the non-zero positions of the sparse signal are encoded in a directionless grammar-based dimension encoder. Element values are encoded in a conditional non-zero encoder. The coding process facilitates random access.
Method and apparatus for decompression acceleration in multi-cycle decoder based platforms
In one embodiment, an apparatus comprises a decompression engine to perform a non-speculative decode operation on a first portion of a first compressed payload comprising a first plurality of codes; and perform a speculative decode operation on a second portion of the first compressed payload, wherein the non-speculative decode operation and the speculative decode operation share at least one decode path and the non-speculative decode operation is to utilize bandwidth of the at least one decode path that is not used by the non-speculative decode operation.
Method and apparatus for hybrid compression processing for high levels of compression
In one embodiment, an apparatus comprises a first compression engine to receive a first compressed data block from a second compression engine that is to generate the first compressed data block by compressing a first plurality of repeated instances of data that each have a length greater than or equal to a first length. The first compression engine is further to compress a second plurality of repeated instances of data of the first compressed data block that each have a length greater than or equal to a second length, the second length being shorter than the first length, wherein each compressed repeated instance of the first and second pluralities of repeated instances comprises a location and length of a data instance that is repeated. The apparatus further comprises a memory buffer to store the compressed first and second plurality of repeated instances of data.
Computer-readable recording medium, encoding device, and encoding method
The encoding device 100 extracts, when encoding a target file by using a static dictionary unit 121 and a dynamic dictionary unit 122, a registered word included in an external dictionary unit 221 from among words registered in the dynamic dictionary unit 122, in which the external dictionary associates a specific word group and a code group with each other; and registers, in the dynamic dictionary unit 122, a code of the registered word in the external dictionary unit 221 and a dynamic code assigned dynamically in association with each other.
Multi-stage data compression for time-series metric data within computer systems
The current document is directed to a multi-stage metric-data compression method and subsystem for compressing metric data collected and stored within distributed computing systems to facilitate computer-system management and administration. In a described implementation, metric data is partitioned into constant metric data, low-variability metric data, and high-variability metric data. High-variability metric data is compressed by identifying a set of basis metrics, or independent metrics, with respect to which a remaining set of dependent metrics can be expressed using coefficient multipliers. The high-variability metric data can then be stored as a set of independent metrics and set of coefficients, along with a small amount of additional data.
Selection of data compression technique based on input characteristics
A compression scheme can be selected for an input data stream based on characteristics of the input data stream. For example, when the input data stream is searched for pattern matches, input stream characteristics used to select a compression scheme can include one or more of: type and size of an input stream, a length of a pattern, a distance from a start of where the pattern is to be inserted to the beginning of where the pattern occurred previously, a gap between two pattern matches (including different or same patterns), standard deviation of a length of a pattern, standard deviation of a distance from a start of where the pattern is to be inserted to the beginning of where the pattern occurred previously, or standard deviation of a gap between two pattern matches. Criteria can be established whereby one or more characteristics are used to select a particular encoding scheme.
Lossy statistical data compression
A method performed in real-time includes receiving and storing time-based data over a specific time period and dividing the specific time period into a plurality of time windows. The method further includes determining that data associated with two or more proximate time windows are within a predetermined variance of one another and responsive to the determination: generating a mathematical function representative of the data associated with the two or more proximate time windows, deleting the data associated with the two or more proximate time windows, and generating a representation of the deleted data from the mathematical function. In certain embodiments, the data comprises empirical network telemetry data.
Decompression of a compressed data unit
A method that may include retrieving, by a decompression processor, a compressed data unit; wherein the compressed data unit comprises a control section and a data section; wherein the control section comprises multiple decompression instructions for a retrieval of data portions from one or more sources; wherein the one or more source comprise the data section; wherein the control section does not include any data portion; and executing, by a decompression processor, the multiple decompression instructions to provide a decompressed data unit.
Detection of unknown code page indexing tokens
A method for determining an encoding used for a sequence of bytes may be provided. The method comprises providing a set of candidate code pages and transforming them into different groups of sequences of bytes, wherein each group of sequences of bytes corresponds to one of the candidate code pages. Thereby each code point is transformed by applying a transformation from one of the candidate code pages to a reference code point value relating to a reference encoding for each code point. The method comprises further separating each of the transformed sequences of bytes into groups of tokens, wherein each group of tokens relates to one candidate code page, and providing an index relating to a text corpus. Furthermore, the method comprises selecting a code page from the set of candidate code pages at least partially based on how many tokens are found in the index.
Hardware data compressor using dynamic hash algorithm based on input block type
A hardware data compressor that compresses an input block of characters by replacing strings of characters in the input block with back pointers to matching strings earlier in the input block. A hash table is used in searching for the matching strings in the input block. A plurality of hash index generators each employs a different hashing algorithm on an initial portion of the strings of characters to be replaced to generate a respective index. The hardware data compressor also includes an indication of a type of the input block of characters. A selector selects the index generated by of one of the plurality hash index generators to index into the hash table based on the type of the input block.