H03M7/3088

COMPUTERIZED SYSTEMS AND METHODS OF DATA COMPRESSION
20220107919 · 2022-04-07 ·

A computerized system and method of compressing symbolic information organized into a plurality of documents, each document having a plurality of symbols, the system and method including: (i) automatically identifying a plurality of sequential (also referred to as adjacent) and/or non-sequential symbol (also referred to as non-adjacent) pairs in an input document; (ii) counting the number of appearances of each unique symbol pair; and (iii) producing a compressed document that includes a replacement symbol at each position associated with one of the plurality of symbol pairs, at least one of which corresponds to a non-sequential symbol pair. For each non-sequential pair the compressed document includes corresponding indicia indicating a distance between locations of the non-sequential symbols of the pair in the input document.

Compression device, decompression device, and method

A compression device includes a dictionary based encoder, a second buffer, a comparator, and a compression data generator. The dictionary based encoder searches for second data at least partially matching first data from a first buffer, and acquires a first match position indicating a position of the second data in the first buffer and a match length indicating a matched length of the first and second data. The second buffer stores the previously acquired second match position with an index. The compression data generator generates first compressed data that includes the index assigned to the second match position in the second buffer and the match length when the first match position matches the second match position in the second buffer.

PROVIDING CHARACTER ENCODING

Aspects of the present invention disclose a method, computer program product, and system for character encoding. The method includes one or more processors receiving a first query involving an attribute. The first query utilizes encoded in accordance with a first encoding scheme. The method further includes one or more processors identifying a table comprising values of the attribute in compressed format. The method further includes one or more processors creating at least one dictionary, the dictionary mapping a compressed value of the attribute to a corresponding uncompressed value that is encoded in accordance with the first encoding scheme. The method further includes one or more processors storing the dictionary in a cache using a predefined cache management policy of the cache.

MEMORY SYSTEM

A memory system includes a storage device and a memory controller. The memory controller includes an encoder and a decoder. The encoder includes a first code table updating section configured to update the encoding code table and an encoding flow controlling section configured to control input to the first code table updating section by using a first data amount indicating a data amount of the input symbol. The first data amount is calculated based on the input symbol. The decoder includes a second code table updating section configured to update the decoding code table and a decoding flow controlling section configured to control input to the second code table updating section by using a second data amount indicating a data amount of the output symbol. The second data amount is calculated based on the output symbol in the same way as the calculation of the first data amount.

COMPRESSION OF MACHINE-GENERATED DATA
20220094767 · 2022-03-24 ·

A pre-shared compression dictionary is received. The pre-shared compression dictionary was generated based on an analysis of sample data for use in compression of other data. A compressed version of a batch of machine-generated data is received. The batch of machine-generated data has been compressed at least in part using the pre-shared compression dictionary and a batch-specific compression dictionary. The received compressed batch is uncompressed using the batch-specific compression dictionary to determine an intermediate version. The intermediate version is uncompressed using the pre-shared compression dictionary to determine an uncompressed version of the batch of machine-generated data.

Computerized methods of data compression and analysis
11269810 · 2022-03-08 ·

A computerized method and apparatus compresses symbolic information, such as text. Symbolic information is compressed by recursively identifying pairs of symbols (e.g., pairs of words or characters) and replacing each pair with a respective replacement symbol. The number of times each symbol pair appears in the uncompressed text is counted, and pairs are only replaced if they appear more than a threshold number of times. In recursive passes, each replaced pair can include a previously substituted replacement symbol. The method and apparatus can achieve high compression especially for large datasets. Metadata, such as the number of times each pair appears, generated during compression of the documents can be used to analyze the documents and find similarities between two documents.

EFFICIENT STORAGE AND RETRIEVAL OF RESOURCE DATA
20220075940 · 2022-03-10 · ·

A method of and system of for compressing and decompressing a localized software resource is disclosed. The method may include receiving a software resource, the software resource being in a first language, receiving a localized software resource for compression, where the software resource in the first language is a counterpart of the localized software resource in the second language. Upon receiving the software resources creating a first local dictionary for the localized software resource based at least in part on one or more first language words in the software resource and on data from a global dictionary, and compressing the localized software resource based on the local dictionary.

PARALLEL DECOMPRESSION OF COMPRESSED DATA STREAMS
20220069839 · 2022-03-03 ·

In various examples, metadata may be generated corresponding to compressed data streams that are compressed according to serial compression algorithms—such as arithmetic encoding, entropy encoding, etc.—in order to allow for parallel decompression of the compressed data. As a result, modification to the compressed data stream itself may not be required, and bandwidth and storage requirements of the system may be minimally impacted. In addition, by parallelizing the decompression, the system may benefit from faster decompression times while also reducing or entirely removing the adoption cycle for systems using the metadata for parallel decompression.

DATA SEGMENT STORING IN A DATABASE SYSTEM
20210326320 · 2021-10-21 · ·

A method includes a host computing device receiving a segment group of data. The method further includes the host computing device evaluating availability status of other computing devices in the storage cluster of computing devices. When one of the other computing devices is unavailable, the method further includes the host computing device dividing the segment group of data into a plurality of lines of data blocks. For a line of the data blocks, the method further includes the host computing device generating at least one parity block. The method further includes the host computing device sending a first data segment that includes first positioned data blocks to a first available computing device. The method further includes the host computing device sending a second data segment that includes second positioned data blocks to a second available computing device. The method further includes the host computing device storing a parity segment.

PHYSICAL MEMORY COMPRESSION
20210311881 · 2021-10-07 ·

A memory management system includes a physical memory associated with a computing device and a memory manager. The memory manager is configured to manage a shared memory cache as part of a compression of the physical memory using a cache compression algorithm, wherein a compression block size for the compression is a single cache line size. The physical memory includes a sector translation table (STT) region and a sector memory region. The memory manager uses a memory descriptor defined by an STT entry having a cache line map and a plurality of sector pointers to load cache from the physical memory to a level 3 Cache. The cache line map contains cache line metadata including a size of each cache line, a location of the cache line in one of the sectors pointed to by the STT entry, and a plurality of flags.