Patent classifications
H03M7/3088
Advanced database compression
A method, a system, and a computer program product for executing a database compression. A compressed string dictionary having a block size and a front coding bucket size is generated from a dataset. Front coding is applied to one or more buckets of strings in the dictionary having the front coding bucket size to generate one or more front coded buckets of strings. One or more portions of the generated front coded buckets of strings are concatenated to form one or more blocks having the block size. Each block is compressed. A set of compressed blocks is stored. The set of the compressed blocks stores all strings in the dataset.
INTERNET OF THINGS DATA COMPRESSION SYSTEM AND METHOD
A disclosure for lossless data compression can include receiving a data block by a processor, performing, by the processor, a sparse transform extraction on the data block, selecting, by the processor, a transform matrix for the data block, modeling, by the processor, the selected transform matrix for the data block, selecting, by the processor, a transform coefficient model for the data block, modeling, by the processor, the selected transform coefficient model for the data block, compressing, by the processor, the data in the data block using the selected transform matrix and the selected transform coefficient model.
DEEP LEARNING NUMERIC DATA AND SPARSE MATRIX COMPRESSION
An apparatus to facilitate deep learning numeric data and sparse matrix compression is disclosed. The apparatus includes a processor comprising a compression engine to: receive a data packet comprising a plurality of cycles of data samples, and for each cycle of the data samples: pass the data samples of the cycle to a compressor dictionary; identify, from the compressor dictionary, tags for each of the data samples, wherein the compressor dictionary comprises at least a first tag for data having a value of zero and a second tag for data having a value of one; and compress the data samples into compressed cycle data by storing the tags as compressed data, wherein the data samples identified with the first tag are compressed using the first tag and the data samples identified with the second tag are compressed using the second tag at the same time as values of the data samples identified with the first tag or the second tag are excluded from the compressed cycle data.
ENHANCED IMAGE COMPRESSION WITH CLUSTERING AND LOOKUP PROCEDURES
An image encoder includes a processor and a memory. The memory includes instructions configured to cause the processor to perform operations. In one example implementation, the operations may include determining whether a dictionary item is available for replacing a block of an image being encoded, the determining based on a hierarchical lookup mechanism, and encoding the image along with reference information of the dictionary item in response to determining that the dictionary item is available. In one more example implementation, the operations may include performing principal component analysis (PCA) on a block to generate a corresponding projected block, the block being associated with a group of images, comparing the projected block with a corresponding threshold, descending the block recursively based on the threshold until a condition is satisfied, and identifying a left over block as a cluster upon satisfying of the condition.
System, methods, and media for compressing non-relational database objects
Method, media, and systems for compressing objects, comprising: receiving a request to write a first object including a first key and a first value, wherein the first object is of a given type; receiving a request to write a second object including a second key and a second value, wherein the second object is of the given type; classifying the first object to a compression dictionary according to at least one rule based on a value of the first object and/or the key of the first object; classifying the second object to the compression dictionary according to at least one rule based on a value of the second object and/or the key of the second object; and compressing the first object and the second object based on the compression dictionary.
Data dictionary with a reduced need for rebuilding
A processor receives statistical information about a data set included in a column of a data table. The processor receives additional information about the data set that indicates a data format utilized by the data set and a type of information represented by the data set. The processor generates a data dictionary for compression of the data set based, at least in part, on the statistical information and the additional information. The data dictionary is created such that the data dictionary is capable of compressing data that is statistically predicted to be received at a future point.
Sparse dictionary tree
Techniques related to a sparse dictionary tree are disclosed. In some embodiments, computing device(s) execute instructions, which are stored on non-transitory storage media, for performing a method. The method comprises storing an encoding dictionary as a token-ordered tree comprising a first node and a second node, which are adjacent nodes. The token-ordered tree maps ordered tokens to ordered codes. The ordered tokens include a first token and a second token. The ordered codes include a first code and a second code, which are non-consecutive codes. The first node maps the first token to the first code. The second node maps the second token to the second code. The encoding dictionary is updated based on inserting a third node between the first node and the second node. The third node maps a third token to a third code that is greater than the first code and less than the second code.
COMPRESSION, SEARCHING, AND DECOMPRESSION OF LOG MESSAGES
Log messages are compressed, searched, and decompressed. A dictionary is used to store non-numeric expressions found in log messages. Both numeric and non-numeric expressions found in log messages are represented by placeholders in a string of log “type” information. Another dictionary is used to store the log type information. A compressed log message contains a key to the log-type dictionary and a sequence of values that are keys to the non-numeric dictionary and/or numeric values. Searching may be performed by parsing a search query into subqueries that target the dictionaries and/or content of the compressed log messages. A dictionary may reference segments that contain a number of log messages, so that all log message need not be considered for some searches.
Computerized data compression and analysis using potentially non-adjacent pairs
A computerized method of compressing symbolic information organized into a plurality of documents, each document having a plurality of symbols, includes: (i) automatically identifying a plurality of sequential and non-sequential symbol pairs in an input document; (ii) counting the number of appearances of each unique symbol pair; and (iii) producing a compressed document that includes a replacement symbol at each position associated with one of the plurality of symbol pairs, at least one of which corresponds to a non-sequential symbol pair. For each non-sequential pair the compressed document includes corresponding indicia indicating a distance between locations of the non-sequential symbols of the pair in the input document. In some instances the plurality of symbol pairs includes only those pairs of non-sequential symbols for which the distance between locations of the non-sequential symbols of the pair in the input document is less than a numeric distance cap.
Determining a state of a network
A client computing device has a storage device storing a plurality of files and a system agent. The system agent applies a hash function to binary data read from the plurality of files to generate a set of data signatures. A server computing device has a database interface to access a database representing a state of the network and storage for a set of exemplar data signatures resulting from a scan of one or more exemplar computing devices, each data signature generated by applying a hash function to binary data representing a file. The client computing device is configured to receive and compare the set of exemplar data signatures with the generated set of data signatures, and to transmit data to the server computing device based on the comparison. The server computing device is configured to obtain data received from the client computing device and update records in the database.