Patent classifications
H03M7/6088
Partitional data compression
A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
Methods and apparatus to compress data
Methods, apparatus, systems and articles of manufacture to compress data are disclosed. An example apparatus includes a data slicer to split a dataset into a plurality of blocks of data; a data processor to select a first compression technique for a first block of the plurality of blocks of data based on first characteristics of the first block; and select a second compression technique for a second block of the plurality of blocks of data based on second characteristics of the second block; a first compressor to compress the first block using the first compression technique to generate a first compressed block of data; a second compressor to compress the second block using the second compression technique to generate a second compressed block of data; and a header generator to generate a first header identifying the first compression technique and a second header identifying the second compression technique.
PARTITIONAL DATA COMPRESSION
A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
SYSTEM AND METHOD FOR COMPUTER DATA TYPE IDENTIFICATION
A system and method for file type identification involving extraction of a file-print of a file, the file-print being a unique or practically-unique representation of statistical characteristics associated with the distribution of bits in the binary contents of the file, similar to a fingerprint. The file-print is then passed to a machine learning algorithm that has been trained to recognize file types from their file-prints. The machine learning algorithm returns a predicted file type and, in some cases, a probability of correctness of the prediction. The file may then be encoded using an encoding algorithm chosen based on the predicted file type.
DATA COMPRESSION METHOD AND COMPUTING DEVICE
This application discloses a data compression method and a computing device. The disclosed method is applied to the computing device. The method includes receiving, by the computing device, to-be-compressed data, and identifying a data type of the to-be-compressed data. The method further includes selecting one or more data compression models based on the identified data type and compressing the to-be-compressed data based on the selected one or more data compression models.
STATISTICAL AND NEURAL NETWORK APPROACH FOR DATA CHARACTERIZATION TO REDUCE STORAGE SPACE REQUIREMENTS
A data model is trained to determine whether data is raw, compressed, and/or encrypted. The data model may also be trained to recognize which compression algorithm was used to compress data and predict compression ratios for the data using different compression algorithms. A storage system uses the data model to independently identify raw data. The raw data is grouped based on similarity of statistical features and group members are compressed with the same compression algorithm and may be encrypted after compression with the same encryption algorithm. The data model may also be used to identify sub-optimally compressed data, which may be uncompressed and grouped for compression using a different compression algorithm.
EMBEDDING CODEBOOKS FOR RESOURCE OPTIMIZATION
Embodiments of the present disclosure provide systems, methods, and computer storage media for optimizing computing resources generally associated with cloud-based media services. Instead of decoding digital assets on-premises to stream to a remote client device, an encoded asset can be streamed to the remote client device. A codebook employable for decoding the encoded asset can be embedded into the stream transmitted to the remote client device, so that the remote client device can extract the embedded codebook, and employ the extracted codebook to decode the encoded asset locally. In this way, not only are processing resources associated with on-premises decoding eliminated, but on-premises storage of codebooks can be significantly reduced, while expensive bandwidth is freed up by virtue of transmitting a smaller quantity of data from the cloud to the remote client device.
SYSTEMS AND METHODS FOR COMPRESSING SENSOR DATA USING CLUSTERING AND SHAPE MATCHING IN EDGE NODES OF DISTRIBUTED COMPUTING NETWORKS
A system and method for compressing sensor data at an edge node of a distributed computing network. The method includes training the edge node to with a plurality of known signal templates. Each known signal template corresponding to a corresponding one of a plurality of events observable by the sensor. A raw data signal is collected by a sensor of the edge node. The raw data signal is classified to one of the known signal templates based on a degree of similarity between the raw data signal and the known signal template. A compression scheme is selected based on the classification of the raw data signal. The raw data signal is compressed in accordance with the compression scheme.
Embedding codebooks for resource optimization
Embodiments of the present disclosure provide systems, methods, and computer storage media for optimizing computing resources generally associated with cloud-based media services. Instead of decoding digital assets on-premises to stream to a remote client device, an encoded asset can be streamed to the remote client device. A codebook employable for decoding the encoded asset can be embedded into the stream transmitted to the remote client device, so that the remote client device can extract the embedded codebook, and employ the extracted codebook to decode the encoded asset locally. In this way, not only are processing resources associated with on-premises decoding eliminated, but on-premises storage of codebooks can be significantly reduced, while expensive bandwidth is freed up by virtue of transmitting a smaller quantity of data from the cloud to the remote client device.
Methods and apparatus to compress data
Methods, apparatus, systems and articles of manufacture to compress data are disclosed. An example apparatus includes an off-chip memory to store data; a data slicer to split a dataset into a plurality of blocks of data; a data processor to select a first compression technique for a first block of the plurality of blocks of data based on first characteristics of the first block; and select a second compression technique for a second block of the plurality of blocks of data based on second characteristics of the second block; a first compressor to compress the first block using the first compression technique to generate a first compressed block of data; a second compressor to compress the second block using the second compression technique to generate a second compressed block of data; a header generator to generate a first header identifying the first compression technique and a second header identifying the second compression technique; and an interface to transmit the first compressed block of data with the first header and the second compressed block of data with the second header to be stored in the off chip memory.