Patent classifications
H03M7/6041
Compressed data verification
Embodiments of the present disclosure relate to verifying compressed data. Compressed data files can be read from a global cache for a storage device into a local buffer. A data verification level of a plurality of data verification levels can be selected to perform on the compressed data files. An amount of data blocks of each data file can be decompressed based on the determined data verification level. An integrity of the compressed data files verified using the decompressed data blocks.
Compression of Machine-Learned Models by Vector Quantization
A computing system can include one or more processors and one or more computer-readable media storing instructions that, when executed by the one or more processors, cause the computing system to perform operations including obtaining model structure data indicative of a plurality of parameters of a machine-learned model; determining a codebook comprising a plurality of centroids, the plurality of centroids having a respective index of a plurality of indices indicative of an ordering of the codebook; determining a plurality of codes respective to the plurality of parameters, the plurality of codes respectively comprising a code index of the plurality of indices corresponding to a closest centroid of the plurality of centroids to a respective parameter of the plurality of parameters; and providing encoded data as an encoded representation of the plurality of parameters of the machine-learned model, the encoded data comprising the codebook and the plurality of codes.
Lossless compression of client read data
A read is aligned to a reference data set. It is determined whether the read includes any identifier distinction, the determination being performed using the alignment. If so, positional data corresponding to the identifier distinction(s) are defined. Compressed read data is stored in association with a read identifier of the read. The compressed read data includes alignment information (e.g., a start and/or stop position of the alignment). When the read includes an identifier distinction, the compressed read data further includes the positional data and deviation data characterizing the distinction.
CONTROL SYSTEM AND CONTROL METHOD
A control system includes: a generation unit that generates a dataset for each unit section; a feature extraction unit that generates feature quantity data on the basis of the dataset; and a score calculation unit that calculates a score indicating a degree of deviation of the feature quantity data from learning data, by referring to the learning data. The feature quantity data and the score are output as compression results of the dataset. The control system includes a restoration unit that selects pattern data corresponding to a class determined according to the score contained in the compression result, and after adjusting the pattern data using the feature quantity data contained in the compression results, restores the pattern data as a temporal change in the dataset corresponding to the compression results.
INFORMATION PROCESSING SYSTEM AND COMPRESSION CONTROL METHOD
A dynamic driving plan generator generates a driving plan representing a dynamic partial driving target of a compressor and a decompressor based on input data input to the compressor. The compressor is partially driven according to the driving plan to generate compressed data of the input data. The decompressor is partially driven according to the driving plan to generate reconstructed data of the compressed data. The dynamic driving plan generator has already been learned based on evaluation values obtained for the driving plan. Each of the evaluation values corresponds to a respective one of evaluation indexes for the driving plan, and the evaluation values are values obtained when at least the compression of the compression and the reconstruction according to the driving plan is executed. The evaluation indexes include the execution time for one or both of the compression and the reconstruction of the data.
Approach to improve decompression performance by scatter gather compression units and also updating checksum mechanism
A method, apparatus, and system for decompressing data with a hardware compression/decompression accelerator is disclosed. The operations comprise: submitting compressed data from a plurality of stored compression units in a compression region to a hardware compression/decompression accelerator in a single submission for decompression, wherein each compression unit stores a checksum calculated based on corresponding uncompressed data; decompressing, at the hardware compression/decompression accelerator, the compressed data from the plurality of stored compression units, the decompressing generating combined decompressed data corresponding to the compressed data; calculating, at the hardware compression/decompression accelerator, a first combined checksum based on the combined decompressed data; calculating a second combined checksum based on individual checksums stored in the plurality of compression units; determining whether the first combined checksum matches the second combined checksum; and if the combined checksums match, forwarding the combined decompressed data to a storage device for storage as uncompressed data.
Compression offloading to RAID array storage enclosure
A storage system comprises a plurality of enclosures and a storage controller. Each enclosure comprises at least one processing device and a plurality of drives configured in accordance with a redundant array of independent disks (RAID) arrangement. The storage controller obtains data pages associated with an input-output request, provides the data pages to a processing device of a given enclosure, and issues a command to the processing device to perform at least one operation based at least in part on the data pages. The processing device of the given enclosure receives the data pages from the storage controller, generates compressed data pages based at least in part on the received data pages, stores one or more of the compressed data pages on the plurality of drives according to the RAID arrangement and returns information associated with the storage of the compressed data pages to the storage controller.
MULTIVARIATE DATA COMPRESSION SYSTEM AND METHOD THEREOF
A smart sensing architecture (100) includes smart meters (102) and processing units (104). The smart meters (102) generate and transmit multidimensional data streams to the processing units (104). A processing unit (104) determines an optimum batch size for a multidimensional data stream and generates a multidimensional batch of data. The processing unit (104) reduces dimensionality of the multidimensional batch of data using principal component analysis to generate a low-dimensional batch of data and performs compressive sampling on the low-dimensional batch of data to generate a compressed batch of data, thereby saving bandwidth of transmission.
System and method for performing data minimization without reading data content
A system and method for performing data minimization without reading data content is disclosed. The method includes receiving a request from a user to perform data minimization and retrieving metadata associated with plurality of datasets based on the request. The method further includes determining one or more characteristics of the retrieved metadata based on one or more data parameters and one or more derived data parameters and generating one or more minimization parameters and one or more data sensitivity parameters for each of the plurality of datasets by using a trained data minimization based ML and NLP model. The method includes determining portions of the plurality of datasets based on the one or more minimization parameters, the one or more data sensitivity parameters, privacy regulations and business requirements and performing one or more minimizing operations on the determined portions of the plurality of datasets based on prestored rules.
TRELLIS BASED RECONSTRUCTION ALGORITHMS AND INNER CODES FOR DNA DATA STORAGE
Techniques for achieving reductions in cost of encoding and decoding operations used in DNA data storage systems to facilitate reducing errors in those encoding and decoding operations while accounting for a code structure used during the encoding and decoding by constructing and using insertion-deletion-substitution (IDS) trellises for multiple traces are disclosed. A DNA sequencing channel is used to randomly sample and sequence DNA strands to generate noisy traces. Multiple trellises are independently constructed for each respective noisy trace. A forward-backward algorithm is run on each trellis to compute posterior marginal probabilities for vertices included in each trellises. An estimate of the data message sequence is then computed.