Patent classifications
G06F2211/1014
Capturing compression efficiency metrics for processing data
Provided are techniques for capturing compression efficiency metrics for processing data. In response to retrieving native data for a first operation, perform the first operation; perform a second operation to generate a compression efficiency metric from the native data based on a ratio of the native data to compressed native data; and store the compression efficiency metric persistently for subsequent use in prioritizing compression of the native data.
Method of Using Common Storage of Parity Data for Unique Copy Recording
A disclosed method is performed at a fault-tolerant object-based storage system including M data storage entities, each is configured to store data on an object-basis. The method includes obtaining a request to store N copies of a data object and in response, storing the N copies of the data object across the M data storage entities, where the N copies are distributed across the M data storage entities. The method additionally includes generating a first parity object for a first subset of M copies of the N copies of the data object, where the first parity object is stored on a first parity storage entity separate from the M data storage entities. The method also includes generating a manifest linking the first parity object with one or more other subsets of M copies of the N copies of the data object.
ADAPTIVE PARITY ROTATION FOR REDUNDANT ARRAYS OF INDEPENDENT DISKS
A method for more efficiently utilizing storage space in a redundant array of independent disks (RAID) is disclosed. In one embodiment, such a method implements a RAID from multiple storage drives. The RAID utilizes data striping with distributed parity values to provide desired data protection/redundancy. The distributed parity values are placed on selected storage drives of the RAID in accordance with a designated parity rotation. The method further adaptively alters the parity rotation of the RAID to provide an increased concentration of parity values in certain storage drives of the RAID compared to other storage drives of the RAID. This parity rotation may be adapted based on residual storage capacity in each storage drive, consumed space in each storage drive, or the like. A corresponding system and computer program product are also disclosed.
EFFICIENT UTILIZATION OF STORAGE SPACE IN ARRAYS OF STORAGE DRIVES
A method for more efficiently utilizing storage space in a set of storage drives is disclosed. In one embodiment, such a method implements, in a set of storage drives, a first RAID utilizing data striping with distributed parity values. The method further implements, in a subset of the set of storage drives, a second RAID using residual storage space in storage drives belonging to the subset. Storage drives belonging to the subset may have a storage capacity that is larger than storage drives not belonging to the subset. In certain embodiments, the method adaptively alters a parity rotation of the first RAID to provide an increased concentration of parity values in certain storage drives of the first RAID compared to other storage drives of the first RAID. A corresponding system and computer program product are also disclosed.
Efficient embedding table storage and lookup
The present disclosure provides systems, methods, and computer program products for providing efficient embedding table storage and lookup in machine-learning models. A computer-implemented method may include obtaining an embedding table comprising a plurality of embeddings respectively associated with a corresponding index of the embedding table, compressing each particular embedding of the embedding table individually allowing each respective embedding of the embedding table to be decompressed independent of any other embedding in the embedding table, packing the embedding table comprising individually compressed embeddings with a machine-learning model, receiving an input to use for locating an embedding in the embedding table, determining a lookup value based on the input to search indexes of the embedding table, locating the embedding based on searching the indexes of the embedding table for the determined lookup value, and decompressing the located embedding independent of any other embedding in the embedding table.
Efficient Embedding Table Storage and Lookup
The present disclosure provides systems, methods, and computer program products for providing efficient embedding table storage and lookup in machine-learning models. A computer-implemented method may include obtaining an embedding table comprising a plurality of embeddings respectively associated with a corresponding index of the embedding table, compressing each particular embedding of the embedding table individually allowing each respective embedding of the embedding table to be decompressed independent of any other embedding in the embedding table, packing the embedding table comprising individually compressed embeddings with a machine-learning model, receiving an input to use for locating an embedding in the embedding table, determining a lookup value based on the input to search indexes of the embedding table, locating the embedding based on searching the indexes of the embedding table for the determined lookup value, and decompressing the located embedding independent of any other embedding in the embedding table.
Storage system and data write method
A storage system includes a plurality of storage devices, each including a storage medium and a compression function for data, and a storage controller coupled to the plurality of storage devices. The storage controller includes compression necessity information indicating necessity of compression of the data in a write command to be transmitted to a storage device at a write destination among the plurality of storage devices. The storage device at the write destination writes, when the compression necessity information included in the received write command indicates that compression is unnecessary, the data in the storage medium without compressing the data.
Storage apparatus and data control method of storing data with an error correction code
In a storage apparatus including a storage medium including a plurality of pages as a unit of reading and writing data, a first data block including a data block received from a higher-level device is generated, a second data block of a predetermined size including one or more undivided first data blocks is generated, a third data block in which a correction code is added to the second data block is generated, the third data block is stored in a page buffer, and one or more of the third data blocks stored in the page buffer is written in a page, which is a write destination, out of the pages of the storage medium.
Data reduction techniques in a flash-based key/value cluster storage
In one aspect, a method includes splitting empty RAID stripes into sub-stripes and storing pages into the sub-stripes based on a compressibility score. In another aspect, a method includes reading pages from 1-stripes, storing compressed data in a temporary location, reading multiple stripes, determining compressibility score for each stripe and filling stripes based on the compressibility score. In a further aspect, a method includes scanning a dirty queue in a system cache, compressing pages ready for destaging, combining compressed pages in to one aggregated page, writing one aggregated page to one stripe and storing pages with same compressibility score in a stripe.
Raid system performance enhancement using compressed data and byte addressable storage devices
A method for operating a RAID storage system includes configuring the RAID storage devices to receive in a read or write command a byte count, receiving a first data block to write to the storage system, compressing the received first data block to generate a first compressed data block, and then storing the first compressed data block memory. The method additionally includes executing a set of RAID operations to perform a partial stripe update, including: retrieving a second compressed data block from memory; determining a physical size of the second compressed data block; generating, based on the second compressed data block and the physical size, redundant data corresponding with the second compressed data block; and writing the second compressed data block and the redundant data by transmitting a write command including the second compressed data block, the redundant data, and the physical size to the set of RAID storage devices.