Patent classifications
G06F11/1096
TECHNIQUES FOR MEMORY ERROR CORRECTION
Methods, systems, and devices for techniques for memory error correction are described. A memory device may operate cycles associated with refresh operations and cycles associated with refresh with error correction (ECC) operations independently. For example, the memory device may include an ECC patrol block having an error correction counter which indicates a row on which to perform an error correction procedure. Additionally, the memory device may include a refresh counter which indicates a row on which to perform a refresh operation. In response to receiving a command of a first, the memory device may modify the error correction counter and maintain the refresh counter. Alternatively, in response to receiving a command of a second, the memory device may modify the refresh counter and maintain the error correction counter.
INSTANT WRITE SCHEME WITH DRAM SUBMODULES
Provided is a memory system including a plurality of memory submodules and a controller. Each submodule comprises a plurality of memory channels, each channel having a parity bit and a redundant array of independent devices (RAID) parity channel. The controller is configured to receive a block of data for storage in the plurality of memory submodules and determine whether a level of data traffic demand for a first of the plurality of submodules is high or low. When the data traffic demand is low, (i) writing a portion of the block of data in the first of the plurality of submodules and (ii) concurrently updating the parity bit and the RAID parity channel associated with the block of data. When the data traffic demand is high, (i) only writing the portion of the block of data in the first of the plurality of submodules and (ii) deferring updating of the parity bits and the RAID parity channel associated with the block of data.
Changing of error correction codes based on the wear of a memory sub-system
Systems and methods are disclosed that are of retrieving, by a processing device, a codeword stored at a memory sub-system, determining parity data of the codeword, generating additional parity bits based on one or more bits of the parity data of the codeword, and generating host data by decoding the codeword using the additional parity bits.
Data protection via commutative erasure coding in a geographically diverse data storage system
Commutative coding in a geographically diverse data storage system is disclosed. Commutative coding can achieve a same result as more conventional hierarchical erasure coding of data, but can be more efficient. Commutative coding can employ Galois Field (GF) based bit-matrix operations. The bit-matrix operations can employ a reduced GF order in associated with expanding elements of input matrixes. A reduced GF order can perform matrix operations at a lower complexity, e.g., employing AND operations for a GF(2) in contrast to XOR operations for a GF(2.sup.w), where w=4, 8, 16, etc. In an aspect, commutative coding can comprise generating a second-tier coding fragment based on applying a second erasure coding scheme, via bit-matrix operations, to a first-tier encoded fragment, wherein the first-tier encoded fragment is based on an input data fragment and a first erasure coding scheme.
Method to increase the usable word width of a memory providing an error correction scheme
Various embodiments relate to a method for storing and reading data from a memory. Data words stored in the memory may be grouped, and word specific parity information and shared parity information is generated, and the shared parity information is distributed among the group of words. During reading of a word, if more errors are detected than can be corrected with word parity data, the shared parity data is retrieved and used to make the error corrections.
Striping based on failure domains rules
A method for striping based on evaluated rules, the method may include determining a compatibility, with a storage system utilization policy, of storing stripes under evaluated rules; wherein the evaluated rules define a stripe size, a number of parity chunks per stripe, and maximal numbers of chunks within a stripe per different failure domains of different size ranges; checking whether the storing of the stripes is compatible with the storage system utilization policy; when finding that the storing of the stripes is not compatible then searching for one or more changes of one or more of the maximal numbers that yields compliant one or more maximal numbers that once applied results in a compliance with the storage system utilization policy; applying the compliant one or more maximal numbers when finding the compliant one or more maximal numbers; and determining that the evaluated failure domain rules are non-compliant when failing to find the compliant one or more maximal numbers.
Exact repair regenerating codes for distributed storage systems
A distributed storage system includes a plurality of nodes comprising a first node, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d. Upon detecting a failure in the first node, each helper node of the number of helper nodes is configured to determine a repair-encoder matrix, multiply a content matrix by the repair-encoder matrix to obtain a repair matrix, extract each linearly independent column of the repair matrix, and send the linearly independent columns of the repair matrix to the first node.
Coexisting differing erasure codes
A method for proactively rebuilding user data in a plurality of storage nodes of a storage cluster is provided. The method includes distributing user data and metadata throughout the plurality of storage nodes such that the plurality of storage nodes can read the user data, using erasure coding, despite loss of two of the storage nodes. The method includes determining that one of the storage nodes is unreachable and determining to rebuild the user data for the one of the storage nodes that is unreachable. The method includes reading the user data across a remainder of the plurality of storage nodes, using the erasure coding and writing the user data across the remainder of the plurality of storage nodes, using the erasure coding. A plurality of storage nodes within a single chassis that can proactively rebuild the user data stored within the storage nodes is also provided.
Method, apparatus, and computer readable medium for I/O control
Techniques providing I/O control involve: in response to receiving an I/O request, detecting a first set bits for a stripe in a RAID. The RAID is built on disk slices divided from disks. The stripes include extents. Each of the first set bits indicates whether a disk slice where a corresponding extent in the stripe is located is in a failure state. The techniques further involve determining, from the stripe and based on the first set bits, a first set of extents in the failure state and a second set of extents out of the failure state. The techniques further involve executing the I/O request on the second set of extents without executing the I/O request on the first set of extents. Such techniques can simplify storage bits in I/O control, support the degraded stripe write request for the RAID and enhance performance executing the I/O control.
Storage system having RAID stripe metadata
A processing device obtains a write operation which comprises first data and second data to be stored in first and second strips of a given stripe. The processing device stores the first data in the first strip and determines that the second strip is unavailable. The processing device determines a parity based on the first data and the second data and stores the parity in a parity strip. The processing device updates metadata to indicate that the second data was not stored in the second strip. In some embodiments, the updated metadata is non-persistent and the processing device may be further configured to rebuild the given stripe, update persistent metadata corresponding to a sector of stripes including the given stripe and clear the non-persistent metadata based at least in part on a completion of the rebuild.