Patent classifications
H03M13/154
Erasure coding in a large geographically diverse data storage system
Selectively distributing fragments of a data protection set in a geographically diverse data storage system is disclosed. The data protection set can comprise fewer fragments than there are zones comprising the geographically diverse data storage system, which can result in some zones not storing a fragment of the data protection set. Control over distribution of fragments of different data protection sets in the geographically diverse data storage system can mitigate or avoid unbalanced storage of the protection sets. The distribution can be controlled in accordance with a protection set distribution scheme (PSDS). A first PSDS can generate coding fragments from randomly select data fragments of all zones. A second PSDS can generate coding fragments from determined unique zone combinations. A third PSDS can generate coding fragments based on affinity values from an affinity matrix. In embodiments, threshold values or rules can be employed to force generation of a protection set regardless of an applied PSDS where the PSDS excessively retards generation of sufficient protections sets.
Hierarchical erasure coding for multi-region storage
Described are systems and methods for storing a data object using a hierarchical erasure encoding to store a physical representation of the data object across a plurality of fault domains. A first erasure encoding is applied to the data object to generate a first set of shards of the data object. Individual shards of the set of shards may then be distributed across the fault domains for storage. Within the fault domains a second erasure encoding may be applied to the individual shards to generate a second set of shards. Finally, a manifest may be generate in order to reconstruct the data object from the first set of shards and the second set of shards.
Preliminary data protection using composite copies of data in a data storage system
The disclosed technology generally describes a preliminary (e.g., triple mirroring) data protection scheme that operates by writing data as redundant (e.g., three) composite copies made up of copies of data fragments to different nodes of a data storage system. The data fragments are distributed such that any two nodes can fail yet a complete set of data remains among the remaining data fragments. Later, erasure encoding creates redundant coding fragments that are written to the nodes of a data storage system in a distributed manner along with one copy of the data fragments, such that any two nodes can fail but the complete data can still be recovered. Redundant data fragments are then deleted.
Multilevel Load Balancing
A storage system is provided. The storage system includes a first storage cluster, the first storage cluster having a first plurality of storage nodes coupled together and a second storage cluster, the second storage cluster having a second plurality of storage nodes coupled together. The system includes an interconnect coupling the first storage cluster and the second storage cluster and a first pathway coupling the interconnect to each storage cluster. The system includes a second pathway, the second pathway coupling at least one fabric module within a chassis to each blade within the chassis.
Policy-based hierarchical data protection in distributed storage
A storage management computing device obtains an information lifecycle management (ILM) policy. A data protection scheme to be applied at a storage node computing device level is determined and a plurality of storage node computing devices are identified based on an application of the ILM policy to metadata received from one of the storage node computing devices and associated with an object ingested by the one of the storage node computing devices. The one of the storage node computing devices is instructed to generate one or more copies of the object or fragments of the object according to the data protection scheme and to distribute the object copies or one of the object fragments to one or more other of the storage node computing devices to be stored by at least the one or more other storage node computing devices on one or more disk storage devices.
Reliability coding for storage on a network
This disclosure describes a programmable device, referred to generally as a data processing unit, having multiple processing units for processing streams of information, such as network packets or storage packets. This disclosure also describes techniques that include enabling data durability coding on a network. In some examples, such techniques may involve storing data in fragments across multiple fault domains in a manner that enables efficient recovery of the data using only a subset of the data. Further, this disclosure describes techniques that include applying a unified approach to implementing a variety of durability coding schemes. In some examples, such techniques may involve implementing each of a plurality of durability coding and/or erasure coding schemes using a common matrix approach, and storing, for each durability and/or erasure coding scheme, an appropriate set of matrix coefficients.
System and Method for Error Correction
A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.
Accelerated erasure coding system and method
An accelerated erasure coding system includes a processing core for executing computer instructions and accessing data from a main memory, and a non-volatile storage medium for storing the computer instructions. The processing core, storage medium, and computer instructions are configured to implement an erasure coding system, which includes: a data matrix for holding original data in the main memory; a check matrix for holding check data in the main memory; an encoding matrix for holding first factors in the main memory, the first factors being for encoding the original data into the check data; and a thread for executing on the processing core. The thread includes: a parallel multiplier for concurrently multiplying multiple entries of the data matrix by a single entry of the encoding matrix; and a first sequencer for ordering operations through the data matrix and the encoding matrix using the parallel multiplier to generate the check data.
Efficient segment cleaning employing local copying of data blocks in log-structured file systems of distributed data systems
Client data is structured as a set of data blocks. A first subset of data blocks is stored on a current segment of a plurality of disks. A second subset of data blocks is stored on a previous segment. A request to clean client data is received. The request includes a request to update the current segment to include the second subset of data blocks. The second subset of data blocks is accessed and transmitted from a lower layer to a higher layer of the system. Parity data is generated at the higher layer. The parity data is transmitted to the lower layer. The lower layer is employed to generate a local copy of the second subset of data blocks. Each local address that references the local copy of the second subset of data blocks is included in the current segment. The parity data is written in the current segment.
System and method for data protection in solid-state drives
The present disclosure relates to a system and a method for data protection. In some embodiments, an exemplary method for data encoding includes: receiving a data bulk; performing an erasure coding (EC) encoding on the data bulk to generate one or more EC codewords; distributing a plurality of portions of each EC codeword of the one or more EC codewords across a plurality of solid-state drives (SSDs); performing, at each SSD of the plurality of SSDs, an error correction coding (ECC) encoding on portions of the one or more EC codewords distributed to the SSD to generate an ECC codeword; and storing, in each SSD of the plurality of SSDs, the ECC codeword.