ERROR CORRECTION IN DATA STORAGE DEVICES
20200264953 ยท 2020-08-20
Inventors
Cpc classification
H03M13/2921
ELECTRICITY
G06F11/1076
PHYSICS
International classification
Abstract
Systems and methods are disclosed for error correction in data storage devices. In some implementations, a method is provided. The method includes obtaining configuration data indicating a logical arrangement for a set of blocks. The logical arrangement includes rows and columns of blocks. The configuration data also indicates a number of row parity blocks in a set of row parity blocks and a number of diagonal parity blocks in a set of diagonal parity blocks. The method also includes configuring a set of storage devices based on the configuration data, wherein a first number of data blocks in the set of diagonal parity blocks is less than a second number of data blocks in a column.
Claims
1. An apparatus, comprising: a set of storage devices comprising a set of blocks logically arranged in rows and columns, wherein the set of blocks comprises a set of data blocks; a set of row parity blocks; and a set of diagonal parity blocks, wherein a first number of diagonal parity blocks in the set of diagonal parity blocks is less than a second number of data blocks in a column; and a processing device coupled to the set of storage devices, the processing device configured to manage access to the set of storage devices.
2. The apparatus of claim 1, wherein each row parity block of the set of row parity blocks is obtained based on a row of data blocks.
3. The apparatus of claim 1, wherein a first number of row parity blocks in the set of row parity blocks is equal to a second number of rows.
4. The apparatus of claim 1, wherein a first number of rows of data blocks is equal to a second number of columns of data blocks.
5. The apparatus of claim 1, wherein each diagonal parity block of the set of diagonal parity blocks is obtained based on data blocks in different rows and different columns.
6. The apparatus of claim 1, wherein the first number of diagonal parity blocks in the set of diagonal parity blocks is less than a second number of rows.
7. The apparatus of claim 1, wherein a total number of row parity blocks and diagonal parity blocks is less than a first number of data blocks in two columns.
8. The apparatus of claim 1, wherein the processing device is further configured to: obtain configuration data indicating one or more of: a logical arrangement of the set of blocks; a number of row parity blocks; and a number of diagonal parity blocks; and configure the set of storage devices and set of storage blocks based on the configuration data.
9. The apparatus of claim 1, wherein the processing device is further configured to: determine whether one or more data blocks of the set of data blocks comprises one or more errors; and recovering the one or more data blocks based on one or more of: a first subset of the set of data blocks; a second subset the set of diagonal parity blocks; and a third subset of the set of row parity blocks.
10. The apparatus of claim 9, wherein a total number of blocks used to recover the one or more data blocks is less than a remaining number of blocks.
11. A method comprising: obtaining a configuration data, wherein the configuration data indicates: a logical arrangement for a set of blocks, wherein the logical arrangement comprises rows and columns of blocks; a number of row parity blocks in a set of row parity blocks; and a number of diagonal parity blocks in a set of diagonal parity blocks; and configuring a set of storage devices based on the configuration data, wherein a first number of diagonal parity blocks in the set of diagonal parity blocks is less than a second number of data blocks in a column.
12. The method of claim 11, wherein each row parity block of the set of row parity blocks is obtained based on a row of data blocks.
13. The method of claim 11, wherein a first number of row parity blocks in the set of row parity blocks is equal to a second number of rows.
14. The method of claim 11, wherein a first number of rows of data blocks is equal to a second number of columns of data blocks.
15. The method of claim 11, wherein each diagonal parity block of the set of diagonal parity blocks is obtained based on data blocks in different rows and different columns.
16. The method of claim 11, wherein the first number of diagonal parity blocks in the set of diagonal parity blocks is less than a second number data blocks in 1.
17. The method of claim 11, wherein a total number of row parity blocks and diagonal parity blocks is less than a first number of data blocks in two columns.
18. The method of claim 11, further comprising: determining that one or more data blocks of a set of data blocks comprises one or more errors; and recovering the one or more data blocks based on one or more of: a first subset of the set of data blocks; a second subset the set of diagonal parity blocks; and a third subset of the set of row parity blocks.
19. The method of claim 18, wherein a total number of blocks used to recover the one or more data blocks is less than a remaining number of blocks.
20. non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations comprising: obtaining a configuration data, wherein the configuration data indicates: a logical arrangement for a set of blocks, wherein the logical arrangement comprises rows and columns of blocks; a number of row parity blocks in a set of row parity blocks; and a number of diagonal parity blocks in a set of diagonal parity blocks; and configuring a set of storage devices based on the configuration data, wherein a first number of diagonal parity blocks in the set of diagonal parity blocks is less than a second number of data blocks in a column.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0006]
[0007]
[0008]
[0009]
[0010]
[0011]
[0012] To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.
DETAILED DESCRIPTION
[0013] In the following disclosure, reference is made to examples, implementations, and/or embodiments of the disclosure. However, it should be understood that the disclosure is not limited to specific described examples, implementations, and/or embodiments. Any combination of the features, functions, operations, components, modules, etc., disclosed herein, whether related to different embodiments or not, may be used to implement and practice the disclosure. Furthermore, although embodiments of the disclosure may provide advantages and/or benefits over other possible solutions, whether or not a particular advantage and/or benefit is achieved by a given embodiment is not limiting of the disclosure. Thus, the following aspects, features, embodiments and advantages are merely illustrative and are not considered elements or limitations of the appended claims except where explicitly recited in a claim(s). Likewise, reference to the disclosure shall not be construed as a generalization of any inventive subject matter disclosed herein and shall not be considered to be an element or limitation of the appended claims except where explicitly recited in the claim(s).
[0014] The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the claimed invention. Disclosed herein are example implementations, configurations, and/or embodiments relating to error correction for data storage devices.
[0015] Data storage devices, such as solid state drives (SSDs), hard disk drives (HDDs), hybrid drives (e.g., storage drives/devices that include both magnetic media/medium and flash memory), etc., typically include one or more controllers coupled with one or more non-volatile memory (NVM) arrays or other storage media such as rotating magnetic disks. Stored data may be subject to loss and/or corruption. For example, data may be lost, damaged, corrupted, etc., due to failure of memory cells, damage (e.g., physical damage), degradation, read/write disturbs, loss of data retention, loss of endurance, etc. Data storage devices may generally utilize one or more error correction codes (ECCs), error correction schemes, and/or error coding mechanisms to detect and/or correct errors in the data that is stored within the data storage devices (e.g., stored within the NVM arrays). For example, the data storage devices may generate codewords that encode data using an ECC. In another example, data storage devices may generate parity data that is used to protect data from loss (e.g., parity data that is used to recover, regenerate, recalculate, etc., data when data becomes inaccessible, corrupted, has errors, etc.).
[0016] Although parity data may be used to correct errors in data, using parity data may increase the amount of storage space used in a non-volatile memory to store the data (e.g., the protected data). Thus, it may be useful and/or more efficient to use codes that reduce the amount of parity data used to protect data stored on the data storage device. In addition, different codes may access a different amount of data on the data storage device to recover data. For example, if a block of data becomes corrupted, the data storage device may use all of the remaining data (e.g., the reaming blocks of data and parity data) stored on the data storage device to recover the data. This may result in an increased bandwidth usage for the data storage device. Thus, it may be useful and/or more efficient to use codes that reduce the amount of data that is accessed when recovering corrupted or inaccessible data.
[0017]
[0018] The computing device 110 also includes a network interface 115. The network interface 115 may be hardware (e.g., a network interface card), software (e.g., drivers, applications, etc.), and/or firmware that allows the computing device 110 to communicate data with the network 105. The network interface card may be used to transmit and/or receive blocks of data, packets, messages, etc. In one embodiment, network 105 may include a public network (e.g., the Internet), a private network (e.g., a local area network (LAN)), a wide area network (WAN) such as the Internet, a wired network (e.g., Ethernet network), a wireless network (e.g., an 802.11 network or a Wi-Fi network), a cellular network (e.g., a Long Term Evolution (LTE) network), routers, hubs, switches, server computers, other types of computer networks, and/or a combination thereof. The computing device 110 may communicate (e.g., transmit and/or receive) data with other devices (e.g., other computing devices, other data storage devices, etc.) via the network 105.
[0019] Each data storage device 120 may incorporate access command scheduling and/or execution in accordance with embodiments, examples, and/or implementations disclosed herein. Each data storage device 120 may be any type of data storage device, drive, module, component, system, or the like. Furthermore, the terms drive and data storage drive may be used herein in certain contexts to refer to any type of data storage device, and may be used substantially interchangeably with the term data storage device herein in connection with various embodiments and/or in various contexts. As shown, Each data storage device 120 (e.g., hybrid hard drive, solid-state drive, any storage device utilizing solid-state memory, a hard disk drive, any storage device utilizing magnetic media/medium, etc.) includes a controller 130 (e.g., control circuitry, software, firmware, or a combination thereof) and a non-volatile memory 140.
[0020] The non-volatile memory (NVM) 140 may be configured for long-term storage of data and may retain data between power on/off cycles of the data storage device 120. The non-volatile memory 140 and/or portions of the non-volatile memory 140 may also be referred to as a storage medium. In some embodiments, the non-volatile memory 140 may include solid-state memory. Solid-state memory may comprise a wide variety of technologies, such as flash integrated circuits, Phase Change Memory (PC-RAM, PCM, or PRAM), Programmable Metallization Cell RAM (PMC-RAM or PMCm), Ovonic Unified Memory (OUM), Resistance RAM (RRAM), NAND memory (e.g., single-level cell (SLC) memory, multi-level cell (MLC) memory, triple level cell (TLC) memory, X4 or quad-level cell (QLC) memory, etc.), NOR memory, EEPROM, Ferroelectric Memory (FeRAM), magnetoresistive RAM (MRAM), or other discrete solid-state memory chips. In other embodiments, the non-volatile memory 140 may include magnetic media (including shingle magnetic recording), optical disks, floppy disks, electrically programmable read only memories (EPROM), electrically erasable programmable read only memories (EEPROM), etc. Non-volatile memory that uses magnetic media/medium may include one or more magnetic platters. Each platter may contain one or more regions of one or more tracks of data. The non-volatile memory 140 may include any combination of the one or more types of memories described here. The non-volatile memory 140 may be divided logically and/or physically into arrays, planes, blocks, pages, tracks, and sectors. While non-volatile memories are used as illustrative and teaching examples in this disclosure, those skilled in the art will recognize that various embodiments are applicable to volatile memories (e.g., Dynamic Random Access Memory (DRAM)) as well, as error correction codes are also used in those memories to protect data.
[0021] The controller 130 may include one or more processors, memory devices, data and/or power transmission channels/paths, boards, or the like. In some embodiments, the controller 130 may be implemented as one or more system-on-a-chip (SoC) modules, field-programmable gate array (FPGA) modules, application-specific integrated circuit (ASIC) modules, processing devices (e.g., processors), chips, or the like. In other embodiments, one or more components of the controller 130 may be mounted on a printed circuit board (PCB). The controller 130 may be configured to receive data commands from a storage interface (e.g., a device driver) residing on the computing device 110.
[0022] The controller 130 may communicate with the computing device 110 over a host interface 160, and may receive commands via the host interface 160. These commands may be referred to as data commands, data access commands, data storage access commands, etc. Data commands may specify a block address in the data storage device 120. Data may be accessed/transferred based on such data commands. For example, the controller 130 may receive data commands (from the computing device 110) and may execute such commands on/in the non-volatile memory 140 (e.g., in one or more arrays, pages, blocks, sectors, etc.). The data commands received from computing device 110 may include read data commands, write data commands, and erase data commands. The controller 130 may be coupled to the non-volatile memory (NVM) 140 via a NVM interface 150. In one embodiment, the NVM interface 150 may include a plurality of channels (e.g., one or more lines, pines, wires, traces, etc.) and each channel may be coupled to different portions of the non-volatile memory 140 (e.g., different NVM arrays, different flash arrays, etc.).
[0023] The controller 130 may execute the received data commands to read, write, and erase data from non-volatile memory 140, via the NVM interface 150. For example, the commands may include a read command (e.g. a data read command) to read a block of data from the non-volatile memory 140. The controller 130 may read the data from the page and may transmit the data to the computing device 110 via the host interface 160. In another example, the commands may include a write command (e.g., a data write command) to write data to a page in a non-volatile memory 140. In one embodiment, write commands may include program commands (e.g., a command to write the value 1 to a location the non-volatile memory 140) and erase commands (e.g., a command to write the value 0 to a location, a page, a block, etc., in the non-volatile memory array). The controller 130 may receive the data from the computing device 110 via the host interface 160 and may write the data to the page. The host interface 160 may include hardware (e.g., wires, pins, traces, connectors, etc.), software (e.g., drivers), firmware, or a combination thereof, that allows the processing device 111 and/or the computing device 110 to communicate data with the data storage device 120. Examples of a host interface may include a peripheral component interconnect express (PCIe) bus, a serial AT attachment (SATA) bus, a non-volatile memory express (NVMe) bus, etc.
[0024] The data storage device 120 may store data received from the computing device 110 such that the data storage device 120 acts as data storage for the computing device 110. To facilitate this function, the controller 130 may implement a logical interface. The logical interface may present to the computing device memory a set of logical addresses (e.g., sequential/contiguous addresses) where data may be stored. Internally, the controller 130 may map logical addresses to various physical memory addresses in the non-volatile memory arrays and/or other memory module(s). Mapping data indicating the mapping of logical addresses to physical memory addresses may be maintained in the data storage device. For example, mapping table data may be stored in non-volatile memory 140 in order to allow for recreation of mapping tables following a power cycle.
[0025] The controller 130 may encode data when storing the data on the non-volatile memory 140. The controller 130 may encode the data to protect the data from errors, loss, corruption, etc. The controller 130 may protect the data from errors, loss, corruption, etc., using various methods, techniques, functions, operations, actions, etc. In one embodiment, the controller 130 may protect the data by generating parity data (e.g., parity bits). The parity data may allow the controller 130 to determine whether there are errors in the data (e.g., errors due to corruption, damaged cells, damaged blocks, error while reading the data, etc.). The parity data (e.g., one or more parity bits) may be generated using various algorithms, techniques, functions, operations, etc. In another embodiment, the controller 130 may use an ECC to generate codewords. The codewords may also allow the controller 130 (e.g., the decoder 132) to correct or recover from errors in the codewords.
[0026] The controller 130 may also decode data that is stored on the non-volatile memory 140. In one embodiment, the controller 130 may decode codewords which encode the data that is stored on the non-volatile memory 140. In another embodiment, the controller 130 may perform error detection to determine the integrity of data retrieved from non-volatile memory 140 (e.g., to determine whether the data has errors). For example, the controller 130 may use parity data to check the data to determine whether there is an error in the data (e.g., whether one or more bits in the data are incorrect due to corruption, damage cells, damaged blocks, etc.).
[0027] As illustrated in
[0028] In one embodiment, the processing device 111 and/or controller 130 may obtain configuration data (e.g., may read the configuration data from a file, may receive the configuration data from the network interface 115, etc.). The configuration data may indicate a logical arrangement for a set of blocks that includes rows and columns of blocks, as discussed in more detail below. The configuration data may also indicate a number of row parity blocks in a set of row parity blocks and a number of diagonal parity blocks in a set of diagonal parity blocks. The processing device 111 and/or controller 130 may configure the data storage devices 120 based on the configuration data. For example, the processing device 111 and/or controller 130 may arrange the data blocks, diagonal parity blocks, and row parity blocks into rows and columns, as discussed in more detail below. The number of diagonal parity blocks in the set of diagonal parity blocks may less than the number of data blocks in a column. The number of diagonal parity blocks may also be less than the number of rows of data blocks. The number of row parity blocks in the set of parity blocks may be equal to the number of rows. The number of rows of data blocks may be equal to the number of columns of data blocks. Each diagonal party block may be generated, obtained, calculated, etc., based on data blocks in different rows and different columns, as discussed in more detail below. The total number of row parity blocks and diagonal parity blocks may be less than the number of blocks in two columns. The total amount of parity blocks stored in the data storage devices 120 (e.g., the repair overhead) may be less than the number of blocks in two columns of blocks, as discussed in more detail below. The processing device 111 and/or controller 130 may manage access to the data storage device 120. For example, the processing device 111 and/or controller 130 may execute commands to read, write, and/or access the non-volatile memory 140.
[0029] In one embodiment, the processing device 111 and/or controller 130 may determine that one or more data blocks of the set of data blocks has errors, is inaccessible, has become corrupted, etc. The processing device 111 and/or controller 130 may recover the one or more data blocks based on one or more of a subset of the set of data blocks, a subset the set of diagonal parity blocks, and a subset of the set of row parity blocks. The total number of blocks that are accessed, used, read, transmitted, etc., to recover a set of blocks (e.g., the repair bandwidth) may be less than the remaining number of blocks in the data storage devices 120, as discussed in more detail below.
[0030] Some embodiments of the present disclosure may use a diagonal code to reduce the repair bandwidth and/or repair overhead used by error correction codes, error correction schemes, error correction mechanisms, etc. The diagonal code may allow a data storage system to have a lower repair bandwidth when compared to other codes (such as SD codes) and may allow the data storage system to have a lower repair overhead when compared to other codes (such as butterfly codes). In addition, the diagonal code may allow the repair bandwidth (e.g., the number of blocks that are accessed to recover data) and/or the repair overhead (e.g., the number of that are used to store parity blocks) to be configurable. This allows a user to configure the data storage system with error correction capabilities, while maintaining a repair overhead and/or repair bandwidth that may be acceptable to the user.
[0031]
[0032] As illustrated in
[0033] In one embodiment, the data storage system 200 may use a sector-disk (SD) code to protect the data blocks 205 from loss, corruption, errors, etc. A SD code may be a type of error correction code, error correction scheme, error correction mechanism, etc., that may allow the data storage system 200 to recover, reconstruct, recalculate, regenerate, re-obtain, etc., data blocks 205 after errors occur, after the data blocks 205 are corrupted, damaged, etc. The SD code may use row parity blocks 210 and global parity blocks 215 to recover data blocks 205. In one embodiment, a row parity block may be parity data that is generated using the blocks in corresponding row. For example, the top-right parity block 210 may be generated, obtained, calculated, etc., using the four data blocks 205 that are in the first row of blocks. In another example, the top-right parity block 210 may be used to protect the data blocks 205 that are in the first row of blocks. In one embodiment, a global parity block may be parity data that is calculated using all of the data blocks 205 that are in the data storage system 200. For example, the global parity blocks 215 may be generated, obtained, calculated, etc., using the twenty-two data blocks 205 that stored in the data storage system 200.
[0034] As discussed above, the data storage system 200 may experience errors and/or failures which may cause one or more of the data blocks 205 to become unreadable, inaccessible, to have errors, etc. Also as discussed above, the data storage system 200 uses a sector-disk (SD) code (e.g., a SD error correction code, a SD coding scheme) to protect the data blocks 205. A SD code may be able to tolerate a certain number of errors in the data blocks 205. For example, the SD code may be able to recover the data blocks 205 if there are less than a threshold number of errors in the data blocks 205. As illustrated in
[0035] As illustrated in
[0036] If the data storage system 200 recovers the data blocks 205 in a column (e.g., in the rectangle 221) using a SD code, the data storage system 200 may access or read the all of the remaining data blocks 205 (e.g., sixteen data blocks 205), the global parity blocks 215 and the row parity blocks 210. For example, the data storage system 200 may read, access, etc., a total of twenty four blocks (e.g., sixteen data blocks 205, two global parity blocks 215 and six row parity blocks 210) in order to recover the data blocks in the second column (indicated by the rectangle 221). Thus, when the data storage system uses SD codes to protect the data blocks 205, the data storage system 200 may access all of the remaining blocks that have not failed or do not have errors (e.g., data blocks 205, global parity blocks 215, and row parity blocks 210) to reconstruct, recover, etc., the data blocks 205 in the second column. The number of blocks that are used by the data storage system 200 to recover data blocks 205 may be referred to as the repair bandwidth. Thus, the SD code (illustrated in
[0037]
[0038] As illustrated in
[0039] In one embodiment, the data storage system 300 may use a butterfly code to protect the data blocks 305 from loss, corruption, errors, etc. A butterfly code may be a type of error correction code, error correction scheme, error correction mechanism, etc., that may allow the data storage system 300 to recover, reconstruct, recalculate, regenerate, re-obtain, etc., data blocks 305 after errors occur, after the data blocks 305 are corrupted, damaged, etc. The butterfly code may use row parity blocks 310 and butterfly parity blocks 315 to recover data blocks 305. In one embodiment, a row parity block may be parity data that is generated using the blocks in a corresponding row. For example, the top-right parity block 310 may be generated, obtained, calculated, etc., using the four data blocks 305 that are in the first row of blocks. In another example, the top-right parity block 310 may be used to protect the data blocks 305 that are in the first row of blocks. In one embodiment, a butterfly parity block may be parity data that is calculated using one data block 305 from different columns and rows. For example, the butterfly parity blocks 315 (8) may be generated, obtained, calculated, etc., using the data blocks labelled 305 (8) (e.g., the last data block 305 from the top in the first column, the fourth data block 305 from the top in the second column, the second data block 305 from the top in the third column, and the first data block 305 from the top in the fourth column). The data blocks labelled (X) may referred to as a stripe of data or data stripe. A data stripe may be blocks of data that may be written across the columns of data blocks (i.e., written across the four leftmost columns) in different rows. For example, the four data blocks labeled 305 (1) may form a first data stripe, the four data blocks labeled 305 (2) may form a second data stripe, the four data blocks labeled 305 (3) may form a third data stripe, and the four data blocks labeled 305 (4) may form a fourth data stripe, etc. The butterfly parity blocks 315 may be generated, obtained, calculated, etc., using the data blocks 305 in a stripe, as discussed above.
[0040] As discussed above, the data storage system 300 may experience errors and/or failures which may cause one or more of the data blocks 305 to become unreadable, inaccessible, to have errors, etc. Also as discussed above, the data storage system 300 uses a butterfly code (e.g., a butterfly error correction code, a butterfly coding scheme) to protect the data blocks 305. A butterfly code may be able to tolerate a certain number of errors in the data blocks 305. For example, the butterfly code may be able to recover a column of data blocks 305 that becomes corrupted, lost, has errors, etc. As illustrated in
[0041] As illustrated in
[0042] As illustrated in
[0043] As discussed above, the number of blocks that are used by the data storage system 300 to recover data blocks 305 may be referred to as the repair bandwidth. The data storage system 300 may use a different number of blocks to recover different columns of data blocks 305. For example, the data storage system 300 may access twenty blocks of data to recover the data blocks in the first column, and may access twenty-six blocks of data to recover the data blocks in each of the second, third, and fourth column. Thus, the average repair bandwidth for the butterfly code (illustrated in
[0044] In addition, the number of rows in the logical arrangement of blocks may increase exponentially with the number of columns of data blocks 305 when the data storage system 300 uses a butterfly code to protect the data blocks 305. For example, if there are four columns of data blocks 305, then the logical arrangement of blocks includes eight rows of blocks, as illustrated in
[0045]
[0046] As illustrated in
[0047] In one embodiment, the data storage system 400 may use a diagonal code to protect the data blocks 405 from loss, corruption, errors, etc. A diagonal code may be a type of error correction code, error correction scheme, error correction mechanism, etc., that may allow the data storage system 400 to recover, reconstruct, recalculate, regenerate, re-obtain, etc., data blocks 405 after errors occur, after the data blocks 405 are corrupted, damaged, etc. The diagonal code may use row parity blocks 410 and diagonal parity blocks 415 to recover data blocks 405. In one embodiment, a row parity block may be parity data that is generated using the blocks in corresponding row. For example, the top-right parity block 410 may be generated, obtained, calculated, etc., using the four data blocks 405 that are in the first row of blocks. In another example, the top-right parity block 410 may be used to protect the data blocks 405 that are in the first row of blocks. In one embodiment, a diagonal parity block may be parity data that is calculated using data blocks 405 from different columns and rows. For example, the diagonal parity blocks 415 (1) may be generated, obtained, calculated, etc., using the data blocks labelled 405 (1) (e.g., the first data block 405 from the top in the first column, the fourth data block 405 from the top in the second column, the third data block 405 from the top in the third column, and the second data block 405 from the top in the fourth column). The data blocks labelled (X) may referred to as a stripe of data or data stripe. A data stripe may be blocks of data that may be written across the columns of data blocks (i.e., written across the four leftmost columns). For example, the four data blocks labeled 405 (1) may form a first data stripe, the four data blocks labeled 405 (2) may form a second data stripe, the four data blocks labeled 405 (3) may form a third data stripe, and the four data blocks labeled 405 (4) may form a fourth data stripe.
[0048] In other embodiments, the number of rows and columns of data blocks 405, row parity blocks 410, and diagonal parity blocks 415 may be different. For example, there may be eight columns of data blocks each column with eight data blocks 405, one column of row parity blocks 410 with eight row parity blocks 401, and four diagonal parity blocks 415. The data blocks 405 may be arranged such that the data blocks 405 from the different stripes (e.g., the first, second, third, and fourth data stripes labeled (1), (2), (3), and (4) respectively) are in an order in the first column (e.g., a leftmost column). For a subsequent column (e.g., the next column) the order of the stripes of the data blocks 405 may be changed. For stripe of the topmost data block 405 may be moved to the bottom of the subsequent column and the stripes for the other data blocks 405 may be shifted upwards to create a new order of stripes of data blocks 405 in the subsequent column. Each subsequent column may arrange the stripes of the data blocks 405 in a new order by moving the strip of the topmost data bock 405 of the previous column to the bottom and shifting the other stripes of the data blocks 405 upwards. The diagonal parity blocks 415 may be generated, obtained, calculated, etc., using the data blocks 405 in a stripe.
[0049] As discussed above, the data storage system 400 may experience errors and/or failures which may cause one or more of the data blocks 405 to become unreadable, inaccessible, to have errors, etc. Also as discussed above, the data storage system 400 uses a diagonal code (e.g., a diagonal error correction code, a diagonal coding scheme) to protect the data blocks 405. A diagonal code may be able to tolerate a certain number of errors in the data blocks 405. For example, the diagonal code may be able to recover a column of data blocks 405 that becomes corrupted, lost, has errors, etc. As illustrated in
[0050] As illustrated in
[0051] As illustrated in
[0052] As illustrated in
[0053] As discussed above, the number of blocks that are used by the data storage system 400 to recover data blocks 405 may be referred to as the repair bandwidth. When a column of data blocks becomes inaccessible, becomes corrupted, has errors, etc., the data storage system 400 may use twelve blocks out of the remaining eighteen blocks to recover the column of data blocks. For example, if the each of the first column, second column, third column, or fourth column of data blocks becomes inaccessible, the data storage system 400 may access twelve blocks of data (e.g., twelve total of data blocks 405, row parity blocks 410, and diagonal parity blocks 415) to recover the column of data blocks 405. Thus, the repair bandwidth for the diagonal code (illustrated in
[0054] In one embodiment, the repair bandwidth for the diagonal code used by the data storage system 400 may be constant or the same regardless of which column of data blocks 405 becomes inaccessible, corrupted, has errors, etc. As discussed above, the data storage system 400 may access twelve blocks of data regardless of whether the first, second, third or fourth column of data has become inaccessible. This may allow a more predictable and/or constant usage of the data storage system 400 when recovering data blocks, when compared with other codes such as butterfly codes. For example, if the data storage system 400 determines that data blocks have become inaccessible, the data storage system 400 may be able to more accurately predict the bandwidth (e.g., the amount of data that should be accessed, read, transmitted, etc.) to recover the inaccessible data blocks.
[0055] In addition, the number of rows of data blocks 405 is equal to the number of columns of data blocks 405 in the logical arrangement of blocks. This may be similar to the numbers and rows of data blocks used by the SD code illustrated in
[0056]
[0057] The method 500 starts at block 505 where the method 500 may obtain configuration data for the data storage system. For example, the method 500 may receive the configuration data from another device (e.g., another computing device) via a network interface. In another example, the method 500 may receive the configuration data via an interface, such as a graphical user interface, a command line interface, etc. In a further example, the method 500 may obtain the configuration data from a configuration file, settings, parameters, etc., that may be stored in the data storage system or in another data storage device. The data storage system may use a diagonal code, as illustrated in
[0058] In one embodiment, the configuration data may indicate a logical arrangement of blocks for the data storage system. For example, the configuration data may indicate how many rows are in the logical arrangement, how may columns are in the logical arrangement, how many blocks are in each row, how many blocks are in each column, etc. The configuration data may also indicate how many row parity blocks are in the data storage system. For example, the configuration data may indicate the number of row parity blocks in a column. The configuration data may also indicate how many diagonal parity blocks are in the data storage system. For example, the configuration data may indicate how many diagonal parity blocks are in a column. The configuration data may also indicate which data blocks are used to generate, obtain, calculate, determine, etc., the row parity blocks and/or diagonal parity blocks. For example, configuration data may indicate that each row of data blocks is used to determine a row parity block. In another example, the configuration data may indicate which blocks from different rows and columns are used to generate a diagonal parity block.
[0059] At block 510, the method 500 may configure a set of data storage devices (e.g., one or more data storage device) of the data storage system. For example, the method 500 may store the data blocks based on the logical arrangement indicated in the configuration data. In another example, the method 500 may generate the row parity blocks and diagonal parity blocks based on the logical arrangement indicated in the configuration data. For example, the method 500 may generate a diagonal parity blocks based on the configuration data, which indicates which blocks from different rows and columns are used to generate the diagonal parity block.
[0060] In some embodiments, the configuration data allows the repair bandwidth (e.g., the number of blocks that are accessed to recover data) and/or the repair overhead (e.g., the number of that are used to store parity blocks) to be configurable. For example, a user may change the configuration data by changing the number of diagonal parity blocks that are used by the data storage system. This may change the repair bandwidths and/or the repair overhead of the data storage system and the diagonal code. The diagonal code used by the data storage system may be configurable such that the repair bandwidth is less than 1 and repair overhead is less than 2, as discussed above. This allows the user to configure the data storage system with error correction capabilities, while maintaining a repair overhead and/or repair bandwidth that may be acceptable to the user.
[0061]
[0062] The method 600 starts at block 605 where the method 600 determines whether there are errors in one or more data blocks. For example, when the data storage devices tries to access one or more data blocks, the method 600 may determine whether there are error in the data blocks. If there are no errors in the data blocks, the method 600 may access the data blocks at block 610. For example, the method 600 may read the one or more data blocks. If there is an error in the one or more data blocks, the method 600 may determine whether the one or more data blocks are recoverable at block 615. For example, the method 600 may determine whether there are errors in less than a threshold number of data blocks (e.g., whether less than a single column of data blocks has errors). If the one or more data blocks are not recoverable, the method 600 ends. For example, if there are too many columns of data blocks with errors (e.g., more than a single column of data blocks has errors), the method 600 may end because the method 600 may be unable to recover the data blocks using the diagonal code. If the one or more data blocks are recoverable, the method 600 may recover the one or more data blocks at block 620. For example, the method 600 may recover the one or more data blocks using other data blocks, row parity blocks, and/or diagonal parity blocks, as discussed above. The total number of blocks that are accessed or used to recover the one or more blocks may be less than the number of blocks remaining in the data storage system (e.g., the repair bandwidth may be less than 1), as discussed above.
[0063] Some embodiments of the present disclosure may be used to reduce the repair bandwidth and/or repair overhead used by error correction codes, error correction schemes, error correction mechanisms, etc. For example, some embodiments may use a diagonal code, a diagonal coding scheme, etc., to protect data blocks from loss. The diagonal code may allow a data storage system to have a lower repair bandwidth when compared to other codes (such as SD codes) and may allow the data storage system to have a lower repair overhead when compared to other codes (such as butterfly codes). In addition, the diagonal code may allow the repair bandwidth (e.g., the number of blocks that are accessed to recover data) and/or the repair overhead (e.g., the number of that are used to store parity blocks) to be configurable. This allows a user to configure the data storage system with error correction capabilities, while maintaining a repair overhead and/or repair bandwidth that may be acceptable to the user.
General Comments
[0064] Those skilled in the art will appreciate that in some embodiments, other types of distributed data storage systems may be implemented while remaining within the scope of the present disclosure. In addition, the actual steps taken in the processes discussed herein may differ from those described or shown in the figures. Depending on the embodiment, certain of the steps described above may be removed, others may be added.
[0065] While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of protection. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the protection. For example, the various components illustrated in the figures may be implemented as software and/or firmware on a processor, ASIC/FPGA, or dedicated hardware. Also, the features and attributes of the specific embodiments disclosed above may be combined in different ways to form additional embodiments, all of which fall within the scope of the present disclosure. Although the present disclosure provides certain preferred embodiments and applications, other embodiments that are apparent to those of ordinary skill in the art, including embodiments which do not provide all of the features and advantages set forth herein, are also within the scope of this disclosure. Accordingly, the scope of the present disclosure is intended to be defined only by reference to the appended claims.
[0066] The words example or exemplary are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as example or exemplary is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the words example or exemplary is intended to present concepts in a concrete fashion. As used in this disclosure, the term or is intended to mean an inclusive or rather than an exclusive or. That is, unless specified otherwise, or clear from context, X includes A or B is intended to mean any of the natural inclusive permutations. That is, if X includes A; X includes B; or X includes both A and B, then X includes A or B is satisfied under any of the foregoing instances. In addition, the articles a and an as used in this disclosure and the appended claims should generally be construed to mean one or more unless specified otherwise or clear from context to be directed to a singular form. Moreover, use of the term an embodiment or one embodiment or an implementation or one implementation throughout is not intended to mean the same embodiment or implementation unless described as such. Furthermore, the terms first, second, third, fourth, etc., as used herein are meant as labels to distinguish among different elements and may not necessarily have an ordinal meaning according to their numerical designation.
[0067] All of the processes described above may be embodied in, and fully automated via, software code modules executed by one or more general purpose or special purpose computers or processors. The code modules may be stored on any type of computer-readable medium or other computer storage device or collection of storage devices. Some or all of the methods may alternatively be embodied in specialized computer hardware.