Patent classifications
G06F11/2094
Memory error handling during and/or immediately after a virtual machine migration
According to aspects of the present disclosure, systems and methods can be provided to recover from memory errors that occur during or following a virtual machine migration. Methods, computer program products and/or systems are provided for handling memory error that perform the following operations: (i) obtaining a memory address that triggered an uncorrected error on a first host associated with a virtual machine migration; (ii) computing a page associated with the memory address; (iii) determining if a copy of the page associated with the memory address is available on a second host associated with the virtual machine migration; (iv) obtaining data from the copy of the page on the second host; and (v) generating a new page on the first host with the data obtained from the second host.
Resolving erred 10 flows
A method for resolving an erred input/output (IO) flow, the method may include (i) sending over a path a remote direct write request associated with a certain address range; wherein the path is formed between a compute node of a storage system to a storage drive of the storage system; (ii) receiving by the compute node an error message related to the remote direct write request; wherein the error message does not indicate whether an execution of the remote direct write request failed or is only temporarily delayed; (iii) responding by the compute node to the error message by (a) preventing from sending one or more IO requests through the path, (b) preventing from sending at least one IO requests aimed to the certain address range; and (c) requesting, using a management communication link, to force an execution of pending IO requests that are related to the path; and (iv) reuse the path, by the compute node, following an indication that there are no pending IO requests that are related to the path.
System and Method for Blockchain Based Backup and Recovery
A blockchain based system and method of data backup and recovery, for use with a conventional data store is disclosed. The system includes a blockchain that includes one or more nodes and a storage adaptation layer. The storage adaptation layer is in data communication with the blockchain and the data store, stores data from the data storage into the blockchain. The data store may be a relational database managing system or other type of data store. The system further includes a recovery adaptation layer configured to restore data in the blockchain to the data store. The recovery adaptation layer is also in data communication with the blockchain and the data store
STORAGE SYSTEM
A first storage controller includes a first input and output controller performs input and output processing on host data, and a first management controller. A second storage controller includes a second input and output controller performs input and output processing on host data, and a second management controller. The first management controller is configured to verify software to be executed by the first management controller and software to be executed by the first input and output controller. The second management controller is configured to verify software to be executed by the second management controller and software to be executed by the second input and output controller. The first management controller is configured to verify the software to be executed by the second input and output controller in place of the second management controller when a failure is detected from the second management controller.
Controller for managing superblocks and operation method thereof
A method for operating a controller that controls a memory device includes: replacing a bad block of a superblock with a replacement block to form a reproduced superblock; controlling the memory device to perform a program operation on the reproduced superblock according to an interleaving scheme; moving data stored in the replacement block to a pseudo-replacement block when the program operation on the reproduced superblock is completed; and releasing the replacement block from the reproduced superblock.
STORAGE SYSTEM AND DATA PROCESSING METHOD
In a storage system in which a plurality of pieces of control software constituting a redundancy group are distributedly arranged in a plurality of storage nodes, control software in an active state out of the plurality of pieces of control software constituting the redundancy group receives a write request from a higher-level device. The control software in the active state writes data related to the write request by mirroring into a cache memory of a storage node in which the control software in the active state is arranged and a cache memory of a storage node in which control software in an inactive state belonging to the same redundancy group is arranged. The control software in the active state sends a write completion response to the higher-level device, and redundantly stores the data written in the cache memories in a storage device.
INTERCONNECT LAYER SEND QUEUE RESERVATION SYSTEM
Systems and methods for an interconnect layer send queue reservation system are provided. In one example, a method involves performing a transfer of data (e.g., an NVLog) from a storage system to a secondary storage system. A send queue having a fixed number of slots is maintained within an interconnect layer interposed between a file system and a Remote Direct Memory Access (RDMA) layer of the storage system. The interconnect layer implements an application programming interface (API) for the reservation system. A deadlock situation is avoided by, during a suspendable phase of a write transaction, making a reservation for slots within the send queue via the reservation system for the transfer of data. When the reservation is successful, the write transaction proceeds with a modify phase, during which the reservation is consumed and the interconnect layer is caused to perform an RDMA operation to carry out the transfer of data.
Data restoration operations based on network path information
A system according to certain aspects improves the process of data restoration and application recovery operations. The system can back up primary data based on network path information associated with a client computing device. When the primary data becomes corrupted or unavailable, a previously backed up copy of the primary data may be used as the primary data to achieve instant application recovery. For example, when a portion of the primary data is requested by a user or an application, the system may identify a corresponding portion in the backed up copy of the primary data and provide the identified portion to the user or the application in a manner transparent to the user or the application. Alternatively, the application running on the client computing device may send a request for the backup copy of the primary data to the secondary storage device upon determining that the requested data is not available.
Memory system for handling program error and method thereof
A scheme for handling program errors is provided for a memory system which includes a memory device and a controller including firmware and a memory interface. The firmware issues commands for program operations to the memory interface. After detecting a failed program operation in a particular memory block, the firmware reroutes that program operation to a different location in a different memory block and takes further action to reduce the likelihood of a subsequent error occurring in the same memory block in which the failed program operation occurred.
Method and system for host-assisted data recovery assurance for data center storage device architectures
A method of error management includes, in response to a read request for first data from a first storage device of a plurality of storage devices under one or more common data protection schemes, receiving a read uncorrectable indication regarding the first data, obtaining uncorrected data and metadata of an LBA associated with the first data, and obtaining the same LBA from one or more other storage devices of the plurality. The method further includes comparing the uncorrected data with the data and metadata from the other storage devices, speculatively modifying the uncorrected data based, at least in part, on the other data to create a set of reconstructed first data codewords, and, in response to a determination that one of the reconstructed first data codewords has recovered the first data, issuing a write_raw command to rewrite the modified data and associated metadata to the first storage device.