G06F2201/805

CRC RAID RECOVERY FROM HARD FAILURE IN MEMORY SYSTEMS

A system and method for memory error recovery in CXL components is presented. The method includes determining that a memory component has sustained a hard failure in a Cyclic Redundancy Check-Redundant Array of Independent Devices (CRC-RAID) mechanism. The method further includes determining a location of the memory component failure, wherein the CRC-RAID mechanism comprises a plurality of memory components configured as a plurality of stripes and initiates a write operation of user data to a location within a particular stripe, wherein the particular stripe contains a failed memory component. The method includes compensating for the failed memory component, wherein the compensating comprises a plurality of read operations prior to a writing of the user data.

Firmware descriptor resiliency mechanism

An apparatus to facilitate descriptor resiliency in a computer system platform is disclosed. The apparatus comprises a non-volatile memory to store firmware for a computer system platform, wherein the firmware comprises a primary descriptor including access permission details for platform components and a secondary descriptor including a backup copy of the access permission details and a controller, coupled to the first non-volatile memory, including recovery hardware to detect a problem during a platform reset with the primary descriptor, recover the contents of the primary descriptor from the backup copy included in the secondary descriptor and store the contents of the backup copy to primary descriptor.

Incremental vault to object store
11714724 · 2023-08-01 · ·

Systems and methods for managing incremental data backups on an object store. A computing device receives first data representing a changed chunk of data in a revision of a data volume on a storage device, the changed chunk includes data having changes from previous data of a previous revision. The computing device creates a block of data representing a copy of the changed chunk on the object store, the object store also includes a previous revision block representing previous revision data. The computing device determines a previous index stored on the object store corresponding to the previous revision, which includes entries including at least one corresponding to the previous revision block. The computing device creates a copy of at least one previous index from the object store, and a revised index that updates the corresponding entry with updated entry data representing the change block.

Node level recovery for clustered databases
11567840 · 2023-01-31 · ·

An example networked computing system for iterative node level recovery comprises a node cluster; a database; at least one processor configured by instructions to perform operations comprising at least: identifying a failed node among existing nodes in the node cluster; identifying and initiating a replacement node as a new node for the node cluster; accessing at the database a logical backup of the node cluster; retrieving logical backup data of the node cluster and identifying specific rows of backup data to be restored to the new node; restoring the specific data rows to the new node; identifying new data written by applications, to the existing nodes of the node cluster, during restoration of the new node; iteratively accessing supplementary back up data to identify supplementary data rows to be restored to the new node; and iteratively restoring the supplementary data rows to the new node until the new node is synchronized with the existing nodes in the node cluster.

Systems and methods for host image transfer

Methods and systems for transferring a host image of a first machine to a second machine, such as during disaster recovery or migration, are disclosed. In one example, a first profile of a first machine of a first type is compared to a second profile of a second machine of a second type different from the first type, to which the host image is to be transferred. The first and second profiles each comprise at least one property of the first type of first machine and the second type of second machine, respectively. At least one property of a host image of the first machine is conformed to at least one corresponding property of the second machine. The conformed host image is provided to the second machine, via a network. The second machine is configured with at least one conformed property of the host image.

Optimized disaster-recovery-as-a-service system

Methods, computer program products, and systems are presented. The methods include, for instance: analyzing a dataset associated with a service provided by the data protection service provider in order to determine a policy for when and how to replicate the respective components of the dataset corresponding to the service from a source site to a target site, such that the target site may perform the service with a minimum cost.

METHOD AND SYSTEM FOR OFF-LINE REPAIRING AND SUBSEQUENT REINTEGRATION IN A SYSTEM

There are provided methods and systems for correcting an error from a memory. For example, there is provided a system for mitigating an error in a memory. The system can include a memory controller communicatively coupled to a host. The memory controller may be configured to receive information associated with a memory location. The information can indicate the error at the memory location. The controller may be configured to perform, upon receiving the information, certain operations. The operations can include copying data around the memory location, placing the copied data in a reserved area. And the operations can further include outputting, to a central controller, a set of physical addresses associated with the reserved area, wherein the central controller is configured to modify the set of physical address to conduct a data recovery off-line.

System, method, and computer program for a microservice lifecycle operator

As described herein, a system, method, and computer program are provided for a microservice lifecycle operator. In use, at least one specification for a microservice is identified. Further, a lifecycle of the microservice is managed, using a lifecycle operator and the at least one specification.

Refresh management for DRAM

A memory controller interfaces with a dynamic random access memory (DRAM). The memory controller selectively places memory commands in a memory interface queue, and transmits the commands from the memory interface queue to a memory channel connected to at least one dynamic random access memory (DRAM). The transmitted commands are stored in a replay queue. A number of activate commands to a memory region of the DRAM is counted. Based on this count, a refresh control circuit signals that an urgent refresh command should be sent to the memory region. In response to detecting a designated type of error, a recovery sequence initiates to re-transmit memory commands from the replay queue. Designated error conditions can cause the recovery sequence to restart. If an urgent refresh command is pending when such a restart occurs, the recovery sequence is interrupted to allow the urgent refresh command to be sent.

Using a track format code in a cache control block for a track in a cache to process read and write requests to the track in the cache

Provided are a computer program product, system, and method for using a track format code in a cache control block for a track in a cache to process read and write requests to the track in the cache. A track format table associates track format codes with track format metadata. A determination is made as to whether the track format table has track format metadata matching track format metadata of a track staged into the cache. A determination is made as to whether a track format code from the track format table for the track format metadata in the track format table matches the track format metadata of the track staged. A cache control block for the track being added to the cache is generated including the determined track format code when the track format table has the matching track format metadata.