Patent classifications
G06F11/2043
Method and apparatus for performing node information exchange management of all flash array server
A method and apparatus for performing node information exchange management of an all flash array (AFA) server are provided. The method may include: utilizing a hardware manager module among multiple program modules running on any node of multiple nodes of the AFA server to control multiple hardware components in a hardware layer of the any node, for establishing a Board Management Controller (BMC) path between the any node and a remote node among the multiple nodes; utilizing at least two communications paths to exchange respective node information of the any node and the remote node, to control a high availability (HA) architecture of the AFA server according to the respective node information of the any node and the remote node, for continuously providing a service to a user of the AFA server; and in response to malfunction of any communications path, utilizing remaining communications path(s) to exchange the node information.
MEDIATOR ASSISTED SWITCHOVER BETWEEN CLUSTERS
Techniques are provided for metadata management for enabling automated switchover. An initial quorum vote may be performed before a node executes an operation associated with metadata comprising operational information and switchover information. After the initial quorum vote is performed, the node executes the operation upon one or more mailbox storage devices. Once the operation has executed, a final quorum vote is performed. The final quorum vote and the initial quorum vote are compared to determine whether the operation is to be designated as successful or failed, and whether any additional actions are to be performed.
High availability for persistent memory
Techniques for implementing high availability for persistent memory are provided. In one embodiment, a first computer system can detect an alternating current (AC) power loss/cycle event and, in response to the event, can save data in a persistent memory of the first computer system to a memory or storage device that is remote from the first computer system and is accessible by a second computer system. The first computer system can then generate a signal for the second computer system subsequently to initiating or completing the save process, thereby allowing the second computer system to restore the saved data from the memory or storage device into its own persistent memory.
System and method for reducing failover times in a redundant management module configuration
While the management module of an information handling system is set as a standby module, an enclosure controller provides first requests for attribute data of the information handling system, and receives and stores first response data for attribute data associated with a first subset of the first requests in a local memory of the enclosure controller. The enclosure controller receives request failure responses associated with a second subset of the first requests directed to a subset of the attributes data for the information handling system stored in a shared memory. While the management module is set as an active module, the management module is granted access to the shared memory. The enclosure controller provides retry requests for attributes associated with the request failure responses, and receives and stores second response data associated with the retry requests in the local memory.
Incremental file system backup using a pseudo-virtual disk
Methods and systems for backing up and restoring sets of electronic files using sets of pseudo-virtual disks are described. The sets of electronic files may be sourced from or be stored using one or more different data sources including one or more real machines and/or one or more virtual machines. A first snapshot of the sets of electronic files may be aggregated from the different data sources and stored using a first pseudo-virtual disk. A second snapshot of the sets of electronic files may be aggregated from the different data sources subsequent to the generation of the first pseudo-virtual disk and stored using the first pseudo-virtual disk or a second pseudo-virtual disk different from the first pseudo-virtual disk.
CPU hot-swapping
There is disclosed in one example a multi-core computing system configured to provide a hot-swappable CPU0, including: a first CPU in a first CPU socket and a second CPU in a second CPU socket; a switch including a first media interface to the first CPU socket and a second media interface to the second CPU socket; and one or more mediums including non-transitory instructions to detect a hot swap event of the first CPU, designate the second CPU as CPU0, determine that a new CPU has replaced the first CPU, operate the switch to communicatively couple the new CPU to a backup initialization code store via the first media interface, initialize the new CPU, and designate the new CPU as CPUN, wherein N≠0.
CPU-GPU LOCKSTEP SYSTEM
A lockstep controller operates a lockstep system of three or more CPU-GPU pairs, comparing the outputs from the CPU-GPU pairs and, by way of a majority vote, provides the output for the lockstep system. Based on comparing the outputs, if one of the CPU-GPU pairs provides outputs that disagree with the majority outputs, it can be switched out of the lockstep system. The removed CPU is replaced by a backup CPU. So that the backup CPU can be part of a CPU-GPU pair, a portion of the address space from the GPU of one of the other CPU-GPU pairs is assigned to the backup CPU to operate as a replacement CPU-GPU pair, while the CPU already associated with this GPU retains another portion of the GPU's address space to continue operating as a CPU-GPU pair.
SERVER RECOVERY FROM A CHANGE IN STORAGE CONTROL CHIP
A method comprises configuring an address-to-SC unit (A2SU) of each of a plurality of CPU chips based on a number of valid SC chips in the computer system. Each of the plurality of CPU chips is coupled to each of the SC chips in a leaf-spine topology. The A2SU is configured to correlate each of a plurality of memory addresses with a respective one of the valid SC chips. The method further comprises, in response to detecting a change in the number of valid SC chips, pausing operation of the computer system including operation of a cache of each of the plurality of CPU chips; while operation of the computer system is paused, reconfiguring the A2SU in each of the plurality of CPU chips based on the change in the number of valid SC chips; and in response to reconfiguring the A2SU, resuming operation of the computer system.
Multi-processor system with distributed mailbox architecture and communication method thereof
A multi-processor system with a distributed mailbox architecture and a communication method thereof are provided. The multi-processor system comprises a plurality of processors, each of the processors is correspondingly configured with an exclusive mailbox and an exclusive channel, and the communication method comprises the following steps. When a first processor of the processors needs to communicate with a second processor, the first processor writes data into the exclusive mailbox of the second processor through a public bus; and when the writing of the data has completed, the exclusive mailbox of the second processor sends an interrupt signal to the second processor, after receiving the interrupt signal, the second processor reads the data in the exclusive mailbox through the corresponding exclusive channel.
SYNCHRONIZED MEMORY REPAIR SYSTEM (SRS)
A system, program product, and method for processing synchronized memory repairs. The method includes identifying a faulty memory row from a plurality of functioning memory rows in a memory array. The method also includes executing memory row repair operations directed toward the faulty memory row and identifying a repair row to operationally replace the faulty memory row. The method also includes creating a multiple hot state within a memory decoder. The memory decoder includes logic circuitry for executing operation of the plurality of functioning memory rows. The method further includes activating a wordline of the identified repair row through the multiple hot state, and executing one or more memory operations on the identified repair row though the memory decoder. Accordingly, the embodiments disclosed herein facilitate synchronization of the repair row and functioning memory rows within the memory array, as well as any associated peripheral signals.