G06F11/181

MANAGEMENT COMPUTER AND RESOURCE MANAGEMENT METHOD

The management computer has a memory which stores management information and management programs, and a CPU which refers to the management information and executes the management programs; the management information includes storage management information for allowing determination as to whether the plurality of storage resources can be paired in a redundant configuration, and couplable configuration management information for determining whether the plurality of storage resources and the plurality of server resources can be connected to each other; and when the CPU deploys a virtual machine, the CPU first determines, by reference to the storage management information, storage resources to be paired in a redundant configuration, then selects, by reference to the couplable configuration management information, server resources each of which can be connected to a respective one of the storage resources that are to be paired in a redundant configuration, and pairs the selected server resources in the redundant configuration.

Redundancy control device for aircraft
11552640 · 2023-01-10 · ·

The redundancy control device includes three controllers that output status signals, a majority voting circuit to which a first voltage or a second voltage is supplied as an output signal through an output line of each controller, a switch provided in each output line, a voltage supply unit provided for each output line to supply the second voltage to the output line when the first voltage is lost, a latch circuit provided for each output line to latch the second voltage when the second voltage is supplied thereto and continue to output the second voltage, a comparison circuit provided for each controller to output a comparison signal based on a comparison of the status signals, and a switch control unit provided for each switch to outputs a switch signal to the switch in response to the comparison signal from the comparison circuit.

Updating Counters Distributed Across a Plurality of Nodes
20220413953 · 2022-12-29 ·

Techniques are disclosed relating to methods that include initializing, by a computer in a computer system, an event counter that includes a plurality of sub-counter groups, each plurality of sub-counter groups including at least two sub-counters located on different nodes of a plurality of nodes in the computer system. In response to an occurrence of an event associated with the event counter, the method may include the computer selecting a particular sub-counter group of the plurality of sub-counter groups to update, and sending, to sub-counters corresponding to the particular sub-counter group, a request to update a sub-counter value for the particular sub-counter group. In response to a request for a current count value of the event counter, the method may include outputting, by the computer, a sum of the sub-counter values for the plurality of sub-counter groups as the current count value.

Parallel processing system runtime state reload
11526409 · 2022-12-13 · ·

A parallel processing system includes at least three processors operating in parallel, state monitoring circuitry, and state reload circuitry. The state monitoring circuitry couples to the at least three parallel processors and is configured to monitor runtime states of the at least three parallel processors and identify a first processor of the at least three parallel processors having at least one runtime state error. The state reload circuitry couples to the at least three parallel processors and is configured to select a second processor of the at least three parallel processors for state reload, access a runtime state of the second processor, and load the runtime state of the second processor into the first processor. Monitoring and reload may be performed only on sub-systems of the at least three parallel processors. During reload, clocks and supply voltages of the processors may be altered. The state reload may relate to sub-systems.

MEDIATOR ASSISTED SWITCHOVER BETWEEN CLUSTERS

Techniques are provided for metadata management for enabling automated switchover. An initial quorum vote may be performed before a node executes an operation associated with metadata comprising operational information and switchover information. After the initial quorum vote is performed, the node executes the operation upon one or more mailbox storage devices. Once the operation has executed, a final quorum vote is performed. The final quorum vote and the initial quorum vote are compared to determine whether the operation is to be designated as successful or failed, and whether any additional actions are to be performed.

PARALLEL PROCESSING SYSTEM RUNTIME STATE RELOAD
20230102197 · 2023-03-30 ·

A parallel processing system includes at least three parallel processors, state monitoring circuitry, and state reload circuitry. The state monitoring circuitry couples to the at least three parallel processors and is configured to monitor runtime states of the at least three parallel processors and identify a first processor of the at least three parallel processors having at least one runtime state error. The state reload circuitry couples to the at least three parallel processors and is configured to select a second processor of the at least three parallel processors for state reload, access a runtime state of the second processor, and load the runtime state of the second processor into the first processor. Monitoring and reload may be performed only on sub-systems of the at least three parallel processors. During reload, clocks and supply voltages of the processors may be altered. The state reload may relate to sub-systems.

METHOD, ELECTRONIC DEVICE, AND PROGRAM PRODUCT FOR FAILURE HANDLING
20230086852 · 2023-03-23 ·

Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for failure handling. This failure handling method includes determining a sector set failure type associated with at least one failed sector set of a disk; if the sector set failure type indicates that the number of failed sector sets in the at least one failed sector set is greater than a first threshold number, generating an instruction for replacing the disk; and otherwise performing at least one of the following: migrating data from a failed sector set in which the number of failed sectors is greater than a second threshold number to a spare sector set, and performing a failure recovery for a failed sector set in which the number of failed sectors is less than or equal to the second threshold number.

Node Failure Detection and Resolution in Distributed Databases
20230078926 · 2023-03-16 · ·

Methods and systems to detect and resolve failure in a distributed database system is described herein. A first node in the distributed database system can detect an interruption in communication with at least one other node in the distributed database system. This indicates a network failure. In response to detection of this failure, the first node starts a failure resolution protocol. This invokes coordinated broadcasts of respective lists of suspicious nodes among neighbor nodes. Each node compares its own list of suspicious nodes with its neighbors' lists of suspicious nodes to determine which nodes are still directly connected to each other. Each node determines the largest group of these directly connected nodes and whether or not it is in that group. If a node isn't in that group, it fails itself to resolve the network failure.

Database system with designated leader and methods for use therewith

A networked database management system (DBMS) is disclosed. In particular, the disclosed DBMS includes a plurality of nodes, one of which is elected as a designated leader. The designated leader is elected using a consensus algorithm, such as tabulated random votes, RAFT or PAXOS. The designated leader is responsible for managing open coding lines, and determining when to close an open coding line.

Multiprocessor system
09846666 · 2017-12-19 · ·

The present invention realizes a functional safety of a multiprocessor system without tightly coupling processor elements. When causing a plurality of processor elements to execute the same data processing and realizing a functional safety of the processor element, there is adopted a bus interface unit that performs control of performing safety measure processing when the non-coincidence of access requests issued from the processor elements has been fixed, and of starting access processing responding the access request when these access requests coincide with one another.