G06F11/1645

Providing failover control on a control system

Systems and methods for providing failover control in a control system are provided. For instance, a data stream from a plurality of computing nodes in a computing system can be monitored. A first subset of computing nodes can be selected based on the data streams. Control grant signals can be generated for each computing node of the first subset. An output to one or more computing nodes of the first subset can be activated based at least in part on a number of control grant signals generated for each computing node of the first subset. Control authority can then be granted to the one or more computing nodes of the first subset.

METHODS FOR MANAGING COMMUNICATIONS INVOLVING A LOCKSTEP PROCESSING SYSTEM
20180336157 · 2018-11-22 ·

A method for managing communications involving a lockstep processing comprising at least a first processor and a second processor can include receiving, at a data synchronizer, a first signal from a first device. The method can also include receiving, at the data synchronizer, a second signal from a second device. In addition, the method can include determining, by the data synchronizer, whether the first signal is equal to the second signal. When the first signal is equal to the second signal, the method can include transmitting, by the data synchronizer, the first signal to the first processor and the second signal to the second processor. Specifically, in example embodiments, transmitting the first signal to the first processor can occur synchronously with transmitting the second signal to the second processor.

PERIODIC NON-INTRUSIVE DIAGNOSIS OF LOCKSTEP SYSTEMS
20180203778 · 2018-07-19 ·

Aspects disclosed herein relate to periodic non-intrusive diagnosis of lockstep systems. An exemplary method includes comparing execution of a program on a first processing system of the plurality of processing systems and execution of the program on a second processing system of the plurality of processing systems using a first comparator circuit, comparing the execution of the program on the first processing system and the execution of the program on the second processing system using a second comparator circuit, and running a diagnosis program on the second comparator circuit while the comparing using the first comparator circuit is ongoing.

Service Takeover Method, Storage Device, And Service Takeover Apparatus
20180143887 · 2018-05-24 ·

The present disclosure describes example service takeover methods, storage devices, and service takeover apparatuses. In one example, when a communication fault occurs between two storage devices in a storage system, the two storage devices respectively obtain running statuses of the two storage devices. A running status can reflect current usage of one or more system resources of a particular storage device. Then, a delay duration is determined according to the running statuses, where the delay duration is a duration for which the storage device waits before sending an arbitration request to a quorum server. The two storage devices respectively send, after the delay duration, arbitration requests to the quorum server to request to take over a service. The quorum server then can select a storage device in a relatively better running status to take over a host service.

ERROR DETECTION
20180129573 · 2018-05-10 ·

An apparatus 2 comprises at least three processing circuits 4 to perform redundant processing of a common thread of program instructions. Error detection circuitry 16 is provided comprising a number of comparators 22 for detecting a mismatch between signals on corresponding signal nodes 20 in the processing circuits 4. When a comparator 22 detects a mismatch, this triggers a recovery process. The error detection circuitry 16 generates an unresolvable error signal 36 indicating that a detected area is unresolvable by the recovery process when, during the recovery process, a mismatch is detected by one of the proper subset 34 of the comparators 22. By considering fewer comparators 22 during the recovery process than during normal operation, the chances of unrecoverable errors being detected can be reduced, increasing system availability.

A VEHICLE SAFETY ELECTRONIC CONTROL SYSTEM
20180105183 · 2018-04-19 ·

A vehicle safety electronic control system includes a first microcontroller having a lockstep architecture with a lockstep core and a second microcontroller having at least two processing cores. The lockstep core of the first microcontroller is configured to monitor and control outputs of said at least two cores of the second microcontroller.

Asynchronous remote copy system and storage control method
09880910 · 2018-01-30 · ·

In a previous storage apparatus, differential JNLs are reflected in order of the sequential numbers, to the data volumes thereof. If a first storage apparatus is suspended, it is determined which is newer: the sequential number which the journal recently reflected in a second storage apparatus or the sequential number reflected in a third storage apparatus. In the newer storage apparatus having the newer sequential number, it is determined whether one or more JNLs from the journal having the sequential number next to the sequential number which is not determined to be the newer to the journal having the sequential number determined to be the newer exist, or not. If the result of the determination is positive, from the newer storage apparatus to the previous storage apparatus which is not the newer of the second and the third storage apparatuses, one or more differential JNLs are copied.

SEMICONDUCTOR DEVICE
20170308445 · 2017-10-26 ·

Conventional semiconductor devices are problematic in that an operation cannot be continued in the event of a failure of one of CPU cores performing a lock step operation and, as a result, reliability cannot be improved. The semiconductor device according to the present invention includes a computing unit including a first CPU core and a second CPU core that perform a lock step operation, wherein the first CPU core 11 and the second CPU core 12 respectively diagnose failures of internal logic circuits, and a sequence control circuit switches the CPU core that outputs data to a shared resource, in the computing unit based on the diagnose result.

Redundant array of independent disk (RAID) storage recovery

In one embodiment, a system includes a storage subsystem having an array of storage devices; a receiving component for receiving an error message; a determining component for determining that the error message indicates that a storage device has failed; a collecting component for collecting an array record having storage device characteristics of the failed storage device; a collating component for collating a candidate record having a plurality of candidate entries; a comparing component for comparing storage device characteristics of the failed storage device of the array record with the storage device characteristics of each of the candidate entries; and an identifying component for identifying a first candidate storage device having storage device characteristics that match the storage device characteristics of the failed storage device or a second candidate storage device having storage device characteristics most similar to the storage device characteristics of the failed storage device.

INTEGRITY CHECKING
20250377989 · 2025-12-11 · ·

An apparatus has processing circuitry to execute instructions. The processing circuitry has calculation circuitry which is responsive to one or more instructions requiring a calculation to be performed to compute the result of the calculation and approximation circuitry which is responsive to said one or more instructions to calculate an approximate result of the calculation independently of the calculation circuitry. The processing circuitry also has integrity checking circuitry to perform an integrity check by comparing the result of the calculation performed by the calculation circuitry and the approximate result of the calculation performed by the approximation circuity. The integrity checking circuitry detects an error in the processing circuitry if it is determined that a difference between the result of the calculation and the approximate result of the calculation is greater than a deviation threshold.