G06F11/184

SYSTEM RECOVERY USING A FAILOVER PROCESSOR

Techniques for system recovery using a failover processor are disclosed. A first processor, with a first instruction set, is configured to execute operations of a first type; and a second processor, with a second instruction set different from the first instruction set, is configured to execute operations of a second type. A determination is made that the second processor has failed to execute at least one operation of the second type within a particular period of time. Responsive to determining that the second processor has failed to execute at least one operation of the second type within the particular period of time, the first processor is configured to execute both the operations of the first type and the operations of the second type.

DIVERSE REDUNDANT PROCESSING MODULES FOR ERROR DETECTION
20200019477 · 2020-01-16 ·

In one embodiment, a system has an integrated circuit (IC) device, the IC device includes a first processing unit having a first functional block that has a diversifiable sub-circuit and a result output, a second processing unit having a second functional block substantially identical to the first functional block that includes a corresponding diversifiable sub-circuit and a corresponding result output. The IC device includes a comparator adapted to compare the result output of the first functional block to the result output of the second functional block. The diversifiable sub-circuit of the first functional block operates using a first set of operating parameters. The diversifiable sub-circuit of the second functional block operates using a second set of operating parameters different from the first set of operating parameters.

Disaster recovery for split storage cluster

A method, computer program product and/or computer system assigns access to a quorum disk in a split-storage cluster environment when a communication link between storage systems fails. Access to the quorum disk is based on storage system I/O performance. Priority is given to the storage system that has a higher performance before the link failure. When the communication link fails, both storage systems attempt to access the quorum disk. If the system that first attempts to access the quorum disk is the non-priority storage system, a timer is started. If the priority system attempts to access the quorum disk within a predetermined time interval, the priority system locks the quorum disk and forms the cluster. If the priority system does not attempt to access the quorum disk within the predetermined time interval, the non-priority system locks the quorum disk and forms the cluster.

Mission-critical computing architecture

Operational faults, including transient faults, are detected within computing hardware for mission-critical applications. Operational requests received from a requestor node are to be processed by shared agents to produce corresponding responses. A first request is duplicated to be redundantly processed independently and asynchronously by distinct shared agents to produce redundant counterpart responses including a first redundant response and a second redundant response. The first redundant response is compared against the second redundant response. In response to a match, the redundant responses are merged to produce a single final response to the first request to be read by the requestor node. In response to a non-match, an exception response is performed.

Debug trace streams for core synchronization
11934295 · 2024-03-19 · ·

The present disclosure provides for synchronization of multi-core systems by monitoring a plurality of debug trace data streams for a redundantly operating system including a corresponding plurality of cores performing a task in parallel; in response to detecting a state difference on one debug trace data stream of the plurality of debug trace data streams relative to other debug trace data streams of the plurality of debug trace data streams: marking a given core associated with the one debug trace data stream as an affected core; and restarting the affected core.

DISTRIBUTED COMPUTING IN A PROCESS CONTROL ENVIRONMENT

High availability and data migration in a distributed process control computing environment. Allocation algorithms distribute data and applications among available compute nodes, such as controllers in a process control system. In the process control system, an input/output device, such as a fieldbus module, can be used by any controller. Databases store critical execution information for immediate takeover by a backup compute element. The compute nodes are configured to execute algorithms for mitigating dead time in the distributed computing environment.

REDUNDANT PROCESSOR ARCHITECTURE
20190361764 · 2019-11-28 · ·

The present disclosure relates to an assembly including a first processor having a first core, a second core and a controller, and a second processor having a first core, and wherein the first core and the second core of the first processor, and the first core of the second processor are configured to execute a first procedure. The controller of the first processor is configured to compare a first result from executing the first procedure on the first core of the first processor with a second result from executing the first procedure on the second core of the first processor; and comparing each of the first and second results with a third result from executing the first procedure on the first core of the second processor, if the first and second results differ from one another.

Multiplexing system, multiplexing method, and computer program product

According to an embodiment, a multiplexing system includes servers. Each server includes a memory, a processing unit, a decision controller, and a restoring unit. The memory is configured to store internal data. The processing unit is configured to output, as first data, deterministic data or the nondeterministic data. The deterministic data is uniquely determined by an operation based on input data and the internal data having not yet been processed. When the processing unit cannot determine deterministic output it outputs the nondeterministic data. The decision controller is configured to select either the first data output from the each server or the first data output from another server, and decide the selected first data as second data. The restoring unit is configured to, when the second output data is the nondeterministic data, restore the internal data to a state of the internal data having not yet been processed.

System recovery using a failover processor

Techniques for system recovery using a failover processor are disclosed. A first processor, with a first instruction set, is configured to execute operations of a first type; and a second processor, with a second instruction set different from the first instruction set, is configured to execute operations of a second type. A determination is made that the second processor has failed to execute at least one operation of the second type within a particular period of time. Responsive to determining that the second processor has failed to execute at least one operation of the second type within the particular period of time, the first processor is configured to execute both the operations of the first type and the operations of the second type.

Distributed computing in a process control environment

High availability and data migration in a distributed process control computing environment. Allocation algorithms distribute data and applications among available compute nodes, such as controllers in a process control system. In the process control system, an input/output device, such as a fieldbus module, can be used by any controller. Databases store critical execution information for immediate takeover by a backup compute element. The compute nodes are configured to execute algorithms for mitigating dead time in the distributed computing environment.