G06F11/1687

Heartbeat in failover cluster

This disclosure is directed to heartbeat messages for a high-availability redundant distributed computing environment (e.g., a quorum data storage implementation). Heartbeat messages may be sent from a node to other members of a duster as an indication of hardware state, network connectivity, and possibly application status on the sending node. If a member of a cluster (or quorum) goes silent (e.g., misses heartbeats), other members of the duster (or quorum) may consider that member (or the node that hosts that member) to be non-functional and may initiate a recovery action. Techniques are disclosed for using low-latency non-persistent storage for some heartbeat messages (referred to herein as a non-persistent heartbeat messages) to replace a portion of typical persistent (e.g., disk-based) heartbeat messages to reduce overall processing for periodic heartbeat messages. Further, implementations that aggregate multiple heartbeat messages from a node into a fewer number of heartbeat messages are disclosed.

Lockstep processing systems and methods
10725873 · 2020-07-28 · ·

The present techniques generally relate to a method of monitoring for a fault event in a lockstep processing system having a plurality of cores configured to operate in lockstep, the method having: power gating, for a period of time, a subset of cores of the plurality of cores from a first power source and providing power to the subset of cores from a second power source for the period of time; processing, at each of the cores of the plurality of cores, one or more instructions; providing an output from each core of the plurality of cores to error detection circuitry to monitor for the fault event, the output from each core based on or in response to processing the one or more instructions during the period of time.

Methods of guaranteed reception of common signals in an avionics system comprising a plurality of electronic computers
10715279 · 2020-07-14 · ·

Methods of guaranteed reception and of processing of a digital signal in an avionics system comprise a plurality of computers, each computer comprising processing electronics and a software layer, which, on receipt of an event, carries out the following steps: at a first instant, sending to each of the other computers of a first signal (ACK) of reception of the event; at a second instant termed TimeOut ACK, if the electronic computer has not received one of the first signals emanating from one of the other computers, sending of a second failure signal (FAIL) to each of the other computers; at a third instant termed TimeOut GARANTEED, if a second failure signal has been received by the computer, absence of taking into account of the event by the computer and if no failure signal has been received by the computer, taking into account of the event by the data processing electronics of the computer.

METHODS, APPARATUSES AND SYSTEMS FOR CLOUD-BASED DISASTER RECOVERY

Methods, apparatuses and systems for cloud-based disaster recovery are provided. The method, for example, includes receiving, at a cloud-based computing platform, backup information associated with backup vendors used by a client machine; storing, at the cloud-based computing platform, the backup information associated with the backup vendors; periodically updating, at the cloud-based computing platform, the backup information associated with each of the backup vendors at a predetermined polling interval for each of the backup vendors; receiving, at the cloud-based computing platform from the client machine, a failure indication for a server associated with at least one of the backup vendors; and restoring the server using the stored backup information at the cloud-based computing platform.

Method for processing data for a driving function of a vehicle

A method for processing data for a driving function of a vehicle is described, a predefined quantity of computation units being provided; the computation units supplying data, in particular redundant data, to a decision unit; the decision unit deciding, based on a comparison of the data delivered by the computation units, whether the data are correct; a synchronization unit being provided; the synchronization unit synchronizing the computation units in such a way that the computation units deliver the data to the decision unit in a specified time period; and the synchronization unit informing the decision unit as to when the data are transmitted by the computation units, so that the decision unit can specify which data of the computation units are used for a check of the data.

PROCESSOR FOR DETECTING AND PREVENTING RECOGNITION ERROR
20200167245 · 2020-05-28 ·

Provided is an image recognition processor. The image recognition processor includes a plurality of nano cores each configured to perform a pattern recognition operation and arranged in rows and columns, an instruction memory configured to provide instructions to the plurality of nano cores in a row unit, a feature memory configured to provide input features to the plurality of nano cores in a row unit, a kernel memory configured to provide a kernel coefficient to the plurality of nano cores in a column unit, and a difference checker configured to receive a result of the pattern recognition operation of each of the plurality of nano cores, detect whether there is an error by referring to the received result, and provide a fault tolerance function that allows an error below a predefined level.

HEARTBEAT IN FAILOVER CLUSTER

This disclosure is directed to heartbeat messages for a high-availability redundant distributed computing environment (e.g., a quorum data storage implementation). Heartbeat messages may be sent from a node to other members of a duster as an indication of hardware state, network connectivity, and possibly application status on the sending node. If a member of a cluster (or quorum) goes silent (e.g., misses heartbeats), other members of the duster (or quorum) may consider that member (or the node that hosts that member) to be non-functional and may initiate a recovery action. Techniques are disclosed for using low-latency non-persistent storage for some heartbeat messages (referred to herein as a non-persistent heartbeat messages) to replace a portion of typical persistent (e.g., disk-based) heartbeat messages to reduce overall processing for periodic heartbeat messages. Further, implementations that aggregate multiple heartbeat messages from a node into a fewer number of heartbeat messages are disclosed.

Assigning a control authorization to a computer

The invention relates to a system (1), comprising at least two asynchronous computers (2-i), on each of which at least one application (A) is executed, which provides control data (SD) for at least one actuation system (3), wherein the provided control data (SD) are transmitted by a control-authorized computer (2-i) that assumes a master computer status (M-RS) to the actuation system (3) for the control thereof, wherein the computers (2-i) of the system (1) cyclically exchange state data (ZD) and performance data (LD) with each other by means of a data interface in a data exchange (DAS), wherein the computers (2-i) each determine, on the basis of the state and performance data (ZD.sub.opp, LD.sub.opp) received from other computers (2-j) and on the basis of the computer's own state and performance data (ZD.sub.own, LD.sub.own, in a master/slave selection (MSA) performed on the computer (2-i), a computer status (RS) as a control-authorized or non-control-authorized computer (2-i) to be assumed by the particular computer (2-i) itself.

SYSTEM RECOVERY USING A FAILOVER PROCESSOR

Techniques for system recovery using a failover processor are disclosed. A first processor, with a first instruction set, is configured to execute operations of a first type; and a second processor, with a second instruction set different from the first instruction set, is configured to execute operations of a second type. A determination is made that the second processor has failed to execute at least one operation of the second type within a particular period of time. Responsive to determining that the second processor has failed to execute at least one operation of the second type within the particular period of time, the first processor is configured to execute both the operations of the first type and the operations of the second type.

LOCKSTEP PROCESSING SYSTEMS AND METHODS
20190370130 · 2019-12-05 · ·

The present techniques generally relate to a method of monitoring for a fault event in a lockstep processing system having a plurality of cores configured to operate in lockstep, the method having: power gating, for a period of time, a subset of cores of the plurality of cores from a first power source and providing power to the subset of cores from a second power source for the period of time; processing, at each of the cores of the plurality of cores, one or more instructions; providing an output from each core of the plurality of cores to error detection circuitry to monitor for the fault event, the output from each core based on or in response to processing the one or more instructions during the period of time.