G06F11/165

COMPUTING WITH UNRELIABLE PROCESSOR CORES

A computer system that has two or more processing engines (PE), each capable of performing one or more operations on one or more operands but one or more of the PEs performs the operations unreliably. Initial results of each operation are debiased to create a debiased result used by the system instead of the initial result. The debiased result has an expected value equal to a correct output where the correct output is the initial result the respective operation would have produced if the respective operation performed was reliable.

SYSTEMS AND METHODS FOR MONITORING AND IDENTIFYING FAILURE IN DUAL FLIGHT MANAGEMENT SYSTEMS
20200320884 · 2020-10-08 ·

Systems and methods may be used for monitoring and identifying failure in flight management systems. For example, a method may include: calculating, using a first flight management system, a first value of a guidance command for controlling an aircraft for an RNP AP procedure; receiving a second value of the guidance command from a second flight management system; comparing the first value with the second value to determine whether the first value matches the second value; upon determining that the first value does not match the second value, using a flight management system monitor to determine, from the first flight management system and the second flight management system, a flight management system that has computed a correct value of the guidance command; and generating a message indicating that the determined flight management system is to be used to guide the aircraft.

METHODS AND APPARATUS FOR VERIFYING PROCESSING RESULTS AND/OR TAKING CORRECTIVE ACTIONS IN RESPONSE TO A DETECTED INVALID RESULT
20200310929 · 2020-10-01 ·

Methods and apparatus for detecting that a processing node, in a network including a plurality of processing nodes, is reporting invalid results and for taking corrective actions in response to the detection are described.

Non-Stop Internet-of-Things (IoT) Controllers
20200310786 · 2020-10-01 ·

Internet-of-Things (IoT) controllers built using hardened industrial technologies which improve functionality and reliability, such as a fixed-loop model in which a loop is repeated with configured time periodicity where sensors are queried, sensor responses are read, configured calculations are performed, and logic rules are evaluated resulting in decisions made and outputs activated. A variety of redundancy techniques are utilized to provide continuous non-stop operation of IoT controllers to compensate for possible hardware and software failures. Robust IoT controller redundancy also allows periodic maintenance, software updates and security patch installation without shutting down the IoT controllers.

Active-active architecture for distributed ISCSI target in hyper-converged storage

A method is provided for a hyper-converged storage-compute system to implement an active-active failover architecture for providing Internet Small Computer System Interface (iSCSI) target service. The method intelligently selects multiple hosts to become storage nodes that process iSCSI input/output (I/O) for a target. The method further enables iSCSI persistent reservation (PR) to handle iSCSI I/Os from multiple initiators.

METHOD AND SYSTEM FOR A GEOGRAPHICAL HOT REDUNDANCY
20200287845 · 2020-09-10 ·

A geographical hot redundancy method includes: a first master computer transmitting to a second slave computer first input data items and a first execution context for the n.sup.th execution cycle of an application, first and second replicas being respectively executed on the first and second computers; execution of the first replica, updating the first execution context at the n.sup.th cycle end and transmission to the second computer; recovering the first input data items and the first execution context for the n.sup.th cycle as the second input data items and second execution context for the n.sup.th cycle; executing the second replica in the second execution context for the n.sup.th cycle, on the second input data items of the n.sup.th cycle, and updating the second execution context at the end of the n.sup.th cycle; and checking and verifying consistency by comparing first and second execution contexts at the n.sup.th cycle end.

METHOD AND APPARATUS TO NEUTRALIZE REPLICATION ERROR AND RETAIN PRIMARY AND SECONDARY SYNCHRONIZATION DURING SYNCHRONOUS REPLICATION
20200278984 · 2020-09-03 ·

Techniques are provided for neutralizing replication errors. An operation is executed upon a first storage object and is replicated as a replicated operation for execution upon a second storage object. A first error may be received for the replicated operation. Instead of transitioning to an out of sync state and aborting the operation, a wait is performed until a result of the attempted execution of the operation is received. If the first error is the same as a second error returned for the operation, then the operation and replicated operation are considered successful and a synchronous replication relationship is kept in sync. If the first error and the second error are different errors, then an error response is returned for the operation and the synchronous replication relationship is transitioned to out of sync.

Multi-channel network-on-a-chip

In at least one embodiment of the disclosure, a method includes detecting an error in a local memory shared by redundant computing modules executing in delayed lockstep. The method includes pausing execution in the redundant computing modules and handling the error of the local memory. The method includes resuming execution in delayed lockstep of the redundant computing modules in response to the handling of the error.

Service Takeover Method, Storage Device, And Service Takeover Apparatus
20200250055 · 2020-08-06 ·

The present disclosure describes example service takeover methods, storage devices, and service takeover apparatuses. In one example method, when a communication fault occurs between two storage devices in a storage system, the two storage devices respectively obtain running statuses of the two storage devices. A running status can reflect current usage of one or more system resources of a particular storage device. Then, a delay duration is determined according to the running statuses, where the delay duration is a duration for which the storage device waits before sending an arbitration request to a quorum server. The two storage devices respectively send, after the delay duration, arbitration requests to the quorum server to request to take over a service. The quorum server then can select a storage device in a relatively better running status to take over a host service.

ACTIVE-ACTIVE ARCHITECTURE FOR DISTRIBUTED ISCSI TARGET IN HYPER-CONVERGED STORAGE

A method is provided for a hyper-converged storage-compute system to implement an active-active failover architecture for providing Internet Small Computer System Interface (iSCSI) target service. The method intelligently selects multiple hosts to become storage nodes that process iSCSI input/output (I/O) for a target. The method further enables iSCSI persistent reservation (PR) to handle iSCSI I/Os from multiple initiators.