G06F11/1616

Monitoring device, fault-tolerant system, and control method
10360115 · 2019-07-23 · ·

A monitoring device is mounted in each of a plurality of operational systems constituting a fault-tolerant system. The plurality of operational systems have an identical configuration including a processor system. The monitoring device includes a processor. The processor executes instruction to read data from a predetermined storage area in a memory of an accessory device to be monitored, connected to the processor system. The processor further executes instruction to compare the read data with reference data held in advance. The processor further executes instruction to separate the processor system connected to the accessory device to be monitored from the fault-tolerant system when the read data is different from the reference data.

REPLACEMENT OF STORAGE DEVICE WITHIN IOV REPLICATION CLUSTER CONNECTED TO PCI-E SWITCH

Storage devices are connected to a Peripheral Component Interconnect Express (PCIe) switch and form an input/output virtualization (IOV) replication cluster that can be exposed to a host processor via hardware root complex interconnecting the PCIe switch to the host processor. When a failed storage device is replaced with a new storage device, the new storage device can initiate a virtual root complex that connects to those storage devices containing data that was replicated on the failed storage device, to receive and copy the data on the new storage device. This replication process does not have to involve the hardware root complex or the host processor.

Redundancy device, redundancy system, and redundancy method

A redundancy device which is configured to communicate with a redundancy opposite device and perform a redundancy execution, the redundancy device includes receivers configured to receive individually HB signals transmitted from the redundancy opposite device, a calculator configured to calculate a number of normal communication paths among communication paths of the HB signals based on a reception result of the receivers, a comparator configured to compare a calculation result of the calculator with a predetermined threshold value, and a changer configured to change the redundancy device from a standby state to an operating state, or change the redundancy device from the standby state to a not-standby state in which the redundancy execution is released, based on the calculation result of the calculator and a comparison result of the comparator.

GENERATING A HEALTH CONDITION MESSAGE ON A HEALTH CONDITION DETECTED AT A SERVER TO SEND TO A HOST SYSTEM ACCESSING THE SERVER
20190179719 · 2019-06-13 ·

Provided are a computer program product, system, and method for generating a health condition message on a health condition detected at a first server to send to a host system accessing the first server. A determination is made of a health condition with respect to access to a first storage. A determination is made of an estimated Input/Output (I/O) delay to access the first storage resulting from the determined health condition. A health condition message is generated indicating the estimated I/O delay. The health condition message is transmitted to the host system, wherein the host system uses the estimated I/O delay to determine whether to perform a swap operation to redirect host I/O requests to data from the first server to a second server.

Data Transmission Between Computation Units Having Safe Signaling Technology
20190171535 · 2019-06-06 ·

An input and output module transmits and receives data via a data line. The input and output module includes a protocol machine for a security protocol for data transfer and a clock. The protocol machine and instructions for clock processing are stored as sequence control in a read-only memory of the input and output module.

Device and system including adaptive repair circuit

A device, system, and/or method includes an internal circuit configured to perform at least one function, an input-output terminal set and a repair circuit. The input-output terminal set includes a plurality of normal input-output terminals connected to an external device via a plurality of normal signal paths and at least one repair input-output terminal selectively connected to the external device via at least one repair signal path. The repair circuit repairs at least one failed signal path included in the normal signal paths based on a mode signal and fail information signal, where the mode signal represents whether to use the repair signal path and the fail information signal represents fail information on the normal signal paths. Using the repair circuit, various systems adopting different repair schemes may be repaired and cost of designing and manufacturing the various systems may be reduced.

Signal Pairing for Module Expansion of a Failsafe Computing System
20190079823 · 2019-03-14 ·

A system includes a central processing unit (CPU), a first input/output (I/O) module, and a second I/O module. The first I/O module includes a first module health controller operatively connected to the CPU. The second I/O module includes a second module health controller operatively connected to the first module health controller and the CPU. One of the first module health controller and the second module health controller is configured to assert a paired module health signal to the CPU indicating that the first I/O module and the second I/O module are health.

Generating a health condition message on a health condition detected at a server to send to a host system accessing the server

Provided are a computer program product, system, and method for generating a health condition message on a health condition detected at a first server to send to a host system accessing the first server. A determination is made of a health condition with respect to access to a first storage. A determination is made of an estimated Input/Output (I/O) delay to access the first storage resulting from the determined health condition. A health condition message is generated indicating the estimated I/O delay. The health condition message is transmitted to the host system, wherein the host system uses the estimated I/O delay to determine whether to perform a swap operation to redirect host I/O requests to data from the first server to a second server.

Method and apparatus of a profiling algorithm to quickly detect faulty disks/HBA to avoid application disruptions and higher latencies

One embodiment is related to a method for determining a faulty hardware component within a data storage system, comprising: collecting data relating to a plurality of input/output (IO) errors associated with a first storage processor within the data storage system; compiling IO error statistics based on the data relating to the plurality of IO errors; and determining a faulty hardware component based on the IO error statistics, wherein the determining of the faulty hardware component comprises utilizing a second storage processor of the data storage system independent from the first storage processor.

Hardware control path redundancy for functional safety of peripherals

A circuit includes a primary register region and a primary shadow register; a secondary register region and a secondary shadow register; and a safety controller having multiple states. The safety controller transitions to a first write state when a first write signal to write a first value to the primary register region is detected, and copies the first value written to the primary register region to the primary shadow register; transitions to a second write state when a second write signal to write a second value to the secondary register region is detected within a set amount of time of detection of the first write signal, and in the second write state, copies the second value written to the secondary register region to the secondary shadow register; transitions to a compare state to receive a comparison signal indicating whether the first value is the same as the second value; and transitions to an update state when the first value is the same as the second value.