Patent classifications
G06F11/181
ERROR CORRECTION IN A REDUNDANT PROCESSING SYSTEM
A processing system encompasses several processing devices and a comparison device. A method for controlling the processing system encompasses: processing of identical information items by the processing devices using associated processing processes; furnishing a characteristic value of each processing process, respectively as a function of the processing that has occurred; and comparing the characteristic values by way of the comparison device and determining a defectively operating processing process on the basis of the comparison. The defectively operating processing process is replaced by a processing process restarted on the same processing device.
MANAGING JOURNALING RESOURCES WITH COPIES STORED IN MULTIPLE LOCATIONS
A storage system in one embodiment comprises a storage controller and a plurality of storage devices comprising a plurality of memory portions. The storage controller is configured to monitor a plurality of servers for a failure event. The servers store a plurality of copies of the memory portions. The storage controller is further configured to mark as invalid a copy of a memory portion on a failed server, search for and identify a location on an operational server for storing a new version of the copy, and communicate the copy invalidity and the identified location to a client device using the memory portion. The client device is configured to generate the new version of the copy for storage on the operational server, and the storage controller receives a notification from the client device regarding whether the new version of the copy was generated and stored on the operational server.
TWO DIE SYSTEM ON CHIP (SOC) FOR PROVIDING HARDWARE FAULT TOLERANCE (HFT) FOR A PAIRED SOC
Apparatuses of systems that provide Safety Integration Levels (SILs) and Hardware Fault Tolerance (HFT) include a first die, the first die including first processing logic connected to a first connection and the first connection connected to second processing logic of a second die. The first die may further include a second connection to an input/output (I/O) channel where the second connection is coupled to the first processing logic. The apparatuses may further include a second die, the second die including second processing logic and a third connection from a secondary device coupled to the second processing logic. The secondary device is outside the system. The second processing logic is configured to select among three configurations based on signals from the second processing logic and the secondary device: sending first output data on the I/O output channel, sending second output data on the I/O output channel, or de-energizing the I/O channel.
Debug trace streams for core synchronization
The present disclosure provides for synchronization of multi-core systems by monitoring a plurality of debug trace data streams for a redundantly operating system including a corresponding plurality of cores performing a task in parallel; in response to detecting a state difference on one debug trace data stream of the plurality of debug trace data streams relative to other debug trace data streams of the plurality of debug trace data streams: marking a given core associated with the one debug trace data stream as an affected core; and restarting the affected core.
Node Failure Detection and Resolution in Distributed Databases
Methods and systems to detect and resolve failure in a distributed database system is described herein. A first node in the distributed database system can detect an interruption in communication with at least one other node in the distributed database system. This indicates a network failure. In response to detection of this failure, the first node starts a failure resolution protocol. This invokes coordinated broadcasts of respective lists of suspicious nodes among neighbor nodes. Each node compares its own list of suspicious nodes with its neighbors' lists of suspicious nodes to determine which nodes are still directly connected to each other. Each node determines the largest group of these directly connected nodes and whether or not it is in that group. If a node isn't in that group, it fails itself to resolve the network failure.
Method, electronic device, and program product for failure handling
Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for failure handling. This failure handling method includes determining a sector set failure type associated with at least one failed sector set of a disk; if the sector set failure type indicates that the number of failed sector sets in the at least one failed sector set is greater than a first threshold number, generating an instruction for replacing the disk; and otherwise performing at least one of the following: migrating data from a failed sector set in which the number of failed sectors is greater than a second threshold number to a spare sector set, and performing a failure recovery for a failed sector set in which the number of failed sectors is less than or equal to the second threshold number.
REDUNDANT PROCESSOR ARCHITECTURE
The present disclosure relates to an assembly including a first processor having a first core, a second core and a controller, and a second processor having a first core, and wherein the first core and the second core of the first processor, and the first core of the second processor are configured to execute a first procedure. The controller of the first processor is configured to compare a first result from executing the first procedure on the first core of the first processor with a second result from executing the first procedure on the second core of the first processor; and comparing each of the first and second results with a third result from executing the first procedure on the first core of the second processor, if the first and second results differ from one another.
DATABASE SYSTEM WITH CODING CLUSTER AND METHODS FOR USE THEREWITH
A networked database management system (DBMS) is disclosed. In particular, the disclosed DBMS includes a plurality of nodes, one of which is elected as a designated leader. The designated leader is elected using a consensus algorithm, such as tabulated random votes, RAFT or PAXOS. The designated leader is responsible for managing open coding lines, and determining when to close an open coding line.
Computer architecture for mitigating transistor faults due to radiation
A transmitting computer for a vehicle is disclosed, and includes a command circuit, a monitor circuit, and a master circuit. The command circuit receives a real-time signal and executes a first set of instructions to analyze the real-time signal, and generates a plurality of command signals based on executing the first set of instructions. The monitor circuit receives the command signals and the real-time signal. The monitor circuit executes a second set of instructions to analyze the real-time signal and generates a plurality of replica signals based on executing the second set of instructions. The monitor circuit generates an initial reset command in response to determining an initial miscompare between one of the plurality of command signals and the plurality of replica signals. The master circuit is in communication with both the command circuit and the monitor circuit and receives an indication that the initial reset command is generated.
PARALLEL PROCESSING SYSTEM RUNTIME STATE RELOAD
A parallel processing system includes at least three processors operating in parallel, state monitoring circuitry, and state reload circuitry. The state monitoring circuitry couples to the at least three parallel processors and is configured to monitor runtime states of the at least three parallel processors and identify a first processor of the at least three parallel processors having at least one runtime state error. The state reload circuitry couples to the at least three parallel processors and is configured to select a second processor of the at least three parallel processors for state reload, access a runtime state of the second processor, and load the runtime state of the second processor into the first processor. Monitoring and reload may be performed only on sub-systems of the at least three parallel processors. During reload, clocks and supply voltages of the processors may be altered. The state reload may relate to sub-systems.