Patent classifications
G06F11/1407
OPERATION OF A MULTI-SLICE PROCESSOR IMPLEMENTING ADAPTIVE FAILURE STATE CAPTURE
Operation of a multi-slice processor that includes a plurality of execution slices and a plurality of load/store slices, where the load/store slices are coupled to the execution slices via a results bus. Operation of such a multi-slice processor includes: capturing first state information corresponding to a first set of control signals; monitoring state information of a plurality of logical components of the multi-slice processor; selecting, in dependence upon one or more selection criteria and upon the monitored state information, a second set of control signals; and capturing second state information corresponding to the second set of control signals, wherein the first set of control signals is different than the second set of control signals.
Configuration of weighted address pools for component design verification
A system for testing a design of a computing component includes an input device configured to receive a request to perform a test of a component, and a testing unit including a simulation of the component. The simulation is configured to output a result indicative of a response to a set of instruction addresses, the set of instruction addresses is acquired from a plurality of addresses, and the plurality of addresses including a plurality of address groups, where each address group is associated with a respective group identifier. The system also includes a plurality of requestors configured to apply the set of instruction addresses to the simulation, where a requestor of the plurality of requestors is configured to select an address for application to the simulation based on a received group identifier and a variably configurable weight value assigned to the received group identifier and the requestor.
Data processing system and method for reading instruction data of instruction from memory including a comparison stage for preventing execution of wrong instruction data
In the disclosure, a data processing system includes a microprocessor and a memory. The integrity of data read from a memory by a microprocessor may be checked. When an instruction address is transmitted from the microprocessor to the memory for reading the instruction data corresponding to the instruction address, predetermined dummy data is also read from the memory while the instruction data is read. The integrity of the instruction data may be check by comparing the predetermined dummy data to a hardwire data that is not stored in the memory. If the dummy data matches the hardwire data, the instruction data read from the memory is determined to be correct.
Method and system for providing coordinated checkpointing to a group of independent computer applications
A system and method thereof for performing loss-less migration of an application group. In an exemplary embodiment, the system may include a high-availability services module structured for execution in conjunction with an operating system, and one or more computer nodes of a distributed system upon which at least one independent application can be executed upon. The high-availability services module may be structured to be executable on the one or more computer nodes for loss-less migration of the one or more independent applications, and is operable to perform checkpointing of all state in a transport connection.
Checkpointing
A system comprising: a first subsystem comprising at least one first processor, and a second subsystem comprising one or more second processors. A first program is arranged to run on the at least one first processor, the first program being configured to send data from the first subsystem to the second subsystem. A second program is arranged to run on the one more second processors, the second program being configured to operate on the data content from the first subsystem. The first program is configured to set a checkpoint at successive points in time. At each checkpoint it records in memory of the first subsystem i) a program state of the second program, comprising a state of one or more registers on each of the second processors at the time of the checkpoint, and ii) a copy of the data content sent to the second subsystem since the respective checkpoint.
System and Method for Coordinating Use of Multiple Coprocessors
An interface software layer is interposed between at least one application and a plurality of coprocessors. A data and command stream issued by the application(s) to an API of an intended one of the coprocessors is intercepted by the layer, which also acquires and stores the execution state information for the intended coprocessor at a coprocessor synchronization boundary. At least a portion of the intercepted data and command stream data is stored in a replay log associated with the intended coprocessor. The replay log associated with the intended coprocessor is then read out, along with the stored execution state information, and is submitted to and serviced by at least one different one of the coprocessors other than the intended coprocessor.
SYSTEM-ON-CHIP FOR SPECULATIVE EXECUTION EVENT COUNTER CHECKPOINTING AND RESTORING
An example system for speculative execution event counter checkpointing and restoring may include a plurality of symmetric cores, at least one of the symmetric cores to simultaneously process a plurality of threads and to perform out-of-order instruction processing for the plurality of threads; at least one shared cache circuit to be shared among two or more the of symmetric cores. The system may further include a memory controller to couple the symmetric cores to a system memory and a data communication interface to couple one or more of the cores to input/output devices. The system may further include event counter circuitry comprising: a plurality of event counters including programmable event counters and fixed event counters and one or more configuration registers to store configuration data to specify an event type to be counted by the programmable event counters, wherein at least one of the one or more configuration registers is to store configuration data for a plurality of the programmable event counters. The system may further include transactional memory circuitry to process transactional memory operations including load operations and store operations, the transactional memory circuitry to process a transaction begin instruction to indicate a start of a transactional execution region of a program, a transaction end instruction to indicate an end of the transactional execution region, and a transaction abort instruction to abort processing of the transactional execution region. The system may further include transaction checkpoint circuitry to store a processor state at the start of the transactional execution region of the program, the processor state including values of one or more of the event counters. The system may further include lock elision circuitry to cause critical sections of the program to execute as transactions on multiple threads without acquiring a lock, the lock elision circuitry to cause the critical sections to be re-executed non-speculatively using one or more locks in response to detecting a transaction failure.
Operating system-based systems and method of achieving fault tolerance
A method and apparatus of performing fault tolerance in a fault tolerant computer system comprising: a primary node having a primary node processor; a secondary node having a secondary node processor, each node further comprising a respective memory; a respective checkpoint shim; each of the primary and secondary node further comprising: a respective non-virtual operating system (OS), the non-virtual OS comprising a respective; network driver; storage driver; and checkpoint engine; the method comprising the steps of: acting upon a request from a client by the respective OS of the primary and the secondary node, comparing the result obtained by the OS of the primary node and the secondary node by the network driver of the primary node for similarity, and if the comparison of indicates similarity less than a predetermined amount, the primary node network driver informs the primary node checkpoint engine to begin a checkpoint process.
Generating and using checkpoints in a virtual computer system
To generate a checkpoint for a virtual machine (VM), first, while the VM is still running, a copy-on-write (COW) disk file is created pointing to a parent disk file that the VM is using. Next, the VM is stopped, the VM's memory is marked COW, the device state of the VM is saved to memory, the VM is switched to use the COW disk file, and the VM begins running again for substantially the remainder of the checkpoint generation. Next, the device state that was stored in memory and the unmodified VM memory pages are saved to a checkpoint file. Also, a copy may be made of the parent disk file for retention as part of the checkpoint, or the original parent disk file may be retained as part of the checkpoint. If a copy of the parent disk file was made, then the COW disk file may be committed to the original parent disk file.
Ordered Event Stream Event Retention
Retention of events of an ordered event stream is disclosed. Expiration of events stored in a segment of an ordered event stream (OES) can be desirable. New events are added to a head of an OES segment, and pruning events from a tail of the OES segment can be valuable. Processing applications can register a processing scheme for a segment, e.g., at-least-once processing, exact1y-once processing, etc., and can generate checkpoints indicating a degree of advancement in processing events of the segment. The ordered event stream can determine a cut point indicative of a progress point, that before which, events of an OES can be marked as ready for expiration. However, events that are marked for expiration can be retained to allow processing based on a checkpoint, e.g., expiration of the event can be refused until there is an assurance the event was read by the processing application.