G06F9/3861

System recovery using a failover processor

Techniques for system recovery using a failover processor are disclosed. A first processor, with a first instruction set, is configured to execute operations of a first type; and a second processor, with a second instruction set different from the first instruction set, is configured to execute operations of a second type. A determination is made that the second processor has failed to execute at least one operation of the second type within a particular period of time. Responsive to determining that the second processor has failed to execute at least one operation of the second type within the particular period of time, the first processor is configured to execute both the operations of the first type and the operations of the second type.

Watcher: precise and fully-automatic on-site failure diagnosis

The techniques described herein may provide techniques for precise and fully-automatic on-site software failure diagnosis that overcomes issues of existing systems and general challenges of in-production software failure diagnosis. Embodiments of the present systems and methods may provide a tool capable of automatically pinpointing a fault propagation chain of program failures, with explicit symptoms. The combination of binary analysis, in-situ/identical replay, and debugging registers may be used together to simulate the debugging procedures of a programmer automatically. Overhead, privacy, transparency, convenience, and completeness challenges of in-production failure analysis are improved, making it suitable for deployment uses.

Flushing a fetch queue using predecode circuitry and prediction information

A data processing apparatus is provided. It includes control flow detection prediction circuitry that performs a presence prediction of whether a block of instructions contains a control flow instruction. A fetch queue stores, in association with prediction information, a queue of indications of the instructions and the prediction information comprises the presence prediction. An instruction cache stores fetched instructions that have been fetched according to the fetch queue. Post-fetch correction circuitry receives the fetched instructions prior to the fetched instructions being received by decode circuitry, the post-fetch correction circuitry includes analysis circuitry that causes the fetch queue to be at least partly flushed in dependence on a type of a given fetched instruction and the prediction information associated with the given fetched instruction.

Fault Isolation and Recovery of CPU Cores for Failed Secondary Asymmetric Multiprocessing Instance
20230118408 · 2023-04-20 ·

According to certain embodiments, a system includes one or more processors and one or more computer-readable non-transitory storage media comprising instructions that, when executed by the one or more processors, cause one or more components to perform operations including executing a software process of a secondary instance, the secondary instance running in parallel with a primary instance and associated with a plurality of cores including a bootstrap core, registering a non-maskable interrupt for the bootstrap core in the secondary instance, determining whether the secondary instance is in a fault state, wherein, if the secondary instance is in the fault state, halting the plurality of cores associated with the secondary instance, without impact to the primary instance, and recovering the bootstrap core by switching a context of the bootstrap core from the secondary instance to the primary instance via the non-maskable interrupt.

RESPONDING TO BRANCH MISPREDICTION FOR PREDICATED-LOOP-TERMINATING BRANCH INSTRUCTION

A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration. When the mispredicted-non-termination branch misprediction is detected for the given iteration of the predicated-loop-terminating branch instruction, in response to determining that a flush suppressing condition is satisfied, flushing of the at least one unnecessary iteration of the predicated loop body is suppressed as a response to the mispredicted-non-termination branch misprediction.

CACHE COHERENCE VALIDATION USING DELAYED FULFILLMENT OF L2 REQUESTS
20230122466 · 2023-04-20 ·

Methods and systems for validating cache coherence in a data processing system are described. A processing element may detect a load instruction requesting the processing element to transfer data from a global memory location to a local memory location. The processing element may apply, in response to detecting the load instruction requesting the processing element to transfer data from the global memory location to the local memory location, a delay to the transfer of the data from the global memory location to the local memory location. The processing element may execute the load instruction and transferring the data from the global memory location to the local memory location with the applied delay. The processing element may validate, in response to executing the load instruction and transferring the data with the applied delay, a cache coherence of the data processing system.

METHODS AND APPARATUS FOR PREDICTING INSTRUCTIONS FOR EXECUTION

Aspects of the present disclosure relate to an apparatus comprising prediction circuitry having a plurality of hierarchical prediction units to perform respective hierarchical predictions of instructions for execution, wherein predictions higher in the hierarchy have a higher expected accuracy than predictions lower in the hierarchy. Responsive to a given prediction higher in the hierarchy being different to a corresponding prediction lower in the hierarchy, the corresponding prediction lower in the hierarchy is corrected. A prediction correction metric determination unit determines a prediction correction metric indicative of an incidence of uncorrected predictions performed by the prediction circuitry. Fetch circuitry fetches instructions predicted by at least one of said plurality of hierarchical predictions, and delays said fetching based on the prediction correction metric indicating an incidence of uncorrected predictions below a threshold.

System and method for physically separating, across different processing units, software for handling exception causing events from executing program code
11630673 · 2023-04-18 · ·

A first processor for executing program code has a control interface mapped to the memory address space of a second processor and provides the second processor with direct mapped access to state information of the first processor. The first processor responds to an exception causing event to enter a halted mode stopping execution of the program code and issuing a trigger event. The second processor responds to the trigger to execute an exception handling routine during which the second processor accesses and modifies the state information via the control interface as required by the exception handling routine. On completion of the exception handling routine, the second processor causes the first processor to exit the halted mode and resume execution of the program code. Thus, the program code is physically separated from the software used to perform the exception handling routine to improve security.

SPECULATIVE RESOLUTION OF LAST BRANCH-ON-COUNT AT FETCH

A computer processor includes an instruction pipeline configured to dispatch a plurality of branch-to-count (BCNT) instructions and an instruction fetch unit (IFU). The IFU is configured to execute an instruction loop for fetching a targeted number of BCNT instructions from the instruction pipeline and to monitor a loop counter that counts a number of fetched BCNT instructions that are actually fetched from the instruction pipeline in response to executing the instruction loop. The IFU resolves a final BCNT instruction included in the instruction loop in response to the number of fetched BCNT instructions reaching a target loop count value.

Instruction Cache for Hardware Multi-Thread Microprocessor
20230066662 · 2023-03-02 · ·

Embodiments are provided for instructions cache system for a hardware multi-thread microprocessor. In some embodiments, a cache controller device includes multiple interfaces connected to a hardware multi-thread microprocessor. A first interface of the multiple interfaces can receive a fetch request from a first execution thread during a first clock cycle. A second interface of the multiple interfaces can receive a fetch request from a second execution thread during a second clock cycle after the first clock cycle. The cache controller device also includes a multiplexer to send first response signals in response to the fetch request from the first execution thread, and also to send second response signals in response to the fetch request from the second execution thread.