G06F9/3861

Way predictor and enable logic for instruction tightly-coupled memory and instruction cache
11687342 · 2023-06-27 · ·

Disclosed herein are systems and method for instruction tightly-coupled memory (iTIM) and instruction cache (iCache) access prediction. A processor may use a predictor to enable access to the iTIM or the iCache and a particular way (a memory structure) based on a location state and program counter value. The predictor may determine whether to stay in an enabled memory structure, move to and enable a different memory structure, or move to and enable both memory structures. Stay and move predictions may be based on whether a memory structure boundary crossing has occurred due to sequential instruction processing, branch or jump instruction processing, branch resolution, and cache miss processing. The program counter and a location state indicator may use feedback and be updated each instruction-fetch cycle to determine which memory structure(s) needs to be enabled for the next instruction fetch.

Livelock Recovery Circuit
20170364363 · 2017-12-21 ·

Livelock recovery circuits configured to detect livelock in a processor, and cause the processor to transition to a known safe state when livelock is detected. The livelock recovery circuits include detection logic configured to detect that the processor is in livelock when the processor has illegally repeated an instruction; and transition logic configured to cause the processor to transition to a safe state when livelock has been detected by the detection logic.

Data processing apparatus and method for providing candidate prediction entries

A data processing apparatus and a method are disclosed. The data processing apparatus comprising: a prediction cache to store a plurality of prediction entries, each defining an association between a prediction cache lookup address and a predicted behaviour; prediction circuitry to select a prediction entry based on a prediction cache lookup of the prediction cache based on a given prediction cache lookup address and to determine the predicted behaviour associated with the given prediction cache lookup address based on the selected prediction entry; and a candidate prediction buffer to store a plurality of candidate predictions each indicative of a candidate prediction entry to be selected for inclusion in a subsequent prediction cache lookup, wherein the candidate prediction entry is selected in response to a candidate prediction lookup based on a candidate lookup address different to a candidate prediction cache lookup address indicated as associated with a candidate predicted behaviour in the candidate prediction entry.

NESTED QUANTUM ANNEALING CORRECTION

Systems and methods of processing using a quantum processor are described. A method includes obtaining a problem Hamiltonian and defining a nested Hamiltonian with a plurality of logical qubits by embedding a logical K.sub.N representing the problem Hamiltonian into a larger K.sub.C×N, where N represents a number of the logical qubits and C represents a nesting level defining the amount of hardware resources for the nest Hamiltonian. The method also includes encoding the nested Hamiltonian into the plurality of physical qubits of the quantum processor; and performing a quantum annealing process with the quantum processor after the encoding.

HYPERVISOR BACKDOOR INTERFACE

A method of providing a backdoor interface between software executing in a virtual machine and a hypervisor executing on a computing system that supports the virtual machine includes trapping, at the hypervisor, an exception generated in response to execution of a debug instruction on a central processing unit (CPU) by the software; identifying, by an exception handler of the hypervisor handling the exception, an equivalence between an immediate operand of the debug instruction and a predefined value; and invoking, in response to the equivalence, a backdoor service of the hypervisor using state of at least one register of the CPU as parametric input, the state being set by the software prior to executing the debug instruction.

GRAPHIC RENDERING QUALITY IMPROVEMENTS THROUGH AUTOMATED DATA TYPE PRECISION CONTROL
20170358129 · 2017-12-14 ·

One or more system, apparatus, method, and computer readable media is described below for automated data type precision control capable of improving rendering quality on a graphics processor. Perceptible rendering quality is dependent at least in part on number format precision (e.g., FP16 or FP32) employed for shader program variables. In accordance with embodiments, shader variables implemented in lower precision data formats are tracked during shader compile to identify those that might trigger a floating point overflow and/or underflow exception. For shaders including one or more such variable, resources are provided to automatically monitor overflow and/or underflow exceptions during shader execution. In further embodiments, shader code is automatically re-generated based, at least in part, upon occurrences of such exceptions, and an increased number format precision specified for one or more of the tracked shader variables.

Executing Memory Requests Out of Order
20170357512 · 2017-12-14 ·

An on-chip cache is described which receives memory requests and in the event of a cache miss, the cache generates memory requests to a lower level in the memory hierarchy (e.g. to a lower level cache or an external memory). Data returned to the on-chip cache in response to the generated memory requests may be received out-of-order. An instruction scheduler in the on-chip cache stores pending received memory requests and effects the re-ordering by selecting a sequence of pending memory requests for execution such that pending requests relating to an identical cache line are executed in age order and pending requests relating to different cache lines are executed in an order dependent upon when data relating to the different cache lines is returned. The memory requests which are received may be received from another, lower level on-chip cache or from registers.

GEOMETRY-BASED COMPRESSION FOR QUANTUM COMPUTING DEVICES

A quantum computing device comprises a surface code lattice that includes/logical qubits, where/is a positive integer. The surface code lattice is partitioned into two or more regions based on lattice geometry. A compression engine is coupled to each logical qubit of the/logical qubits. Each compression engine is configured to compress syndrome data generated by the surface code lattice using a geometry-based compression scheme. A decompression engine is coupled to each compression engine. Each decompression engine is configured to receive compressed syndrome data, decompress the received compressed syndrome data, and route the decompressed syndrome data to a decoder block.

PREDICTING UPCOMING CONTROL FLOW

An apparatus has a fetch queue to identify a sequence of instructions to be fetched for execution and prediction circuitry to predict upcoming control flow and to control which instructions are identified in the fetch queue in dependence on the prediction. The prediction circuitry predicts multi-taken sequences which are sequences of instructions in which control flow is diverted by a first control flow changing instruction to a series of instructions terminating in a second control flow changing instruction that diverts control flow to a target address. The apparatus also has prediction confidence calculation circuitry to calculate confidence levels for respective multi-taken sequences. Each confidence level is indicative of a confidence in an accuracy of prediction of its respective multi-taken sequence. When the confidence level for a particular multi-taken sequence satisfies a prediction confidence condition, the prediction confidence tracking circuitry allows the particular multi-taken sequence to be predicted by the prediction circuitry. The prediction circuitry causes the series of instructions and the target instruction for the particular multi-taken sequence to be identified in the fetch queue when the prediction circuitry predicts the particular multi-taken sequence and further predictions to be made starting from the target address for the particular multi-taken sequence.

BEHAVIORAL IMPLEMENTATION OF A DOUBLE FAULT STACK IN A COMPUTER SYSTEM

An example method of exception handling in a computer system is described. The computer system includes a physical central processing unit (PCPU) and a system memory, the system memory storing a first stack, a second stack, and a double fault stack associated with the PCPU. The method includes: storing, by an exception handler executing in the computer system, an exception frame on the double fault stack in response to a stack overflow condition of the first stack; switching, by the exception handler, a first stack pointer of the PCPU from pointing to the first stack to pointing to the double fault stack; setting a current stack pointer of the PCPU to the first stack pointer; and executing software on the PCPU with the current stack pointer pointing to the double fault stack.