Patent classifications
G06F9/3842
Instruction sequence buffer to store branches having reliably predictable instruction sequences
A method for outputting reliably predictable instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor, and out of that set, identifying a branch instruction having a series of subsequent frequently executed branch instructions that form a reliably predictable instruction sequence. The reliably predictable instruction sequence is stored into a buffer. On a subsequent hit to the branch instruction, the reliably predictable instruction sequence is output from the buffer.
Secure predictors for speculative execution
Systems and methods are disclosed for secure predictors for speculative execution. Some implementations may eliminate or mitigate side-channel attacks, such as the Spectre-class of attacks, in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a predictor circuit that, when operating in a first mode, uses data stored in a set of predictor entries to generate predictions. For example, the integrated circuit may be configured to: detect a security domain transition for software being executed by the integrated circuit; responsive to the security domain transition, change a mode of the predictor circuit from the first mode to a second mode and invoke a reset of the set of predictor entries, wherein the second mode prevents the use of a first subset of the predictor entries of the set of predictor entries; and, after completion of the reset, change the mode back to the first mode.
Speculative execution of correlated memory access instruction methods, apparatuses and systems
A processor core, a processor, an apparatus, and an instruction processing method are disclosed. The processor core includes: an instruction fetch unit, where the instruction fetch unit includes a speculative execution predictor and the speculative execution predictor compares a program counter of a memory access instruction with a table entry stored in the speculative execution predictor and marks the memory access instruction; a scheduler unit adapted to adjust a send order of marked memory access instructions and send the marked memory access instructions according to the send order; an execution unit adapted to execute the memory access instructions according to the send order. In the instruction fetch unit, a memory access instruction is marked according to a speculative execution prediction result. In the scheduler unit, a send order of memory access instructions is determined according to the marked memory access instruction and the memory access instructions are sent. In the execution unit, the memory access instructions are executed according to the send order. This helps avoiding re-execution of a memory access instruction due to an address correlation of the memory access instruction. Consequently, this eliminates the need of adding an idle cycle in an instruction pipeline and the need of refreshing the pipeline to clear a memory access instruction that is incorrectly speculated.
System for providing trace data in a data processor having a pipelined architecture
The invention is a method and system for providing trace data in a pipelined data processor. Aspects of the invention include providing a trace pipeline in parallel to the execution pipeline, providing trace information on whether conditional instructions complete or not, providing trace information on the interrupt status of the processor, replacing instructions in the processor with functionally equivalent instructions that also produce trace information and modifying the scheduling of instructions in the processor based on the occupancy of a trace output buffer.
System and method for an asynchronous processor with assisted token
Embodiments are provided for an asynchronous processor using master and assisted tokens. In an embodiment, an apparatus for an asynchronous processor comprises a memory to cache a plurality of instructions, a feedback engine to decode the instructions from the memory, and a plurality of XUs coupled to the feedback engine and arranged in a token ring architecture. Each one of the XUs is configured to receive an instruction of the instructions form the feedback engine, and receive a master token associated with a resource and further receive an assisted token for the master token. Upon determining that the assisted token and the master token are received in an abnormal order, the XU is configured to detect an operation status for the instruction in association with the assisted token, and upon determining a needed action in accordance with the operation status and the assisted token, perform the needed action.
Apparatus, method, and system for providing a decision mechanism for conditional commits in an atomic region
An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
Method for a delayed branch implementation by using a front end track table
A method for a delayed branch implementation by using a front end track table. The method includes receiving an incoming instruction sequence using a global front end, wherein the instruction sequence includes at least one branch, creating a delayed branch in response to receiving the one branch, and using a front end track table to track both the delayed branch the one branch.
APPARATUS WITH SHARED TRANSACTIONAL PROCESSING RESOURCE, AND DATA PROCESSING METHOD
An apparatus (2) with multiple processing elements (4, 6, 8) has shared transactional processing resources (10, 50, 75) for supporting processing of transactions, which comprise operations performed speculatively following a transaction start event whose results are committed following a transaction end event. The transactional processing resources may have a significant overhead and sharing these between the processing elements helps reduce energy consumption and circuit area.
READ AND WRITE SETS FOR RANGES OF INSTRUCTIONS OF TRANSACTIONS
Transactional memory accesses are tracked using read and write sets based on actual program flow. A read and write set is associated with a range of instructions of a transaction. When execution follows a predicted branch, loads and stores are marked as being of selected read and write sets. Then, when a misprediction is processed, and execution is rewound, speculatively added read and write set indications are removed from the read and write sets.
Transaction abort processing
A transaction executing within a computing environment ends prior to completion; i.e., execution is aborted. Pursuant to aborting execution, a hardware transactional execution CPU mode is exited, and one or more of the following is performed: restoring selected registers; committing nontransactional stores on abort; branching to a transaction abort program status word specified location; setting a condition code and/or abort code; and/or preserving diagnostic information.