Patent classifications
G06F9/3857
Method for replenishing a thread queue with a target instruction of a jump instruction
Methods and electronic circuits for executing instructions in a central processing unit (CPU) are provided. One of the methods includes forming an instruction block by sequentially fetching, from a current thread queue, one or more instructions including one jump instruction, wherein the jump instruction is the last instruction in the instruction block; transmitting the instruction block to a CPU execution unit for execution; replenishing the current thread queue with at least one instruction to form a thread queue to be executed; determining a target instruction of the jump instruction according to an execution result of the CPU execution unit; determining whether the target instruction is contained in the thread queue to be executed; and if not, flushing the thread queue to be executed, obtaining the target instruction and adding the target instruction to the thread queue to be executed.
Writeback Hazard Elimination
A processor includes a processing pipeline, a plurality of result-storage elements, and writeback logic. The processing pipeline is configured to process program operations and to write, to a result storage, up to a predefined maximal number of results of the processed program operations per clock cycle. The result-storage elements are configured to store respective ones of the results. The writeback logic is configured to (i) detect a writeback conflict event in which the processing pipeline produces simultaneous results that exceed the predefined maximal number of results, for writing to the result storage, in a same clock cycle, (ii) in response to detecting the writeback conflict event, to temporarily store at least a given result, from among the simultaneous results, in a given result-storage element, and (iii) to subsequently write the temporarily-stored given result from the given result-storage element to the result storage.
ADVANCED PROCESSOR ARCHITECTURE
The invention relates to a method for processing instructions out-of-order on a processor comprising an arrangement of execution units. The inventive method comprises: 1) looking up operand sources in a Register Positioning Table and setting operand input references of the instruction to be issued accordingly; 2) checking for an Execution Unit (EXU) available for receiving a new instruction; and 3) issuing the instruction to the available Execution Unit and enter a reference of the result register addressed by the instruction to be issued to the Execution Unit into the Register Positioning Table (RPT).
METHOD FOR EXECUTING MULTITHREADED INSTRUCTIONS GROUPED INTO BLOCKS
A method for executing multithreaded instructions grouped into blocks. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks, wherein the instructions of the instruction blocks are interleaved with multiple threads; scheduling the instructions of the instruction block to execute in accordance with the multiple threads; and tracking execution of the multiple threads to enforce fairness in an execution pipeline.
Computing System and controller thereof
Computing system and controller thereof are disclosed for ensuring the correct logical relationship between multiple instructions during their parallel execution. The computing system comprises: a plurality of functional modules each performing a respective function in response to an instruction for the given functional module; and a controller for determining whether or not to send an instruction to a corresponding functional module according to dependency relationship between the plurality of instructions.
Inferring future value for speculative branch resolution
Aspects of the invention include includes determining a first instruction in a processing pipeline, wherein the first instruction includes a compare instruction, determining a second instruction in the processing pipeline, wherein the second instruction includes a conditional branch instruction relying on the compare instruction, determining a predicted result of the compare instruction, and completing the conditional branch instruction using the predicted result prior to executing the compare instruction.
Subscription to Sync Zones
A set of configurable sync groupings (which may be referred to as sync zones) are defined. Any of the processors may belong to any of the sync zones. Each of the processor comprises a register indicating to which of the sync zones it belongs. If a processor does not belong to a sync zone, it continually asserts a sync request for that sync zone to the sync controller. If a processor does belong to a sync zone, it will only assert its sync request for that sync zone upon arriving at a synchronisation point for that sync zone indicated in its compiled code set.
Systems, methods, and apparatuses for heterogeneous computing
- Rajesh M. Sankaran ,
- Gilbert Neiger ,
- Narayan Ranganathan ,
- Stephen R. Van Doren ,
- Joseph Nuzman ,
- Niall D. McDonnell ,
- Michael A. O'Hanlon ,
- Lokpraveen B. Mosur ,
- Tracy Garrett Drysdale ,
- Eriko Nurvitadhi ,
- Asit K. Mishra ,
- Ganesh Venkatesh ,
- Deborah T. Marr ,
- Nicholas P. Carter ,
- Jonathan D. Pearce ,
- Edward T. Grochowski ,
- Richard J. Greco ,
- Robert Valentine ,
- Jesus Corbal ,
- Thomas D. Fletcher ,
- Dennis R. Bradford ,
- Dwight P. Manley ,
- Mark J. Charney ,
- Jeffrey J. Cook ,
- Paul Caprioli ,
- Koichi Yamada ,
- Kent D. Glossop ,
- David B. Sheffield
Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
Branch target buffer arrangement with preferential storage for unconditional branch instructions
A branch target buffer, BTB, is provided to store at least one BTB entry corresponding to a respective branch in a control flow in a sequence of machine-readable instructions of a computer program. The BTB has a tag field to compare with a program counter of a fetch address generator and at least one further field to store information characteristic of the branch instruction identified by the corresponding tag field and allowing a conditional branch to be distinguished from an unconditional branch instruction. The BTB has a predetermined storage capacity and is utilized such that unconditional branch instructions are preferentially allocated storage space in the BTB relative to conditional branch instructions.
Apparatus and method for store pairing with reduced hardware requirements
An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.