G06F9/3848

BRANCH DENSITY DETECTION FOR PREFETCHER
20220137974 · 2022-05-05 ·

In one embodiment, a microprocessor, comprising: first logic configured to dynamically adjust a maximum prefetch count based on a total count of predicted taken branches over a predetermined quantity of cache lines; and second logic configured to prefetch instructions based on the adjusted maximum prefetch count.

DUAL BRANCH FORMAT
20220129278 · 2022-04-28 ·

In one embodiment, a branch processing method, comprising: assigning plural branch instructions for a given clock cycle to primary branch information and secondary branch information; routing the primary branch information along a first path having adder logic and the secondary branch information along a second path having no adder logic; and writing the primary branch information including a displacement branch target address to a branch order table (BOT) and the secondary branch information without a target address to the BOT.

DUAL BRANCH EXECUTE AND TABLE UPDATE
20220129277 · 2022-04-28 ·

In one embodiment, a branch processing method comprising receiving information from at least two branch execution units; writing two updates per clock cycle to respective first and second write queues based on the information; and writing from the first write queue up to two updates per clock cycle into plural tables of a first predictor and a single update for the single clock cycle when there is an expected write collision, the first predictor comprising a single write or read/write port.

QUICK PREDICTOR OVERRIDE
20220121446 · 2022-04-21 ·

In one embodiment, a microprocessor, comprising: an instruction cache comprising programming instructions at a plurality of cache addresses; a quick predictor configured to receive information associated with a first branch instruction corresponding to a first cache address of the instruction cache and, based on a match at the quick predictor, provide a first branch prediction during a first stage; and a branch target address cache (BTAC) operating according to a second set of stages subsequent to the first stage, the BTAC configured to receive the information associated with the first branch instruction and determine a second branch prediction, the BTAC configured to override the first branch prediction and update the quick predictor by writing a branch target address associated with the second branch prediction to the quick predictor at a time corresponding to the second set of stages.

Conditional Instructions Distribution and Execution

A processor may include an instruction distribution circuit and a plurality of execution pipelines. The instruction distribution circuit may distribute a conditional instruction to a first execution pipeline for execution when the conditional instruction is associated with a prediction of a high confidence level, or to a second execution pipeline for execution when the conditional instruction is associated with a prediction of a low confidence level. The second execution pipeline, not the first execution pipeline, may directly instruct the processor to obtain an instruction from a target address for execution, when the conditional instruction is mispredicted. Thus, when the conditional instruction is distributed to the first execution pipeline for execution and determined to be mispredicted, the first execution pipeline may cause the conditional instruction to be re-executed in the second execution pipeline to cause the instruction from the correct target address to be obtained for execution.

Conditional Instructions Prediction

A processor may include a bias prediction circuit and an instruction prediction circuit to provide respective predictions for a conditional instruction. The bias prediction circuit may provide a bias prediction whether a condition of the conditional instruction is biased true or biased false. The instruction prediction circuit may provide an instruction prediction whether the condition of the conditional instruction is true of false. Responsive to a bias prediction that the condition of the conditional instruction is biased true or biased false, the processor may use the bias prediction from the bias prediction circuit to speculatively process the conditional instruction. Otherwise, the processor may use the instruction prediction from the instruction prediction circuit to speculatively process the conditional instruction.

REGISTER SCOREBOARD FOR A MICROPROCESSOR WITH A TIME COUNTER FOR STATICALLY DISPATCHING INSTRUCTIONS
20230244493 · 2023-08-03 · ·

A processor includes a time counter and a register scoreboard and operates to statically dispatch instructions with preset execution times based on a write time of a register in the register scoreboard and a time count of the time counter provided to an execution pipeline.

Virtual 3-way decoupled prediction and fetch

A unified queue configured to perform decoupled prediction and fetch operations, and related apparatuses, systems, methods, and computer-readable media, is disclosed. The unified queue has a plurality of entries, where each entry is configured to store information associated with at least one instruction, and where the information comprises an identifier portion, a prediction information portion, and a tag information portion. The unified queue is configured to update the prediction information portion of each entry responsive to a prediction block, and to update the tag information portion of each entry responsive to a tag and TLB block. The prediction information may be updated more than once, and the unified queue is configured to take corrective action where a later prediction conflicts with an earlier prediction.

Transactional recovery storage for branch history and return addresses to partially or fully restore the return stack and branch history register on transaction aborts

An apparatus and method are provided for handling prediction information. The apparatus has processing circuitry for performing data processing operations in response to instructions, the processing circuitry comprising transactional memory support circuitry to support execution of a transaction comprising a sequence of instructions. Prediction circuitry is used to generate predictions in relation to instruction flow changing instructions, and prediction storage is provided to store a plurality of items of prediction information that are referenced by the prediction circuitry when generating the predictions. The items of prediction information maintained by the prediction storage change based on the instructions being executed by the processing circuitry. A recovery storage is activated by the transactional memory support circuitry at a transaction start point to store a restore pointer identifying a chosen location in the prediction storage. Between the transaction start point and the transaction end point, the recovery storage receives any item of prediction information removed from the prediction storage that was present in the prediction storage at the transaction start point. In response to the transaction being aborted, the restore pointer is used in order to discard from the prediction storage any items of prediction information added to the prediction storage after the transaction start point, and in addition any items of prediction information stored in the recovery storage are stored back into the prediction storage. This can significantly improve prediction accuracy in systems that may need to retry transactions due to a transaction abort, without requiring the entire prediction storage state to be captured at the transaction start point.

Loop exit predictor

A processor includes a prediction engine coupled to a training engine. The prediction engine includes a loop exit predictor. The training engine includes a loop exit branch monitor coupled to a loop detector. Based on at least one of a plurality of call return levels, the loop detector of the processor takes a snapshot of a retired predicted block during a first retirement time, compares the snapshot to a subsequent retired predicted block at a second retirement time, and based on the comparison, identifies a loop and loop exit branches within the loop for use by the loop exit branch monitor and the loop exit predictor to determine whether to override a general purpose conditional prediction.