Patent classifications
G06F9/3834
Maintaining sequentiality for media management of a memory sub-system
Methods, systems, and devices for maintaining sequentiality for media management of a memory sub-system are described. A plurality of read commands in connection with a set of media management operations for a plurality of transfer units are issued according to a read sequence. A plurality of entries associated with the set of media management operations are stored. A plurality of write commands in connection with the set of media management operations are issued based on the plurality of entries of the read sequence.
Restoring speculative history used for making speculative predictions for instructions processed in a processor employing control independence techniques
Restoring speculative history used for making speculative predictions for instructions processed in a processor. The processor can be configured to speculatively predict an outcome of a condition or predicate of a conditional control instruction before its condition is fully evaluated in execution. Predictions are made by the processor based on a history that is updated based on outcomes of past predictions. If a conditional control instruction is mispredicted in execution, the processor can perform a misprediction recovery by stalling the instruction pipeline, flushing younger instructions in the instruction pipeline back to the mispredicted conditional control instruction, and then re-fetching instructions in the correct instruction flow path for execution. The processor can be configured to restore entries of the speculative history associated with younger control independent (CI) conditional control instructions, so that younger fetched instructions that follow non-re-fetched CI instructions in misprediction recovery will use a more accurate speculative history.
Inhibiting load instruction execution based on reserving a resource of a load and store queue but failing to reserve a resource of a store data queue
A calculation processing apparatus includes a decoder that decodes memory access instructions including a store instruction and a load instruction; a first queue that stores the decoded memory access instructions; a second queue that stores store data related to the store instruction; a storage circuit that stores target address information of the store instruction for which the first queue is reserved but the second queue is not reserved; and an inhibitor that inhibits execution of the load instruction when address information matching target address information of the load instruction is stored in the storage circuit when the load instruction is being processed. This configuration inhibits switching of the order of a store instruction and a load instruction.
DETERMINING A RESTART POINT IN OUT-OF-ORDER EXECUTION
There is provided a data processing apparatus comprising decode circuitry responsive to receipt of a block of instructions to generate control signals indicative of each of the block of instructions, and to analyse the block of instructions to detect a potential hazard instruction. The data processing apparatus is provided with decode circuitry to encode information indicative of a clean restart point into the control signals associated with the potential hazard instruction. The data processing apparatus is provided with data processing circuitry to perform out-of-order execution of at least some of the block of instructions, and control circuitry responsive to a determination, at execution of the potential hazard instruction, that data values used as operands for the potential hazard instruction have been modified by out-of-order execution of a subsequent instruction, to restart execution from the clean restart point and to flush held data values from the data processing circuitry.
Systems, methods, and apparatuses for heterogeneous computing
- Rajesh M. Sankaran ,
- Gilbert Neiger ,
- Narayan Ranganathan ,
- Stephen R. Van Doren ,
- Joseph Nuzman ,
- Niall D. McDonnell ,
- Michael A. O'Hanlon ,
- Lokpraveen B. Mosur ,
- Tracy Garrett Drysdale ,
- Eriko Nurvitadhi ,
- Asit K. Mishra ,
- Ganesh Venkatesh ,
- Deborah T. Marr ,
- Nicholas P. Carter ,
- Jonathan D. Pearce ,
- Edward T. Grochowski ,
- Richard J. Greco ,
- Robert Valentine ,
- Jesus Corbal ,
- Thomas D. Fletcher ,
- Dennis R. Bradford ,
- Dwight P. Manley ,
- Mark J. Charney ,
- Jeffrey J. Cook ,
- Paul Caprioli ,
- Koichi Yamada ,
- Kent D. Glossop ,
- David B. Sheffield
Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
Detecting execution hazards in offloaded operations
Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.
Execution elision of intermediate instruction by processor
A method for operation of a processor core is provided. First instruction data is consulted to determine whether a second instruction has execution data that matches the first instruction data. The first instruction data is from a first instruction. In response to determining that the second instruction has execution data that matches the first instruction data, prior data is copied into the second instruction. The first instruction depends on the prior data. After receiving an availability indication of the prior data, both the first instruction and the second instruction are woken for execution, without requiring execution of the first instruction before waking of the second instruction. The second instruction is executed by using the prior data as a skip of the first instruction. A computer system and a processor core configured to operate according to the method are also disclosed herein.
ADJUSTING STORE GATHER WINDOW DURATION IN A DATA PROCESSING SYSTEM SUPPORTING SIMULTANEOUS MULTITHREADING
In at least some embodiments, a store-type operation is received and buffered within a store queue entry of a store queue associated with a cache memory of a processor core capable of executing multiple simultaneous hardware threads. A thread identifier indicating a particular hardware thread among the multiple hardware threads that issued the store-type operation is recorded. An indication of whether the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the hardware thread is also maintained. While the indication indicates the store queue entry is a most recently allocated store queue entry for buffering store-type operations of the particular hardware thread, the store queue extends a duration of a store gathering window applicable to the store queue entry. For example, the duration may be extended by decreasing a rate at which the store gathering window applicable to the store queue entry ends.
PROCESSORS EMPLOYING MEMORY DATA BYPASSING IN MEMORY DATA DEPENDENT INSTRUCTIONS AS A STORE DATA FORWARDING MECHANISM, AND RELATED METHODS
Processors employing memory bypassing in memory data dependent instructions as a store data forwarding mechanism, and related methods. To reduce stalls of memory data dependent, load-based instructions, a memory data dependency detection circuit is configured to detect a memory hazard between a store-based instruction and a load-based instruction based on their opcodes and designation/source operands. Some store-based and load-based instructions have opcodes identifying these instructions as having respective store and load address operand types that can be compared without resolution of their respective store and load addresses. For these detected types of instructions, the memory data dependency detection circuit is configured to determine if a source operand of a load-based instruction matches a target operand of a store-based instruction to detect a memory hazard earlier in the instruction pipeline. Identifying memory hazards earlier in an instruction pipeline can allow memory dependent instructions to be processed with avoided or reduced stalls.
APPARATUS AND METHOD FOR HANDLING MEMORY LOAD REQUESTS
When load requests are generated to support data processing operations, the load requests are buffered in pending load buffer circuitry prior to being carried out. Coalescing circuitry determines for a first load request whether a set of one or more subsequent load requests buffered in the pending load buffer circuitry satisfies an address proximity condition. The address proximity condition is satisfied when all data items identified by the set of one or more subsequent load requests are comprised within a series of data items which will be retrieved from the memory system in response to the first load request. When the address proximity condition is satisfied, forwarding of the set of one or more subsequent load requests is suppressed.