Patent classifications
G06F9/3858
EVICTING AND RESTORING INFORMATION USING A SINGLE PORT OF A LOGICAL REGISTER MAPPER AND HISTORY BUFFER IN A MICROPROCESSOR COMPRISING MULTIPLE MAIN REGISTER FILE ENTRIES MAPPED TO ONE ACCUMULATOR REGISTER FILE ENTRY
A computer system, processor, programming instructions and/or method of processing data that includes a main register file having a plurality of entries for storing data; an accumulator register file having a plurality of entries for storing data wherein multiple main register file entries are mapped to one accumulator register file entry in the at least one accumulator register file; a logical register mapper to track and map logical registers to main register file entries, and a history buffer. Processing wide data width instructions includes evicting and restoring information from a single primary entry in the logical register mapper through a single read or write port in the logical register mapper without evicting or restoring the remaining other multiple main register file entries mapped in the accumulator register.
PHYSICAL ADDRESS PROXY REUSE MANAGEMENT
Each load/store queue entry holds a load/store physical address proxy (PAP) for use as a proxy for a load/store physical memory line address (PMLA). The load/store PAP comprises a set index and a way that uniquely identifies an L2 cache entry holding a memory line at the load/store PMLA when an L1 cache provides the load/store PAP during the load/store instruction execution. The microprocessor removes a line at a removal PMLA from an L2 entry, forms a removal PAP as a proxy for the removal PMLA that comprises a set index and a way, snoops the load/store queue with the removal PAP to determine whether the removal PAP is being used as a proxy for the removal PMLA, fills the removed entry with a line at a fill PMLA, and prevents the removal PAP from being used as a proxy for the removal PMLA and the fill PMLA concurrently.
MICROPROCESSOR THAT PREVENTS SAME ADDRESS LOAD-LOAD ORDERING VIOLATIONS
A microprocessor prevents same address load-load ordering violations. Each load queue entry holds a load physical memory line address (PMLA) and an indication of whether a load instruction has completed execution. The microprocessor fills a line specified by a fill PMLA into a cache entry and snoops the load queue with the fill PMLA, either before the fill or in an atomic manner with the fill with respect to ability of the filled entry to be hit upon by any load instruction, to determine whether the fill PMLA matches load PMLAs in load queue entries associated with load instructions that have completed execution and there are other load instructions in the load queue that have not completed execution. The microprocessor, if the condition is true, flushes at least the other load instructions in the load queue that have not completed execution.
User mode event handling
A method includes asserting a field of an event flag mask register configured to inhibit an event handler. The method also includes, responsive to an event that corresponds to the field of the event flag mask register being triggered: asserting a field of an event flag register associated with the event; and based the field in the event flag register being asserted, taking an action by a task being executed by the data processor core.
GATHERING PAYLOAD FROM ARBITRARY REGISTERS FOR SEND MESSAGES IN A GRAPHICS ENVIRONMENT
An apparatus to facilitate gathering payload from arbitrary registers for send messages in a graphics environment is disclosed. The apparatus includes processing resources comprising execution circuitry to receive a send gather message instruction identifying a number of registers to access for a send message and identifying IDs of a plurality of individual registers corresponding to the number of registers; decode a first phase of the send gather message instruction; based on decoding the first phase, cause a second phase of the send gather message instruction to bypass an instruction decode stage; and dispatch the first phase subsequently followed by dispatch of the second phase to a send pipeline. The apparatus can also perform an immediate move of the IDs of the plurality of individual registers to an architectural register of the execution circuitry and include a pointer to the architectural register in the send gather message instruction.
Methods and systems for utilizing a master-shadow physical register file based on verified activation
A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.
Ordered event stream event retention
Retention of events of an ordered event stream is disclosed. Expiration of events stored in a segment of an ordered event stream (OES) can be desirable. New events are added to a head of an OES segment, and pruning events from a tail of the OES segment can be valuable. Processing applications can register a processing scheme for a segment, e.g., at-least-once processing, exactly-once processing, etc., and can generate checkpoints indicating a degree of advancement in processing events of the segment. The ordered event stream can determine a cut point indicative of a progress point, that before which, events of an OES can be marked as ready for expiration. However, events that are marked for expiration can be retained to allow processing based on a checkpoint, e.g., expiration of the event can be refused until there is an assurance the event was read by the processing application.
ROUTING INSTRUCTION RESULTS TO A REGISTER BLOCK OF A SUBDIVIDED REGISTER FILE BASED ON REGISTER BLOCK UTILIZATION RATE
A system, processor, programming product and/or method for assigning instructions to destination register file blocks, and/or routing instructions, includes: providing a processing pipeline having two or more execution units configured to process instructions; providing a register file having register file entries configured to hold data, where the register file is subdivided into a plurality of register blocks and each register block has two or more register file entries; calculating a utilization rate for one or more register blocks; and assigning and/or routing an instruction to write its results to a register block based upon the utilization rate for that register block. Preferably the execution unit is configured to write its results to a single specific destination (rename) register block.
DEPENDENCY SKIPPING EXECUTION
A computer processor includes a dispatch stage and a dependency skipping execution unit. The dispatch stage is configured to dispatch a plurality of instructions that include a general purpose instruction configured to produce first data, a dependent instruction configured to produce second data, and an indirect dependent instruction configured to produce third data. The dependency skipping execution unit is configured to monitor the plurality of instructions and to process the indirect dependent instruction in response to the general purpose instruction producing the first data. The indirect dependent instruction is issued independently from the second data produced by the indirect dependent instruction.
System and method of VLIW instruction processing using reduced-width VLIW processor
Very long instruction word (VLIW) instruction processing using a reduced-width processor is disclosed. In a particular embodiment, a VLIW processor includes a control circuit configured to receive a VLIW packet that includes a first number of instructions and to distribute the instructions to a second number of instruction execution paths. The first number is greater than the second number. The VLIW processor also includes physical registers configured to store results of executing the instructions and a register renaming circuit that is coupled to the control circuit.