Patent classifications
G06F9/3828
System and method of populating an instruction word
A methodology for populating an instruction word for simultaneous execution of instruction operations by a plurality of ALUs in a data path is provided. The methodology includes: creating a dependency graph of instruction nodes, each instruction node including at least one instruction operation; first selecting a first available instruction node from the dependency graph; first assigning the selected first available instruction node to the instruction word; second selecting any available dependent instruction nodes that are dependent upon a result of the selected first available instruction node and do not violate any predetermined rule; second assigning to the instruction word the selected any available dependent instruction nodes; and updating the dependency graph to remove any instruction nodes assigned during the first and second assigning from further consideration for assignment.
Block-based processor core composition register
Systems, apparatuses, and methods related to a block-based processor core composition register are disclosed. In one example of the disclosed technology, a processor can include a plurality of block-based processor cores for executing a program including a plurality of instruction blocks. A respective block-based processor core can include one or more sharable resources and a programmable composition control register. The programmable composition control register can be used to configure which resources of the one or more sharable resources are shared with other processor cores of the plurality of processor cores.
Advanced processor architecture
The invention relates to a method for processing instructions out-of-order on a processor comprising an arrangement of execution units. The inventive method comprises looking up operand sources in a Register Positioning Table and setting operand input references of the instruction to be issued accordingly, checking for an Execution Unit (EXU) available for receiving a new instruction, and issuing the instruction to the available Execution Unit and entering a reference of the result register addressed by the instruction to be issued to the Execution Unit into the Register Positioning Table (RPT).
System, apparatus and method for dynamic pipeline stage control of data path dominant circuitry of an integrated circuit
In an embodiment, a data path circuit includes: a plurality of pipeline stages coupled between an input of the data path circuit and an output of the data path circuit; and a first selection circuit coupled between a first pipeline stage and a second pipeline stage, the first selection circuit having a first input to receive an input to the first pipeline stage and a second input to receive an output of the first pipeline stage and controllable to output one of the input to the first pipeline stage and the output of the first pipeline stage. A bypass controller coupled to the data path circuit may control the first selection circuit based at least in part on an operating frequency of the data path circuit. Other embodiments are described and claimed.
QUIESCE RECONFIGURABLE DATA PROCESSOR
A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
SIMD operand permutation with selection from among multiple registers
Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline. In some embodiments, the routing circuitry may support a shift and fill instruction that facilitates storage of an arbitrary portion of a graphics frame in one or more registers.
SYSTEM, APPARATUS AND METHOD FOR DYNAMIC PIPELINE STAGE CONTROL OF DATA PATH DOMINANT CIRCUITRY OF AN INTEGRATED CIRCUIT
In an embodiment, a data path circuit includes: a plurality of pipeline stages coupled between an input of the data path circuit and an output of the data path circuit; and a first selection circuit coupled between a first pipeline stage and a second pipeline stage, the first selection circuit having a first input to receive an input to the first pipeline stage and a second input to receive an output of the first pipeline stage and controllable to output one of the input to the first pipeline stage and the output of the first pipeline stage. A bypass controller coupled to the data path circuit may control the first selection circuit based at least in part on an operating frequency of the data path circuit. Other embodiments are described and claimed.
Distinct system registers for logical processors
Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.
SIMD Operand Permutation with Selection from among Multiple Registers
Techniques are disclosed relating to operand routing among SIMD pipelines. In some embodiments, an apparatus includes a set of multiple hardware pipelines configured to execute a single-instruction multiple-data (SIMD) instruction for multiple threads in parallel, wherein the instruction specifies first and second architectural registers. In some embodiments, the pipelines include execution circuitry configured to perform operations using one or more pipeline stages of the pipeline. In some embodiments, the pipelines include routing circuitry configured to select, based on the instruction, a first input operand for the execution circuitry from among: a value from the first architectural register from thread-specific storage for another pipeline and a value from the second architectural register from thread-specific storage for a thread assigned to another pipeline. In some embodiments, the routing circuitry may support a shift and fill instruction that facilitates storage of an arbitrary portion of a graphics frame in one or more registers.
Datapath Circuitry for Math Operations using SIMD Pipelines
Techniques are disclosed relating to sharing operands among SIMD threads for a larger arithmetic operation. In some embodiments, a set of multiple hardware pipelines is configured to execute single-instruction multiple-data (SIMD) instructions for multiple threads in parallel, where ones of the hardware pipelines include execution circuitry configured to perform floating-point operations using one or more pipeline stages of the pipeline and first routing circuitry configured to select, from among thread-specific operands stored for the hardware pipeline and from one or more other pipelines in the set, a first input operand for an operation by the execution circuitry. In some embodiments, a device is configured to perform a mathematical operation on source input data structures stored across thread-specific storage for the set of hardware pipelines, by executing multiple SIMD floating-point operations using the execution circuitry and the first routing circuitry. This may improve performance and reduce power consumption for matrix multiply and reduction operations, for example.