G06F2209/507

RECONFIGURABLE SYSTEM-ON-CHIP AND RELATED METHODS

A circuit includes combinational circuit and sequential circuit elements coupled thereto. The circuit includes a multiplexor coupled to the combinational and sequential circuit elements, and a system register is coupled to the multiplexor. At least one portion of the combinational and sequential circuit elements is configured to selectively switch to operate as a random access memory.

System and method for out-of-order resource allocation and deallocation in a threaded machine
09690625 · 2017-06-27 · ·

A system and method for managing the dynamic sharing of processor resources between threads in a multi-threaded processor are disclosed. Out-of-order allocation and deallocation may be employed to efficiently use the various resources of the processor. Each element of an allocate vector may indicate whether a corresponding resource is available for allocation. A search of the allocate vector may be performed to identify resources available for allocation. Upon allocation of a resource, a thread identifier associated with the thread to which the resource is allocated may be associated with the allocate vector entry corresponding to the allocated resource. Multiple instances of a particular resource type may be allocated or deallocated in a single processor execution cycle. Each element of a deallocate vector may indicate whether a corresponding resource is ready for deallocation. Examples of resources that may be dynamically shared between threads are reorder buffers, load buffers and store buffers.

Multiple-Patch SIMD Dispatch Mode for Domain Shaders

To use SIMD lanes efficiently for domain shader execution, domain point data from different domain shader patches may be packed together into a single SIMD thread. To generate an efficient code sequence, each domain point occupies one SIMD lane and all attributes for the domain point reside in their own partition of General Register File (GRF) space. This technique is called the multiple-patch SIMD dispatch mode.

TASK EXECUTION IN A SIMD PROCESSING UNIT WITH PARALLEL GROUPS OF PROCESSING LANES
20250061536 · 2025-02-20 ·

A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.

Selecting Processor Micro-Threading Mode
20170132049 · 2017-05-11 ·

An approach is provided to dynamically select a micro-threading (MT) mode of each core of a processor based on a load on each of the respective cores while the processor is running a hypervisor. The approach sets a core's micro-threading mode to a whole-core mode (MT1) in response to identifying that the load on the selected core is at a light load level, sets the core's micro-threading mode to a two-way micro-threading mode (MT2) in response to identifying that the load on the selected core has increased above the light load level, and sets the selected core's micro-threading mode to a four-way micro-threading mode (MT4) in response to identifying that the load on the selected core is at a high load level.

Selecting Processor Micro-Threading Mode
20170132050 · 2017-05-11 ·

An approach is provided to dynamically select a micro-threading (MT) mode of each core of a processor based on a load on each of the respective cores while the processor is running a hypervisor. The approach sets a core's micro-threading mode to a whole-core mode (MT1) in response to identifying that the load on the selected core is at a light load level, sets the core's micro-threading mode to a two-way micro-threading mode (MT2) in response to identifying that the load on the selected core has increased above the light load level, and sets the selected core's micro-threading mode to a four-way micro-threading mode (MT4) in response to identifying that the load on the selected core is at a high load level.

System and Method for Hardware Multithreading to Improve VLIW DSP Performance and Efficiency
20170132003 · 2017-05-11 ·

A system and method of hardware multithreading in VLIW DSPs includes an instruction fetch and dispatch unit, a plurality of program control units coupled to the instruction fetch and dispatch unit, a plurality of function units coupled to the plurality of program control units, and a mode control unit coupled to the function units and the program control units, the mode control unit configured to dynamically organize the plurality of function units and the plurality of program control units into one or more threads, each thread comprising a program control of the plurality of program control units and a subset of the plurality of function units.

Intelligent bandwidth shifting mechanism

In an approach for sharing memory bandwidth in one or more processors, a processor receives a first set of monitored usage information for one or more processors executing one or more threads. A processor calculates impact of hardware data prefetching for each thread of the one or more threads, based on the first set of monitored usage information. A processor adjusts prefetch settings for the one or more threads, based on the calculated impact of hardware data prefetching for each thread of the one or more threads.

Resource Sharing Using Process Delay
20170123853 · 2017-05-04 ·

Methods and systems that reduce the number of instance of a shared resource needed for a processor to perform an operation and/or execute a process without impacting function are provided. a method of processing in a processor is provided. Aspects include determining that an operation to be performed by the processor will require the use of a shared resource. A command can be issued to cause a second operation to not use the shared resources N cycles later. The shared resource can then be used for a first aspect of the operation at cycle X and then used for a second aspect of the operation at cycle X+N. The second operation may be rescheduled according to embodiments.

Reconfigurable system-on-chip and related methods

A circuit includes combinational circuit and sequential circuit elements coupled thereto. The circuit includes a multiplexor coupled to the combinational and sequential circuit elements, and a system register is coupled to the multiplexor. At least one portion of the combinational and sequential circuit elements is configured to selectively switch to operate as a random access memory.