Patent classifications
G06F2209/507
APPARATUS, METHOD, AND SYSTEM FOR ENSURING QUALITY OF SERVICE FOR MULTI-THREADING PROCESSOR CORES
A simultaneous multi-threading (SMT) processor core capable of thread-based biasing with respect to execution resources. The SMT processor includes priority controller circuitry to determine a thread priority value for each of a plurality of threads to be executed by the SMT processor core and to generate a priority vector comprising the thread priority value of each of the plurality of threads. The SMT processor further includes thread selector circuitry to make execution cycle assignments of a pipeline by assigning to each of the plurality of threads a portion of the pipeline's execution cycles based on each thread's priority value in the priority vector. The thread selector circuitry is further to select, from the plurality of threads, tasks to be processed by the pipeline based on the execution cycle assignments.
TASK EXECUTION IN A SIMD PROCESSING UNIT WITH PARALLEL GROUPS OF PROCESSING LANES
A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
Allocation of Memory
Methods of memory allocation map registers referenced by different groups of instances of the same task to individual logical memories. Other example methods describe the mapping of registers referenced by a task to different banks within a single logical memory and in various examples this mapping may take into consideration which bank is likely to be the dominant bank for the particular task and the allocation for one or more other tasks.
Allocation of Memory
Methods of memory allocation in which registers referenced by different groups of instances of the same task are mapped to individual logical memories. Other example methods describe the mapping of registers referenced by a task to different banks within a single logical memory and in various examples this mapping may take into consideration which bank is likely to be the dominant bank for the particular task and the allocation for one or more other tasks.
Task execution in a SIMD processing unit with parallel groups of processing lanes
A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
MEMORY FRAGMENTS FOR SUPPORTING CODE BLOCK EXECUTION BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES
A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.
Providing interrupt service routine (ISR) prefetching in multicore processor-based systems
Providing interrupt service routine (ISR) prefetching in multicore processor-based systems is disclosed. In one aspect, a multicore processor-based system provides an ISR prefetch control circuit communicatively coupled to an interrupt controller and a plurality of instruction fetch units (IFUs) of a corresponding plurality of processor elements (PEs). Upon receiving an interrupt directed to a target PE of the plurality of PEs, the interrupt controller provides an interrupt request (IRQ) identifier to the ISR prefetch control circuit. Based on the IRQ identifier, the ISR prefetch control circuit fetches an ISR pointer to an ISR corresponding to the IRQ identifier. The ISR prefetch control circuit next selects a prefetch PE of the plurality of PEs to perform a prefetch operation to retrieve the ISR on behalf of the target PE, and provides an ISR prefetch request, including the ISR pointer, to an IFU of the prefetch PE.
Apparatus, system and method for proxy coupling management
An apparatus, system, and method are disclosed for proxy coupling management. A proxy template module transparently offloads a task from a native data processing system to an equivalent task on a remote data processing system. A proxy generation module fills in the proxy template with information, such as specification of an offload task that is a remote equivalent of the native task, to generate a proxy of the native task. An offload agent module receives a request from the proxy to perform the offload task on the remote data processing system.
Multi-thread processor and controlling method thereof
A multi-thread processor and a method of controlling a multi-thread processor are provided. The multi-thread processor includes at least one functional unit; a mode register; and a controller configured to control the mode register to store thread mode information corresponding to a task to be processed among a plurality of thread modes, wherein the plurality of thread modes are divided based on a size and a number of at least one thread that is concurrently processed in one of the at least one functional unit, allocate at least one thread included in the task to the at least one functional unit based on the thread mode information stored in the mode register and control the at least one functional unit to process the at least one thread.
Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines
A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.