G06F2209/521

Low latency and highly programmable interrupt controller unit

A graph processing core includes a plurality of processing pipelines and an interrupt controller unit. Each processing pipeline executes one or more threads and includes, for each thread, a register indicating a currently executing program counter vector and another register indicating an interrupt or exception handler vector. The interrupt controller unit may receive interrupt or exception notifications from the processing pipelines, determine a handler vector based on the notification and a set of registers of the interrupt controller unit, and transmit the handler vector to the processing pipeline that issued the interrupt or exception notification. Further, the issuing pipeline may receive the handler vector from the interrupt controller unit, write a value in the first register into the second register, write the handler vector into the first register, and invoke an interrupt or exception hander based on the value written into the first register.

HARDWARE ACCELERATED SYNCHRONIZATION WITH ASYNCHRONOUS TRANSACTION SUPPORT

A new transaction barrier synchronization primitive enables executing threads and asynchronous transactions to synchronize across parallel processors. The asynchronous transactions may include transactions resulting from, for example, hardware data movement units such as direct memory units, etc. A hardware synchronization circuit may provide for the synchronization primitive to be stored in a cache memory so that barrier operations may be accelerated by the circuit. A new wait mechanism reduces software overhead associated with waiting on a barrier.