G06F9/462

SCHEDULING TASKS IN A MULTI-THREADED PROCESSOR
20190121668 · 2019-04-25 · ·

A processor comprising: an execution unit for executing a respective thread in each of a repeating sequence of time slots; and a plurality of context register sets, each comprising a respective set of registers for representing a state of a respective thread. The context register sets comprise a respective worker context register set for each of the number of time slots the execution unit is operable to interleave, and at least one extra context register set. The worker context register sets represent the respective states of worker threads and the extra context register set being represents the state of a supervisor thread. The processor is configured to begin running the supervisor thread in each of the time slots, and to enable the supervisor thread to then individually relinquish each of the time slots in which it is running to a respective one of the worker threads.

APPARATUS AND METHOD FOR PROCESSING THREAD GROUPS

An apparatus and method are provided for processing thread groups, where each thread group has associated program code and comprises one or more threads. Scheduling circuitry is used to select thread groups from a plurality of thread groups, and then thread processing circuitry is responsive to the scheduling circuitry to process one or more threads of a selected thread group by executing instructions of the associated program code. The associated program code comprises a plurality of regions that each require access to an associated plurality of registers providing operand values for the instructions of that region. An operand staging unit is provided that has a plurality of storage elements that are dynamically allocated to provide the associated plurality of registers for one or more of the regions. Capacity management circuitry is arranged, for a thread group having a region of the associated program code that is ready to be executed, to perform an operand setup process to reserve sufficient storage elements within the operand staging unit to provide the associated plurality of registers, and to cause the operand value for any input register to be preloaded into a reserved storage element allocated for that input register, an input register being a register whose operand value is required before the region can be executed. The scheduling circuitry selects a thread group for which the capacity management circuitry has performed the operand setup process in respect of the region to be executed, and the thread processing circuitry then executes the instructions of the region of the selected thread group with reference to the registers as provided by the operand staging unit. This provides a very area and energy efficient mechanism for providing the required registers.

APPLICATION RESTORE TIME FROM CLOUD GATEWAY OPTIMIZATION USING STORLETS
20190095244 · 2019-03-28 ·

A method, computer system, and a computer program product for designing and executing at least one storlet is provided. The present invention may include receiving a plurality of restore operations based on a plurality of data. The present invention may also include identifying a plurality of blocks corresponding to the received plurality of restore operations from the plurality of data. The present invention may then include identifying a plurality of grain packs corresponding with the identified plurality of blocks. The present invention may further include generating a plurality of grain pack index identifications corresponding with the identified plurality of grain packs. The present invention may also include generating at least one storlet based on the generated plurality of grain pack index identifications. The present invention may then include returning a plurality of consolidated objects by executing the generated storlet.

Data processing apparatus and method using secure domain and less secure domain

A data processing apparatus has processing circuitry which has a secure domain and a less secure domain of operation. When operating in the secure domain the processing circuitry has access to data that is not accessible in the less secure domain. In response to a control flow altering instruction, processing switches to a program instruction at a target address. Domain selection is performed to determine a selected domain in which the processing circuitry is to operate for the instruction at the target address. Domain checking can be performed to check which domains are allowed to be the selected domain determining the domain selection. A domain check error is triggered if the selected domain in the domain selection is not an allowed selected domain.

Maintaining secure data isolated from non-secure access when switching between domains

A data processing apparatus including circuitry for performing data processing, a plurality of registers; and a data store including regions having different secure levels, at least one secure region (for storing sensitive data accessible by the data processing circuitry operating in the secure domain and not accessible by the data processing circuitry operating in a less secure domain) and a less secure region (for storing less secure data). The circuitry is configured to determine which stack to store data to, or load data from, in response to the storage location of the program code being executed. In response to program code calling a function to be executed, the function code being stored in a second region, the second region having a different secure level to the first region, the data processing circuitry is configured to determine which of the first and second region have a lower secure level.

Context switching method and system for swapping contexts between register sets based on thread halt
12073221 · 2024-08-27 · ·

A context switching system includes a processor and a scheduler. The processor is configured to execute a first thread. A first context associated with the first thread is stored in a register set of the processor. While the first thread is being executed, the scheduler is configured to select a second thread from a set of threads, and receive and store a second context associated with the second thread in a register set of the scheduler. The second thread is to be scheduled for execution after the first thread. The scheduler is further configured to swap the first and second contexts when the execution of the first thread is halted, thereby executing the context switching. Further, the processor is configured to execute the second thread based on the second context. While the second thread is being executed, the first context is stored in the data memory.

ALLOCATION OF RESOURCES TO TASKS
20240273804 · 2024-08-15 ·

A method of managing resources in a graphics processing pipeline includes, in response to selecting a task for execution within a texture/shading unit, allocating to the task both a static allocation of temporary registers for the entire task and a dynamic allocation of temporary registers. The dynamic allocation comprises temporary registers used by a first phase of the task only and the static allocation of temporary registers comprises any temporary registers that are used by the program and are live at a boundary between two phases. When the task subsequently reaches a boundary between two phases, the dynamic allocation of temporary registers are freed and a new dynamic allocation of temporary registers for a next phase of the task is allocated to the task.

PROCESSOR AND OPERATING METHOD THEREOF

A processor includes a register file, a context controller that, in response to a target interrupt occurring, is configured to determine, a target register that stores new data acquired through each of commands for executing an interrupt service routine (ISR) among the plurality of registers, a write buffer configured to transmit pre-data stored in the target register to a memory, and a flag register configured to store set data including set values indicating whether the new data is stored in each of the registers. The context controller is configured to determine whether to transfer the pre-data to the memory through the write buffer based on the set data.

HUMAN-MACHINE-INTERFACE SYSTEM

A human-machine-interface system comprising: register-file-memory, configured to store input-data; a first-processing-element-slice, a second-processing-element-slice, and a controller. Each of the processing-slices comprise: a register configured to store register-data; and a processing-element configured to apply an arithmetic and logic operation on the register-data in order to provide convolution-output-data. The controller is configured to: load input-data from the register-file-memory into the first-register as the first-register-data; and load: (i) input-data from the register-file-memory, or (ii) the first-register-data from the first-register, into the second-register as the second-register-data.

Intelligent context management

Intelligent context management for thread switching is achieved by determining that a register bank has not been used by a thread for a predetermined number of dispatches, and responsively disabling the register bank for use by that thread. A counter is incremented each time the thread is dispatched but the register bank goes unused. Usage or non-usage of the register bank is inferred by comparing a previous checksum for the register bank to a current checksum. If the previous and current checksums match, the system concludes that the register bank has not been used. If a thread attempts to access a disabled bank, the processor takes an interrupt, enables the bank, and resets the corresponding counter. For a system utilizing transactional memory, it is preferable to enable all of the register banks when thread processing begins to avoid aborted transactions from register banks disabled by lazy context management techniques.