G06F9/462

Allocation of resources to tasks

A method of managing resources in a graphics processing pipeline includes, in response to selecting a task for execution within a texture/shading unit, allocating to the task both a static allocation of temporary registers for the entire task and a dynamic allocation of temporary registers. The dynamic allocation comprises temporary registers used by a first phase of the task only and the static allocation of temporary registers comprises any temporary registers that are used by the program and are live at a boundary between two phases. When the task subsequently reaches a boundary between two phases, the dynamic allocation of temporary registers are freed and a new dynamic allocation of temporary registers for a next phase of the task is allocated to the task.

METHOD FOR IMPLEMENTING A LINE SPEED INTERCONNECT STRUCTURE
20190235877 · 2019-08-01 ·

A method and apparatus including a cache controller coupled to a cache memory, wherein the cache controller receives a plurality of cache access requests, performs a pre-sorting of the plurality of cache access requests by a first stage of the cache controller to order the plurality of cache access requests, wherein the first stage functions by performing a presorting and pre-clustering process on the plurality of cache access requests in parallel to map the plurality of cache access requests from a first position to a second position corresponding to ports or banks of a cache memory, performs the combining and splitting of the plurality of cache access request by a second stage of the cache controller, and applies the plurality of cache access requests to the cache memory at line speed.

Context Switch Optimization
20190220417 · 2019-07-18 ·

In an embodiment, a processor may include a register file including one or more sets of registers for one or more data types specified by the ISA implemented by the processor. The processor may have a processor mode in which the context is reduced, as compared to the full context. For example, for at least one of the data types, the registers included in the reduced context exclude one or more of the registers defined in the ISA for that data type. In an embodiment, one half or more of the registers for the data type may be excluded. When the processor is operating in a reduced context mode, the processor may detect instructions that use excluded registers, and may signal an exception for such instructions to prevent use of the excluded registers.

Apparatus and method for invocation of a multi threaded accelerator

A processor is described having logic circuitry of a general purpose CPU core to save multiple copies of context of a thread of the general purpose CPU core to prepare multiple micro-threads of a multi-threaded accelerator for execution to accelerate operations for the thread through parallel execution of the micro-threads.

Processor and controlling method thereof to process an interrupt

A processor and a control method thereof are processed. The processor includes an instruction fetch module configured to receive a first instruction of an interrupt service routine without backup of data stored in a register in response to processing of the interrupt service routine being requested, a detecting module configured to analyze the received first instruction to determine whether the data stored in the register needs to be changed, an instruction generating module configured to generate a second instruction for storing data in a temporary memory when the stored data is initially changed, an instruction selecting module configured to sequentially select the generated second instruction and first instruction; and a control module configured to perform the second instruction and the first instruction.

ELECTRONIC DEVICE CAPABLE OF PERFORMING MULTI-CAMERA INTELLIGENT SWITCHING AND MULTI-CAMERA INTELLIGENT SWITCHING METHOD THEREOF

An electronic device capable of performing multi-camera intelligent switching and a multi-camera intelligent switching method thereof are provided. The electronic device includes a plurality of camera device media foundation transform (camara DMFT) units, an integrated DMFT unit and a mix camera agent. Each of the camera DMFT units is connected to one of a plurality of cameras. The integrated DMFT unit is serially connected to one of the camera DMFT units. The mix camera agent is connected to the cameras. The mix camera agent is used for obtaining a switching notification signal. The integrated DMFT unit switches a serial path between the integrated DMFT unit and one of the camera DMFT units according to the switching notification signal.

CONTEXT SWITCH BY CHANGING MEMORY POINTERS
20190146832 · 2019-05-16 ·

Context switch by changing memory pointers. A determination is made that a context switch is to be performed from a first context to a second context. Data of the first context is stored in one or more configuration state registers stored at least in part in a first memory unit and data of the second context is stored in one or more configuration state registers stored at least in part in a second memory unit. The context switch is performed by changing a pointer from the first memory unit to the second memory unit.

Efficient preemption for graphics processors

Systems and methods may provide for inserting one or more preemption instructions while compiling a computer program. The one or more preemption instructions being inserted within a preemption window in the computer program reduces the number of live registers at each preemption instruction position. Further, the preemption instruction instructs which registers are to be saved at a particular program position, typically the registers that are live at that program position. The compiled program may be run in an execution unit. A preemption request may be made to the execution unit and executed at a next available preemption instruction in the program being run in the execution unit.

COMBINING STATES OF MULTIPLE THREADS IN A MULTI-THREADED PROCESSOR
20190121638 · 2019-04-25 · ·

A processor comprising: an execution unit, multiple context register sets, a scheduler arranged to control the execution unit to provide a repeating sequence of temporally interleaved time slots, thereby enabling at least one respective worker thread to be allocated for execution in each respective one of some or all of the time slots, wherein a program state of the respective worker thread currently executing in each time slot is maintained in a respective one of the context register sets; and an exit state register arranged to store an aggregated exit state the worker threads. The instruction set comprises an exit instruction for inclusion in each worker thread, the exit state instruction taking an individual exit state of the respective thread as an operand. The exit instruction terminates the respective worker and also cause the individual exit state specified in the operand to contribute to the aggregated exit state.

SYNCHRONIZATION IN A MULTI-TILE PROCESSING ARRANGEMENT

A processing system comprising multiple tiles and an interconnect between the tiles. The interconnect is used to communicate between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each tile in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all tiles in the group have completed the compute phase. Each tile in the group has a local exit state upon completion of the compute phase. The instruction set comprises a synchronization instruction for execution by each tile upon completion of its compute phase to signal a sync request to logic in the interconnect. In response to receiving the sync request from all the tiles in the group, the logic releases the next exchange phase and also makes available an aggregated a state of all the tiles in the group.