G06F9/26

Distinct system registers for logical processors

Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.

Scheduler for amp architecture using a closed loop performance and thermal controller

Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.

Dense read encoding for dataflow ISA

Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using an instruction decoder that decodes instructions having variable numbers of target operands. In one example of the disclosed technology, a block-based processor core includes an instruction decoder configured to decode target operands for an instruction in an instruction block, the instruction being encoded to allow for a variable number of target operands and a control unit configured to send data for at least one of the decoded target operands for an operation performed by the at least one of the cores. In some examples, the instruction indicates target instructions with a vector encoding. In other examples, a variable length format allows for the indication of one or more targets.

Memory device for updating micro-code, memory system including the memory device, and method for operating the memory device
10930350 · 2021-02-23 · ·

Provided herein may be a memory device which is capable of easily performing an update operation of a micro-code stored in the memory device. The memory device may include a first CAM block and a second CAM block, in which a micro-code is stored; and a control logic configured to control the first and second CAM blocks such that the stored micro-code is updated with a new micro-code in a micro-code update operation.

Processing system and heterogeneous processor acceleration method

A processing system includes a core, at least one accelerator function unit (AFU) and an accelerator interface. The core is utilized to develop at least one task. The AFU is utilized to execute the task. The accelerator interface is arranged between the core and the AFU to receive an accelerator interface instruction transmitted by the processing core and instruct the AFU to execute the task according to the accelerator interface instruction.

Processing system and heterogeneous processor acceleration method

A processing system includes a core, at least one accelerator function unit (AFU) and an accelerator interface. The core is utilized to develop at least one task. The AFU is utilized to execute the task. The accelerator interface is arranged between the core and the AFU to receive an accelerator interface instruction transmitted by the processing core and instruct the AFU to execute the task according to the accelerator interface instruction.

LIVE FIRMWARE UPDATE SWITCHOVER
20230418592 · 2023-12-28 ·

A method includes receiving, by a microcontroller, a live firmware update (LFU) command from an external host; and downloading, by the microcontroller, an image of a new version of firmware responsive to the LFU command. During a first time period, the method includes initializing only variables contained in the new version that are not contained in an old version of firmware. During a second time period, the method includes updating one or more of an interrupt vector table, a function pointer, and/or a stack pointer responsive to the new version. The second time period begins responsive to completing initialization of the variables.

Apparatus and methods related to microcode instructions indicating instruction types

The present disclosure includes apparatuses and methods related to microcode instructions. One example apparatus comprises a memory storing a set of microcode instructions. Each microcode instruction of the set can comprise a first field comprising a number of control data units, and a second field comprising a number of type select data units. Each microcode instruction of the set can have a particular instruction type defined by a value of the number of type select data units, and particular functions corresponding to the number of control data units are variable based on the particular instruction type.

Scheduler for AMP architecture with closed loop performance controller using static and dynamic thread grouping

Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.

Register read/write ordering

Apparatus and methods are disclosed for controlling execution of register access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of register access instruction in an instruction block. In one example of the disclosed technology, a method of operating a processor includes selecting a register access instruction of the plurality of instructions to execute based at least in part on dependencies encoded within a previous block of instructions and on stored data indicating which of the register write instructions have executed for the previous block, and executing the selected instruction. In some examples, one or more of a write mask, a read mask, a register write vector register, or a counter are used to determine register read/write dependences. Based on the encoded dependencies and the masked write vector, the next instruction block can issue when its register dependencies are available.