G06F9/3836

MICROPROCESSOR AND METHOD FOR ISSUING LOAD/STORE INSTRUCTION
20220382547 · 2022-12-01 · ·

A microprocessor and a method for issuing a load/store instruction is introduced. The microprocessor includes a decode/issue unit, a load/store queue, a scoreboard, and a load/store unit. The scoreboard includes a plurality of scoreboard entries, in which each scoreboard entry includes an unknown bit value and a count value, wherein the unknown bit value or the count value is set when the instructions are issued. The decode/issue unit checks for WAR, WAW, and RAW data dependencies from the scoreboard, dispatches the load/store instructions to the load/store queue with the recorded scoreboard values. The load/store queue is configured to resolve the data dependencies and dispatches the load/store instructions to the load/store unit for execution.

Memory request size management in a multi-threaded, self-scheduling processor
11513839 · 2022-11-29 · ·

Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.

Thread creation on local or remote compute elements by a multi-threaded, self-scheduling processor
11513840 · 2022-11-29 · ·

Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.

Thread state monitoring in a system having a multi-threaded, self-scheduling processor
11513838 · 2022-11-29 · ·

Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.

Thread commencement and completion using work descriptor packets in a system having a self-scheduling processor and a hybrid threading fabric
11513837 · 2022-11-29 · ·

Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.

APPARATUS AND METHOD FOR IDENTIFYING AND PRIORITIZING CERTAIN INSTRUCTIONS IN A MICROPROCESSOR INSTRUCTION PIPELINE
20220374237 · 2022-11-24 ·

A microprocessor improves Memory Level Parallelism (MLP) with minimal added complexity and without requiring segregated storage or management of instructions, by marking memory instructions and related instructions as urgent, and dispatching marked and unmarked instructions into common queuing circuitry for scheduled execution within scheduling circuitry that is configured to prioritize the execution of marked instructions. Instruction marking may be limited to the span of the renaming stage or may be extended to the span of the reorder buffer for additional gains in MLP.

SYMMETRY-PROTECTED QUANTUM COMPUTATION

In a quantum-computation method, quantum-computer code is received for execution on a quantum computer. The quantum computer includes a plurality of qubits associated with a corresponding plurality of particles, and the plurality of particles define a quantum state. The quantum-computer code is decomposed into a sequence of operations including a total spin-state measurement on particles corresponding to two or more of the qubits. Then the sequence of operations is applied on the plurality of particles to thereby transform the quantum state according to the quantum-computer code initially received.

DATA PROCESSING

Data processing circuitry comprises out-of-order instruction execution circuitry; register mapping circuitry to map zero or more architectural processor registers relating to execution of that program instruction to respective ones of a set of physical processor registers; commit circuitry to commit, in a program code order, the results of executed program instructions, the commit circuitry being configured to access a data store which stores register tag data to indicate which physical registers mapped by the register mapping circuitry relate to a given program instruction; fault detection circuitry to detect a memory access fault in respect of a vector memory access operation and to generate fault indication data indicative of an element earliest in the element order for which a memory access fault was detected; a fault indication register to store the fault indication data, in which the register mapping circuitry is configured to generate a register mapping for a program instruction for any architectural processor registers relating to execution of that program instruction other than the fault indication register; and control circuitry to encode the fault indication data, applicable to a program instruction not yet committed by the commit circuitry, to register tag data associated with that program instruction.

METHOD AND SYSTEM FOR OPTIMIZING ADDRESS CALCULATIONS
20220374236 · 2022-11-24 ·

The disclosed systems, structures, and methods are directed to optimizing address calculations in a computer. This is achieved in a compiler that identifies an address calculation in code that is being compiled and transforms the code by splitting the address calculation into a first portion in which an offset is determined and a second portion, in which the offset is combined with a base pointer to generate an address. The address and the base pointer have a first bit-length, and the offset has a second bit-length shorter than the first bit-length. The offset is determined using an operation performed at the second bit-length. In some implementations the first bit-length is 64 bits and the second bit-length is 32 bits.

ELECTRONIC DEVICE AND METHOD FOR MANAGING CACHE DATA

A non-transitory computer readable medium according to various embodiments store one or more programs include instructions cause, a first processor of an first electronic device to, transmit in response to identification of a first instructions executable by a second processor of the electronic device, at least portion of information associated with the identified first instructions, to a second electronic device, receive a signal associated with the first instructions from the second electronic device, store in response to identification of second instructions distinct from the first instructions from the received signal and executable by the second processor, the second instructions in the memory and transmit in response to identification of a command indicating upload of the first instructions from the received signal, the first instructions executable by the second processor to the second electronic device.