G06F9/3017

Neural network training mechanism

An apparatus to facilitate neural network (NN) training is disclosed. The apparatus includes training logic to receive one or more network constraints and train the NN by automatically determining a best network layout and parameters based on the network constraints.

ADVANCED PROCESSOR ARCHITECTURE
20180004530 · 2018-01-04 ·

The invention relates to a method for processing instructions out-of-order on a processor comprising an arrangement of execution units. The inventive method comprises: 1) looking up operand sources in a Register Positioning Table and setting operand input references of the instruction to be issued accordingly; 2) checking for an Execution Unit (EXU) available for receiving a new instruction; and 3) issuing the instruction to the available Execution Unit and enter a reference of the result register addressed by the instruction to be issued to the Execution Unit into the Register Positioning Table (RPT).

PROCESSOR AND CONTROL METHOD OF PROCESSOR

A processor includes: an address generating unit that, when an instruction decoded by a decoding unit is an instruction to execute arithmetic processing on a plurality of operand sets each including a plurality of operands that are objects of the arithmetic processing, in parallel a plurality of times, generates an address set corresponding to each of the operand sets of the arithmetic processing for each time, based on a certain address displacement with respect to the plurality of operands included in each of the operand sets; a plurality of instruction queues that hold the generated address sets corresponding to the respective operand sets, in correspondence to respective processing units; and a plurality of processing units that perform the arithmetic processing in parallel on the operand sets obtained based on the respective address sets outputted by the plurality of instruction queues.

Electronic device including main processor and systolic array processor and operating method of electronic device

Disclosed is an electronic device which includes a main processor, and a systolic array processor, and the systolic array processor includes processing elements, a kernel data memory that provides a kernel data set to the processing elements, a data memory that provides an input data set to the processing elements, and a controller that provides commands to the processing elements. The main processor translates source codes associated with the systolic array processor into commands of the systolic array processor, calculates a switching activity value based on the commands, and stores the translated commands and the switching activity value to a machine learning module, which is based on the systolic array processor.

Memory controller and memory system for generating instruction set based on non-interleaving block group information
11567773 · 2023-01-31 · ·

Embodiments of the present invention include a memory controller including a buffer memory configured to store program data, an instruction set configurator configured to configure an instruction set describing a procedure for programming the program data stored in the buffer memory to target memory blocks, an instruction set performer configured to sequentially perform instructions in the instruction set and generate an interrupt at a time of completion of performance of a last instruction among the instructions, and a central processing unit configured to erase the program data stored in the buffer memory when the interrupt is received from the instruction set performer. The instruction set configurator may configure the instruction set differently according to whether a non-interleaving block group exists among the target memory blocks.

Apparatus and method for injecting spin echo micro-operations in a quantum processor
11704588 · 2023-07-18 · ·

Apparatus and method for injected spin echo sequences in a quantum processor. For example, one embodiment of a processor includes a decoder to decode quantum instructions to generate quantum microoperations (uops) and to decode non-quantum instructions to generate non-quantum uops, execution circuitry to execute the quantum uops and non-quantum uops, and a corrective sequence data structure to identify and/or store corrective sets of uops for one or more of the quantum instructions. The decoder is to query the corrective sequence data structure upon receiving a first quantum instruction to determine if one or more corrective uops exist, and if the one or more corrective uops exist, the decoder is to submit the one or more corrective uops for execution by the execution circuitry.

Transparent interpretation and integration of layered software architecture event streams

A computerized method includes analyzing program code, including a control flow graph, of one or more applications that are executable by an operating system of a computing device to determine event-logging functions of the program code that generate event logs; extracting, by the processing device based on the event-logging functions, log message strings from the program code that describes event-logging statements; identifying, by the processing device, via control flow analysis, possible control flow paths of the log message strings through the control flow graph; storing, in a database accessible by the processing device, the possible control flow paths; and inputting, by the processing device into a log parser, the possible control flow paths of the log message strings to facilitate interpretation of application events during runtime execution of the one or more applications.

HANDLING OF SINGLE-COPY-ATOMIC LOAD/STORE INSTRUCTION
20230017802 · 2023-01-19 ·

In response to a single-copy-atomic load/store instruction for requesting an atomic transfer of a target block of data between the memory system and the registers, where the target block has a given size greater than a maximum data size supported for a single load/store micro-operation by a load/store data path, instruction decoding circuitry maps the single-copy-atomic load/store instruction to two or more mapped load/store micro-operations each for requesting transfer of a respective portion of the target block of data. In response to the mapped load/store micro-operations, load/store circuitry triggers issuing of a shared memory access request to the memory system to request the atomic transfer of the target block of data of said given size to or from the memory system, and triggers separate transfers of respective portions of the target block of data over the load/store data path.

MEMORY CONTROLLER WITH ARITHMETIC LOGIC UNIT AND/OR FLOATING POINT UNIT
20230221958 · 2023-07-13 ·

Techniques for performing an operation at a memory controller are described. An example includes decoder circuitry to decode a single instruction, the single instruction to include one or more fields for an opcode to indicate an arithmetic or Boolean operation to be performed by a memory controller, and one or more fields to identify at least one source location; and execution circuitry of the memory controller to execute the decoded instruction according to the opcode.

Systems, media, and methods for identifying loops of or implementing loops for a unit of computation

Systems, media, and methods may identify loops of a unit of computation for performing operations associated with the loops. The system, media, and methods may receive textual program code that includes a unit of computation that comprises a loop (e.g., explicit/implicit loop). The unit of computation may be identified by an identifier (e.g., variable name within the textual program code, text string embedded in the unit of computation, and/or syntactical pattern that is unique within the unit of computation). A code portion and/or a section thereof may include an identifier referring to the unit of computation, where the code portion and the unit of computation may be at independent locations of each other. The systems, media, and methods may semantically identify a loop that corresponds to the identifier and perform operations on the textual program code using the code portion and/or section.