G06F9/3016

USER-LEVEL INTERPROCESSOR INTERRUPTS

Processors, methods, and systems for user-level interprocessor interrupts are described. In an embodiment, a processing system includes a memory and a processing core. The memory is to store an interrupt control data structure associated with a first application being executed by the processing system. The processing core includes an instruction decoder to decode a first instruction, invoked by a second application, to send an interprocessor interrupt to the first application; and, in response to the decoded instruction, is to determine that an identifier of the interprocessor interrupt matches a notification interrupt vector associated with the first application; set, in the interrupt control data structure, a pending interrupt flag corresponding to an identifier of the interprocessor interrupt; and invoke an interrupt handler for the interprocessor interrupt identified by the interrupt control data structure.

SYNCHRONIZING SCHEDULING TASKS WITH ATOMIC ALU
20230033355 · 2023-02-02 ·

A method of synchronizing a group of scheduled tasks within a parallel processing unit into a known state is described. The method uses a synchronization instruction in a scheduled task which triggers, in response to decoding of the instruction, an instruction decoder to place the scheduled task into a non-active state and forward the decoded synchronization instruction to an atomic ALU for execution. When the atomic ALU executes the decoded synchronization instruction, the atomic ALU performs an operation and check on data assigned to the group ID of the scheduled task and if the check is passed, all scheduled tasks having the particular group ID are removed from the non-active state.

Systems and methods for performing 16-bit floating-point matrix dot product instructions

Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

PROCESSOR, PROCESSING METHOD, AND RELATED DEVICE
20230093393 · 2023-03-23 ·

This application discloses a processor, a processing method, and a related device. The processor includes a processor core. The processor core includes an instruction dispatching unit and a graph flow unit and at least one general-purpose operation unit that are connected to the instruction dispatching unit. The instruction dispatching unit is configured to: allocate a general-purpose calculation instruction in a decoded to-be-executed instruction to the at least one general-purpose calculation unit, and allocate a graph calculation control instruction in the decoded to-be-executed instruction to the graph calculation unit, where the general-purpose calculation instruction is used to instruct to execute a general-purpose calculation task, and the graph calculation control instruction is used to instruct to execute a graph calculation task. The at least one general-purpose operation unit is configured to execute the general-purpose calculation instruction. The graph flow unit is configured to execute the graph calculation control instruction.

INFORMATION COMMUNICATION SYSTEM, STANDALONE DATA TRANSMISSION SYSTEM, DATA TRANSMISSION SYSTEM, APPARATUS, PROCESS, AND METHODS OF USE
20230088417 · 2023-03-23 ·

An information communication system, a standalone data transmission system, a data relay system, apparatus for performing transmission, process of sending datum over distances, and various methods of use are presented. The present disclosure provides for instantaneous information transmission without the need for a data connection, broadband, internet, wifi, or the like.

Inline data inspection for workload simplification

A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

METHOD AND APPARATUS FOR IMPLIED BIT HANDLING IN FLOATING POINT MULTIPLICATION
20230085048 · 2023-03-16 ·

A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.

CAPABILITY-GENERATING ADDRESS CALCULATING INSTRUCTION
20230085143 · 2023-03-16 ·

An apparatus has processing circuitry, an instruction decoder, and capability registers, each capability register to store a capability comprising a pointer and constraint metadata for constraining valid use of the pointer/capability. In response to a capability-generating address calculating instruction specifying an offset value, a reference capability register is selected as one of a program counter capability register and a further capability register. A result capability is generated for which the pointer of the result capability indicates a window address identifying a selected window within an address space, the selected window being offset from a reference window by a number of windows determined based on the offset value of the capability-generating address calculating instruction. The reference window comprises the window comprising an address indicated by the pointer of the reference capability register.

Apparatus and method for performing operations on capability metadata
11481384 · 2022-10-25 · ·

An apparatus is provided comprising storage elements to store data blocks, where each data block has capability metadata associated therewith identifying whether the data block specifies a capability, at least one capability type being a bounded pointer. Processing circuitry is then arranged to be responsive to a bulk capability metadata operation identifying a plurality of the storage elements, to perform an operation on the capability metadata associated with each data block stored in the plurality of storage elements. Via a single specified operation, this hence enables query and/or modification operations to be performed on multiple items of capability metadata, hence providing more efficient access to such capability metadata.

System and method enabling one-hot neural networks on a machine learning compute platform
11481218 · 2022-10-25 · ·

One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising instruction decode logic to decode a single instruction including multiple operands into a single decoded instruction, the multiple operands including a first operand and a second operand, the first operand including vector of one-hot coded weights and the second operand including a vector of input data; and a general-purpose graphics compute unit including a first logic unit, the general-purpose graphics compute unit to execute the single decoded instruction, wherein to execute the single decoded instruction includes to perform multiple operations on the first set of operands and the second set of operands.