G06F9/30163

Multiply-accumulation in a data processing apparatus
11513796 · 2022-11-29 · ·

A data processing apparatus, a method of operating a data processing apparatus, a non-transitory computer readable storage medium, and an instruction are provided. The instruction specifies a first source register, a second source register, and a set of N accumulation registers. In response to the instruction control signals are generated, causing processing circuitry to extract N data elements from content of the first source register, perform a multiplication of each of the N data elements by content of the second source register, and apply a result of each multiplication to content of a respective target register of the set of N accumulation registers. As a result plural (N) multiplications are performed in a manner that effectively provides a multiplier N times the register width, but without requiring the register file to be made N times larger.

Apparatus and method for performing operations on capability metadata
11481384 · 2022-10-25 · ·

An apparatus is provided comprising storage elements to store data blocks, where each data block has capability metadata associated therewith identifying whether the data block specifies a capability, at least one capability type being a bounded pointer. Processing circuitry is then arranged to be responsive to a bulk capability metadata operation identifying a plurality of the storage elements, to perform an operation on the capability metadata associated with each data block stored in the plurality of storage elements. Via a single specified operation, this hence enables query and/or modification operations to be performed on multiple items of capability metadata, hence providing more efficient access to such capability metadata.

Coalescing adjacent gather/scatter operations

According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Reusing an operand received from a first-in-first-out (FIFO) buffer according to an operand specifier value specified in a predefined field of an instruction

Various embodiments are provided reusing an operand in an instruction set architecture (ISA) by one or more processors in a computing system. An instruction may specify that an operand register for a selected operand retain operand data used by a previous instruction. The operand data in the operand register may be reused by the instruction.

COALESCING ADJACENT GATHER/SCATTER OPERATIONS

According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Graph Instruction Processing Method and Apparatus
20230205530 · 2023-06-29 ·

This application provides a graph instruction processing method and apparatus. The method is applied to a processor, and includes: detecting whether a first input and a second input of a first graph instruction are in a ready-to-complete state, where the first input and/or the second input are or is a dynamic data input or dynamic data inputs of the first graph instruction.

RANDOM DATA USAGE
20230205531 · 2023-06-29 ·

Techniques for prefetching random data and instructions using implicitly reference random number data are described. An example includes decode circuitry to decode a single instruction at least having a field for an opcode, the opcode to indicate execution circuitry is to perform an operation using implicitly referenced random data; and execution circuitry to execute the decoded single instruction according to the opcode.

Enhanced Low Cost Microcontroller

An 8-bit microprocessor has a program memory having a 16-bit instruction word size and a data memory having an 8-bit data size. An instruction word has a payload size for an address of up to 12 bits. The microprocessor furthermore has a central processing unit coupled with the program memory and the data memory, a bank select register configured to select one of up to 64 memory banks, and an indirect addressing register operable to address up to 16 KB of data memory. The CPU is configured to execute a first move instruction having two instruction words and being configured to only access the lower 4 KB of the data memory and a second move instruction having three instruction words and configured to access the entire data memory.

Multiple register memory access instructions, processors, methods, and systems
09786338 · 2017-10-10 · ·

A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in response to the multiple register memory access instruction. The operation is to involve N-bit data, in each of the N-bit registers comprising the indicated register. The operation is also to involve different corresponding N-bit portions of an M×N-bit line of memory corresponding to the indicated memory location. A total number of bits of the N-bit data in the N-bit registers to be involved in the multiple register memory access operation is to amount to at least half of the M×N-bits of the line of memory.

COALESCING ADJACENT GATHER/SCATTER OPERATIONS

According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.