G06F7/762

SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION

Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Tiled Switch Matrix Data Permutation Circuit
20200073638 · 2020-03-05 ·

Embodiments of the present disclosure pertain to switch matrix circuit including a data permutation circuit. In one embodiment, the switch matrix comprises a plurality of adjacent switching blocks configured along a first axis, wherein the plurality of adjacent switching blocks each receive data and switch control settings along a second axis. The switch matrix includes a permutation circuit comprising, in each switching block, a plurality of switching stages spanning a plurality of adjacent switching blocks and at least one switching stage that does not span to adjacent switching blocks. The permutation circuit receives data in a first pattern and outputs the data in a second pattern. The data permutation performed by the switching stages is based on the particular switch control settings received in the adjacent switching blocks along the second axis.

SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX OPERATIONS

Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Data-transfer test mode

Apparatuses and techniques for implementing a data-transfer test mode are described. The data-transfer test mode refers to a mode in which the transfer of data from an interface die to a linked die can be tested prior to connecting the interface die to the linked die. In particular, the data-transfer test mode enables the interface die to perform aspects of a write operation and output a portion of write data that is intended for the linked die. With the data-transfer test mode, testing (or debugging) of the interface die can be performed during an earlier stage in the manufacturing process before integrating the interface die into an interconnected die architecture. For example, this type of testing can be performed at a wafer level or at a single-die-package (SDP) level. In general, the data-transfer test mode can be executed independent of whether the interface die is connected to the linked die.

SYSTEMS, METHODS, AND APPARATUS FOR TILE CONFIGURATION

Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

SYSTEMS, METHODS, AND APPARATUSES FOR TILE TRANSPOSE

Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY

Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

SYSTEMS, METHODS, AND APPARATUSES FOR TILE DIAGONAL

Embodiments detailed herein relate to matrix operations. In particular, tile diagonal support is described. For example, a processor is detailed having decode circuitry to decode an instruction having fields for an opcode, a source operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to write the identified source operand to each element along a main diagonal of the identified destination matrix operand.

SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY

Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Systems, methods, and apparatuses for tile store

Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.