Patent classifications
G06F7/762
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX OPERATIONS
Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
Systems, methods, and apparatus for matrix move
Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE
Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.
Multi-packet processing with ordering rule enforcement
A system includes an input/output adapter operable to receive a plurality of packets in a single clock cycle. The system includes a controller operatively connected to the input/output adapter. The controller is operable to receive a first packet at a data link layer and determine a state of a first output indicator to maintain packet ordering. Based on determining that a first receiver formatting interface is selected by the first output indicator, the controller performs an alignment adjustment and output of the first packet by the first receiver formatting interface. Based on determining that a second receiver formatting interface is selected by the first output indicator, the controller performs the alignment adjustment and output of the first packet by the second receiver formatting interface.
SYSTEMS, METHODS, AND APPPARATUS FOR MATRIX MOVE
Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.
Systems, methods, and apparatuses for matrix operations
Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
Systems, methods, and apparatuses for tile transpose
Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.
Systems, methods, and apparatuses for tile matrix multiplication and accumulation
Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
MULTI-PACKET PROCESSING WITH ORDERING RULE ENFORCEMENT
A system includes an input/output adapter operable to receive a plurality of packets in a single clock cycle. The system includes a controller operatively connected to the input/output adapter. The controller is operable to receive a first packet at a data link layer and determine a state of a first output indicator to maintain packet ordering. Based on determining that a first receiver formatting interface is selected by the first output indicator, the controller performs an alignment adjustment and output of the first packet by the first receiver formatting interface. Based on determining that a second receiver formatting interface is selected by the first output indicator, the controller performs the alignment adjustment and output of the first packet by the second receiver formatting interface.
Systems, methods, and apparatuses for tile load, multiplication and accumulation
Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.