Patent classifications
G06F9/3818
CHANNEL-PARALLEL COMPRESSION WITH RANDOM MEMORY ACCESS
A data compressor a zero-value remover, a zero bit mask generator, a non-zero values packer, and a row-pointer generator. The zero-value remover receives 2.sup.N bit streams of values and outputs 2.sup.N non-zero-value bit streams having zero values removed from each respective bit stream. The zero bit mask generator receives the 2.sup.N bit streams of values and generates a zero bit mask for a predetermined number of values of each bit stream in which each zero bit mask indicates a location of a zero value in the predetermined number of values corresponding to the zero bit mask. The non-zero values packer receives the 2.sup.N non-zero-value bit streams and forms a group of packed non-zero values. The row-pointer generator that generates a row-pointer for each group of packed non-zero values.
METHOD AND SYSTEM FOR VEHICLE ENGAGEMENT CONTROL
A method includes receiving, by machine-learning logic, observations indicative of a states associated with a first and second group of vehicles arranged within an engagement zone during a first interval of an engagement between the first and the second group of vehicles. The machine-learning logic determines actions based on the observations that, when taken simultaneously by the first group of vehicles during the first interval, are predicted by the machine-learning logic to result in removal of one or more vehicles of the second group of vehicles from the engagement zone during the engagement. The machine-learning logic is trained using a reinforcement learning technique and on simulated engagements between the first and second group of vehicles to determine sequences of actions that are predicted to result in one or more vehicles of the second group being removed from the engagement zone. The machine-learning logic communicates the plurality of actions to the first group of vehicles.
Systems, methods, and apparatuses for matrix add, subtract, and multiply
Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.
Controlling the operation of a decoupled access-execute processor
Data processing apparatuses, methods of data processing, instructions, and simulator computer programs for providing a corresponding instruction execution environment are disclosed. Decode circuitry is responsive to an instance of a predetermined instruction type to cause issue circuitry to issue at least one subsequent instruction for execution to one of first and second instruction execution circuitry which support decoupled access-execute instruction execution. The predetermined instruction type is thus a steering instruction for at least one subsequent instruction and the programmer is provided with a mechanism for determining which program instructions are treated as access instructions and which are treated as execute instructions.
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR LOADING A TILE OF A MATRIX OPERATIONS ACCELERATOR
Systems, methods, and apparatuses relating to one or more instructions for loading a tile of a matrix operations accelerator are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core coupled to the matrix operations accelerator circuit and comprising a vector register, a decoder circuit to decode a single instruction into a decoded instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a location in the cache, and a third field that identifies the vector register, and an opcode that indicates an execution circuit of the hardware processor core is to load elements into the plurality of registers that represents the two-dimensional matrix from the location in the cache by the coupling to the cache, and load one or more elements from the vector register into the plurality of registers that represents the two-dimensional matrix by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache, and the execution circuit of the hardware processor core to execute the decoded instruction according to the opcode.
HARDENING LOAD HARDWARE AGAINST SPECULATION VULNERABILITIES
Embodiments for dynamically mitigating speculation vulnerabilities are disclosed. In an embodiment, an apparatus includes decode circuitry and load circuitry coupled to the decode circuitry. The decode circuitry is to decode a load hardening instruction to mitigate vulnerability to a speculative execution attack. The load circuitry is to be hardened in response to the load hardening instruction.
HARDENING BRANCH HARDWARE AGAINST SPECULATION VULNERABILITIES
Embodiments for dynamically mitigating speculation vulnerabilities are disclosed. In an embodiment, an apparatus includes decode circuitry and branch circuitry coupled to the decode circuitry. The decode circuitry is to decode a branch hardening instruction to mitigate vulnerability to a speculative execution attack. The branch circuitry is to be hardened in response to the branch hardening instruction.
SYSTEM FOR EXECUTING NEW INSTRUCTIONS AND METHOD FOR EXECUTING NEW INSTRUCTIONS
A method for executing new instructions includes the following steps: receiving an instruction and determining whether the received instruction is a new instruction. When the received instruction is the new instruction, entering a system management mode, and simulating the execution of the received instruction by executing at least one old instruction in the system management mode.
Systems and methods for performing instructions to convert to 16-bit floating-point format
Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
ISA OPCODE PARAMETERIZATION AND OPCODE SPACE LAYOUT RANDOMIZATION
An embodiment of an apparatus may comprise a memory to store configuration information, an instruction decoder to decode an instruction having one or more fields including an opcode field, and circuitry communicatively coupled to the instruction decoder and the memory, the circuitry to determine if an opcode value in the opcode field of the instruction corresponds to an altered opcode value in the stored configuration information that correlates one or more altered opcode values with respective original opcode values, and, if so determined, decode the instruction based on one of the original opcode values correlated to the altered opcode value in the stored configuration information. Other embodiments are disclosed and claimed.