Patent classifications
G06F9/30167
Instructions for vector operations with constant values
Disclosed embodiments relate to instructions for vector operations with immediate values. In one example, a system includes a memory and a processor that includes fetch circuitry to fetch the instruction from a code storage, the instruction including an opcode, a destination identifier to specify a destination vector register, a first immediate, and a write mask identifier to specify a write mask register, the write mask register including at least one bit corresponding to each destination vector register element, the at least one bit to specify whether the destination vector register element is masked or unmasked, decode circuitry to decode the fetched instruction, and execution circuitry to execute the decoded instruction, to, use the write mask register to determine unmasked elements of the destination vector register, and, when the opcode specifies to broadcast, broadcast the first immediate to one or more unmasked vector elements of the destination vector register.
Enabling removal and reconstruction of flag operations in a processor
In one embodiment, a processor includes fetch logic to fetch instructions, decode logic to decode the fetched instructions, and execution logic to execute at least some of the instructions. The decode logic may determine whether a flag portion of a first instruction to be folded is to be performed, and if not, accumulate a first immediate value of the first instruction with a folded immediate value obtained from an entry of an immediate buffer.
Look-up table initialize
A digital data processor includes an instruction memory storing instructions specifying a data processing operation and a data operand field, an instruction decoder coupled to the instruction memory for recalling instructions from the instruction memory and determining the operation and the data operand, and an operational unit coupled to a data register file and to an instruction decoder to perform a data processing operation upon an operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The operational unit is configured to perform a table write in response to a look up table initialization instruction by duplicating at least one data element from a source data register to create duplicated data elements, and writing the duplicated data elements to a specified location in a specified number of at least one table and a corresponding location in at least one other table.
INFERRING FUTURE VALUE FOR SPECULATIVE BRANCH RESOLUTION IN A MICROPROCESSOR
A system, processor, programming product and/or method including: an instruction dispatch unit configured to dispatch instructions of a compare immediate-conditional branch instruction sequence; and a compare register having at least one entry to hold information in a plurality of fields. Operations include: writing information from a first instruction of the compare immediate-conditional branch instruction sequence into one or more of the plurality of fields in an entry in the compare register; writing an immediate field and the ITAG of a compare immediate instruction into the entry in the compare register; writing, in response to dispatching a conditional branch instruction, an inferred compare result value into the entry in the compare register; comparing a computed compare result value to the inferred compare result value stored in the entry in the compare register; and not execute the compare immediate instruction or the conditional branch instruction.
Instruction decoding using hash tables
Systems and methods for instruction decoding using hash tables. An example method of constructing a decoding tree comprises: generating an aggregated vector of differentiating bit scores representing at least a subset of a set of processor instructions; identifying, based on the aggregated vector of differentiating bit scores, one or more opcode bit positions; and constructing a hash table implementing a current level of a decoding tree representing the subset of the set of processor instructions, wherein the hash table is indexed by one or more opcode bits identified by the one or more opcode bit positions.
LOOK-UP TABLE READ
A digital data processor includes an instruction memory storing instructions specifying data processing operations and a data operand field, an instruction decoder coupled to the instruction memory for recalling instructions from the instruction memory and determining the operation and the data operand, and an operational unit coupled to a data register file and an instruction decoder to perform an operation upon an operand corresponding to an instruction decoded by the instruction decoder and storing results of the operation. The operational unit is configured to perform a table recall in response to a look up table read instruction by recalling data elements from a specified location and adjacent location to the specified location, in a specified number of at least one table and storing the recalled data elements in successive slots in a destination register. Recalled data elements include at least one interpolated data element in the adjacent location.
Accelerator circuit for mathematical operations with immediate values table
Embodiments of the present disclosure relate to an accelerator circuit with a dynamic immediate values table (IVT). The accelerator circuit includes an instruction memory, a data memory, and a vector circuit with the IVT storing multiple immediate values at multiple entries. The vector circuit reads a subset of instructions from the instruction memory, each instruction including at least one corresponding pointer to at least one corresponding entry in the IVT. The vector circuit further receives a subset of input data from the data memory corresponding to the subset of instructions. The vector circuit performs a respective operation in accordance with each instruction from the subset of instructions using a corresponding data vector of the received subset of input data identified in each instruction and at least one corresponding immediate value from the IVT pointed by the at least one corresponding pointer to generate corresponding output data.
Apparatus and method for executing Boolean functions via forming indexes to an immediate value from source register bits
An apparatus and method are described for performing efficient Boolean operations in a pipelined processor which, in one embodiment, does not natively support three operand instructions. For example, in one embodiment, a processor comprises: a set of registers for storing packed operands; Boolean operation logic to execute a single instruction which uses three or more source operands packed in the set of registers, the Boolean operation logic to read at least three source operands and an immediate value to perform a Boolean operation on the three source operands, wherein the Boolean operation comprises: combining a bit read from each of the three operands to form an index to the immediate value, the index identifying a bit position within the immediate value; reading the bit from the identified bit position of the immediate value; and storing the bit from the identified bit position of the immediate value in a destination register.
Providing code sections for matrix of arithmetic logic units in a processor
The present invention relates to a processor having a trace cache and a plurality of ALUs arranged in a matrix, comprising an analyser unit located between the trace cache and the ALUs, wherein the analyser unit analyses the code in the trace cache, detects loops, transforms the code, and issues to the ALUs sections of the code combined to blocks for joint execution for a plurality of clock cycles.
Method for forming constant extensions in the same execute packet in a VLIW processor
In a very long instruction word (VLIW) central processing unit instructions are grouped into execute packets that execute in parallel. A constant may be specified or extended by bits in a constant extension instruction in the same execute packet. If an instruction includes an indication of constant extension, the decoder employs bits of a constant extension instruction to extend the constant of an immediate field. Two or more constant extension slots are permitted in each execute packet, each extending constants for a different predetermined subset of functional unit instructions. In an alternative embodiment, more than one functional unit may have constants extended from the same constant extension instruction employing the same extended bits. A long extended constant may be formed using the extension bits of two constant extension instructions.