Patent classifications
G06F9/3853
SUPPORTING EVEN INSTRUCTION TAG ('ITAG') REQUIREMENTS IN A MULTI-SLICE PROCESSOR USING NULL INTERNAL OPERATIONS (IOPS)
Supporting even instruction tag (‘ITAG’) requirements in a multi-slice processor with null internal operations (IOPs) includes: receiving an IOP with an even ITAG requirement; determining that the IOP is to be assigned an odd ITAG; and inserting a null IOP into an instruction lane ahead of the IOP, wherein the null IOP is assigned the odd ITAG, and the IOP is assigned an even ITAG.
Apparatus and method for compressing instruction for VLIW processor, and apparatus and method for fetching instruction
Provided are an instruction compression apparatus and method for a very long instruction word (VLIW) processor, and an instruction fetching apparatus and method. The instruction compression apparatus includes: an indicator generator configured to generate an indicator code that indicates an issue width of an instruction bundle to be executed in the VLIW processor, and a number of No-Operation (NOP) instruction bundles following the instruction bundle; an instruction compressor configured to compress the instruction bundle by removing at least one of NOP instructions from the instruction bundle and the NOP instruction bundles following the instruction bundle; and an instruction converter configured to include the generated indicator code in the compressed instruction bundle.
VLIW processor including a state register for inter-slot data transfer and extended bits operations
A very long instruction word (VLIW) processor that performs efficient processing including extended bits operations is provided. The VLIW processor includes an instruction control unit, a register file unit, and an instruction execution unit. The instruction execution unit includes a plurality of slots, and a state register arranged between the second slot and the third slot to transfer N-bit data between the second and third slots. The VLIW processor stores data output from the third slot into the state register and uses the data, and thus achieves efficient processing including bit-expanded operations, such as processing performed in response to instructions commonly used in image processing, image recognition, and other processing, while preventing scaling up of the circuit.
EVENT HANDLING IN PIPELINE EXECUTE STAGES
A method includes receiving an execute packet that includes a first instruction and a second instruction and executing the first instruction and the second instruction using a pipeline. Executing the first and second instructions includes storing a result of the first instruction in a holding register; determining whether an event that interrupts execution of the execute packet occurs prior to completion of the executing of the second instruction; and based on the event not occurring, committing the result of the first instruction after completion of the executing of the second instruction.
Instruction fusion after register rename
Embodiments of the present invention include methods, systems, and computer program products for implementing instruction fusion after register rename. A computer-implemented method includes receiving, by a processor, a plurality of instructions at an instruction pipeline. The processor can further performing a register rename within the instruction pipeline in response to the received plurality of instructions. The processor can further determine that two or more of the plurality of instructions can be fused after the register rename. The processor can further fuse the two or more instructions that can be fused based on the determination to create one or more fused instructions. The processor can further perform an execution stage within the instruction pipeline to execute the plurality of instructions, including the one or more fused instructions.
Method of system for generating a cluster instruction set
A system for generating a cluster combination instruction set using machine learning, the system comprising a computing device configured to generate, as a function of a received cluster, a plurality of physical transfer paths from a distinct plurality of initiation points to a single locale, wherein the cluster comprises a cluster of a plurality of alimentary elements, determine, as a function of the plurality of physical transfer paths, a physical transfer pattern, generate an objective function of the plurality of physical transfer paths as a function of a plurality of constraints, select a physical transfer path that minimizes objective function, determine a cluster combination instruction set for the physical transfer pattern to the single destination, and generate a representation of the cluster combination instruction set via a graphical user interface to at least a physical transfer apparatus and the plurality of alimentary element originators.
BIT SHUFFLE PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
A processor includes packed data registers and a decode unit to decode an instruction. The instruction is to indicate a first source operand having at least one lane of bits, and a second source packed data operand having a number of sub-lane sized bit selection elements. An execution unit is coupled with the packed data registers and the decode unit. The execution unit, in response to the instruction, stores a result operand in a destination storage location. The result operand includes, a different corresponding bit for each of the number of sub-lane sized bit selection elements. A value of each bit of the result operand corresponding to a sub-lane sized bit selection element is that of a bit of a corresponding lane of bits, of the at least one lane of bits of the first source operand, which is indicated by the corresponding sub-lane sized bit selection element.
Processor Core, Processor And Method For Executing A Composite Scalar-Vector Very Lare Instruction Word (VLIW) Instruction
A processor core includes a storage device which stores a composite very large instruction word (VLIW) instruction, an instruction unit which obtains the composite VLIW instruction from the storage device and decodes the composite VLIW instruction to determine an operation to perform, and a composite VLIW instruction execution unit which executes the decoded composite VLIW instruction to perform the operation.
Method for implementing a line speed interconnect structure
A method for line speed interconnect processing. The method includes receiving initial inputs from an input communications path, performing a pre-sorting of the initial inputs by using a first stage interconnect parallel processor to create intermediate inputs, and performing the final combining and splitting of the intermediate inputs by using a second stage interconnect parallel processor to create resulting outputs. The method further includes transmitting the resulting outputs out of the second stage at line speed.
Techniques for instruction group formation for decode-time instruction optimization based on feedback
A technique of processing instructions for execution by a processor includes determining whether a first property of a first instruction and a second property of a second instruction are compatible. The first instruction and the second instruction are grouped in an instruction group in response to the first and second properties being compatible and a feedback value generated by a feedback function indicating the instruction group has been historically beneficial with respect to a benefit metric of the processor. Group formation for the first and second instructions is performed according to another criteria, in response to the first and second properties being incompatible or the feedback value indicating the grouping of the first and second instructions has not been historically beneficial.