Patent classifications
G06F9/30152
NON-TRANSITORY COMPUTER-READABLE MEDIUM, ANALYSIS DEVICE, AND ANALYSIS METHOD
The present disclosure relates to a non-transitory computer-readable recording medium storing an analysis program that causes a computer to execute a process. The process includes sampling an instruction address of one of instructions included in a program during execution of the program, identifying a first function that includes the sampled instruction address in an address range, rewriting mark information associated with the identified first function, identifying first information corresponding to the instruction address of the first function among a plurality of first information based on the rewritten mark information, identifying second information corresponding to the instruction address of the first function among a plurality of second information based on the rewritten mark information, storing the first information and the second information in a memory, and analyzing performance of the program based on the first information and the second information stored in the memory.
Instruction packing scheme for VLIW CPU architecture
A processor is provided and includes a core that is configured to perform a decode operation on a multi-instruction packet comprising multiple instructions. The decode operation includes receiving the multi-instruction packet that includes first and second instructions. The first instruction includes a primary portion at a fixed first location and a secondary portion. The second instruction includes a primary portion at a fixed second location between the primary portion of the first instruction and the secondary portion of the first instruction. An operational code portion of the primary portion of each of the first and second instructions is accessed and decoded. An instruction packet including the primary and secondary portions of the first instruction is created, and a second instruction packet including the primary portion of the second instruction is created. The first and second instructions packets are dispatched to respective first and second functional units.
INSTRUCTION PACKING SCHEME FOR VLIW CPU ARCHITECTURE
A processor is provided and includes a core that is configured to perform a decode operation on a multi-instruction packet comprising multiple instructions. The decode operation includes receiving the multi-instruction packet that includes first and second instructions. The first instruction includes a primary portion at a fixed first location and a secondary portion. The second instruction includes a primary portion at a fixed second location between the primary portion of the first instruction and the secondary portion of the first instruction. An operational code portion of the primary portion of each of the first and second instructions is accessed and decoded. An instruction packet including the primary and secondary portions of the first instruction is created, and a second instruction packet including the primary portion of the second instruction is created. The first and second instructions packets are dispatched to respective first and second functional units.
System, device, and method for obtaining instructions from a variable-length instruction set
An instruction processing device and an instruction processing method are disclosed. The instruction processing device includes: an instruction boundary prediction unit including circuitry configured to acquire an instruction packet of a variable-length instruction set and to add instruction prediction information to a plurality of instruction meta-fields in the instruction packet; and an instruction pipeline structure comprising an instruction fetch unit including an instruction boundary determination unit including circuitry configured to determine instruction boundary information according to the instruction prediction information to obtain one or more instructions in the instruction packet.
SYSTEMS AND METHODS FOR PROCESSING OUT-OF-ORDER EVENTS
The present disclosure provides new and innovative systems and methods for processing out-of-order events. In an example, a computer-implemented method includes obtaining data, committing the obtained data to a fixed-size storage pool, the fixed-size storage pool including a plurality of slots and a pool index including a fixed-length array, by acquiring a slot in the plurality of slots, locking the acquired slot, storing the obtained data in the acquired slot, updating the pool index for the storage pool by updating an element in the array corresponding to the acquired slot, the element storing an indication of the obtained data, and unlocking the acquired slot, and transmitting an indication that the data is available.
Method and apparatus for executing vector instructions with merging behavior
A processor includes a register file and control logic that detects multiple different sets of sequential zero bits of a register in the register file, wherein each of the multiple different sets has a bit length that corresponds to a partial instruction width and operates at a first partial instruction width or a second partial instruction width with the register file depending on number of sets of zero bits detected in the register. In certain examples, the control logic causes operating at first instruction width that avoids merging of a first bit length of data in the register and operating at the second instruction width that avoids merging of a second bit length of data in the register. In some examples, a register rename map table incudes multiple zero bits that identify the detected multiple different sets of bits of sequential zeros.
PROCESSOR AND INSTRUCTION SET
A processor includes a register file having a plurality of register file addresses, a processing unit, configured to perform processing in accordance with a configuration defined by information stored in the register file, and an instruction sequencer. The instruction sequencer is configured to control the processing unit by retrieving a sequence of instructions from a memory, in which each instruction includes an opcode, and a subset of the instructions includes a data portion. For each instruction in the sequence of instructions, the instruction sequencer performs an action defined by the opcode. The action for the subset of the opcodes includes writing the data portion to a register file address defined by the opcode. The sequence of instructions includes variable length instructions.
AI synaptic coprocessor
A synaptic coprocessor may include a memory configured to store a plurality of Very Long Data Words, each as a test Very Long Data Word (VLDW) having a length in the range of about one thousand bits to one million or more bits and containing encoded information that is distributed across the length of the VLDW. A processor generates search terms and a processing logic unit receives a test VLDW from the memory, receives a search term from the processor, and computes a Boolean inner product between the search term and the test VLDW read from memory indicative of the measure of similarity between the test VLDW and the search term. Optionally, buffers within logic circuits of processing pipelines may receive the test VLDWs.
PARALLEL INSTRUCTION EXTRACTION METHOD AND READABLE STORAGE MEDIUM
The invention relates to the technical field of a processor, in particular to a method for parallel extracting instructions and a readable storage medium. The method generates a valid vector of fetched instructions according to the end position vector s_mark_end of the instruction, and performs parallel decoding of instructions at each position, calculation of instruction address and branch instruction target address operation through logical “AND” and logical “OR” operations. Ultimately, multiple instructions are fetched in parallel. The present invention is a method for generating a valid vector of fetching instructions according to the end position vector s_mark_end of the instruction, and extracting multiple instructions in parallel through logical “AND” and logical “OR” operations. The invention can extract a plurality of instructions in parallel, there is no serial dependence relationship between each instruction, and the time sequence is easy to converge, so a higher main frequency can be obtained.
Wavefront selection and execution
Techniques are provided for executing wavefronts. The techniques include at a first time for issuing instructions for execution, performing first identifying, including identifying that sufficient processing resources exist to execute a first set of instructions together within a processing lane; in response to the first identifying, executing the first set of instructions together; at a second time for issuing instructions for execution, performing second identifying, including identifying that no instructions are available for which sufficient processing resources exist for execution together within the processing lane; and in response to the second identifying, executing an instruction independently of any other instruction.