Patent classifications
G06F9/30076
COMPILER DEVICE, INSTRUCTION GENERATION METHOD, PROGRAM, COMPILING METHOD, AND COMPILER PROGRAM
A compiler device, for generating an instruction sequence to be executed by an arithmetic processing device, includes at least one memory and at least one processor. The at least one processor is configured to receive a first instruction sequence for a first process and a second instruction sequence for a second process to be executed after the first process; generate third instructions, each third instruction being generated by merging a first instruction included in the first instruction sequence and a second instruction included in the second instruction sequence; and generate a third instruction sequence by concatenating the third instructions, instructions included in the first instruction sequence that are not merged into the third instructions, and instructions other than the second instruction among the plurality of instructions included in the second instruction sequence that are not merged into the one or more third instructions.
SUPPORTING LARGE-WORD OPERATIONS IN A REDUCED INSTRUCTION SET COMPUTER ("RISC") PROCESSOR
A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.
TILE-BASED RESULT BUFFERING IN MEMORY-COMPUTE SYSTEMS
A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. A first tile in a first node can include a processor with a processor output and a first register network configured to receive information from the processor output and information from one or more of the multiple other tiles in the first node. In response to an output instruction and a delay instruction, the register network can provide an output signal to one of the multiple other tiles in the first node. Based on the output instruction, the output signal can include one or the other of the information from the processor output and the information from one or more of the multiple other tiles in the first node. A timing characteristic of the output signal can depend on the delay instruction.
IMPLEMENTATION METHOD AND SYSTEM OF RISC_V VECTOR INSTRUCTION SET VSETVLI INSTRUCTION
The invention relates to the technical field of CPUs, in particular to a method and system for implementing a risc_v vector instruction set vsetvli instruction. it allocates vectag[n:0] information in the rename module when the CPU executes out of order, and determines whether the instruction is vsetvli. If the instruction is vsetvli, vectag+1 is added. If it is a non-vsetvli instruction, the vectag remains unchanged; it is sent to the execution unit, and the vsetvli instruction is distributed to the csr module; and the corresponding other vector instructions are distributed to the vpu module. The non-vsetvli{i} Vector instruction execution efficiency of the present invention is high. Data is selected by mask, which reduces power consumption, reduces execution cycle and latency, and has strong market application prospects.
Speculatively executing instructions that follow a status updating instruction
A data processing apparatus is provided that comprises fetch circuitry to fetch an instruction stream comprising a plurality of instructions, including a status updating instruction, from storage circuitry. Status storage circuitry stores a status value. Execution circuitry executes the instructions, wherein at least some of the instructions are executed in an order other than in the instruction stream. For the status updating instruction, the execution circuitry is adapted to update the status value based on execution of the status updating instruction. Flush circuitry flushes, when the status storage circuitry is updated, following instructions that appear after the status updating instruction in the instruction stream.
Supporting large-word operations in a reduced instruction set computer (“RISC”) processor
A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.
INSTRUCTION EXECUTION METHOD AND INSTRUCTION EXECUTION DEVICE
An instruction configuration and execution method includes the following steps. A target instruction is received through an instruction cache. The target instruction is decoded by an instruction translator. It is determined whether the target instruction has the authority to read or write the model specific register in an unprivileged state. It is determined whether the model specific register index of the specific instruction corresponds to a specific model specific register, so as to order the microprocessor to perform an instruction serialization operation.
INSTRUCTION EXECUTION METHOD AND INSTRUCTION EXECUTION DEVICE
An instruction execution method for a microprocessor is provided. The microprocessor includes a model specific register (MSR). And, the instruction execution method includes the following steps. A target instruction is received using an instruction cache. The target instruction is decoded using an instruction translator to determine whether the target instruction is a specific instruction is a specific instruction. When the target instruction is the specific instruction, a model specific register index of the target instruction is obtained to directly read or write the model specific register.
Processors, methods, systems, and instructions to protect shadow stacks
A processor of an aspect includes a decode unit to decode an instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to determine that an attempted change due to the instruction, to a shadow stack pointer of a shadow stack, would cause the shadow stack pointer to exceed an allowed range. The execution unit is also to take an exception in response to determining that the attempted change to the shadow stack pointer would cause the shadow stack pointer to exceed the allowed range. Other processors, methods, systems, and instructions are disclosed.
DETERMINISTIC REPLAY OF A MULTI-THREADED TRACE ON A MULTI-THREADED PROCESSOR
At least one computer-readable storage medium comprising instructions for execution by at least one graphics processing unit (GPU) that, when executed, cause the at least one GPU to: obtain program code for tracing, the program code including a plurality of instructions; identify from the plurality of instructions of the program code events to be synchronized; instrument the program code corresponding to one or more of the events identified, by inserting instructions that support monitoring code; execute the instrumented program code on at least a plurality of hardware threads of the GPU and generate trace data; replay the identified events according to an order of occurrence of the events identified; and report a GPU state indicating a utilization of the GPU based; and wherein to report the GPU state includes to indicate when the GPU executes non-graphics related tasks.