IMPLEMENTATION METHOD AND SYSTEM OF RISC_V VECTOR INSTRUCTION SET VSETVLI INSTRUCTION
20230068290 ยท 2023-03-02
Assignee
Inventors
Cpc classification
G06F9/3856
PHYSICS
G06F9/3836
PHYSICS
G06F9/30036
PHYSICS
Y02D10/00
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
Abstract
The invention relates to the technical field of CPUs, in particular to a method and system for implementing a risc_v vector instruction set vsetvli instruction. it allocates vectag[n:0] information in the rename module when the CPU executes out of order, and determines whether the instruction is vsetvli. If the instruction is vsetvli, vectag+1 is added. If it is a non-vsetvli instruction, the vectag remains unchanged; it is sent to the execution unit, and the vsetvli instruction is distributed to the csr module; and the corresponding other vector instructions are distributed to the vpu module. The non-vsetvli{i} Vector instruction execution efficiency of the present invention is high. Data is selected by mask, which reduces power consumption, reduces execution cycle and latency, and has strong market application prospects.
Claims
1-7. (canceled)
8. A method for realizing vsetvli instructions in a risc_v vector instruction set, the method comprising the following steps: step S1: when a CPU is executed out of order, allocating vectag [n:0] information in a rename module to determine whether an instruction is a vsetvli instruction; Step S2: if the instruction is vsetvli instruction, performing vectag+1, if the instruction is not vsetvli instructions, keeping vectag unchanged; Step S3: distributing one or more vsetvli instructions to a csr module, and distributing one or more other vector instructions to a vpu module; Step S4: when the vectag information of one or more instructions is determined to be consistent with a vectag broadcast by an ROB module, transmitting the one or more instructions from a reserve station to an execution unit; and Step S5: completing execution of the one or more instructions, in the ROB module, graduating in order, and updating a register vectag when graduating.
9. The method according to claim 8, wherein each cycle emits 0-5 instructions.
10. The method according to claim 9, wherein, if the vsetvli instructions are accepted, a cycle only transmits the vsetvli instructions, each cycle allocates one vectag, and other instructions are not transmitted until the next cycle.
11. The method according to claim 8, wherein: active element is transmitted to the execution unit, and unactive element is not transmitted to the execution unit.
12. The method according to claim 8, further comprising: comparing the vectag information of an instruction in the reserve station with the register vectag, and only if the vectag information is consistent with the register vectag, transmitting the instruction comprising the vectag information to the execution unit.
13. The method according to claim 8, wherein vectag [n:0] is allocated in the rename module as a condition for other vector instructions to be transmitted to the execution unit, so that a pipeline is not refreshed when the vsetvli instructions are executed.
14. A system for realizing risc_v vector instruction set vsetvli instructions, the system comprising a rename module, a dispatch module, a vpu module and an ROB module; wherein the system is used for implementing risc_v vector instruction set vsetvli instructions by performing methods comprising steps of: step S1: when a CPU is executed out of order, allocating vectag [n:0] information in a rename module to determine whether an instruction is a vsetvli instruction; Step S2: if the instruction is vsetvli instruction, performing vectag+1, if the instruction is not vsetvli instructions, keeping vectag unchanged; Step S3: distributing one or more vsetvli instructions to a csr module, and distributing one or more other vector instructions to a vpu module; Step S4: when the vectag information of one or more instructions is determined to be consistent with a vectag broadcast by an ROB module, transmitting the one or more instructions from a reserve station to an execution unit; and Step S5: completing execution of the one or more instructions, in the ROB module, graduating in order, and updating a register vectag when graduating.
15. The system according to claim 14, wherein each cycle emits 0-5 instructions.
16. The system according to claim 15, wherein, if the vsetvli instructions are accepted, a cycle only transmits the vsetvli instructions, each cycle allocates one vectag, and other instructions are not transmitted until the next cycle.
17. The system according to claim 14, wherein: active element is transmitted to the execution unit, and unactive element is not transmitted to the execution unit.
18. The system according to claim 14, wherein the steps further comprise: comparing the vectag information of an instruction in the reserve station with the register vectag, and only if the vectag information is consistent with the register vectag, transmitting the instruction comprising the vectag information to the execution unit.
19. The system according to claim 14, wherein vectag [n:0] is allocated in the rename module as a condition for other vector instructions to be transmitted to the execution unit, so that a pipeline is not refreshed when the vsetvli instructions are executed.
Description
DESCRIPTION OF DRAWINGS
[0021] In order to more clearly illustrate the technical scheme in the embodiment of the invention or the prior art, the following will briefly introduce the drawings that need to be used in the embodiment or the prior art description, obviously, the drawings described below are only some embodiments of the invention, and for ordinary technicians in the art, other drawings can be obtained according to these drawings without creative work.
[0022]
[0023]
[0024]
DETAILED DESCRIPTION
[0025] In order to make the purpose, technical scheme and advantages of the embodiment of the invention more clear, the technical scheme in the embodiment of the invention will be described clearly and completely in combination with the drawings in the embodiment of the invention. Obviously, the described embodiments are some embodiments of the invention, not all embodiments. Based on the embodiments of the invention, all other embodiments obtained by ordinary technicians in the field without creative work fall within the scope of the protection of the invention.
Embodiment 1
[0026] The present embodiment discloses a method for implementing risv_v vector instruction set vsetli instructions as shown in
[0027] When the S1CPU is executed out of order, the vectag[n:0] information is allocated in the rename module to determine whether the instruction is vsetli.
[0028] S2 if the instruction is vsetli, then vectag+1, if it is not vsetli instruction, then vectag remains unchanged.
[0029] S3 is transmitted to the execution unit, vsetli instructions are distributed to the csr module, and other vector instructions are distributed to the vpu module.
[0030] When S4 determines that the instruction vectag is consistent with the vectag broadcast by ROB, the instruction is transmitted from reserver station to the execution unit.
[0031] The execution of S5 instruction is completed, in the ROB module, graduate in order, and update the register vectag when graduation, the execution ends.
[0032] In the present embodiment, each cycle emits 0-5 instructions. If the vsetli instruction is accepted, the cycle only transmits vsetli, each cycle allocates one vectag, and the other instructions are not sent until the next cycle.
[0033] In the present embodiment, the unactive is transmitted to the execution unit, the execution of 2n cycle is completed, the unactive is not transmitted to the execution unit, and the execution of n cycle is completed.
[0034] In the present embodiment, the instruction vectag in the vpu module reserve station needs to be compared with the register vectag, and only if the instruction is consistent can the instruction be transmitted to the execution unit.
[0035] In the present embodiment, the vectag [n:0] is allocated in the rename as a condition for the vpu instruction to be transmitted to the execution unit without refreshing the pipeline when the vsetli instruction is executed.
[0036] The vsetli instruction of the present embodiment does not need to refresh the pipeline when graduating, and the unactive element part does not need to be transmitted to the execution unit for execution, which can reduce power consumption and execution cycle.
Embodiment 2
[0037] The embodiment refers to the out-of-order CPU, and its basic frame is shown in
[0038] The rename module of the present embodiment allocates a vectag [vsetli 0] information in the rename module, and if it is a vsetli, the vectag of the vectag+1, non-vsetli instruction remains unchanged, so that the instruction executed by the vpu unit can be transmitted to the execution unit only if the vectag of the instruction in the reserve station is consistent with the vectage broadcast by the csr.
[0039] The function of the dispatch module of the embodiment is to distribute the instruction to different datapath according to the type of instruction, corresponding to the vsetli instruction to the csr module, and to the other vector instruction to the vpu module. Each cycle can send five instructions. If the vsetli instruction is encountered, the cycle only launches the vsetli, and the other instructions wait until the next cycle, so each cycle only needs to allocate one vectag.
[0040] The vpu module of the present embodiment, the vector instruction datapath, an important condition for the instruction to be transmitted from the reserver station (reservation station) to the execution unit is that the instruction vectag of the entry is required to be consistent with the vectag broadcast by the ROB before it can be transmitted to the execution unit. As shown in
[0041] In the ROB module of the present embodiment, after each instruction is executed, it is necessary to graduate sequentially and update the register vectag at the same time.
[0042] Vectag allocates the update vectage register, and the timeline table of the conditions under which the vector instruction can be issued is as follows:
TABLE-US-00001 Time cycle1 cycle2 cycle3 . . . cycle_m . . . cycle_n Instruction vsetli vec_instr0 1 Instruction vec_instr1 2 Allocate n n + 1 n + 1 the vectag register Graduation vsetli vec_instr0 Instruction and vec_instr1 update Update vectag command vec_instr0 can be and issued vec_instr1 can emit
[0043] In summary, the non-vsetl {i} Vector instruction of the invention only needs to be executed according to the youngest instruction in the older vsetl {i} before entering the execution unit, which is much more efficient than the current refresh pipeline. Refreshing the pipeline needs to start with a fresh finger fetch, instead of just waiting in the reservation station until the youngest instruction in the older vsetl {i} has been executed.
[0044] The Vector instruction of the invention also executes the unactive element part in the execution unit, and finally selects the data by the way of mask, which can reduce the power consumption, at the same time reduce the execution cycle and reduce the latency.
[0045] The above embodiments are only used to illustrate the technical scheme of the invention, not to limit it; although the invention is described in detail with reference to the aforementioned embodiments, ordinary technicians in the field should understand that they can still modify the technical scheme recorded in the above-mentioned embodiments, or equivalent replacement of some of the technical features. These modifications or replacements do not deviate the essence of the corresponding technical scheme from the spirit and scope of the technical scheme of the embodiments of the present invention.