PROCESSOR EMBEDDED WITH SMALL INSTRUCTION SET
20220326956 · 2022-10-13
Inventors
Cpc classification
G06F9/30032
PHYSICS
International classification
Abstract
Provided is a processor that is used for limited purposes such as preprocessing of raw data and that has a small circuit scale and high program processing efficiency, wherein an instruction block includes a 2-bit opcode. The processor can move to a branch destination or perform an operation by using an immediate bit accompanying the instruction block, by assigning a branch flag or an immediate instruction determination bit corresponding to the opcode.
Claims
1. A processor in which an instruction block includes a 2-bit opcode, the processor being capable of moving to a branch destination or performing an operation by using an immediate bit accompanying the instruction block, by assigning a branch flag or an immediate instruction determination bit corresponding to the opcode.
2. The processor according to claim 1, wherein a subtraction instruction, a logical AND instruction, a left-right shift instruction, and a memory access instruction are assigned to the 2-bit opcode.
3. The processor according to claim 2, wherein a constant is specifiable as an operand in the instruction block of the subtraction instruction and the logical AND instruction.
4. The processor according to claim 2, wherein the immediate bit accompanies the instruction block when the immediate instruction determination bit is a predetermined value in the instruction block of the subtraction instruction and the logical AND instruction.
5. The processor according to claim 4, wherein a branch block that determines a branch condition and the branch destination accompany the instruction block when the branch flag is a predetermined value in the instruction block of the subtraction instruction and the logical AND instruction.
6. The processor according to claim 2, wherein the number of shift amounts to be specified by the shift instruction varies between left shifting and right shifting.
7. The processor according to claim 5, wherein the subtraction instruction, the logical AND instruction, the left-right shift instruction, and the memory access instruction
Description
BRIEF DESCRIPTION OF DRAWINGS
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
DESCRIPTION OF EMBODIMENTS
[0033] A processor (hereinafter, also referred to as “SubRISC+”) of an embodiment is a 32-bit processor that includes 16 registers and that can perform a three-stage pipeline process, and has an instruction set formed of four types of instructions of subtraction (sub, subi), logical AND (and, andi), shift (shr, shl, sht), and memory access (mr, mw). This instruction set formed of instruction blocks with formats shown in
[0034] The processor of the embodiment has the instruction set formed of four instructions that are far fewer than those in a processor used for general purpose. To this end, among the instructions in the instruction set of the processor used for general purpose, instructions used in complex arithmetic calculation and the like are omitted, and the instruction set in the processor of the embodiment includes only relatively-simple minimum instructions necessary for limited purposes such as preprocessing of data and is provided with functions for improving processing efficiency of a program.
[0035] Two bits of the fourteenth and fifteenth bits of a main block in each of the instructions shown in
<Subtraction and Logical AND>
[0036]
[0037] The two bits of the fourteenth and fifteenth bits of the main block are an opcode indicating subtraction (sub) or logical AND (and). When the opcode is “00”, the opcode indicates the operation instruction of subtraction and, when the opcode is “01”, the opcode indicates the operation instruction of logical AND.
[0038] “Register number of operand A” is a 4-bit code as shown in Table 1 and indicates a code corresponding to a constant 0, 1, or −1 (value expressed in 32 bits) to be set as the operand A (hereinafter, also referred to as “A”) or the number of the register in which the operand A being a 32-bit value is stored. Any of 12 types of register numbers from “0100” to “1111” can be specified as the number of register. The case where the “register number of operand A” is “0011” is the case where the operand A is to be an immediate. This case is the case where an operation of “subtraction or logical AND handling an immediate” to be described later is performed. In the instruction of performing the operation handling only a constant and a value stored in a register, the “register number of operand A” is never “0011”.
TABLE-US-00001 TABLE 1 “Register number of operand A” Operand A 0000 0 0001 1 0010 −1 0011 Immediate 0100 Value stored in register with ∥ register number 1111
[0039] “Register number of operand B” is a 5-bit code as shown in Table 2 and indicates the number of a register in which an operand B (hereinafter, also referred to as “B”) being a 32-bit value is stored or a constant of 0, 1, or −1 (value expressed in 32 bits) corresponding to the operand B. Any of 16 types of numbers of “00000” to “01111” can be specified as the number of the register. When the “register number of operand B” is “10000” to “10010”, the operand B is a constant. There is a case where the operand B is an immediate. This case is the case where the operation of “subtraction or logical AND handling an immediate” to be described later is performed, and the “register number of operand B” is “10100” or “11000”. In the instruction of performing the operation handling only a constant and a value stored in a register, the “register number of operand B” is never “10100” or “11000”.
TABLE-US-00002 TABLE 2 “Register number of operand B” Operand B 00000 Value stored in register with ∥ register number 01111 10000 0 10001 1 10010 −1 10100 Immediate subjected to zero extension 11000 Immediate subjected to sign extension
[0040] It is possible to specify 0, 1, and −1 that are constants with relatively high usage frequency as the operand A and the operand B. The processor of the embodiment can thereby achieve a shorter program and higher processing speed.
[0041] “Register number of operand D” indicates the number of a register in which an operand D (hereinafter, also referred to as “D”) being a 32-bit value is stored. A value obtained by an operation or the like is stored in this register.
[0042] When subtraction (sub) by the instruction with the format shown in
[0043]
[0044] “Relative branch destination” formed of thirteen bits from the third bit to the fifteenth bit in the branch instruction block in
[0045] When the main block is subtraction (sub), the branching is performed in the case of B−A<0 or |B|-|A|≤0. When the main block is logical AND (and), the branching is performed in the case where the least significant bit of a logical AND result value is “0”.
<Shift>
[0046]
[0047]
[0048]
[0049] The shift amount is a bit number expressed by (shift amount)=8b+n (b and n are integers, 0≤b, n≤3). In this case, b=arg[3:2] (sixth and seventh bits in the main block) and n=arg[1:0] (fourth and fifth bits in the main block).
[0050] In the case of the right shift instruction (shr) (
[0051]
[0052] The case where the value [4] is “0” indicates the right shifting and the case where the value [4] is “1” indicates the left shifting. The shift amount is determined by value[3:0].
[0053] As in the fixed amount shifting, the shift amount is the bit number expressed by (shift amount)=8b+n (b and n are integers, 0≤b, n≤3). In this case, b=value[3:2] and n=value [1:0].
[0054] In the case of the right shift instruction, there is no further limitation for b and n. Meanwhile, in the case of the left shift instruction, limitations of 1≤b and n=0 (“00”) are added and the number of available shift amounts is smaller.
[0055] The shift instruction in the instruction set of the processor in the invention of the present application uses the shifting by the fixed amount and the setting of the shift amount asymmetric in the left-right direction in which the left shift amount is limited, to achieve high speed and reduction of a circuit scale.
<Memory Access>
[0056]
[0057] When the memory read (mr) is executed, a value stored in an address of the memory that is offset from the reference address of the memory by the “address offset (four bits)” is stored as the operand D in the register with the “register number of operand D” (zeroth to third bits), the reference address stored in the register with the “register number of reference address (five bits)”.
[0058] When the memory write (mw) is executed, the operand A (32 bits) stored in the zeroth to third bits is written in an address of the memory that is offset from the reference address of the memory by the “address offset (four bits)”.
<Subtraction and Logical AND Handling Immediate>
[0059]
[0060] In the operations of these instruction formats, operation operand of the operand A and the operand B is performed and the operand D obtained as a result is stored in the register with the “register number of the operand D” as in the instruction format of
[0061] In the operation instruction that is shown in
[0062] In this case, the operand A is a 32-bit value that is a combination of 16 bits (zeroth bit to fifteenth bit) expressed by the immediate block and 16 bits (sixteenth bit to thirty-first bit) obtained by successively arranging 16 of a bit value of the “seventeenth bit of the immediate” in the thirteenth bit of the main block. Specifically, when the “seventeenth bit of the immediate” in the seventeenth bit of the main block is “0”, 16 bits from the sixteenth bit to the thirty-first bit are all set to “0” and, when the “seventeenth bit of the immediate” is “1”, 16 bits from the sixteenth bit to the thirty-first bit are all set to “1”.
[0063] In the operation instruction that is shown in
[0064] When the five bits from the fourth bit to the eighth bit in the main block is “10100”, the operand B is set to a 32-bit value obtained by zero-extending the 16-bit immediate in the immediate block. In this case, 16 bits from the sixteenth bit to the thirty-first bit of the operand B are all “0”.
[0065] When the five bits from the fourth bit to the eighth bit in the main block is “10100”, the operand B is set to a 32-bit value obtained by sign-extending the 16-bit value in the immediate block. In this case, 16 bits from the sixteenth bit to the thirty-first bit of the operand B are all “1”.
[0066] Which one of the extension processes of the zero extension and the sign extension is to be performed on the operand B is selected for each program.
[0067] Unlike the SubRISC of the publicly known technique, the processor of the embodiment can perform an operation handling an immediate. This can make a program to be executed shorter and improve the processing speed.
[0068] Effects of the processor of the embodiment are described below.
[0069] A performance of a prototype processor SubRISC+ of the embodiment is described.
[0070] First, a circuit scale of the prototype processor is described. Comparison of circuit scale (μm.sup.2 and the number of gates) between the SubRISC+ and processors of conventional techniques is shown in Table 3. The circuit area (μm.sup.2) is a result of designing each processor assuming that the power supply voltage is 0.75 V and the frequency is 50 MHz in Renesas SOTB 45 nm technology, and the number of gates is a value obtained by dividing the total area of processor cores by the area of 2-input NAND gates. The used design tool is Synopsys Design Compiler-F2011.09-SP2. The circuit scale correlates with the types of processable instructions. Accordingly, simplifying the instruction set and reducing the number of processable instructions can achieve reduction of the circuit area.
[0071] As can be seen from Table 1, the SubRISC of the publicly known technique and the processor SubRISC+ of the embodiment can have smaller circuit scales than the conventional general-purpose processors as a result of reducing the number of instructions and reducing the number of gates.
TABLE-US-00003 TABLE 3 Number of Length of Circuit instruc- instruc- Pipe- Area Number Processor tions tions Register line (μm.sup.2) of gates CORTEX- 60 16/32 32 3 619.9k 17.6k M0 (Non- entries patent Literature 1) MICRO- 45 16 16 2 553.0k 15.7k RIPCY entries (Non-patent Literature 2) SubRISC 4 16 16 2 275.5k 7.8k entries SubRISC+ 4 16/32 16 3 311.0k 8.9k entries
[0072] Next, processing performance is described. Each of the SubRISC+ and the processors of the conventional techniques are made to perform the following five types of processes of A to E and the processing time of each process is measured.
A. A process of arranging 5000 integer values in order with a quick sort algorithm.
B. A process of detecting 8×8 blocks that do not match from two 128×128 gray scale images.
C. A process of applying two-dimensional DCT conversion to a 48×48 gray scale image.
D. A process of creating a histogram of brightness values of pixels from a 64×64 gray scale image.
E. A process of applying a Laplacian contour detection filter to a 64×64 gray scale image.
[0073] The results are shown in Table 4. The processor SubRISC+ of the embodiment clearly has higher processing speed than the CORTEX-M0 used for general purpose and the SubRISC of the publicly known technique. This effect is due to higher program processing efficiency of the instruction set in the processor of the embodiment.
TABLE-US-00004 TABLE 4 Processor A B C D E CORTEX-M0 1.9 0.19 0.11 0.12 0.36 (Non-patent Literature 1) SubRISC (Non-patent Literature 1.5 0.17 N/A N/A N/A 5) SubRISC+ 1.2 0.14 0.09 0.06 0.34
[0074] The embodiment and expressions with conditions described in the present description are all given for the purpose of teaching the disclosed contents of the present description and the concepts of the invention by which the inventors of the present application have affected development of the conventional technique, in such a manner that a reader can easily understand these contents and concepts. The invention of the present application should not be interpreted to be limited to these embodiments and conditions. Although the embodiment of the present description is described in detail, various changes, alternatives, and modifications can be added to the embodiment without departing from the technical scope of the invention of the present application.