Transfer triggered microcontroller with orthogonal instruction set
09582283 ยท 2017-02-28
Assignee
Inventors
- Jeffrey D. Owens (Frisco, TX, US)
- Edward Tangkwai Ma (Plano, TX)
- Donald W. Loomis, III (Coppel, TX, US)
- Thomas Augustus Chenot (Parker, TX, US)
Cpc classification
G06F9/30185
PHYSICS
G06F9/30145
PHYSICS
G06F9/30032
PHYSICS
International classification
G06F15/00
PHYSICS
G06F9/30
PHYSICS
G06F9/38
PHYSICS
Abstract
A microcontroller includes a program memory, data memory, central processing unit, at least one register module, a memory management unit, and a transport network. Instructions are executed in one clock cycle via an instruction word. The instruction word indicates the source module from which data is to be retrieved and the destination module to which data is to be stored. The address/data capability of an instruction word may be extended via a prefix module. If an operation is performed on the data, the source module or the destination module may perform the operation during the same clock cycle in which the data is transferred.
Claims
1. A microprocessor core comprising: a plurality of registers in which commands are received and processed, the plurality of registers being scalable and configurable by a user; a point-to-point transport network coupled to the plurality of registers, the point-to-point transport network interfacing directly with each register within the plurality of registers and providing point-to-point communication between a routine source register and a routine destination register; a decoder, coupled to the point-to-point transport network and program memory, the decoder configured to decode data fetched from the program memory and transmit this decoded data to at least one register within the plurality of registers; and a memory management unit, coupled to the program memory, data memory and at least one register within the plurality of registers, the memory management unit creating a virtual memory map by combining at least a portion of the program memory and at least a portion of the data memory.
2. The microprocessor core of claim 1 wherein the plurality of registers comprises a set of configurable special purpose registers that support processing and system control functions within the microprocessor core.
3. The microprocessor core of claim 2 wherein the set of configurable special purpose registers includes at least one module selected from a group consisting of accumulators, instruction pointers, stack pointers, data pointers, loop counters and status registers.
4. The microprocessor core of claim 1 wherein the plurality of register modules comprises a set of configurable special function registers that support peripheral or user functions external to the microprocessor.
5. The microprocessor core of claim 1 wherein the plurality of registers comprises a configurable accumulator module that provides a plurality of accumulators of which one of the plurality of accumulators may be designated as active.
6. The microprocessor core of claim 5 wherein the configurable accumulator module is scalable and configurable.
7. The microprocessor core of claim 5 wherein the configurable accumulator module is directly coupled to the point-to-point transport network.
8. The microprocessor core of claim 1 wherein the point-to-point transport network supports at least one external interface through an input/output register and a memory register.
9. The microprocessor core of claim 1 wherein the memory management unit maps a plurality of memory segments within the data memory to a plurality of memory segments within the program memory, at least a portion of the plurality of memory segments within the program memory being inaccessible until specifically activated.
10. The microprocessor core of claim 9 wherein the at least a portion of the plurality of memory segments within the program memory are activated using a single register bit.
11. The microprocessor core of claim 9 wherein direct data transfer is supported between the program memory and the data memory.
12. The microprocessor core of claim 1 wherein the plurality of registers operate pursuant to a register map that defines both single and multi-cycle clock executions.
13. The microprocessor core of claim 12 wherein the register map supports a single word instruction having a first portion that defines a source register and a second portion that defines a destination register.
14. The microprocessor core of claim 13 wherein the single word instruction further comprises a format bit having a value indicating whether an instruction is an immediate source instruction or a register source instruction.
15. The microprocessor core of claim 13 wherein the register map supports extension addressing capability such that a first subset of registers is accessible within a single clock window and a second subset of registers is accessible within a time window longer than the single clock window.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) A more complete understanding of principles of the present invention may be obtained by reference to the following Detailed Description when taken in conjunction with the accompanying Drawings wherein:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
(10) In accordance with an embodiment of the present invention, a microcontroller utilizes single cycle instruction execution. Single cycle instruction execution permits higher performance, and/or reduced power consumption. Although the microcontroller in this embodiment is illustrated as performing most operations in a single clock cycle, it will be understood by those skilled in the art that some instructions, such as long jump/long call, and/or various extended register accesses may be executed in more than one clock cycle.
(11) Referring now to
(12) The size of the on-chip data memory 104 available for the user application is dependent on the actual chip implementation. The data memory 104 may be accessed via indirect register addressing through a Data Pointer (@DP) or Frame Pointer (@BP[Offs]). The Data Pointer is used as one of the operands in a move instruction. If the Data Pointer is used as a source, the microcontroller 100 performs a Load operation which reads data from the data memory location addressed by the Data Pointer. If the Data Pointer is used as a destination, the microcontroller 100 executes a Store operation that writes data to the data memory location addressed by the Data Pointer. If two data pointers are used, one as a source and another as the destination, a direct memory-to-memory transfer occurs. In addition, the Data Pointer may be used as a pre-increment/decrement pointer by a move instruction for a memory write or post-increment/decrement pointers by a move instruction for a memory read.
(13) Also located within the microcontroller 100 is ideally at least one register module 106. The use of register modules 106 lends reconfigurability to the microcontroller 100. The register modules 106 (i.e., serial ports, A/D converters, or any I/O or processing device) may be plugged or unplugged from the microcontroller 100 as a user deems desirable. By permitting reconfigurability of the register modules 106 within the microcontroller 100, the microcontroller 100 is flexible and may be tailored to fit a number of different applications. The register modules 106 also enable register-to-register communication/data transfer, allowing an instruction to perform meaningful work directly. The register modules 106 may be accessible by the user program and therefore registers may not be wasted and intermediate transactions may not be necessary.
(14) A register module 106 may be identified by a 4-bit specifier (shown in
(15) The CPU 108 controls the operation of the microcontroller 100 through the execution of user code stored in the program memory 102. The CPU 108 controls the program memory address and data buses, the data memory address and data buses, and stack operation. An instruction is fetched from the program memory 102 and sent to the instruction register of a decoder 110. The CPU 108 decodes, via the decoder 110, the instruction and performs necessary operations as defined by the instruction. The decoder 110 determines the destination and source for an instruction. Detailed operational decoding is closely associated with destination and source modules. This approach limits switching activities to the necessary data path and minimizes on-chip power dissipation.
(16) Many of the instructions require operations to be performed on data. The main execution unit for the CPU 108 is the Arithmetic Logic Unit (ALU) 112. The ALU 112, for example, performs addition, subtraction, comparison, shift and logical operations. Instruction decoding prepares the ALU 112 and provides the appropriate data. The ALU 112 primarily uses an accumulator module and any of the on-chip registers/memory or an immediate value embedded in the instruction as the source for operations. The accumulator module is ideally incorporated in a modulo fashion with specific hardware support. Each of the registers in the accumulator module may be accessed explicitly by an instruction. Instructions related to arithmetic and logical operations are associated with the active accumulator (acc). The active accumulator may be activated by the user program via an accumulator point (AP) register. The AP register is used to select one of the available registers in the module as the active accumulator. The AP register may be programmed to automatically increment or decrement the selection of the active accumulator in a module fashion after an execution of an ALU 112 operation through an Accumulator Point Control (APC) register. The APC register provides a user option to enable the AP's post increment/decrement function and the modules selection for modulo operation.
(17) The data path of the microcontroller 100 is ideally implemented as a point-to-point transport network 114. By utilizing a transport network 114, there is no internal system bus. The transport network 114 allows a fast, point-to-point interconnection between the microcontroller 100, register modules 106, and memories 102, 104. The transport network 114 also allows power dissipation to be localized in only the active functional units and switching activity may be limited to only those circuitries. By reducing switching activity, noise may be reduced and efficiency may be increased. The transport network 114 may be implemented as multiplexers, switches, routers, etc. depending on the required system throughput.
(18) The microcontroller 100 may also include a memory management unit (MMU) 116. The MMU 116 may be capable of supporting two of the memory architectures for microprocessors in one design. The microcontroller 100 provides a programmable method to merge different physical memories in different memory spaces (program and data) into one linear memory space on-demand and on-the-fly. With the MMU 116, the microcontroller 100 is capable of supporting in-application programming and in-system programming directly. A memory can be used as program memory, a data memory, or both data and program memories. The MMU 116 creates a large virtual memory map for both program and data space. In addition, data transfers between different physical memories may be handled by a simple MOVE instruction.
(19) As shown in
(20) Another register module 106 of the microcontroller 100 of
(21) A prefix function is activated by a move instruction that specifies the prefix module as its destination register. The prefix module may be realized by a 20-bit register with synchronous clear as illustrated by
(22) To access multi-cycle registers, the prefix register is used to activate the targeted index bits of the source and/or destination of the next instruction for one cycle by supplying the prefix index N (Destination Index [2:0]) in the form of dds, where s is the extended index bit 4 for source of the succeeding instruction and dd is the extended index bits 4 and 3 for the destination of the succeeding instruction. These bits together for a control prefix field which is separated from its 16 bit data field.
(23) To implement single clock cycle execution, the instruction set designates a source register module and a destination register module without specifying an operation. Access to register modules may be explicit or implicit as part of the execution of an instruction. Some register modules may be accessed implicitly or explicitly. In accordance with an embodiment of the present invention, a source module may execute the requested operation as the data is leaving the source module, or the destination module may execute the requested operation as the data is received. In this manner, a single clock cycle is utilized to move the data and perform the requested operation.
(24) In one aspect of the present invention, as illustrated in
(25) The source operand 202 may be divided into two portions. In this case, the latter four bits 208 may designate a specific source module from which data is to be retrieved. The first four bits 210 may indicate either an index of the source module or an operation to be performed on the data. The destination operand 204 may be divided into two portions similar to that of the source operand 202. The latter four bits 212 of the destination operand 204 refer to the specific destination module to which data is to be transferred. The first three bits 214 refer to either an index of the destination module or an operation to be performed on the data.
(26) To further expand the functionality and addressing capability of a selected instruction word length, the instruction bus may be implemented as an 18-bit bus with three additional bits supplied from the dds control field of the prefix module as previously described.
(27) As illustrated above, the source and destination operands 202, 204 may be utilized to select physical device registers. However, the source and destination operands 202, 204 are not rigidly associated with physical registers and may instead designate specific operations to be performed on a particular piece of data. For example, the source and destination operands 202, 204 may be utilized to perform an indirect memory access. Specific source and/or destination operands 202, 204 may be identified as indirect access portals to physical memories such as a stack, accumulator array, or the data memory. The indirect memory access portals utilize physical pointer registers to define the respective memory address locations for access. For example, one way that the data memory can be accessed indirectly is using a @DP[0] operand. This operand, when used as a source or destination, triggers an indirect read or write access to the data memory location addressed by the Data Pointer 0 (DP[0]) register.
(28) In addition, specific source and/or destination operands 202, 204 may be utilized to trigger underlying hardware operations. The trigger mechanism serves as the basis for creating instructions that are implicitly linked to specific resources. For example, math operations (i.e., ADD, SUB, ADDC, and SUBB) are implemented as special destination encodings that implicitly target one of the working accumulators, with only the source operand supplied by the user. Conditional jumps implicitly target an instruction pointer (IP) for modification and are implemented as separate destination encodings for each status condition that can be evaluated. The indirect memory access and underlying hardware operation triggers are combined whenever possible to create new source/destination operands 202, 204 which give dual benefits to the user. For instance, when reading from the data memory 104 with, e.g., Data Pointer 0, the user may optionally increment or decrement the pointer following the read operation using a @DP[0]++ or @DP[0] source operand respectively.
(29) As shown below in Table 1, an exemplary instruction set utilizing the above described structure is listed. The instruction words may explicitly list an entire instruction word, including the source format bit, or portions of the instruction word, such as the destination operand, may be explicitly listed. Although Table 1 illustrates specific functions as being performed by specific instruction words, it will be understood by one skilled in the art that various instruction words may be utilized to perform a specific function.
(30) TABLE-US-00001 Instruction Code Description Flags f001 1010 ssss ssss (Acc)=(Acc) AND src; S, Z f=0: src=#literal, f=1: src=(register) f010 1010 ssss ssss (Acc)=(Acc) OR src; S, Z f=0: src=#literal, f=1: src=(register) f011 1010 ssss ssss (Acc)=(Acc) XOR src; S, Z f=0: src=#literal, f=1: src=(register) f100 1010 ssss ssss (Acc)=(Acc) + src; C, S, Z, OV f=0: src=#literal, f=1: src=(register) f101 1010 ssss ssss (Acc)=(Acc) src; C, S, Z, OV f=0: src=#literal, f=1: src=(register) f110 1010 ssss ssss (Acc)=(Acc) + src + (C); C, S, Z, OV f=0: src=#literal, f=1: src=(register) f111 1010 ssss ssss (Acc)=(Acc) src (C); C, S, Z, OV f=0: src=#literal, f=1: src=(register) 1000 1010 0001 1010 (A)=(A) S, Z 1000 1010 0010 1010 (A7-0)=(A6-0),0 and (C)=(A7) (for MaxQ10Core) C, S, Z (A15-0)=(A14-0),0 and (C)=(A15) (for MaxQ20Core) 1000 1010 0011 1010 (A7-0)=(A5-0),0,0 and (C)=(A6) (for MaxQ10Core) C, S, Z (A15-0)=(A13-0),0,0 and (C)=(A14) (for MaxQ20Core) 1000 1010 0110 1010 (A7-0)=(A3-0),0,0,0,0 and (C)=(A4) (for MaxQ10Core) C, S, Z (A15-0)=(A11-0),0,0,0,0 and (C)=(A12) (for MaxQ20Core) 1000 1010 0100 1010 (A7-0)=(A6-0,7) (for MaxQ20Core) S (A15-0)=(A14-0,15) (for MaxQ20Core) 1000 1010 0101 1010 (A7-0)=(A6-0),(C) and (C)=(A7) (for MaxQ10Core) C, S, Z (A15-0)=(A14-0),(C) and (C)=(A15) (for MaxQ20Core) 1000 1010 1001 1010 (A)=(A)+1 S 1000 1010 1010 1010 (A7-0)=0,(A7-1) and (C)=(A0) (for MaxQ10Core) C, S, Z (A15-0)=0,(A15-1) and (C)=(A0) (for MaxQ20Core) 1000 1010 1100 1010 (A7-0)=(A0,7-1) (for MaxQ10Core) S (A15-0)=(A0,15-1) (for MaxQ20Core) 1000 1010 1101 1010 (A7-0)=(C),(A7-1) and (C)=(A0) (for MaxQ10Core) C, S, Z (A15-0)=(C),(A15-1) and (C)=(A0) (for MaxQ20Core) 1000 1010 1111 1010 (A7-0)=(A7),(A7-1) and (C)=(A0) (for MaxQ10Core) C, Z (A15-0)=(A15),(A15-1) and (C)=(A0) (for MaxQ20Core) 1000 1010 1110 1010 (A7-0)=(A7),(A7),(A7-2) and (C)=(A1) (for MaxQ10Core) C, Z (A15-0)=(A15),(A15),(A15-2) and (C)=(A1) (for MaxQ20Core) 1000 1010 1011 1010 (A7-0)=(A7),(A7),(A7),(A7),(A7-4) and (C)=(A3) C, Z (for MaxQ10Core) (A15-0)=(A15),(A15),(A15),(A15),(A15-4) and (C)=(A3) (for MaxQ20Core) f111 1000 ssss ssss If (Acc)=src, then (E)=1; E else, (E)=0 fddd dddd ssss ssss (dst)=src; S, Z f=0: src=#literal, f=1: src=(register) C, E 1000 1010 0111 1010 (A7-0)=(A3-0,A7-4) (for MaxQ10Core) S (A15-0)=(A11-8,15-12,3-0,7-4) (for MaxQ20Core) 1000 1010 1000 1010 (A15-0)=(A7-0,15-8) (for MaxQ20Core) S f000 1101 ssss ssss (SP)=(SP)+1, ((SP))=src; f=0: src=#literal, f=1: src=(source specifier). This is equivalent to MOVE @++SP, src. 1ddd dddd 0000 1101 (dst)=((SP)), S, Z (SP)=(SP)1, C, E Note: This is equivalent to MOVE dst, @SP 1ddd dddd 1000 1101 (dst)=((SP)), S, Z (SP)=(SP)1, C, E Note: This is equivalent to MOVE dst, @SPI. It also clears the INS bit. 1001 1010 bbbb 1010 (C)=(C) AND (Acc.b) C For a selected bit in the Active Accumulator where b=0:15 as selected by the source index bbbb. For MaxQ10Core, Acc is 8-bit only and selecting one of the eight high order bit results no operation. 1010 1010 bbbb 1010 (C)=(C) OR (Acc.b) C For a selected bit in the Active Accumulator where b=0:15 as selected by the source index bbbb. For MaxQ10Core, Acc is 8-bit only and selecting one of the eight high order bit results no operation. 1011 1010 bbbb 1010 (C)=(C) XOR (Acc.b) C For a selected bit in the Active Accumulator where b=0:15 as selected by the source index bbbb. For MaxQ10Core, Acc is 8-bit only and selecting one of the eight high order bit results no operation. 1101 1010 0000 1010 (C)=0 C 1101 1010 0001 1010 (C)=1 C 1101 1010 0010 1010 (C)=(C) C 1110 1010 bbbb 1010 (C)=(Acc.b) C For a selected bit in the Active Accumulator where b=0:15 as selected by the source index bbbb. For MaxQ10Core, Acc is 8-bit only and selecting one of the eight high order bit results no operation. 1111 1010 bbbb 1010 (Acc.b)=(C) S, Z For a selected bit in the Active Accumulator where b=0:15 as selected by the source index bbbb. For MaxQ10Core, Acc is 8-bit only and selecting one of the eight high order bit results no operation. 1ddd dddd 0bbb 0111 (dst.b)=0 S, Z For a selected bit in the destination register where b=0:7 as C, E selected by the source index bbb. 1ddd dddd 1bbb 0111 (dst.b)=1 S, Z For a selected bit in the destination register where b=0:7 as C, E selected by the source index bbb. fbbb 0111 ssss ssss (C)=src.b C For a selected bit in the source where b=0:7 as selected by the destination index bbb. For f=0: src=#literal, f=1: src=(register). f000 1100 ssss ssss If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. f001 1100 ssss ssss If Z=1, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. else, (IP)=(IP)+1. f010 1100 ssss ssss If C=1, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. else, (IP)=(IP)+1. 0011 1100 ssss ssss If E=1, then If PFX is not activated, then (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended If PFX is activated, then (IP)=(PFX);Immediate data else, (IP)=(IP)+1. f100 1100 ssss ssss If S=1, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. else, (IP)=(IP)+1. f101 1100 ssss ssss If Z=0, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. else, (IP)=(IP)+1. f110 1100 ssss ssss If C=0, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. else, (IP)=(IP)+1. 0111 1100 ssss ssss If E=0, then If PFX is not activated, then (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended If PFX is activated, then (IP)=(PFX);Immediate data else (IP)=(IP)+1. f10n 1101 ssss ssss (LC[n])=(LC[n])1, If LC[n]<>0, then If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended in the range of +127 to 128 in decimal. - if PFX is activated (IP)=(PFX);Immediate data else (IP)=(IP)+1. for n=0:1 as selected by the destination index n. f011 1101 ssss ssss (IP)=(IP)+1, (SP)=(SP)+1, ((SP))=(IP), If f=1, then - for 16-bit register operand (IP)=(src) - for 8-bit register operand (IP)=(PFX);(src) If f=0, then - if PFX is not activated (IP)=(IP)+1+Immediate data -- Immediate data is 2's complement with sign extended - if PFX is activated (IP)=(PFX);Immediate data. 1000 1100 0000 1101 (IP)=((SP)), (SP)=(SP)1. 1000 1100 1000 1101 (IP)=((SP)), (SP)=(SP)1. 1001 1100 0000 1101 If Z=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears the INS bit. 1001 1100 1000 1101 If Z=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears INS bit if return is taken. 1010 1100 0000 1101 If C=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. 1010 1100 1000 1101 If C=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears INS bit if return is taken. 1100 1100 0000 1101 If S=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. 1100 1100 1000 1101 If S=1, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears INS bit if return is taken. 1101 1100 0000 1101 If Z=0, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. 1101 1100 1000 1101 If Z=0, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears INS bit if return is taken. 1110 1100 0000 1101 If C=0, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. 1110 1100 1000 1101 If C=0, then (IP)=((SP)), (SP)=(SP)1; else, (IP)=(IP)+1. Note: This instruction also clears INS bit if return is taken. 1101 1010 0011 1010 (IP)=(IP)+1.
(31) Referring now to
(32) Referring now to
(33) Referring now to
(34) Referring now to
(35) The previous description is of a preferred embodiment for implementing the invention, and the scope of the invention should not necessarily be limited by this description. The scope of the present invention is instead defined by the following claims.