Stack processor using a ferroelectric random access memory (F-RAM) for code space and a portion of the stack memory space having an instruction set optimized to minimize processor stack accesses
09588881 ยท 2017-03-07
Assignee
Inventors
Cpc classification
G06F12/0223
PHYSICS
Y02D10/00
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
Abstract
A stack processor and method implemented using a ferroelectric random access memory (F-RAM) for code and a portion of the stack memory space having an instruction set optimized to minimize processor stack accesses and thus minimize program execution time. This is particularly advantageous in low power applications and those in which the power supply is only available for a finite period of time such as RFID implementations. Disclosed herein is a relatively small but complete set of instructions enabling a multitude of possible applications to be supported with a program execution time that is not too long.
Claims
1. A method for operating a stack processor comprising: a data POP wherein data is transferred from a bottom portion of a data stack to a top portion of the data stack in a first cycle of the stack processor, wherein the top portion of the data stack is in a volatile memory in a Forth core in the stack processor and the bottom portion of the data stack is in a ferroelectric random access memory in the stack processor, separate from the Forth core and coupled to the Forth core by a single code/data bus extending between the Forth core and ferroelectric random access memory only, and wherein data is transferred from the bottom portion of the data stack to the top portion of the data stack through the single code/data bus; transferring data from the top portion of the data stack to a top portion of a return stack in a second cycle of the stack processor, wherein the top portion of the return stack is a single register in the volatile memory, and wherein transferring data between the top portion of the data stack and the top portion of the return stack comprises copying data in the top portion of the data stack to the top portion of the return stack; and a return PUSH wherein data is transferred from the top portion of the return stack to a bottom portion of the return stack in a third cycle of the stack processor, wherein the bottom portion of the return stack is in the ferroelectric random access memory, and wherein data is transferred from the top portion of the return stack to the bottom portion of the return stack through the single code/data bus, wherein the first, second and third clock cycles are sequential, and wherein the data POP, return PUSH and the transferring data are performed in response to a stack-based computer programming language.
2. The method of claim 1 wherein the stack-based computer programming language comprises 64 possible instructions based upon a 16 bit word, wherein each of the instructions in the 16 bit word comprises 3 five bit instructions and a 16.sup.th bit applicable to each of the 3 five bit instructions.
3. A method for operating a stack processor comprising: in a first cycle of the stack processor transferring data from a bottom portion of a data stack to a top portion of the data stack, wherein the bottom portion of the data stack is in a ferroelectric random access memory (F-RAM) and the top portion of the data stack is in a volatile memory in a Forth core in the stack processor, and the data is transferred through a single code/data bus coupled between the F-RAM and the Forth core only; in a second cycle of the stack processor copying the data from the top portion of the data stack to a top portion of a return stack, wherein the top portion of the return stack comprises a single register in the volatile memory; and in a third cycle of the stack processor transferring the data from the top portion of the return stack to a bottom portion of the return stack, wherein the bottom portion of the return stack is in F-RAM and the data is transferred through the single code/data bus.
4. The method of claim 3 wherein the transferring and the copying of data are performed in response to a stack-based computer programming language supporting Forth core instructions.
5. The method of claim 4 wherein the Forth core instructions comprise 64 possible instructions based upon a 16 bit word, wherein each of the instructions in the 16 bit word comprises 3 five bit instructions and a 16th bit applicable to each of the 3 five bit instructions.
6. A method for operating a stack processor comprising: transferring data from a bottom portion of a data stack to a top portion of the data stack, wherein the bottom portion of the data stack is in a ferroelectric random access memory (F-RAM) and the top portion of the data stack is in a volatile memory in a Forth core in the stack processor and the data is transferred through a single code/data bus coupled between the F-RAM and the Forth core only; copying data from the top portion of the data stack to a top portion of a return stack, wherein the top portion of the return stack comprises a single register in the volatile memory; and transferring the data from the top portion of the return stack to a bottom portion of the return stack, wherein the bottom portion of the return stack is in F-RAM and the data is transferred through the single code/data bus, wherein the transferring and the copying of data are performed in three sequential clock cycles of the stack processor in response to Forth core instructions.
7. The method of claim 6 wherein the Forth core instructions comprise 64 possible instructions based upon a 16 bit word, wherein each of the instructions in the 16 bit word comprises 3 five bit instructions and a 16th bit applicable to each of the 3 five bit instructions.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The aforementioned and other features and objects of the present invention and the manner of attaining them will become more apparent and the invention itself will be best understood by reference to the following description of a preferred embodiment taken in conjunction with the accompanying drawings, wherein:
(2)
(3)
(4)
(5)
(6)
(7)
DESCRIPTION OF A REPRESENTATIVE EMBODIMENT
(8) With reference now to
(9) An associated interrupt controller 108 forms a portion of the processor 104 which also operates in conjunction with a clock reset circuit 110 as shown. In the representative embodiment illustrated, eight modules (Module 1 through Module 8) respectively labeled as 112.sub.1 through 112.sub.8 inclusive are associated with the stack processor 100. A single code/data bus couples the Forth core 106 to the F-RAM memory array 102 and comprises memory read and write lines (mem_rd, mem_wr) and 16 bit memory address, memory write data and memory read data buses (mem_address, mem_wr_data, mem_rd_data) buses as illustrated.
(10) The clock reset circuit 110 provides a clock interrupt signal (clk_int) to the interrupt controller 108 as well as a reset n (rst_n) signal to both the Forth core 106 and interrupt controller 108. The clock reset circuit 110 also provides a core clock signal (clk_core) to the Forth Core 106 and receives a core_need_clock therefrom. The interrupt controller 108 provides an interrupt signal (int) and a 3 bit int_nb signal to the Forth core 106 and receives an interrupt clear (int_clr) and ongoing interrupt (int_ongoing) signals therefrom.
(11) With reference additionally now to
(12) In this figure the organization of an embodiment of the stack processor 200 is shown. The data stack 210 has its top part in the processor core 204 in volatile CMOS registers in this case. The bottom of the data stack 206 is in the nonvolatile F-RAM memory array 202. In the same manner, the top of the return stack 212 is in the processor core 204 in volatile CMOS registers 212 and the bottom of the return stack 208 is in the F-RAM memory array 202. Therefore, if a program only needs to modify the top of the data stack with no push/pop to the F-RAM memory array 202 stack, only CMOS register accesses will be involved. This would result in overall lower power consumption. If a power down were to occur, there would then only be the contents of a very limited number of CMOS registers to save before a power loss occurs.
(13) It should be noted that a portion of the performance of this particular stack processor 200 implementation is due to the instruction set. However, some of it also results from the fact that the program is relatively small and can be written to derive benefit from the top of the stacks being in CMOS registers.
(14) With reference additionally now to
(15) The proposed stack access method illustrated maintains track of the where the stack pointer is relative to the boundary between the CMOS and F-RAM portions of the stack. As illustrated, a data transfer between the data and return stacks can result in four possible initial configurations. The shaded boxes indicate that there is valid data in this stack memory address.
(16) In
(17) As can be seen, according to the specific configuration, the proposed algorithm performs or does not perform a stack pop/push and for the specific instruction outlined above, the algorithm would do the following:
(18) As shown in
(19) With reference additionally now to
(20) With reference additionally now to
(21) In this particular implementation of the present invention, the disclosed instruction set is based on a 16 bit memory code space. In order to determine the appropriate instruction set, the following possibilities may be examined: any 4 bit instruction (or less) set would allow up to 16 instructions (or less) and so would not be sufficient as the program would take too long to execute with too many F-RAM memory array fetches; any 7 bit instruction set (or more) would require 128 instructions or more, so too much leakage/dynamic power would be required and the logic would be excessively large; a 6 bit instruction set would require 64 instructions, which while sufficient, is nonetheless very difficult to map into a 16-bit word without wasting too many bits; a 5 bit instruction set would require 32 instructions and would appear to be a bit too limiting.
(22) As can be seen in the exemplary instruction set illustrated, 64 instructions can be provided while minimizing the waste of memory bits. The 16 bit word MSB is a bit used by each instruction, therefore each instruction is 6 bits wide (which gives a total of 64 instructions maximum). As a consequence, the full 16 bits of code are used.
(23) With reference additionally now to
(24) While there have been described above the principles of the present invention in conjunction with specific circuitry and technology, it is to be clearly understood that the foregoing description is made only by way of example and not as a limitation to the scope of the invention. Particularly, it is recognized that the teachings of the foregoing disclosure will suggest other modifications to those persons skilled in the relevant art. Such modifications may involve other features which are already known per se and which may be used instead of or in addition to features already described herein. Although claims have been formulated in this application to particular combinations of features, it should be understood that the scope of the disclosure herein also includes any novel feature or any novel combination of features disclosed either explicitly or implicitly or any generalization or modification thereof which would be apparent to persons skilled in the relevant art, whether or not such relates to the same invention as presently claimed in any claim and whether or not it mitigates any or all of the same technical problems as confronted by the present invention. The applicants hereby reserve the right to formulate new claims to such features and/or combinations of such features during the prosecution of the present application or of any further application derived therefrom.
(25) As used herein, the terms comprises, comprising, or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a recitation of certain elements does not necessarily include only those elements but may include other elements not expressly recited or inherent to such process, method, article or apparatus. None of the description in the present application should be read as implying that any particular element, step, or function is an essential element which must be included in the claim scope and THE SCOPE OF THE PATENTED SUBJECT MATTER IS DEFINED ONLY BY THE CLAIMS AS ALLOWED. Moreover, none of the appended claims are intended to invoke paragraph six of 35 U.S.C. Sect. 112 unless the exact phrase means for is employed and is followed by a participle.