Multiple-digit binary in-memory multiplier devices

11461074 · 2022-10-04

Assignee

FLASHSILICON INCORPORATION (Diamond Bar, CA, US)

Inventors

Lee Wang (Diamond Bar, CA, US)

Cpc classification

International classification

Abstract

The multi-digit binary in-memory multiplication devices are disclosed. The multi-digit binary in-memory multiplication devices of the invention can dramatically reduce the operational steps in comparison with the conventional binary multiplier device. In one embodiment with the expense of more hardware, the in-memory multiplication device can achieve one single step operation. Consequently, the multi-digit binary in-memory multiplication device can improve the computation efficiency and save the computation power by eliminating the data transportations between Arithmetic Logic Unit (ALU), registers, and memory units.

Claims

1. An in-memory multiplication device for performing multiplication on a multiplicand and a multiplier and generating a final product, comprising: a number P of in-memory multiplier units arranged in a parallel configuration, each comparing a number 2.sup.2n of hardwired 2n-bit operand symbols with a first n-bit digit and a second n-bit digit respectively selected from the multiplicand and the multiplier to output one of a number 2.sup.2n of hardwired 2n-bit response symbols as a 2n-bit product code, wherein all the 2n-bit product codes from the number P of in-memory multiplier units form first coefficients of m first polynomials in base 2.sup.2n and the first coefficients of each first polynomial in base 2.sup.n are associated with multiplication of the multiplicand with a corresponding digit of the multiplier, wherein each of the multiplicand and the multiplier has m digits in base 2.sup.n; zero or a number Q of binary adder devices arranged in a parallel configuration for converting the 2n-bit first coefficients of the m first polynomials in base 2.sup.n into n-bit second coefficients of m second polynomials in base 2.sup.n in parallel; and zero or a number (m−1) of polynomial adders arranged in sequential order and sequentially adding the n-bit second coefficients of the m second polynomials in base 2.sup.n in ascending degrees such that like terms of the m second polynomials in base 2.sup.n are lined up and added to generate third coefficients of a third polynomial in base 2.sup.n; wherein the third coefficients form the final product having 2m digits in base 2.sup.n; and wherein P, Q, n and m are integers greater than 0.

2. The in-memory multiplication device according to claim 1, wherein the number 2.sup.2n of hardwired 2n-bit operand symbols and the number 2.sup.2n of hardwired 2n-bit response symbols define an n-bit by n-bit multiplication table.

3. The in-memory multiplication device according to claim 1, wherein a number of terms in each first polynomial is m and a highest degree of the m first polynomials is (2m−2), wherein a number of terms in each second polynomial is (m+1) and a highest degree of the m second polynomials is (2m−1), and wherein a number of terms in the third polynomial is 2m and a highest degree of the third polynomial is (2m−1).

4. The in-memory multiplication device according to claim 1, further comprising: a first register unit coupled to the number (m−1) of polynomial adders for storing the final product, wherein a constant term of the m second polynomials is stored in the first register unit as a least significant digit of the final product.

5. The in-memory multiplication device according to claim 4, wherein the number (m−1) of polynomial adders comprises a least significant polynomial adder, zero or (m−3) intermediate polynomial adders and zero or one most significant polynomial adder, wherein the least significant polynomial adder lines up and adds the n-bit second coefficients of m larger degree terms of the second polynomial of degree m and all the n-bit second coefficients of a second polynomial of degree (m+1) to obtain sum coefficients of a sum polynomial of degree (m+1) and propagates the sum coefficient of a smallest degree term of the sum polynomial of degree (m+1) to the first register unit, wherein when m>=4, each of the (m−3) intermediate polynomial adders lines up and adds the sum coefficients of m larger degree terms of the sum polynomial of degree j and all the n-bit second coefficients of the second polynomial of degree (j+1) to obtain sum coefficients of a sum polynomial of degree (j+1) and propagates the sum coefficient of a smallest degree term of the sum polynomial of degree (j+1) to the first register unit, where j is increased from (m+1) to (2m−3), wherein the most significant polynomial adder lines up and adds the sum coefficients of m larger degree terms of the sum polynomial of degree (2m−2) and all the n-bit second coefficients of the second polynomial of degree (2m−1) to obtain the sum coefficients of a sum polynomial of degree (2m−1) and wherein all the sum coefficients of the sum polynomial of degree (2m−1) are propagated to the first register unit.

6. The in-memory multiplication device according to claim 1, wherein each binary adder device comprises (m−1) n-bit adders and n half adders in a carry-chained configuration, wherein a least significant digit of the first coefficient of the smallest degree term in the first polynomial of degree k is assigned to the n-bit second coefficient of a smallest degree term in a corresponding second polynomial of degree (k+1), where k is increased from (m−1) to (2m−2) and m>=2.

7. The in-memory multiplication device according to claim 6, wherein a least significant n-bit adder of the (m−1) n-bit adders adds a least significant digit of the first coefficient of a second smallest degree term and a most significant digit of the first coefficient of the smallest degree term in the first polynomial of degree k to produce a carry digit and the n-bit second coefficient of the second smallest degree term in the corresponding second polynomial of degree (k+1), wherein a corresponding n-bit adder (i) of the (m−1) n-bit adders adds a carry digit from a less significant n-bit adder, the least significant digit of the first coefficient of a target term (i.sup.th) and the most significant digit of the first coefficient in its immediately-previous-degree term ((i−1).sup.th) in the first polynomial of degree k to produce a carry digit and the n-bit second coefficient of the corresponding term (i.sup.th) in the corresponding second polynomial of degree (k+1), where i is increased from 2 to (m−1), and wherein the n half adders adds a carry digit from a most significant n-bit adder and a most significant digit of the first coefficient of a largest degree term in the first polynomial of degree k to produce the n-bit second coefficient of a largest degree term in the corresponding second polynomial of degree (k+1).

8. The in-memory multiplication device according to claim 1, wherein each in-memory multiplier unit comprises: a first read-only-memory (ROM) array comprising 2.sup.2n rows by 2n columns of first memory cells for parallel comparing the first n-bit digit and the second n-bit digit with the number 2.sup.2n of 2n-bit operand symbols hardwired in the 2.sup.2n rows of first memory cells, wherein each row of the first memory cells generates an indication signal indicative of whether the first n-bit digit and the second n-bit digit match its hardwired 2n-bit operand symbol; a detection circuit for respectively applying a number 2.sup.2n of switching signals to a number 2.sup.2n of wordlines of a second ROM array in response to a number 2.sup.2n of indication signals; and the second ROM array comprising 2.sup.2n rows by 2n columns of second memory cells, wherein the number 2.sup.2n of 2n-bit response symbols are respectively hardwired in the 2.sup.2n rows of second memory cells; wherein while receiving an activated switching signal, a row of second memory cells is switched on to output its hardwired 2n-bit response symbol as the 2n-bit product code.

9. The in-memory multiplication device according to claim 1, further comprising: a number m of second register units coupled between the number P of in-memory multiplier units and the number Q of binary adder devices for respectively storing the first coefficients of the m first polynomials in base 2.sup.n, wherein P=1 and Q=m.

10. The in-memory multiplication device according to claim 1, further comprising: a number m of second register units coupled between the number Q of binary adder devices and the number (m−1) of polynomial adders for respectively storing the n-bit second coefficients of the m second polynomials in base 2.sup.n, wherein P=m and Q=1.

11. The in-memory multiplication device according to claim 1, wherein P=m.sup.2 and Q=m.

12. An operating method of an in-memory multiplication device that performs multiplication on a multiplicand and a multiplier to generate a final product, the in-memory multiplication device comprising a single in-memory multiplier unit, a first register unit, a second register unit, a number m of binary adder devices and zero or a number (m−1) of polynomial adders, the method comprising the steps of: comparing a first n-bit digit and a second n-bit digit respectively selected from the multiplicand and the multiplier with a number 2.sup.2n of 2n-bit operand symbols hardwired in a first read-only-memory (ROM) array to output one of a number 2.sup.2n of 2n-bit response symbols hardwired in a second ROM array as a 2n-bit product code and store the 2n-bit product code in the first register unit by the single in-memory multiplier unit comprising the first ROM array and the second ROM array; repeating the step of comparing until all digits of the multiplicand and the multiplier are processed and a number m.sup.2 of 2n-bit product codes are stored in the first register unit, wherein the number m.sup.2 of 2n-bit product codes serve as first coefficients of m first polynomials in base 2.sup.n, wherein the first coefficients of each first polynomial in base 2.sup.n are associated with multiplication of the multiplicand with a corresponding digit of the multiplier; when m>=2, sequentially adding a most significant digit of the first coefficient of a less degree term and a least significant digit of the first coefficient of a larger degree term adjacent to the less degree term for each first polynomial from the first register unit in ascending degree by each binary adder device comprising (m−1) n-bit adders and n half adders in a carry-chained configuration so that the 2n-bit first coefficients of the m first polynomials in base 2.sup.n are converted into n-bit second coefficients of m second polynomials in base 2.sup.n in parallel, wherein the number m of binary adder devices are arranged in a parallel configuration; and when m>=2, sequentially adding the m second polynomials in base 2.sup.n in ascending degrees by the number (m−1) of polynomial adders arranged in sequential order such that like terms of the m second polynomials in base 2.sup.n are lined up and added to generate and store a final product having 2m digits in base 2.sup.n in the second register unit, wherein each polynomial adder comprises a (m×n)-bit adder and n half adders in a carry-chained configuration; wherein each of the multiplicand and the multiplier has m digits in base 2.sup.n and both n and m are integers greater than 0.

13. The operating method according to claim 12, wherein the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array and the number 2.sup.2n of 2n-bit response symbols hardwired in the second ROM array define an n-bit by n-bit multiplication table.

14. The operating method according to claim 12, wherein a number of terms in each first polynomial is m and a highest degree for the m first polynomials is (2m−2), wherein a number of terms in each second polynomial is (m+1) and a highest degree for the m second polynomials is (2m−1).

15. The operating method according to claim 12, wherein the step of sequentially adding the most significant digit comprises: at a binary adder device (k−3) of the number m of binary adder devices, (1) adding a least significant digit of the first coefficient of a second smallest degree term and a most significant digit of the first coefficient of a smallest degree term in the first polynomial of degree k from the first register unit to produce a carry digit and the n-bit second coefficient of the second smallest degree term in the corresponding second polynomial of degree (k+1) by a least significant n-bit adder of the (m−1) n-bit adders; (2) adding a carry digit from its less significant n-bit adder, the least significant digit of the first coefficient of a target term (i.sup.th) and the most significant digit of the first coefficient of its immediately-previous term ((i−1).sup.th) in the first polynomial of degree k from the first register unit to produce a carry digit and the n-bit second coefficient of a corresponding term (i.sup.th) in its corresponding second polynomial of degree (k+1) by a corresponding n-bit adder of the (m−1) n-bit adders; (3) repeating step (2) until the (m−1) n-bit adders are completed, where i is increased from 2 to (m−1); and (4) adding a carry digit from its less a most significant n-bit adder and a most significant digit of the first coefficient of a largest degree term in the first polynomial of degree k to produce the n-bit second coefficient of a largest degree term in the corresponding second polynomial of degree (k+1) by the n half adders, where k is increased ranges from (m−1) to (2m−2).

16. The operating method according to claim 12, wherein the step of sequentially adding the m second polynomials comprises: (a) storing a constant term of the second polynomial of degree m as a least significant digit of the final product in the second register unit; (b) lining up and adding the n-bit second coefficients of m larger degree terms of the second polynomial of degree m and all the n-bit second coefficients of a second polynomial of degree (m+1) by a least significant polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (m+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (m+1) as the second least significant digit of the final product in the second register unit; (c) when m>=4, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree j and all the n-bit second coefficients of the second polynomial of degree (j+1) by a corresponding polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (j+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (j+1) as a corresponding digit of the final product in the second register unit; (d) when m>=4, repeating step (c) until a total of (m−2) polynomial adders out of the number (m−1) of polynomial adders are completed, where j is increased from (m+1) to (2m−3); and (e) when m>=3, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree (2m−2) and all the n-bit second coefficients of the second polynomial of degree (2m−1) by a most significant polynomial adder of the number (m−1) of polynomial adders to obtain and store all the sum coefficients of a sum polynomial of degree (2m−1) as the (m+1) most significant digits of the final product in the second register unit.

17. The operating method according to claim 12, wherein the step of comparing comprises: parallel comparing the first n-bit digit and the second n-bit digit with the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array comprising 2.sup.2n rows by 2n columns of first memory cells so that each row of first memory cells generates an indication signal indicative of whether the first n-bit digit and the second n-bit digit match its hardwired 2n-bit operand symbol; respectively applying a number 2.sup.n of switching signals to a number 2.sup.2n of wordlines in the second ROM array comprising 2.sup.2n rows by 2n columns of second memory cells according to a number 2.sup.2n of indication signals, wherein the number 2.sup.2n of 2n-bit response symbols are hardwired in the 2.sup.2n rows of second memory cells; and switching on a row of second memory cells to output its hardwired 2n-bit response symbol as the 2n-bit product code in response to a received activated switching signal.

18. An operating method of an in-memory multiplication device that performs multiplication on a multiplicand and a multiplier to generate a final product, the in-memory multiplication device comprising a number m of in-memory multiplier units in a parallel configuration, a first register unit, a second register unit, a binary adder device and zero or a number (m−1) of polynomial adders, the method comprising the steps of: comparing a first n-bit digit and a second n-bit digit respectively selected from the multiplicand and the multiplier with a number 2.sup.2n of 2n-bit operand symbols hardwired in a first read-only-memory (ROM) array to output one of a number 2.sup.2n of 2n-bit response symbols hardwired in a second read-only-memory (ROM) array as a 2n-bit product code by each in-memory multiplier unit comprising the first ROM array and the second ROM array so that a number m of 2n-bit product codes are outputted in parallel from the number m of in-memory multiplier units, wherein the number m of 2n-bit product codes serve as 2n-bit first coefficients of one of m first polynomials in base 2.sup.n and are associated with multiplication of the multiplicand with the second n-bit digit of the multiplier; when m>=2, sequentially adding a least significant digit of the first coefficient of a less degree term and a most significant digit of the first coefficient of a larger degree term adjacent to the less degree term in the one first polynomial in ascending degree by the binary adder device comprising (m−1) n-bit adders and n half adders in a carry-chained configuration to convert the 2n-bit first coefficients of the one first polynomial in base 2.sup.n into n-bit second coefficients of a corresponding second polynomial in base 2.sup.n and store the n-bit second coefficients of the corresponding second polynomial in base 2.sup.n in the first register unit; repeating steps of comparing and converting until all the digits of the multiplier are selected and all the n-bit second coefficients of m second polynomials in base 2.sup.n are stored in the first register unit; and when m>=2, sequentially adding the m second polynomials in base 2.sup.n in ascending degrees by the number (m−1) of polynomial adders arranged in sequential order such that like terms of the m second polynomials in base 2.sup.n are lined up and added to generate and store the final product having 2m digits in base 2.sup.n in the second register unit; wherein each polynomial adder comprises a m-bit adder and n half adders in a carry-chained configuration; and wherein each of the multiplicand and the multiplier has m digits in base 2.sup.n and both n and m are integers greater than 0.

19. The operating method according to claim 18, wherein the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array and the number 2.sup.2n of 2n-bit response symbols hardwired in the second ROM array define a n-bit by n-bit multiplication table.

20. The operating method according to claim 18, wherein a number of terms in each first polynomial is m and a highest degree for the m first polynomials is (2m−2), wherein a number of terms in each second polynomial is (m+1) and a highest degree for the m second polynomials is (2m−1).

21. The operating method according to claim 18, wherein the step of sequentially adding the least significant digit comprises: (1) storing a least significant digit of the first coefficient of a smallest degree term in the one first polynomial of degree k as the n-bit second coefficient of a smallest degree term in a corresponding second polynomial of degree (k+1) in the first register unit; (2) adding a least significant digit of the first coefficient of a second smallest degree term and a most significant digit of the first coefficient of the smallest degree term in the one first polynomial of degree k to produce a carry digit and store the n-bit second coefficient of the second smallest degree term in the corresponding second polynomial of degree (k+1) in the first register unit by a least significant n-bit adder of the (m−1) n-bit adders; (3) adding a carry digit from its less significant n-bit adder, the least significant digit of the first coefficient of a target term (i.sup.th) and the most significant digit of the first coefficient of its immediately-previous term ((i−1).sup.th) in the one first polynomial of degree k to produce a carry digit and store the n-bit second coefficient of a corresponding term (i.sup.th) in its corresponding second polynomial of degree (k+1) in the first register unit by a corresponding n-bit adder of the (m−1) n-bit adders; (4) repeating step (3) until the (m−1) n-bit adders are completed, where i is increased from 2 to (m−1); and (5) adding a carry digit from a most significant n-bit adder and the most significant digit of the first coefficient of a largest degree term in the one first polynomial of degree k to produce and store the n-bit second coefficient of a largest degree term in the corresponding second polynomial of degree (k+1) in the first register unit by the n half adders, where k ranges from (m−1) to (2m−2).

22. The operating method according to claim 18, wherein the step of sequentially adding the m second polynomials comprises: (a) storing a constant term of the second polynomial of degree m as a least significant digit of the final product in the second register unit; (b) lining up and adding the n-bit second coefficients of m larger degree terms of the second polynomial of degree m and all the n-bit second coefficients of a second polynomial of degree (m+1) by a least significant polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (m+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (m+1) as the second least significant digit of the final product in the second register unit; (c) when m>=4, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree j and all the n-bit second coefficients of the second polynomial of degree (j+1) by a corresponding polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (j+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (j+1) as a corresponding digit of the final product in the second register unit; (d) when m>=4, repeating step (c) until a total of (m−2) polynomial adders out of the number (m−1) of polynomial adders are completed, where j is increased from (m+1) to (2m−3); and (e) when m>=3, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree (2m−2) and all the n-bit second coefficients of the second polynomial of degree (2m−1) by a most significant polynomial adder of the number (m−1) of polynomial adders to obtain and store all the sum coefficients of a sum polynomial of degree (2m−1) as the (m+1) most significant digits of the final product in the second register unit.

23. The operating method according to claim 18, wherein the step of comparing comprises: parallel comparing the first n-bit digit and the second n-bit digit with the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array comprising 2.sup.2n rows by 2n columns of first memory cells so that each row of first memory cells generates an indication signal indicative of whether the first n-bit digit and the second n-bit digit match its hardwired 2n-bit operand symbol; respectively applying a number 2.sup.2n of switching signals to a number 2.sup.2n of wordlines in the second ROM array comprising 2.sup.2n rows by 2n columns of second memory cells according to a number 2.sup.2n of indication signals, wherein the number 2.sup.2n of 2n-bit response symbols are hardwired in the 2.sup.2n rows of second memory cells; and switching on a row of second memory cells to output its hardwired 2n-bit response symbol as a 2n-bit product code in response to a received activated switching signal.

24. An operating method of an in-memory multiplication device that performs multiplication on a multiplicand and a multiplier to generate a final product, the in-memory multiplication device comprising a number m.sup.2 of in-memory multiplier units, a number m of binary adder devices, a register unit and zero or a number (m−1) of polynomial adders, the method comprising the steps of: comparing a first n-bit digit and a second n-bit digit respectively selected from the multiplicand and the multiplier with a number 2.sup.2n of 2n-bit operand symbols hardwired in a first read-only-memory (ROM) array to output one of a number 2.sup.2n of 2n-bit response symbols hardwired in a second ROM array as a 2n-bit product code by each in-memory multiplier unit comprising the first ROM array and the second ROM array so that a number m.sup.2 of 2n-bit product codes are outputted in parallel from the number m.sup.2 of in-memory multiplier units, wherein the number m.sup.2 of 2n-bit product codes serve as first coefficients of a number m of first polynomials in base 2.sup.n and the first coefficients of each first polynomial in base 2.sup.n are associated with multiplication of the multiplicand with a corresponding digit of the multiplier; when m>=2, sequentially adding a most significant digit of the first coefficient of a less degree term and a least significant digit of the first coefficient of a larger degree term adjacent to the less degree term for each first polynomial in ascending degree by each binary adder device comprising (m−1) n-bit adders and n half adders in a carry-chained configuration so that the 2n-bit first coefficients of the m first polynomials in base 2.sup.n are converted into n-bit second coefficients of m second polynomials in base 2.sup.n in parallel by the m binary adder devices; and when m>=2, sequentially adding the m second polynomials in base 2.sup.n in ascending degrees by the number (m−1) of polynomial adders arranged in sequential order such that like terms of the m second polynomials in base 2.sup.n are lined up and added to generate and store a final product having 2m digits in base 2.sup.n in the register unit, wherein each polynomial adder comprises a (m×n)-bit adder and n half adders in a carry-chained configuration; wherein the number m.sup.2 of in-memory multiplier units and the number m of binary adder devices are respectively arranged in a parallel configuration and each of the multiplicand and the multiplier has m digits in base 2.sup.n and both n and m are integers greater than 0.

25. The operating method according to claim 24, wherein the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array and the number 2.sup.2n of 2n-bit product symbols hardwired in the second ROM array define a n-bit by n-bit multiplication table.

26. The operating method according to claim 24, wherein a number of terms in each first polynomial is m and a highest degree for the m first polynomials is (2m−2), wherein a number of terms in each second polynomial is (m+1) and a highest degree for the m second polynomials is (2m−1).

27. The operating method according to claim 24, wherein the step of sequentially adding the most significant digit comprises: at a binary adder device (k−3) of the number m of binary adder devices, (1) adding a least significant digit of the first coefficient of a second smallest degree term and a most significant digit of the first coefficient of a smallest degree term in the first polynomial of degree k to produce a carry digit and the n-bit second coefficient of the second smallest degree term in the corresponding second polynomial of degree (k+1) by a least significant n-bit adder of the (m−1) n-bit adders; (2) adding a carry digit from its less significant n-bit adder, the least significant digit of the first coefficient of a target term (i.sup.th) and the most significant digit of the first coefficient of its immediately-previous term ((i−1).sup.th) in the first polynomial of degree k to produce a carry digit and the n-bit second coefficient of a corresponding term (i.sup.th) in its corresponding second polynomial of degree (k+1) by a corresponding n-bit adder of the (m−1) n-bit adders; (3) repeating step (2) the (m−1) n-bit adders are completed, where i is increased from 2 to (m−1); and (4) adding a carry digit from its less a most significant n-bit adder and a most significant digit of the first coefficient of a largest degree term in the first polynomial of degree k to produce the n-bit second coefficient of a largest degree term in the corresponding second polynomial of degree (k+1) by the n half adders, where k ranges from (m−1) to (2m−2).

28. The operating method according to claim 24, wherein the step of sequentially adding the m second polynomials comprises: (a) storing a constant term of the second polynomial of degree m to as a least significant digit of the final product in the register unit; (b) lining up and adding the n-bit second coefficients of m larger degree terms of the second polynomial of degree m and all the n-bit second coefficients of a second polynomial of degree (m+1) by a least significant polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (m+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (m+1) as the second least significant digit of the final product in the register unit; (c) when m>=4, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree j and all the n-bit second coefficients of the second polynomial of degree (j+1) by a corresponding polynomial adder of the number (m−1) of polynomial adders to obtain sum coefficients of a sum polynomial of degree (j+1) and store the sum coefficient of a smallest degree term of the sum polynomial of degree (j+1) as a corresponding digit of the final product in the register unit; (d) when m>=4, repeating step (c) until a total of (m−2) polynomial adders out of the number (m−1) of polynomial adders are completed, where j is increased from (m+1) to (2m−3); and (e) when m>=3, lining up and adding the sum coefficients of m larger degree terms of the sum polynomial of degree (2m−2) and all the n-bit second coefficients of the second polynomial of degree (2m−1) by a most significant polynomial adder of the number (m−1) of polynomial adders to obtain and store all the sum coefficients of a sum polynomial of degree (2m−1) as the (m+1) most significant digits of the final product in the register unit.

29. The operating method according to claim 24, wherein the step of comparing comprises: parallel comparing the first n-bit digit and the second n-bit digit with the number 2.sup.2n of 2n-bit operand symbols hardwired in the first ROM array comprising 2.sup.2n rows by 2n columns of first memory cells so that each row of first memory cells generates an indication signal indicative of whether the first n-bit digit and the second n-bit digit match its hardwired 2n-bit operand symbol; respectively applying a number 2.sup.2n of switching signals to a number 2.sup.2n of wordlines in the second ROM array comprising 2.sup.2n rows by 2n columns of second memory cells according to a number 2.sup.2n of indication signals, wherein the number 2.sup.2n of 2n-bit response symbols are hardwired in the 2.sup.2n rows of second memory cells; and switching on a row of second memory cells to output its hardwired 2n-bit response symbol as a 2n-bit product code in response to a received activated switching signal.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) For a better understanding of the present invention and to show how it may be carried into effect, reference will now be made to the following drawings, which show the preferred embodiment of the present invention, in which:

(2) FIG. 1 shows the conventional Von-Neumann computing architecture for a typical Central Processing Unit (CPU).

(3) FIG. 2 shows the n-bit by n-bit multiplication table for two n-bit binary integer number operands.

(4) FIG. 3 shows the multiplication operations of two m-digit base-2.sup.n operands in the form of digit-multiply polynomials and polynomial additions according to the invention.

(5) FIG. 4 shows the schematics for generating the digit-multiply polynomial according to the invention.

(6) FIG. 5 shows the schematic of a Perpetual Digital Perceptron (PDP) base-2.sup.n in-memory multiplier unit for the digit-by-digit multiplication based on the n-bit-by-n-bit multiplication table in FIG. 2.

(7) FIG. 6 shows the schematic of the Input Buffer and Driver Unit 510 according to the PDP in-memory multiplier unit in FIG. 5.

(8) FIG. 7 shows the schematic of 2n-bit by (2.sup.2n)-row CROM array 520 according to the PDP in-memory multiplier unit in FIG. 5.

(9) FIG. 8 shows the schematics of Match-Detector Unit 530 according to the PDP in-memory multiplier unit in FIG. 5.

(10) FIG. 9 shows the schematic of the 2n-bit by (2.sup.2n)-row RROM array 540 according to the PDP in-memory multiplier unit in FIG. 5.

(11) FIG. 10 shows the schematic of a carry-chained binary adder device for the digit/multi-digit multiply polynomial generation according to an embodiment of the invention.

(12) FIG. 11 shows the schematics of digit/multi-digit multiply polynomial additions using m polynomial adders according to the invention.

(13) FIG. 12a shows the schematic of the first significant polynomial adder 110(1) with inputs connected with the most significant (m*n)-bit outputs of the first polynomial register unit 440(0) and the (m*n+n)-bit outputs of the second polynomial register unit 440(1) according to an embodiment of the invention.

(14) FIG. 12b shows the schematic of the intermediate polynomial adder 110(j) with inputs connected with the most significant (m*n)-bit outputs of the polynomial adder 110(j−1) and the (m*n+n)-bit outputs of the polynomial register unit 440(j) according to an embodiment of the invention.

(15) FIG. 12c shows the schematic of the last polynomial adder 110(m−1) with the inputs connected to the most significant (m*n)-bit outputs of the polynomial adder 110(m−2) and the (m*n+n)-bit outputs of the most significant polynomial register unit 440(m−1), and with the (m*n+n)-bit outputs connected to the most significant (m*n+n)-bit registers in the resultant multiplication register unit 120 according to an embodiment of the invention.

(16) FIG. 13 shows the binary codes of the 4-bit by 4-bit multiplication table stored in the PDP in-memory multiplier unit 141 according to an embodiment of the invention.

(17) FIG. 14 shows the schematic of four-digit base-2.sup.n in-memory multiplication device with sixteen operational steps according to an embodiment of the invention.

(18) FIG. 15 shows the schematics of four-digit base-2.sup.n in-memory multiplication device with four operational steps operations for the two 16-bit operands according to an embodiment of the invention.

(19) FIG. 16 shows the schematics of four-digit base-2.sup.n in-memory multiplication device with one-operational step according to an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

(20) The following detailed description is meant to be illustrative only and not limiting. It is to be understood that other embodiment may be utilized and element changes may be made without departing from the scope of the present invention. Also, it is to be understood that the phraseology and terminology used herein are for the purpose of description and should not be regarded as limiting. Those of ordinary skill in the art will immediately realize that the embodiments of the present invention described herein in the context of methods and schematics are illustrative only and are not intended to be in any way limiting. Other embodiments of the present invention will readily suggest themselves to such skilled persons having the benefits of this disclosure.

(21) To illustrate the idea of m-digit base-2.sup.n in-memory multiplication devices for two m-digit base-2.sup.n integer number operands, we apply 4-digit (m=4) base-2.sup.4 (n=4) in-memory multiplication devices for two 16-bit binary operands (16-digit by 16-bit multiplication) for the embodiments. The embodiments are for the illustration purpose but shall not be limited to specific numbers of m and n depending on the optimized design environment circumstance for the IC chips. For purposes of clarity and ease of description, hereinafter, in the following examples and embodiments, the same components and/or components with the same function are designated with the same reference numerals.

(22) FIG. 13 shows the 4-bit by 4-bit multiplication table, where the first column and first row of the table cells are filled with the 4-bit integer numbers: [0 (0000)b], [1 (0001)b], [2 (0010)b], . . . , [14 (1110)b], [15 (1111)b]. Every cell is filled with the 8-bit binary code of the multiplication results for the number (i−1) in the “i.sup.th”-column and the integer number (j−1) in the “j.sup.th”-row. For example, the cell (3.sup.rd-column and 7.sup.th-row) for [2 (0010)b]*[6 (0110)b] is filled with the number [12 (00001100)b], the cell (8.sup.th-column and 10.sup.th-row) for [7 (0111)b]*[9 (1001)b] is filled with the number [63 (00111111)b], . . . , and so forth. To apply the 4-bit PDP in-memory multiplier unit 141, we store the binary codes of the two input 4-bit integers (the cells in first row and first column of multiplication table in FIG. 13) into the 256 rows of CROM array 520 and correspondingly store the number 256 of 8-bit product binary codes into the 256 rows of the RROM array 540 according to the 4-bit by 4-bit multiplication table in FIG. 13. Basically, given any inputs of two 4-bit binary integers to the 4-bit PDP in-memory multiplier unit 141, the 4-bit PDP in-memory multiplier unit 141 will output the 8-bit product binary code for their multiplication code.

(23) In one embodiment, the schematic of the 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 140 shown in FIG. 14 comprises a single PDP base-2.sup.4 in-memory multiplier unit 141 for obtaining the 8-bit binary multiplication code of two inputted 4-bit operands, an “8 to 128” multiplexer 142 for selecting one of the sixteen sets of 8-bit digit-digit multiply register unit 143 as the outputs, sixteen digit-digit multiply register unit 143 for storing the sixteen sets of digit-digit multiply binary codes, four binary adder devices 144(0), 144(1), 144(2), and 144(3) for the generations of four digit/multi-digit polynomials, three polynomial adders 110(1), 110(2), and 110(3) for the polynomial additions, and one 32-bit resultant multiplication register unit 146 for storing the multiplication resultant code (i.e., the final binary product). Each binary adder device 144(j), for j=0, 1, 2, 3, consists of a 4-bit carry-chained binary adder unit 410, two 4-bit carry-chained binary adder units 420 and a 4-bit carry-chained binary adder unit 430.

(24) The 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 140 is operated as the following: the “8 to 128” multiplexer 142 is selected to connect the 8-bit outputs of PDP base-2.sup.4 in-memory multiplier unit 141 to the designated 8-bit registers in the digit multiply register unit 143 for the inputted digit multiply of A.sub.i*B.sub.j in one operational step for each i, j=0, 1, 2, 3. The process will take sixteen operational steps to fill up the entire 128-bit registers in the digit-digit multiply register unit 143 for the binary codes of the sixteen components of digit multiplications. Meanwhile the data voltage signals of the 128-bit registers in the register unit 143 are propagating to the four binary adder devices 144(0), 144(1), 144(2) and 144(3) for generating the digit/multi-digit polynomial codes along with their least significant 4-bit respectively sent to the inputs of polynomial adders 110(0), 110(1), and 110(2), and to the least significant 4-bit registers [m.sub.3:m.sub.0] in the 32-bit resultant multiplication register unit 146. The operation of a first binary adder device 144(0) is equivalent to converting 8-bit first coefficients of a first polynomial of degree 3 (i.e., A.sub.3*B.sub.0X.sup.3+A.sub.2*B.sub.0X.sup.2+*B.sub.0X.sup.1+A.sub.0*B.sub.0X.sup.0) into 4-bit second coefficients of a second polynomial of degree 4 (i.e., C.sub.4X.sup.4+C.sub.3X.sup.3+C.sub.2X.sup.2+C.sub.1X.sup.1+C.sub.0X.sup.0) in mathematics; the operation of a second binary adder device 144(1) is equivalent to converting 8-bit first coefficients of a first polynomial of degree 4 (A.sub.3*B.sub.1X.sup.4+A.sub.2*+A.sub.1*B.sub.1X.sup.2+A.sub.0*B.sub.1X.sup.1) into 4-bit second coefficients of a second polynomial of degree 5 (C.sub.9X.sup.5+C.sub.8X.sup.4+C.sub.7X.sup.3+C.sub.6X.sup.2+C.sub.5X.sup.1) in mathematics; the operation of a third binary adder device 144(2) is equivalent to converting 8-bit first coefficients of a first polynomial of degree 5 (A.sub.3*B.sub.2X.sup.5+A.sub.2*B.sub.2X.sup.4+A.sub.1*B.sub.2X.sup.3+A.sub.0*B.sub.2X.sup.2) into 4-bit second coefficients of a second polynomial of degree 6 (C.sub.14X.sup.6+C.sub.13X.sup.5+C.sub.12X.sup.4+C.sub.11X.sup.3+C.sub.10X.sup.2) in mathematics; the operation of a fourth binary adder device 144(3) is equivalent to converting 8-bit first coefficients of a first polynomial of degree 6 (A.sub.3*B.sub.3X.sup.6+A.sub.2*B.sub.3X.sup.5+A.sub.1*B.sub.3X.sup.4+A.sub.0*B.sub.3+X.sup.3) into 4-bit second coefficients of a second polynomial of degree 7 (C.sub.19X.sup.7+C.sub.18X.sup.6+C.sub.17X.sup.5+C.sub.16X.sup.4+C.sub.15X.sup.3) in mathematics, where X=2.sup.4. The voltage signals of the digit/multi-digit polynomial codes continue to propagate to the inputs of the three polynomial adders 110(1), 110(2), and 110(3).

(25) Meanwhile with the voltage signals of the 4-bit outputs [p.sub.31:p.sub.01] from the first polynomial adder 110(1) sent to the 4-bit registers [m.sub.7:m.sub.4] in the final 32-bit resultant multiplication registers 146, the voltage signals of 16-bit [p.sub.(19)1:p.sub.41] from the first polynomial adder 110(1) propagate to the inputs of the second polynomial adder 110(2). With the voltage signals of the least significant 4-bit outputs [p.sub.32:p.sub.02] from the second polynomial adder 110(2) sent to the 4-bit registers [m.sub.11:m.sub.8] in the final 32-bit resultant multiplication registers unit 146, the voltage signals of 16-bit outputs [p.sub.(19)2:p.sub.42] from the second polynomial adder 110(2) propagate to the inputs of the third polynomial adder 110(3). Finally the voltage signals of the 20-bit outputs [p.sub.(19)3:p.sub.03] from the third polynomial adder 110(3) have reached the 20-bit registers [m.sub.31:m.sub.12] in the final 32-bit resultant multiplication register unit 146. The operations of the polynomial adders 110(1)˜110(3) are equivalent to lining up and adding like terms of the above second polynomials of degrees ranging from 3 to 7 to obtain third coefficients of a third polynomial of degree 7 in mathematics. Here, the third polynomial has eight terms. After the voltage signals of the entire 32-bit registers are settled the 32-bit multiplication codes for two 16-bit (4-digit hexadecimal) operands A and B are stored in the final 32-bit resultant multiplication register unit 146 as the 16 processing steps for obtaining the sixteen sets of digit-digit multiply with one single PDP in-memory multiplier unit 141.

(26) In one embodiment the schematic of the 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 150 shown in FIG. 15 comprises four PDP base-2.sup.4 in-memory multiplier units 141 for obtaining four 8-bit binary multiplication/product codes, a binary adder device 144 for the generations of digit/multi-digit polynomials, an “20 to 80” multiplexer 152 for selecting one of the digit/multi-digit multiply polynomial register units 153, four digit/multi-digit multiply polynomial register units 153(0)˜153(3) for storing 80-bit codes (i.e., the second coefficients C.sub.0˜C.sub.19 of the second polynomials, each second coefficient having 4 bits) of four digit/multi-digit multiply polynomials, three polynomial adders 110(1), 110(2) and 110(3) for the polynomial additions, and one 32-bit resultant multiplication register unit 146 for storing the final multiplication code.

(27) The 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 150 is operated as the following: the “20 to 80” multiplexer 152 is selected to connect the 20-bit outputs of the binary adder device 144 with the adder's inputs from the four PDP base-2.sup.4 in-memory multiplier units 141 to the inputs of 20-bit registers 153(j), where the 20-bit register unit 153(j) stores the second coefficients of second polynomials of C.sub.4+5*jX.sup.j+4+C.sub.3+5+jX.sup.j+3+C.sub.2+5+jX.sup.j+2+C.sub.1+5*jX.sup.j+1+C.sub.0+5*jX.sup.j for j=0, 1, 2, 3. The process takes four operational steps to fill up the entire 80-bit registers with the binary codes of four digit/multi-digit multiply polynomials (or second coefficients (C.sub.0˜C.sub.19) of four second polynomials shown in blocks 153(0)˜153(3). The data voltage signals of 80-bit digit/multi-digit polynomial codes (or the twenty second coefficients (C.sub.0˜C.sub.19)) in the four polynomial register units 153(0)˜153(3) are sent to the inputs of the three polynomial adders 110(1), 110(2), and 110(3), and to the least significant 4-bit inputs of registers [m.sub.3:m.sub.0] in the 32-bit resultant multiplication register unit 146, respectively. Meanwhile the data voltage signals of the most significant 16-bit (i.e, C.sub.1˜C.sub.4) of the first polynomial digit/multi-digit register unit 153(0) are sent into the 16-bit inputs of the first polynomial adder 110(1) along with the least significant 4 bits (i.e, C.sub.0) sent to the least significant 4-bit registers [m.sub.3:m.sub.0] in the 32-bit resultant multiplication register unit 146. With the voltage signals of the 4-bit outputs [p.sub.31:p.sub.01] from the first polynomial adder 110(1) sent to the 4-bit registers [m.sub.7:m.sub.4] in the final 32-bit binary register unit 146, the voltage signals of 16-bit [p.sub.(19)1:p.sub.41] propagate into the inputs of the second polynomial adder 110(2). With the voltage signals of the 4-bit outputs [p.sub.32:p.sub.02] from the second polynomial adder 110(2) sent to the 4-bit registers [m.sub.11:m.sub.8] in the final 32-bit resultant register unit 146, the voltage signals of 16-bit [p.sub.(19)2:p.sub.42] propagate into the inputs of the third polynomial adder 110(3). Finally the voltage signals of the 20-bit outputs [p.sub.(19)3:p.sub.03] from the third polynomial adder 110(3) have reached the 20-bit registers [m.sub.31:m.sub.12] in the final 32-bit resultant multiplication registers 146. After the voltage signals of the entire 32-bit registers are settled, the 32-bit multiplication codes for two 16-bit (4-digit hexadecimal) operands A and B are stored in the final 32-bit resultant multiplication registers 146 as the 4 processing steps for obtaining four digit/multi-digit multiply polynomials with four PDP in-memory multiplier units 141.

(28) In one embodiment the schematics of the 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 160 shown in FIG. 16 comprises sixteen PDP base-2.sup.4 in-memory multiplier units 141s for simultaneously obtaining 128-bit digit-digit multiply codes, four binary adder devices 144(0), 144(1), 144(2), and 144(3) for the generations of four digit/multi-digit polynomials, three polynomial adders 110(1), 110(2) and 110(3) for the polynomial additions, and one 32-bit resultant multiplication register unit 146 for storing the final multiplication code.

(29) The 4-digit base-2.sup.4 (hexadecimal) in-memory multiplication device 160 is operated in one step as the following: the voltage signals of 128-bit digit-digit multiply code is simultaneously generated from the sixteen PDP in-memory multiplier units 141s. With the voltage signals of the least significant 4-bit of the digit-digit multiply code (or the second coefficient (C.sub.0) of the second polynomials) sent to the 4-bit of [m.sub.3:m.sub.0] in the 32-bit resultant multiplication register unit 146, the voltage signals of the most significant 124-bit of the digit-digit multiply code is sent to the inputs of four binary adder devices 144(0), 144(1), 144(2), and 144(3) for generating the polynomial codes. The voltage signals of the four digit/multi-digit polynomials (or the second coefficients (C.sub.1˜C.sub.19) of the second polynomials) generated by the four binary adder devices 144(0), 144(1), 144(2), and 144(3) then propagate to the inputs of the three polynomial adders 110(1), 110(2), and 110(3). Meanwhile with the voltage signals of the 4-bit outputs [p.sub.31:p.sub.01] from the first polynomial adder 110(1) sent to the 4-bit registers [m.sub.7:m.sub.4] in the final 32-bit resultant multiplication register unit 146, the voltage signals of 16-bit [p.sub.(19)1:p.sub.41] from the first polynomial adder 110(1) continue to propagate into the inputs of the second polynomial adder 110(2). With the voltage signals of the 4-bit outputs [p.sub.32:p.sub.02] from the second polynomial adder 110(2) sent to the 4-bit registers [m.sub.11:m.sub.8] in the final 32-bit resultant multiplication registers unit 146, the voltage signals of 16-bit [p.sub.(19)2:p.sub.42] continue to propagate into the inputs of the third polynomial adder 110(3). Finally the voltage signals of the 20-bit outputs [p.sub.(19)3:p.sub.03] from the third polynomial adder 110(3) have reached the 20-bit registers [m.sub.31:m.sub.12] in the final 32-bit resultant multiplication register unit 146. After the voltage signals of the entire 32-bit registers are settled, the 32-bit multiplication codes for two 16-bit (4-digit hexadecimal) operands A and B are stored in the final 32-bit resultant multiplication register unit 146 as the one process step for obtaining the 128-bit digit-digit multiply code from sixteen PDP in-memory multiplier units 141s.

(30) Please note that the above carry-chained binary adder device/unit (100, 410, 420 and 430) are utilized as embodiments and not limitations of the invention. In actual implementations, the above carry-chained binary adder device/unit (100, 410, 420 and 430) can be replaced with any other types of binary adder device/unit, such as Carry Save Adder and Look Ahead Adder, and this also falls in the scope of the invention. Please also note that the above CROM array 520 and the RROM array 540 are utilized as embodiments and not limitations of the invention. In actual implementations, the above CROM array 520 and the RROM array 540 can be replaced with any other types of memory arrays or equivalent logic components, and this also falls in the scope of the invention.

(31) The aforementioned description of the preferred embodiments of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form or to exemplary embodiment disclosed. Accordingly, the description should be regarded as illustrative rather than restrictive. The embodiment is chosen and described in order to best explain the principles of the invention and its best mode practical application, thereby to enable persons skilled in the art to understand the invention for various embodiments and with various modifications as are suited to the particular use or implementation contemplated. It is intended that the scope of the invention be defined by the claims appended hereto and their equivalents in which all terms are meant in their broadest reasonable sense unless otherwise indicated. The abstract of the disclosure is provided to comply with the rules requiring an abstract, which will allow a searcher to quickly ascertain the subject matter of the technical disclosure of any patent issued from this disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. Any advantages and benefits described may not apply to all embodiments of the invention. It should be appreciated that variations may be made in the embodiments described by persons skilled in the art without departing from the scope of the present invention as defined by the following claims. Moreover, no element and component in the present disclosure is intended to be dedicated to the public regardless of whether the element or component is explicitly recited in the following claims.

Multiple-digit binary in-memory multiplier devices

Assignee

Inventors

Cpc classification

Classification Explorer

G06F7/523

PHYSICS

Classification Explorer

G06F7/405

PHYSICS

International classification

Classification Explorer

G06F7/40

PHYSICS

Classification Explorer

G06F7/523

PHYSICS

Abstract

Claims

Description