INPUT SEQUENCE RE-ORDERING METHOD AND INPUT SEQUENCE RE-ORDERING UNIT WITH MULTI INPUT-PRECISION RECONFIGURABLE SCHEME AND PIPELINE SCHEME FOR COMPUTING-IN-MEMORY MACRO IN CONVOLUTIONAL NEURAL NETWORK APPLICATION

Abstract

An input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory macro in a convolutional neural network application is configured to re-order a plurality of multi-bit input signals and includes performing a scanning step and a re-ordering step. The scanning step includes driving a scanner to scan one group of the multi-bit input signals to determine whether an initial value of a plurality of flag signals in one of a plurality of multi-bit section flags is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals. The re-ordering step includes driving a sorter to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags.

Claims

1. An input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory macro in a convolutional neural network application, which is configured to re-order a plurality of multi-bit input signals, and the input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application comprising: performing a scanning step, wherein the scanning step comprises driving a scanner to scan one group of the multi-bit input signals to determine whether an initial value of one of a plurality of flag signals in one of a plurality of multi-bit section flags is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals, and the initial value is different from the inverted initial value; and performing a re-ordering step, wherein the re-ordering step comprises driving a sorter to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags, and then transmit the part of the one group of the multi-bit input signals to the computing-in-memory macro.

2. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 1, further comprising: performing an initializing step, wherein the initializing step comprises driving the sorter to initialize each of the flag signals in each of the multi-bit section flags to the initial value, and a number of the flag signals is equal to a number of the multi-bit input signals; wherein the initializing step, the scanning step and the re-ordering step are performed in sequence.

3. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 1, wherein each of the multi-bit input signals has eight bits, the initial value is equal to 0, the inverted initial value is equal to 1, the multi-bit input signals comprise a first input sub-group, a second input sub-group and a third input sub-group, and the first input sub-group and the second input sub-group have six bits and eight bits, respectively.

4. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 1, wherein the multi-bit section flags comprise: a first multi-bit section flag configured to label at least one N-bit input signal of the multi-bit input signals, wherein the first multi-bit section flag comprises a plurality of first flag signals, a number of the first flag signals is equal to a number of the multi-bit input signals, the first flag signals are corresponding to the multi-bit input signals, respectively, the at least one N-bit input signal is represented by an N bit value, and N is a positive integer; and a second multi-bit section flag configured to label at least one M-bit input signal of the multi-bit input signals, wherein the second multi-bit section flag comprises a plurality of second flag signals, a number of the second flag signals is equal to the number of the multi-bit input signals, the second flag signals are corresponding to the multi-bit input signals, respectively, the at least one M-bit input signal is represented by an M bit value, and M is a positive integer greater than N.

5. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 4, wherein the scanning step further comprises: performing a first scanning sub-step, wherein the first scanning sub-step comprises driving the scanner to scan the one group of the multi-bit input signals to obtain the bit numbers of the one group of the multi-bit input signals; and performing a second scanning sub-step, wherein the second scanning sub-step comprises driving the sorter to determine whether the initial value of one of the first flag signals of the first multi-bit section flag and the second flag signals of the second multi-bit section flag is changed to the inverted initial value according to one of the bit numbers of the one group of the multi-bit input signals.

6. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 5, wherein in the second scanning sub-step, in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to N, the initial value of one of the first flag signals of the first multi-bit section flag is changed to the inverted initial value, the initial value of one of the second flag signals of the second multi-bit section flag is not changed, and a first count number of a first counter of the sorter is added by 1; and in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to M, the initial value of the one of the first flag signals of the first multi-bit section flag is not changed, the initial value of the one of the second flag signals of the second multi-bit section flag is changed to the inverted initial value, and a second count number of a second counter of the sorter is added by 1; wherein the first count number represents a number of a plurality of the first flag signals which are all equal to the inverted initial value in the first multi-bit section flag, and the second count number represents a number of a plurality of the second flag signals which are all equal to the inverted initial value in the second multi-bit section flag.

7. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 6, wherein the re-ordering step further comprises: in response to determining that the first count number reaches a predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the first flag signals which are all equal to the inverted initial value; and in response to determining that the second count number reaches the predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the second flag signals which are all equal to the inverted initial value.

8. An input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory macro in a convolutional neural network application, which is configured to re-order a plurality of multi-bit input signals, and the input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application comprising: performing a scanning step, wherein the scanning step comprises driving a scanner to scan one group of the multi-bit input signals to determine whether an initial value of one of a plurality of flag signals in one of a plurality of multi-bit section flags is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals, and the initial value is different from the inverted initial value; performing a re-ordering step, wherein the re-ordering step comprises driving the sorter to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags, and then transmit the part of the one group of the multi-bit input signals to the computing-in-memory macro; and performing a pipeline step, wherein the pipeline step comprises driving the computing-in-memory macro to perform a multiply-and-accumulate calculation according to the one of the multi-bit section flags and the part of the one group of the multi-bit input signals, and driving the scanner to scan a next group of the multi-bit input signals, and the computing-in-memory macro and the scanner are driven by using the pipeline scheme.

9. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 8, further comprising: performing an initializing step, wherein the initializing step comprises driving the sorter to initialize each of the flag signals in each of the multi-bit section flags to the initial value, and a number of the flag signals is equal to a number of the multi-bit input signals; wherein the initializing step, the scanning step, the re-ordering step and the pipeline step are performed in sequence.

10. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 8, wherein each of the multi-bit input signals has eight bits, the initial value is equal to 0, the inverted initial value is equal to 1, the multi-bit input signals comprise a first input sub-group, a second input sub-group and a third input sub-group, and the first input sub-group and the second input sub-group have six bits and eight bits, respectively.

11. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 8, wherein the multi-bit section flags comprise: a first multi-bit section flag configured to label at least one N-bit input signal of the multi-bit input signals, wherein the first multi-bit section flag comprises a plurality of first flag signals, a number of the first flag signals is equal to a number of the multi-bit input signals, the first flag signals are corresponding to the multi-bit input signals, respectively, the at least one N-bit input signal is represented by an N bit value, and N is a positive integer; and a second multi-bit section flag configured to label at least one M-bit input signal of the multi-bit input signals, wherein the second multi-bit section flag comprises a plurality of second flag signals, a number of the second flag signals is equal to the number of the multi-bit input signals, the second flag signals are corresponding to the multi-bit input signals, respectively, the at least one M-bit input signal is represented by an M bit value, and M is a positive integer greater than N.

12. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 11, wherein the scanning step further comprises: performing a first scanning sub-step, wherein the first scanning sub-step comprises driving the scanner to scan the one group of the multi-bit input signals to obtain the bit numbers of the one group of the multi-bit input signals; and performing a second scanning sub-step, wherein the second scanning sub-step comprises driving the sorter to determine whether the initial value of one of the first flag signals of the first multi-bit section flag and the second flag signals of the second multi-bit section flag is changed to the inverted initial value according to one of the bit numbers of the one group of the multi-bit input signals.

13. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 12, wherein in the second scanning sub-step, in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to N, the initial value of one of the first flag signals of the first multi-bit section flag is changed to the inverted initial value, the initial value of one of the second flag signals of the second multi-bit section flag is not changed, and a first count number of a first counter of the sorter is added by 1; and in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to M, the initial value of the one of the first flag signals of the first multi-bit section flag is not changed, the initial value of the one of the second flag signals of the second multi-bit section flag is changed to the inverted initial value, and a second count number of a second counter of the sorter is added by 1; wherein the first count number represents a number of a plurality of the first flag signals which are all equal to the inverted initial value in the first multi-bit section flag, and the second count number represents a number of a plurality of the second flag signals which are all equal to the inverted initial value in the second multi-bit section flag.

14. The input sequence re-ordering method with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 13, wherein the re-ordering step further comprises: in response to determining that the first count number reaches a predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the first flag signals which are all equal to the inverted initial value; and in response to determining that the second count number reaches the predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the second flag signals which are all equal to the inverted initial value.

15. An input sequence re-ordering unit with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory macro in a convolutional neural network application, which is configured to re-order a plurality of multi-bit input signals, and the input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application comprising: a sorter electrically connected to the computing-in-memory macro and comprising: a plurality of multi-bit section flags comprising a plurality of flag signals; and a scanner electrically connected to the multi-bit section flags, wherein the scanner is configured to scan one group of the multi-bit input signals to determine whether an initial value of one of the flag signals in one of the multi-bit section flags is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals, and the initial value is different from the inverted initial value; wherein the sorter is configured to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags, and transmit the part of the one group of the multi-bit input signals to the computing-in-memory macro.

16. The input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 15, wherein the sorter further comprises: an input register array electrically connected to the sorter, wherein the input register array is configured to arrange the multi-bit input signals in an array form; wherein the scanner is configured to scan the multi-bit input signals in sequence.

17. The input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 15, wherein the multi-bit section flags further comprise: a first multi-bit section flag configured to label at least one N-bit input signal of the multi-bit input signals, wherein the first multi-bit section flag comprises a plurality of first flag signals, a number of the first flag signals is equal to a number of the multi-bit input signals, the first flag signals are corresponding to the multi-bit input signals, respectively, the at least one N-bit input signal is represented by an N bit value, and N is a positive integer; and a second multi-bit section flag configured to label at least one M-bit input signal of the multi-bit input signals, wherein the second multi-bit section flag comprises a plurality of second flag signals, a number of the second flag signals is equal to the number of the multi-bit input signals, the second flag signals are corresponding to the multi-bit input signals, respectively, the at least one M-bit input signal is represented by an M bit value, and M is a positive integer greater than N.

18. The input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 17, wherein, the scanner is configured to scan the one group of the multi-bit input signals to obtain the bit numbers of the one group of the multi-bit input signals; and the sorter to is configured determine whether the initial value of one of the first flag signals of the first multi-bit section flag and the second flag signals of the second multi-bit section flag is changed to the inverted initial value according to one of the bit numbers of the one group of the multi-bit input signals.

19. The input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 18, wherein in the sorter, in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to N, the initial value of one of the first flag signals of the first multi-bit section flag is changed to the inverted initial value, the initial value of one of the second flag signals of the second multi-bit section flag is not changed, and a first count number of a first counter of the sorter is added by 1; and in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to M, the initial value of the one of the first flag signals of the first multi-bit section flag is not changed, the initial value of the one of the second flag signals of the second multi-bit section flag is changed to the inverted initial value, and a second count number of a second counter of the sorter is added by 1; wherein the first count number represents a number of a plurality of the first flag signals which are all equal to the inverted initial value in the first multi-bit section flag, and the second count number represents a number of a plurality of the second flag signals which are all equal to the inverted initial value in the second multi-bit section flag.

20. The input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the computing-in-memory macro in the convolutional neural network application of claim 19, wherein in the sorter, in response to determining that the first count number reaches a predetermined count number, the part of the one group of the multi-bit input signals is set to a plurality of the multi-bit input signals corresponding to the first flag signals which are all equal to the inverted initial value; and in response to determining that the second count number reaches the predetermined count number, the part of the one group of the multi-bit input signals is set to a plurality of the multi-bit input signals corresponding to the second flag signals which are all equal to the inverted initial value.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The present disclosure can be more fully understood by reading the following detailed description of the embodiment, with reference made to the accompanying drawings as follows:

[0009] FIG. 1 shows a flow chart of an input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory (CIM) macro in a convolutional neural network (CNN) application according to a first embodiment of the present disclosure.

[0010] FIG. 2 shows a flow chart of an input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro in a CNN application according to a second embodiment of the present disclosure.

[0011] FIG. 3 shows a schematic view of a scanner scanning a first multi-bit input signal of one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0012] FIG. 4 shows a schematic view of the scanner scanning a second multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0013] FIG. 5 shows a schematic view of the scanner scanning a third multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0014] FIG. 6 shows a schematic view of the scanner scanning a fourth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0015] FIG. 7 shows a schematic view of the scanner scanning a fifth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0016] FIG. 8 shows a schematic view of the scanner scanning a sixth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0017] FIG. 9 shows a schematic view of the scanner scanning a seventh multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0018] FIG. 10 shows a schematic view of the scanner scanning an eighth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0019] FIG. 11 shows a schematic view of the scanner scanning a ninth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0020] FIG. 12 shows a schematic view of the scanner scanning a tenth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0021] FIG. 13 shows a schematic view of the scanner scanning an eleventh multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0022] FIG. 14 shows a schematic view of the scanner scanning a twelfth multi-bit input signal of the one group of the multi-bit input signals in the input sequence re-ordering method of FIG. 2.

[0023] FIG. 15 shows a flow chart of an input sequence re-ordering method with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro in a CNN application according to a third embodiment of the present disclosure.

[0024] FIG. 16 shows a schematic view of a beginning of a first pipeline stage of a pipeline step of the input sequence re-ordering method of FIG. 15.

[0025] FIG. 17 shows a schematic view of an end of the first pipeline stage of the pipeline step of the input sequence re-ordering method of FIG. 15.

[0026] FIG. 18 shows a schematic view of an end of a second pipeline stage of the pipeline step of the input sequence re-ordering method of FIG. 15.

[0027] FIG. 19 shows a schematic view of an end of a third pipeline stage of the pipeline step of the input sequence re-ordering method of FIG. 15.

[0028] FIG. 20 shows a schematic view of an end of a fourth pipeline stage of the pipeline step of the input sequence re-ordering method of FIG. 15.

[0029] FIG. 21 shows a block diagram of an input sequence re-ordering unit with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro in a CNN application according to a fourth embodiment of the present disclosure.

[0030] FIG. 22 shows a schematic view of the input sequence re-ordering unit, a word line driver and the CIM macro.

[0031] FIG. 23 shows a schematic view of one of a plurality of word line driving units of the word line driver of FIG. 22.

[0032] FIG. 24 shows a comparison result of throughput between the conventional method and the input sequence re-ordering method of the present disclosure.

DETAILED DESCRIPTION

[0033] The embodiment will be described with the drawings. For clarity, some practical details will be described below. However, it should be noted that the present disclosure should not be limited by the practical details, that is, in some embodiment, the practical details is unnecessary. In addition, for simplifying the drawings, some conventional structures and elements will be simply illustrated, and repeated elements may be represented by the same labels.

[0034] It will be understood that when an element (or device) is referred to as be “connected to” another element, it can be directly connected to the other element, or it can be indirectly connected to the other element, that is, intervening elements may be present. In contrast, when an element is referred to as be “directly connected to” another element, there are no intervening elements present. In addition, the terms first, second, third, etc. are used herein to describe various elements or components, these elements or components should not be limited by these terms. Consequently, a first element or component discussed below could be termed a second element or component.

[0035] Before describing any embodiments in detail, some terms used in the following are described. A voltage level of “1” represents that the voltage is equal to a power supply voltage VDD. The voltage level of “0” represents that the voltage is equal to a ground voltage GND. A PMOS transistor and an NMOS transistor represent a P-type MOS transistor and an N-type MOS transistor, respectively. Each transistor has a source, a drain and a gate.

[0036] FIG. 1 shows a flow chart of an input sequence re-ordering method 100 with a multi input-precision reconfigurable scheme and a pipeline scheme for a computing-in-memory (CIM) macro in a convolutional neural network (CNN) application according to a first embodiment of the present disclosure. The input sequence re-ordering method 100 with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application is configured to re-order a plurality of multi-bit input signals. The multi input-precision reconfigurable scheme represents re-ordering the multi-bit input signals to generate a part of one group of the multi-bit input signals sent to the CIM macro, so that the part of the one group of the multi-bit input signals sent to the CIM macro can utilize a processing principle with a same bit. The pipeline scheme represents performing a multiply-and-accumulate (MAC) calculation in the CIM macro and scanning a next group of the multi-bit input signals in an input sequence re-ordering unit at the same time. The input sequence re-ordering method 100 with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application includes performing a scanning step S02 and a re-ordering step S04.

[0037] The scanning step S02 includes driving a scanner to scan one group of the multi-bit input signals to determine whether an initial value of one of a plurality of flag signals in one of a plurality of multi-bit section flags is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals, and the initial value is different from the inverted initial value. The re-ordering step S04 includes driving a sorter to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags, and then transmit the part of the one group of the multi-bit input signals to the CIM macro. Accordingly, the input sequence re-ordering method 100 with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application of the present disclosure can re-group certain precision of the multi-bit input signals before feeding into the CIM macro, so that the part of the one group of the multi-bit input signals sent to the CIM macro can utilize the processing principle with the same bit to maximize utilization. In addition, the input sequence re-ordering method 100 of the present disclosure may use the scanner to preprocess the multi-bit input signals sent to the CIM macro to speed up the total time of the entire operation process and mainly solve the burden of the CIM macro in processing multi-bit operations.

[0038] FIG. 2 shows a flow chart of an input sequence re-ordering method 100a with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro in a CNN application according to a second embodiment of the present disclosure. The input sequence re-ordering method 100a with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application is configured to re-order a plurality of multi-bit input signals and includes performing an initializing step S12, a scanning step S14 and a re-ordering step S16. The initializing step S12, the scanning step S14 and the re-ordering step S16 are performed in sequence.

[0039] The initializing step S12 includes driving a sorter to initialize each of a plurality of flag signals in each of a plurality of multi-bit section flags to an initial value, and a number of the flag signals is equal to a number of the multi-bit input signals. A first count number of a first counter and a second count number of a second counter of the sorter are set to 0. A predetermined count number is set to 8. The multi-bit section flags include a first multi-bit section flag and a second multi-bit section flag. The first multi-bit section flag is configured to label at least one N-bit input signal of the multi-bit input signals. The first multi-bit section flag includes a plurality of first flag signals. The number of the first flag signals is equal to the number of the multi-bit input signals, and the first flag signals are corresponding to the multi-bit input signals, respectively. The at least one N-bit input signal is represented by an N bit value, and N is a positive integer. The second multi-bit section flag is configured to label at least one M-bit input signal of the multi-bit input signals. The second multi-bit section flag includes a plurality of second flag signals. The number of the second flag signals is equal to the number of the multi-bit input signals, and the second flag signals are corresponding to the multi-bit input signals, respectively. The at least one M-bit input signal is represented by an M bit value, and M is a positive integer greater than N.

[0040] The scanning step S14 includes performing a first scanning sub-step S142 and a second scanning sub-step S144. The first scanning sub-step S142 includes driving a scanner to scan one group of the multi-bit input signals to obtain a plurality of bit numbers of the one group of the multi-bit input signals. The second scanning sub-step S144 includes driving the sorter to determine whether the initial value of one of the first flag signals of the first multi-bit section flag and the second flag signals of the second multi-bit section flag is changed to the inverted initial value according to one of the bit numbers of the one group of the multi-bit input signals. In the second scanning sub-step S144, in response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to N, the initial value of one of the first flag signals of the first multi-bit section flag is changed to the inverted initial value, the first count number of the first counter of the sorter is added by 1, and the initial value of one of the second flag signals of the second multi-bit section flag is not changed. In response to determining that the one of the bit numbers of the one group of the multi-bit input signals is equal to M, the initial value of the one of the first flag signals of the first multi-bit section flag is not changed, the initial value of the one of the second flag signals of the second multi-bit section flag is changed to the inverted initial value, and the second count number of the second counter of the sorter is added by 1. The first count number represents a number of a plurality of the first flag signals which are all equal to the inverted initial value in the first multi-bit section flag, and the second count number represents a number of a plurality of the second flag signals which are all equal to the inverted initial value in the second multi-bit section flag.

[0041] The re-ordering step S16 includes driving the sorter to select a part of the one group of the multi-bit input signals corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags, and then transmit the part of the one group of the multi-bit input signals to the CIM macro. The re-ordering step S16 further includes in response to determining that the first count number reaches a predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the first flag signals which are all equal to the inverted initial value. The re-ordering step S16 further includes in response to determining that the second count number reaches the predetermined count number, setting the part of the one group of the multi-bit input signals to a plurality of the multi-bit input signals corresponding to the second flag signals which are all equal to the inverted initial value.

[0042] FIG. 3 shows a schematic view of the scanner scanning a first multi-bit input signal IN.sub.1 of the one group (e.g., IN.sub.1-IN.sub.12) of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 4 shows a schematic view of the scanner scanning a second multi-bit input signal IN.sub.2 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 5 shows a schematic view of the scanner scanning a third multi-bit input signal IN.sub.3 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 6 shows a schematic view of the scanner scanning a fourth multi-bit input signal IN.sub.4 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 7 shows a schematic view of the scanner scanning a fifth multi-bit input signal IN.sub.5 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 8 shows a schematic view of the scanner scanning a sixth multi-bit input signal IN.sub.6 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 9 shows a schematic view of the scanner scanning a seventh multi-bit input signal IN.sub.7 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 10 shows a schematic view of the scanner scanning an eighth multi-bit input signal IN.sub.8 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 11 shows a schematic view of the scanner scanning a ninth multi-bit input signal IN.sub.9 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 12 shows a schematic view of the scanner scanning a tenth multi-bit input signal IN.sub.10 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 13 shows a schematic view of the scanner scanning an eleventh multi-bit input signal IN.sub.11 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2. FIG. 14 shows a schematic view of the scanner scanning a twelfth multi-bit input signal IN.sub.12 of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 in the input sequence re-ordering method 100a of FIG. 2.

[0043] In FIGS. 3-14, each of the multi-bit input signals IN.sub.1-IN.sub.256 in an input register array 500 has eight bits. The initial value is equal to 0, and the inverted initial value is equal to 1. The multi-bit input signals IN.sub.1-IN.sub.256 include a first input sub-group (i.e., all N-bit input signals in the one group where N=1-6), a second input sub-group (i.e., all M-bit input signals in the one group where M=7-8) and a third input sub-group (i.e., input signals=0). The first input sub-group and the second input sub-group have six bits and eight bits, respectively. In other words, the first input sub-group, the second input sub-group and the third input sub-group are regarded as a low-byte group (6-bit group), a high-byte group (8-bit group) and a zero group, respectively. The first input sub-group includes the multi-bit input signals IN.sub.2, IN.sub.4, IN.sub.6-IN.sub.10, IN.sub.12. The second input sub-group includes the multi-bit input signals IN.sub.1, IN.sub.5. The third input sub-group includes the multi-bit input signals IN.sub.3, IN.sub.11. In addition, the multi-bit section flags 200 include a first multi-bit section flag 210 and a second multi-bit section flag 220. The first multi-bit section flag 210 is represented by “6-bit section flag” and configured to label at least one N-bit input signal of the multi-bit input signals IN.sub.1-IN.sub.256. The second multi-bit section flag 220 is represented by “8-bit section flag” and configured to label at least one M-bit input signal of the multi-bit input signals IN.sub.1-IN.sub.256. The first multi-bit section flag 210 includes a plurality of first flag signals F1.sub.1-F1.sub.256. The number of the first flag signals F1.sub.1-F1.sub.256 is equal to a number of the multi-bit input signals IN.sub.1-IN.sub.256. The first flag signals F1.sub.1-F1.sub.256 are corresponding to the multi-bit input signals IN.sub.1-IN.sub.256, respectively. The at least one N-bit input signal is represented by an N bit value, and N is a positive integer. The second multi-bit section flag 220 includes a plurality of second flag signals F2.sub.1-F2.sub.256. The number of the second flag signals F2.sub.1-F2.sub.256 is equal to the number of the multi-bit input signals IN.sub.1-IN.sub.256, and the second flag signals F2.sub.1-F2.sub.256 are corresponding to the multi-bit input signals IN.sub.1-IN.sub.256, respectively. The at least one M-bit input signal is represented by an M bit value, and M is a positive integer greater than N. In one embodiment, N is equal to 1, 2, 3, 4, 5 or 6. M is equal to 7 or 8. Each of the first flag signals F1.sub.1-F1.sub.256 and the second flag signals F2.sub.1-F2.sub.256 has one bit, i.e., 1-bit flag. Each of the first multi-bit section flag 210 and the second multi-bit section flag 220 has 256×1 bit.

[0044] In FIGS. 2 and 3, the scanner is driven to scan the first multi-bit input signal IN.sub.1 to obtain a bit number of the first multi-bit input signal IN.sub.1 (i.e., 8 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.1 of the first multi-bit section flag 210 and the second flag signal F2.sub.1 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the first multi-bit input signal IN.sub.1. Because the bit number of the first multi-bit input signal IN.sub.1 is equal to M, the initial value of the first flag signal F1.sub.1 of the first multi-bit section flag 210 is not changed, and the initial value of the second flag signal F2.sub.1 of the second multi-bit section flag 220 is changed to the inverted initial value. The second count number Counter8 of the second counter of the sorter is added by 1 (i.e., Counter8=1).

[0045] In FIGS. 2 and 4, the scanner is driven to scan the second multi-bit input signal IN.sub.2 to obtain a bit number of the second multi-bit input signal IN.sub.2 (i.e., 2 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.2 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.2 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the second multi-bit input signal IN.sub.2. Because the bit number of the second multi-bit input signal IN.sub.2 is equal to N, the initial value of the first flag signal F1.sub.2 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.2 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=1).

[0046] In FIGS. 2 and 5, the scanner is driven to scan the third multi-bit input signal IN.sub.3 to obtain a bit number of the third multi-bit input signal IN.sub.3 (i.e., 0), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.3 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.3 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the third multi-bit input signal IN.sub.3. Because the third multi-bit input signal IN.sub.3 is equal to 0, the initial value of the first flag signal F1.sub.3 of the first multi-bit section flag 210 and the initial value of one of the second flag signals F2.sub.3 of the second multi-bit section flag 220 are not changed. The first count number Counter6 of the first counter and the second count number Counter8 of the second counter of the sorter are still maintained (i.e., Counter6=1, and Counter8=1).

[0047] In FIGS. 2 and 6, the scanner is driven to scan the fourth multi-bit input signal IN.sub.4 to obtain a bit number of the fourth multi-bit input signal IN.sub.4 (i.e., 3 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.4 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.4 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the fourth multi-bit input signal IN.sub.4. Because the bit number of the fourth multi-bit input signal IN.sub.4 is equal to N, the initial value of the first flag signal F1.sub.4 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.4 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=2).

[0048] In FIGS. 2 and 7, the scanner is driven to scan the fifth multi-bit input signal IN.sub.5 to obtain a bit number of the fifth multi-bit input signal IN.sub.5 (i.e., 7 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.5 of the first multi-bit section flag 210 and the second flag signal F2.sub.5 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the fifth multi-bit input signal IN.sub.5. Because the bit number of the fifth multi-bit input signal IN.sub.5 is equal to M, the initial value of the first flag signal F1.sub.5 of the first multi-bit section flag 210 is not changed, and the initial value of the second flag signal F2.sub.5 of the second multi-bit section flag 220 is changed to the inverted initial value. The second count number Counter8 of the second counter of the sorter is added by 1 (i.e., Counter8=2).

[0049] In FIGS. 2 and 8, the scanner is driven to scan the sixth multi-bit input signal IN.sub.6 to obtain a bit number of the sixth multi-bit input signal IN.sub.6 (i.e., 2 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.6 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.6 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the sixth multi-bit input signal IN.sub.6. Because the bit number of the sixth multi-bit input signal IN.sub.6 is equal to N, the initial value of the first flag signal F1.sub.6 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.6 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=3).

[0050] In FIGS. 2 and 9, the scanner is driven to scan the seventh multi-bit input signal IN.sub.7 to obtain a bit number of the seventh multi-bit input signal IN.sub.7 (i.e., Gbit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.7 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.7 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the seventh multi-bit input signal IN.sub.7. Because the bit number of the seventh multi-bit input signal IN.sub.7 is equal to N, the initial value of the first flag signal F1.sub.7 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.7 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=4).

[0051] In FIGS. 2 and 10, the scanner is driven to scan the eighth multi-bit input signal IN.sub.8 to obtain a bit number of the eighth multi-bit input signal IN.sub.8 (i.e., 5 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.8 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.8 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the eighth multi-bit input signal IN.sub.8. Because the bit number of the eighth multi-bit input signal IN.sub.8 is equal to N, the initial value of the first flag signal F1.sub.8 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.8 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=5).

[0052] In FIGS. 2 and 11, the scanner is driven to scan the ninth multi-bit input signal IN.sub.9 to obtain a bit number of the ninth multi-bit input signal IN.sub.9 (i.e., 2 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.9 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.9 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the ninth multi-bit input signal IN.sub.9. Because the bit number of the ninth multi-bit input signal IN.sub.9 is equal to N, the initial value of the first flag signal F1.sub.9 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.9 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=6).

[0053] In FIGS. 2 and 12, the scanner is driven to scan the tenth multi-bit input signal IN.sub.10 to obtain a bit number of the tenth multi-bit input signal IN.sub.10 (i.e., 1 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.10 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.10 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the tenth multi-bit input signal IN.sub.10. Because the bit number of the tenth multi-bit input signal IN.sub.10 is equal to N, the initial value of the first flag signal F1.sub.10 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.10 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=7).

[0054] In FIGS. 2 and 13, the scanner is driven to scan the eleventh multi-bit input signal IN.sub.11 to obtain a bit number of the eleventh multi-bit input signal IN.sub.11 (i.e., 0), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.11 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.11 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the eleventh multi-bit input signal IN.sub.11. Because the eleventh multi-bit input signal IN.sub.11 is equal to 0, the initial value of the first flag signal F1.sub.11 of the first multi-bit section flag 210 and the initial value of one of the second flag signals F2.sub.11 of the second multi-bit section flag 220 are not changed. The first count number Counter6 of the first counter and the second count number Counter8 of the second counter of the sorter are still maintained (i.e., Counter6=7, and Counter8=2).

[0055] In FIGS. 2 and 14, the scanner is driven to scan the twelfth multi-bit input signal IN.sub.12 to obtain a bit number of the twelfth multi-bit input signal IN.sub.12 (i.e., 4 bit), and the sorter is driven to determine whether the initial value of one of the first flag signal F1.sub.12 of the first multi-bit section flag 210 and the initial value of the second flag signal F2.sub.12 of the second multi-bit section flag 220 is changed to the inverted initial value according to the bit number of the twelfth multi-bit input signal IN.sub.12. Because the bit number of the twelfth multi-bit input signal IN.sub.12 is equal to N, the initial value of the first flag signal F1.sub.12 of the first multi-bit section flag 210 is changed to the inverted initial value, and the initial value of one of the second flag signals F2.sub.12 of the second multi-bit section flag 220 is not changed. The first count number Counter6 of the first counter of the sorter is added by 1 (i.e., Counter6=8).

[0056] In response to determining that the first count number Counter6 reaches the predetermined count number (e.g., 8), the sorter is driven to set the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 to a plurality of the multi-bit input signals (e.g., IN.sub.2, IN.sub.4, IN.sub.6-IN.sub.10, IN.sub.12) corresponding to the first flag signals which are all equal to the inverted initial value.

[0057] Therefore, the input sequence re-ordering method 100a with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application of the present disclosure can re-group certain precision of the multi-bit input signals IN.sub.1-IN.sub.256 before feeding into the CIM macro, so that the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro can utilize the processing principle with the same bit to maximize utilization. In addition, the input sequence re-ordering method 100a of the present disclosure may use the scanner to preprocess the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro to speed up the total time of the entire operation process and mainly solve the burden of the CIM macro in processing multi-bit operations.

[0058] FIG. 15 shows a flow chart of an input sequence re-ordering method 100b with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro in a CNN application according to a third embodiment of the present disclosure. FIG. 16 shows a schematic view of a beginning (IN.sub.1) of a first pipeline stage (INRE=IN.sub.1-IN.sub.12) of a pipeline step S28 of the input sequence re-ordering method 100b of FIG. 15. FIG. 17 shows a schematic view of an end (IN.sub.12) of the first pipeline stage of the pipeline step S28 of the input sequence re-ordering method 100b of FIG. 15. FIG. 18 shows a schematic view of an end (IN.sub.30) of a second pipeline stage (CIM Macro=IN.sub.1-IN.sub.12, INRE=IN.sub.13-IN.sub.30) of the pipeline step S28 of the input sequence re-ordering method 100b of FIG. 15. FIG. 19 shows a schematic view of an end (IN.sub.X) of a third pipeline stage (CIM Macro=IN.sub.13−IN.sub.30, INRE=IN.sub.31−IN.sub.X) of the pipeline step S28 of the input sequence re-ordering method of FIG. 15. FIG. 20 shows a schematic view of an end (IN.sub.Y) of a fourth pipeline stage (CIM Macro=IN.sub.31-IN.sub.X, INRE=IN.sub.X+1−IN.sub.Y) of the pipeline step S28 of the input sequence re-ordering method of FIG. 15. X and Y are positive values. X is greater than 31, and Y is greater than X+1. In FIGS. 2-20, the input sequence re-ordering method 100b with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application is configured to re-order a plurality of multi-bit input signals IN.sub.1-IN.sub.256 and includes performing an initializing step S22, a scanning step S34, a re-ordering step S26 and the pipeline step S28. The initializing step S22, the scanning step S24, the re-ordering step S26 and the pipeline step S28 are performed in sequence.

[0059] In FIG. 15, the detail of the initializing step S22, the scanning step S24 and the re-ordering step S26 is the same as the embodiments of the initializing step S12, the scanning step S14 and the re-ordering step S16 of FIG. 2, and will not be described again herein. In FIG. 15, the input sequence re-ordering method 100b further includes the pipeline step S28. The pipeline step S28 includes driving the CIM macro to perform a MAC calculation according to the one of the multi-bit section flags 200 and the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256, and driving the scanner to scan a next group (e.g., IN.sub.13-IN.sub.30) of the multi-bit input signals IN.sub.1-IN.sub.256, and the CIM macro and the scanner are driven by using the pipeline scheme. The pipeline scheme represents performing the MAC calculation in the CIM macro and scanning the next group of the multi-bit input signals IN.sub.1-IN.sub.256 in an input sequence re-ordering unit at the same time. In addition, the scanner does not label all of the multi-bit input signals IN.sub.1-IN.sub.256 at the beginning, but divides the labor with CIM in the pipeline scheme. When the CIM macro performs calculations, the scanner starts to scan and gathers the next group of the multi-bit input signals IN.sub.1-IN.sub.256 to be sent to the CIM macro. When there is one group of the multi-bit input signals IN.sub.1-IN.sub.256 ready to be sent to the CIM macro, the scanner stops operating to hide a scanning time period of the input sequence re-ordering unit into a CIM calculation time period of the CIM macro. For example, in FIG. 18, when the CIM macro performs calculations via a first instruction (i.e., “Instr. No.”=1) at the beginning of a second clock cycle (i.e., “Clock Cycle”=2) in a pipeline stage, the scanner starts to scan and gathers the next group (e.g., IN.sub.13-IN.sub.30) of the multi-bit input signals IN.sub.1-IN.sub.256 to be sent to the CIM macro via a second instruction (i.e., “Instr. No.”=2). When the next group (e.g., IN.sub.13-IN.sub.30) of the multi-bit input signals IN.sub.1-IN.sub.256 ready to be sent to the CIM macro, the scanner stops operating to hide the scanning time period of the input sequence re-ordering unit into the CIM calculation time period of the CIM macro. The scanning time period of the input sequence re-ordering unit is shorter than the CIM calculation time period of the CIM macro. Therefore, the input sequence re-ordering method 100b with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application of the present disclosure can re-group certain precision of the multi-bit input signals IN.sub.1-IN.sub.256 before feeding into the CIM macro, so that the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro can utilize the processing principle with the same bit to maximize utilization. In addition, the input sequence re-ordering method 100b of the present disclosure may use the scanner to preprocess the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro to speed up the total time of the entire operation process and mainly solve the burden of the CIM macro in processing multi-bit operations.

[0060] FIG. 21 shows a block diagram of an input sequence re-ordering unit 300 with a multi input-precision reconfigurable scheme and a pipeline scheme for a CIM macro 700 in a CNN application according to a fourth embodiment of the present disclosure. FIG. 22 shows a schematic view of the input sequence re-ordering unit 300, a word line driver 600 and the CIM macro 700. FIG. 23 shows a schematic view of one of a plurality of word line driving units 610 of the word line driver 600 of FIG. 22. In FIGS. 3-14 and 21-23, the input sequence re-ordering unit 300 with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro 700 in the CNN application is configured to re-order a plurality of multi-bit input signals IN.sub.1-IN.sub.256 and is represented by “INRE”. The input sequence re-ordering unit 300 includes a sorter 400 and an input register array 500.

[0061] The sorter 400 is electrically connected to the CIM macro 700 via the word line driver 600. The sorter 400 includes a plurality of multi-bit section flags 200, a scanner 410, a first counter 420 and a second counter 430. The multi-bit section flags 200 include a plurality of flag signals and are electrically connected to the scanner 410. In FIG. 21, the detail of the multi-bit section flags 200 is the same as the embodiment of the multi-bit section flags 200 of FIGS. 3-14, and will not be described again herein. The scanner 410 can be implemented by a finite state machine and represented by “FSM”. The scanner 410 is configured to scan the multi-bit input signals IN.sub.1-IN.sub.256 in sequence. The first counter 420 may generate a first count number Counter6, and the second counter 430 may generate a second count number Counter8. The scanner 410 is configured to scan one group of the multi-bit input signals IN.sub.1-IN.sub.256 to determine whether an initial value of one of the flag signals in one of the multi-bit section flags 200 is changed to an inverted initial value according to a plurality of bit numbers of the one group of the multi-bit input signals IN.sub.1-IN.sub.256, and the initial value is different from the inverted initial value. The scanner 410 is configured to select a part (e.g., IN.sub.2, IN.sub.4, IN.sub.6-IN.sub.10, IN.sub.12) of the one group (e.g., IN.sub.1-IN.sub.12) of the multi-bit input signals IN.sub.1-IN.sub.256 corresponding to a plurality of the inverted initial values of the flag signals in the one of the multi-bit section flags 200, and transmit the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 to the CIM macro 700.

[0062] The input register array 500 may be regarded as a buffer. The input register array 500 has 256×8 bit to access the multi-bit input signals IN.sub.1-IN.sub.256. The input register array 500 is electrically connected to the sorter 400. The input register array 500 is configured to arrange the multi-bit input signals in an array form.

[0063] The input sequence re-ordering unit 300 receives an input signal IN, a register valid signal In_valid_reg, a sorter valid signal In_valid_sorter and a CIM valid signal Busy_cancel. The input signal IN has eight bit and is written into the input register array 500. Each of the register valid signal In_valid_reg, the sorter valid signal In_valid_sorter and the CIM valid signal Busy_cancel has one bit and is transmitted to the input register array 500. The register valid signal In_valid_reg is configured to notify that the input signal IN is a valid signal. The sorter valid signal In_valid_sorter represents the start of scanning. The CIM valid signal Busy_cancel represents the end of calculation in the CIM macro 700. In addition, the input sequence re-ordering unit 300 outputs the multi-bit input signals IN.sub.1-IN.sub.256, a word line enable signal WL_EN, a bit selecting signal Sel_8, an output valid signal Out_valid and a finish signal Done. The word line enable signal WL_EN has 256 bits and is configured to select eight of the multi-bit input signals IN.sub.1-IN.sub.256 which have been labeled. Each bit of the word line enable signal WL_EN is corresponding to each of the multi-bit input signals IN.sub.1-IN.sub.256. Each of the bit selecting signal Sel_8, the output valid signal Out_valid and the finish signal Done has one bit. The bit selecting signal Sel_8 represents that the eight of the multi-bit input signals IN.sub.1-IN.sub.256 which have been labeled are represented by 8 bits or 6 bits, e.g., “1” for 8 bits, and “0” for 6 bits. The output valid signal Out_valid is configured to notify the CIM macro 700 that the eight of the multi-bit input signals IN.sub.1-IN.sub.256 which have been labeled can be used to perform a MAC calculation. The finish signal Done is configured to notify that all of the multi-bit input signals IN.sub.1-IN.sub.256 are calculated.

[0064] The word line driver 600 is represented by “WLD” and includes the word line driving units 610. Each of the word line driving units 610 is corresponding to a word line WL.sub.j and includes an AND gate 612, a level shifter 614, a first inverter 616 and a second inverter 168. The AND gate 612 is electrically connected between the input sequence re-ordering unit 300 and the level shifter 614. The AND gate 612 receives the multi-bit input signal IN.sub.j and one bit (i.e., WL_EN.sub.j) of the word line enable signal WL_EN. When the one bit (i.e., WL_EN.sub.j) of the word line enable signal WL_EN is equal to 1, the word line driving unit 610 transmits the multi-bit input signal IN.sub.j to the word line WL.sub.j. When the one bit (i.e., WL_EN.sub.j) of the word line enable signal WL_EN is equal to 0, the word line WL.sub.j is maintained to 0. In one embodiment, j may be a positive value from 1 to 256. The level shifter 614 is configured to shift a voltage level generated by the AND gate 612. The first inverter 616 is electrically connected between the level shifter 614 and the second inverter 168.

[0065] The CIM macro 700 includes a plurality of memory cells 710. The memory cells 710 stores a plurality of weights and are controlled by the word lines WL.sub.j to generate a plurality of memory cell currents for the MAC calculation. When the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 includes the multi-bit input signals IN.sub.2, IN.sub.4, IN.sub.6-IN.sub.10, IN.sub.12, the word lines WL.sub.2, WL.sub.4, WL.sub.6-WL.sub.10, WL.sub.12 transmit the multi-bit input signals IN.sub.2, IN.sub.4, IN.sub.6-IN.sub.10, IN.sub.12, respectively, as shown in FIG. 22.

[0066] Therefore, the input sequence re-ordering unit 300 with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro 700 in the CNN application of the present disclosure can re-group certain precision of the multi-bit input signals IN.sub.1-IN.sub.256 before feeding into the CIM macro 700, so that the part of the one group of the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro 700 can utilize the processing principle with the same bit to maximize utilization. In addition, the input sequence re-ordering unit 300 of the present disclosure may use the scanner 410 to preprocess the multi-bit input signals IN.sub.1-IN.sub.256 sent to the CIM macro 700 to speed up the total time of the entire operation process and mainly solve the burden of the CIM macro 700 in processing multi-bit operations.

[0067] FIG. 24 shows a comparison result of throughput between the conventional method (i.e., without input re-ordering (w/o INRE)) and the input sequence re-ordering method 100b (i.e., with input re-ordering (w/ INRE)) of the present disclosure. The input sequence re-ordering method 100b of the present disclosure can improve the throughput (giga operations per second per watt (GOPS/W)) of the CIM macro 700 by 2.41× compared to the conventional method.

[0068] According to the aforementioned embodiments and examples, the advantages of the present disclosure are described as follows.

[0069] 1. The input sequence re-ordering method and the input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application of the present disclosure can re-group certain precision of the multi-bit input signals before feeding into the CIM macro, so that the part of the one group of the multi-bit input signals sent to the CIM macro can utilize the processing principle with the same bit to maximize utilization.

[0070] 2. The input sequence re-ordering method and the input sequence re-ordering unit with the multi input-precision reconfigurable scheme and the pipeline scheme for the CIM macro in the CNN application of the present disclosure can use the scanner to preprocess the multi-bit input signals sent to the CIM macro to speed up the total time of the entire operation process and mainly solve the burden of the CIM macro in processing multi-bit operations.

[0071] 3. The input sequence re-ordering method of the present disclosure can improve the throughput of the CIM macro by 2.41× compared to the conventional method.

[0072] Although the present disclosure has been described in considerable detail with reference to certain embodiments thereof, other embodiments are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the embodiments contained herein.

[0073] It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present disclosure without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the present disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims.

INPUT SEQUENCE RE-ORDERING METHOD AND INPUT SEQUENCE RE-ORDERING UNIT WITH MULTI INPUT-PRECISION RECONFIGURABLE SCHEME AND PIPELINE SCHEME FOR COMPUTING-IN-MEMORY MACRO IN CONVOLUTIONAL NEURAL NETWORK APPLICATION

Inventors

Cpc classification

Classification Explorer

G06F7/76

PHYSICS

Classification Explorer

G06N3/0464

PHYSICS

Classification Explorer

G06F7/08

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G06N3/063

PHYSICS

International classification

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

G06F7/08

PHYSICS

Classification Explorer

G06F7/76

PHYSICS

Classification Explorer

G06N3/04

PHYSICS

Abstract

Claims

Description