MULTIPLY-ACCUMULATE SUCCESSIVE APPROXIMATION DEVICES AND METHODS
20240223207 ยท 2024-07-04
Inventors
Cpc classification
G11C11/413
PHYSICS
H03M1/462
ELECTRICITY
International classification
H03M1/46
ELECTRICITY
G11C11/413
PHYSICS
G06F9/30
PHYSICS
Abstract
A multiply-accumulate successive approximation (MASAR) column is provided. The MASAR column includes a plurality of MASAR cells, each including a multiplier configured to perform digital multiplication between an input activation received to an input and an operand to compute a result, and a unit capacitor configured to store the result as analog charge. The MASAR column further includes digital logic configured to perform analog summation of the analog charge of the unit capacitors of the plurality of MASAR cells to determine a digital output of the multiplication.
Claims
1. A multiply-accumulate successive approximation (MASAR) column, comprising: a plurality of MASAR cells, each including a multiplier configured to perform digital multiplication between an input activation received to an input and an operand to compute a result, and a unit capacitor configured to store the result as analog charge; and digital logic configured to perform analog summation of the analog charge of the unit capacitors of the plurality of MASAR cells to determine a digital output of the multiplication by configuring the unit capacitors as a capacitive digital to analog converter (CDAC) in a successive approximation register (SAR) analog to digital converter (ADC).
2. The MASAR column of claim 1, wherein the operands specify weights or biases of a neural network.
3. The MASAR column of claim 1, wherein each of the plurality of MASAR cells further includes a memory configured to maintain the operand.
4. The MASAR column of claim 3, wherein each memory has a memory input and a memory output, and each of the plurality of MASAR cells further includes a column select control line connected to the memory, wherein the memory is configured to utilize the value on the column select control line to switch between (i) saving the value on the memory input to the memory as the operand and (ii) applying the value in the memory from the memory output to the multiplier.
5. The MASAR column of claim 1, wherein the plurality of MASAR cells are configured to receive the operands from operand inputs separate from the input activation inputs.
6. The MASAR column of claim 1, wherein each of the plurality of MASAR cells further includes a multiplexer (MUX) having at least first and second MUX inputs and a MUX output, wherein the MUX is configured to receive the result on the first MUX input, to receive a bit-guess input from the digital logic on the second MUX input, and to apply the MUX output to the unit capacitor, wherein the MUX is further configured to be controlled by an enable MAC control line to select between (i) storing the result to the unit capacitor and (ii) utilizing the unit capacitor to determine the analog summation of the charge.
7. The MASAR column of claim 6, where the digital logic is further configured to utilize SAR to convert the analog charge to a digital result, by controlling the individual MASAR cell unit capacitances via the bit-guess input to form the CDAC.
8. The MASAR column of claim 7, wherein the SAR includes guessing a plurality of bits of the digital output of the multiplication from most significant bit to least significant bit.
9. The MASAR column of claim 1, further comprising: a comparator having a comparator input and a comparator output, wherein each of the unit capacitors is connected to the comparator input via a common bit line, and the digital logic is configured to receive the comparator output, wherein the common bit line is connected to a switch controllable by a RESET line, where, when the RESET line is set the common bit line is connected to a reference voltage, and when the RESET line is unset the common bit line is disconnected from the reference voltage, and wherein the RESET line is set when performing the digital multiplication, and the RESET line is unset when performing the analog summation of the analog charge.
10. The MASAR column of claim 1, wherein the digital output utilizes N+1 bits for signed integer arithmetic and N bits for two's complement arithmetic.
11. The MASAR column of claim 1, wherein the digital output is an N.sub.BG bit value, and the plurality of MASAR cells includes at least 2.sup.N.sup.
12. The MASAR column of claim 11, wherein the digital logic is configured to control the individual MASAR cell unit capacitances via a bit-guess input to form the capacitive digital to analog converter (CDAC).
13. The MASAR column of claim 12, wherein the bit-guess input is N.sub.BG bits wide from M=0: N.sub.BG?1, and each bit guess line M is connected to 2.sup.M of the plurality of MASAR cells.
14. The MASAR column of claim 12, wherein the bit-guess input is M<MAX(N.sub.BG) bits wide from M=0: MAX(N.sub.BG)?X and each bit guess line M is connected to 2.sup.M+X of the plurality of MASAR cells, thereby providing a coarse-precision mapping of the analog charge of the unit capacitors to determine the digital output.
15. The MASAR column of claim 12, wherein bit guess line is connected to a spatially randomized set of the plurality of MASAR cells across the MASAR column.
16. The MASAR column of claim 12, wherein a first subset of the plurality of MASAR cells are connected to the bit-guess input for ADC conversion, and a second subset of the MASAR cells are connected to a reference voltage to perform a conversion range shift.
17. The MASAR column of claim 16, wherein the first subset of the plurality of MASAR cells include least significant bits (LSBs) of the digital output, and the second subset of the MASAR cells include most significant bits (MSBs) of the digital output, thereby providing range-shifted full-resolution mapping of a subset of the range of values of the MASAR column.
18. The MASAR column of claim 16, wherein the first subset of the plurality of MASAR cells include MSBs of the digital output, and the second subset of the MASAR cells include LSBs of the digital output, thereby providing a coarse-resolution mapping of the full range of values of the MASAR column.
19. The MASAR column of claim 16, wherein the first subset of the plurality of MASAR cells include MSBs and LSBs of the digital output, and the second subset of the MASAR cells include the remaining bits of the digital output, thereby providing a range-shifted coarse-resolution mapping of a subset of the range of values of the MASAR column.
20. A MASAR column, comprising: a plurality of MASAR cells, each including: a multiplier configured to perform digital multiplication between an input activation received to an input and an operand to compute a result, a unit capacitor configured to store the result as analog charge, and a multiplexer (MUX) having at least first and second inputs and an output, wherein the MUX is configured to receive the result on the first input, to receive a bit-guess input from digital logic on the second input, and to apply the output to the unit capacitor; the digital logic configured to utilize a SAR to perform analog summation of the analog charge of the unit capacitors of the plurality of MASAR cells to determine a digital output of a MAC, by controlling the individual MASAR cell unit capacitances via the bit-guess input to form a CDAC; and a comparator having a comparator input and a comparator output, wherein each of the unit capacitors is connected to the comparator input via a common bit line, and the digital logic is configured to receive the comparator output, wherein the common bit line is connected to a RESET switch controllable by a RESET line, wherein the MUX is further configured to be controlled by an enable MAC control line to select between (i) storing the result to the unit capacitor and (ii) utilizing the unit capacitor to determine the analog summation of the charge, and wherein the RESET switch is further configured to be controlled to select between (i) connecting the common bit line to a reference voltage, and (ii) disconnecting the common bit line from the reference voltage.
21. The MASAR column of claim 20, wherein: in a store charge operation of a MAC mode, the enable MAC control line is set to store the result to the unit capacitors and the RESET switch is set to connect the unit capacitor to the reference voltage, in a sum charge operation of the MAC mode, the enable MAC control line is set to store the result to the unit capacitors and the RESET switch is unset to disconnect the unit capacitors from the reference voltage, and in an ADC conversion mode, the enable MAC control line is set to connect the bit-guess input of the digital logic to the unit capacitor and the RESET switch is unset to disconnect the unit capacitors from the reference voltage.
22. The MASAR column of claim 21, wherein each of the plurality of MASAR cells further includes a memory having a memory input and a memory output, the memory configured to maintain the operand and a column select control line connected to the memory, wherein the memory is configured to utilize the value on the column select control line to switch between (i) saving the value on the memory input to the memory as the operand and (ii) applying the value in the memory from the memory output to the multiplier.
23. A method of performing multiplication and multiply-accumulate functions using a plurality of MASAR cells and digital logic, comprising: performing digital multiplication, utilizing multipliers of each of the plurality of MASAR cells, between an input activation received to an input of the respective MASAR cell and an operand to compute a result; storing the result of the digital multiplication as analog charge in unit capacitors of the respective MASAR cells; and performing analog summation of the analog charge of the unit capacitors of the plurality of MASAR cells, under control of digital logic, to determine a digital output of the multiplication by configuring the unit capacitors as a capacitive digital to analog converter (CDAC) in a successive approximation register (SAR) analog to digital converter (ADC).
24. The method of claim 23, further comprising controlling a MUX via an enable MAC control line to select between (i) storing the result to the unit capacitor and (ii) utilizing the unit capacitors to determine the analog summation of the charge.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
DETAILED DESCRIPTION
[0065] Embodiments of the present disclosure are described herein. It is to be understood, however, that the disclosed embodiments are merely examples and other embodiments can take various and alternative forms. The figures are not necessarily to scale; some features could be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the embodiments. As those of ordinary skill in the art will understand, various features illustrated and described with reference to any one of the figures can be combined with features illustrated in one or more other figures to produce embodiments that are not explicitly illustrated or described. The combinations of features illustrated provide representative embodiments for typical applications. Various combinations and modifications of the features consistent with the teachings of this disclosure, however, could be desired for particular applications.
[0066] The computational workload of convolutional neural networks (CNNs) may be dominated by multiply and accumulate or MAC operations (also known as dot products). These operations are essentially sums of products between input activations, A.sub.i and weights, W.sub.ij, of the CNNs. Hence there is interest in hardware (HW) based building blocks that can accelerate MAC operations while improving performance such as energy/MAC, Area/MAC and Clock cycles/MAC.
[0067] Aspects of the disclosure relate to a new building block for implementing MAC functions in HW using both digital and analog circuit techniques. These approaches may enable possible architectures going from a single multiplier to large scale MAC arrays that enable parallel multiply and accumulation of data and weights for artificial intelligence (AI)/ML applications. Such architectures may be adapted from 1 bit to multi bit (4 bit, 8 bit) computational precision of the weights and activations.
[0068] An array may be made up of modular processing elements or cells. These processing elements may be configured such that a column of cells can perform a digital input to digital output MAC computation without the need of an additional analog to digital converter (ADC). A column of such cells may perform a mixed signal MAC calculation which results in an analog charge proportional to the MAC computation result. The analog result is then converted to digital using the same column of cells configured as a SAR ADC. These cells may be referred to as MAC+SAR, or MASAR cells. The MASAR cells are the processing elements that enable all functions for MAC calculation and analog to digital conversion. Thus, the proposed approach uses the same processing element array for digital multiplication and charge summation as well as for the ADC conversion.
[0069] The MASAR modular processing elements may be used to implement multibit precision multiplications and MAC computations, which is useful for ML/AI HW acceleration. Each MASAR cell uses a unit capacitance to store the results of the 1-bit multiplications in charge. A column of MASAR cells can be used sum the 1-bit products in charge using charge redistribution. The column of MASAR cells also provide the ability to convert the charge to a digital value by configuring the column of MASAR cells into a SAR ADC. The SAR is used to convert the sum of products back to a digital representation. These new building blocks (MASAR cells) enable both MAC and SAR functions when used in columns (MASAR columns). MASAR columns can be placed in parallel to form MASAR arrays which can perform multibit precision MAC computations.
[0070]
[0071] In a weights programming mode, weights w.sub.ij may be applied to the digital inputs 106 of the MASAR cells 104. These weights may be stored in the MASAR cells 104 and used in a two-mode runtime approach to perform digital-in to digital-out MAC computations. Weights may be stored in each MASAR cell or only in some or not at all. In some examples, the weights may be stored outside of the MASAR columns. The weight programming mode may be specific to when the weights are stored in the MASAR columns or cells. In this case it may be advantageous to use the same inputs (e.g., wires) to the cells for programming weights and applying input activations. It should be noted that more than one weight may be stored in each MASAR cell in some examples (e.g., each MASAR cell may contain multiple memory cells), which may be advantageous for computation of ML algorithms.
[0072] This two-mode approach includes a MAC mode (multiply+charge summation) followed by a SAR mode (charge to digital). It should be noted that memory can be in every MASAR cell 104 or only certain rows in the MASAR column 102 may have memory. For cases where not all MASAR cells 104 have memory, there are different options for how memory can be distributed in a MASAR column 102. Some examples are discussed herein. Also programming of the memory can be do done in multiple ways. One way would be programming the weights w.sub.ij one column j at a time. In this case a vector of weight w.sub.ij may be applied to the inputs of the MASAR column 102. Other options may include to program one row i at a time, program individual MASAR cell 104 memories one at a time or multiple MASAR cell 104 memories all at once through an entire MASAR array.
[0073] In the MAC mode, input activations di, or input biases bi, may be applied to each of the cells. These values may be applied to the digital inputs 106. In a first aspect (MAC step 1), multiple 1-bit digital multiplications are performed digitally. In a second aspect (MAC step 2), the multiplication results are stored in charge. In a third aspect (MAC step3), the results of the multiplications are summed using charge sharing/redistribution on the MASAR column 102. The total charge stored on the MASAR column 102 as unit capacitances represents the analog value of the result of the MAC computation.
[0074] In the SAR mode, conversion of the charge back to digital is accomplished by configuring the unit capacitors of the MASAR cells 104 in the column as a capacitive digital to analog converter (CDAC) 115. The column is used with a single comparator to perform a successive approximation analog to digital conversion of the stored charge in the column. In the SAR mode, ADC guess bits BG.sub.i[0: N?1] may be utilized to facilitate the conversion back to digital.
[0075] The MASAR column 102 may produce digital outputs 108 representative of multiplication of the input i with the stored weights w.sub.ij. These digital outputs 108 may provide a single bit B[N] result, or, in other examples, may include a full output B[0: N?1]. The MASAR columns 102 may further include a bit line driver 110, a zero input cell 112, a comparator 114, and digital logic 116. These components are discussed in further detail below.
[0076]
[0077] The MASAR cell 104 can be configured to calculate a 1-bit multiplication between an input, (I.sub.i=a.sub.i in MAC mode) and store it as a charge on the unit capacitor 206 as unit capacitance C.sub.u. The output of the MASAR cell 104 may be provided to the BL.sub.j, as the j.sup.th bit line, for charge summation. By setting the EM signal, the unit capacitance may be stored to the capacitor 206, and by resetting the EM signal the capacitor 206 may be reset. Additionally, the EM signal may be used to select between MAC mode in which the capacitance is determined by the multiple, and the collective capacitive across MASAR cells 104 is measured in the SAR mode.
[0078]
[0079]
[0080] It should be noted that single bit computation does not require sign bits, as 1-bit or single bit multiplication does not include a sign. For signed integer multiplication a sign bit may be utilized. Note that a sign bit is not required in cases that do not use signed integer computation, such as binary coded decimal.
[0081]
[0082]
[0083] Table 1 illustrates a description of the signaling shown in
TABLE-US-00001 TABLE 1 1-bit and Multibit MASAR Cell Control Signal Definitions Signal Name Description I.sub.i Is an input to the MASAR cell 104 used to supply a value for W.sub.ij while programming the weight memory in the MASAR cell 104. It is also used to supply the input activation, a.sub.i. I.sub.i is routed to the MASAR cells 104 on the same row (i.sup.th row) of multiple MASAR columns 102 on what is referred to herein as a word line. BG.sub.i Is an input to the MASAR cell 104 used to apply the Bit Guess. BG.sub.i comes from the SAR ADC control logic and is needed to enable conversion of the stored charge on the MASAR Column 102 bit line(s) to a digital value. P.sub.j Serves as a column select for programming of the memory in each cell. Setting P.sub.j high selects the j.sup.th MASAR Column 102 for programming. Typically, only one column is programmed at a time. When high the weight memory is set, W.sub.i, j = I.sub.i. When low, the value applied to I.sub.i is used as input to the multiplier 204. EM Enable MAC Control bit. When high (or 1) it enables the storage of the multiplication product, A.sub.i .Math. W.sub.i, j as a charge, Qi, on the unit capacitor 206, C.sub.u. When low, it allows for operation in the SAR A/D conversion mode. BL.sub.j,
[0084] Table 2 illustrates values of the signaling with respect to the different modes and operations that are performed by the MASAR column 102 and MASAR cells 104.
TABLE-US-00002 TABLE 2 Modes of the MASAR Cell/Columns with signal values Cell Mode P.sub.j EM I.sub.i RESET Description MAC 0 1 a 1 1-bit multiplication a.sub.i .Math. w.sub.ij (Store Charge) Stored as charge on the unit capacitor 206 MAC 0 1 a.sub.i 0 Stored charge redistributed on (Sum Charge) all capacitors 206 in a MASAR column 102 SAR ADC 0 0 X = 0 Unit capacitors 206 in (ADC conversion) don't MASAR column 102 used as a care SAR ADC capacitor 206 digital-to-analog converter (DAC). Controlled by ADC guess bits, BG.sub.i Weight Memory 1 0 New 1 SRAM 202 weight bits of the Programming (WMP) w.sub.ij j.sup.th MASAR column 102 are updated to new value placed row input, I.sub.i.
[0085] Table 3 illustrates further definitions of terms with respect to the MASAR column 102.
TABLE-US-00003 TABLE 3 MASAR Column Definitions Name Description i, j i = 1 . . . N.sub.y, is the MASAR (Column/Array) row index. j = 1 . . . N.sub.x, is MASAR array column index. Both are integers N.sub.r The number of rows in the MASAR columns 102. N.sub.r is assumed to be a power of 2. i.e. N.sub.r = 2.sup.k. k is an integer. N.sub.x Number of MASAR columns 102 in a MASAR array. N.sub.y Number of MASAR cells 104 in the column that process inputs. MAX(N.sub.y) = N.sub.r ? 1 N.sub.BG Number of output bits, B, in a MASAR column 102. This may also refer to the SAR ADC resolution in bits. Note the number of output bits changes depends on the arithmetic used. N.sub.BG ? log.sub.2(N.sub.r) for 2's complement number representation and N.sub.BG ? log.sub.2(Nr) + 1 for signed integer number representation. a.sub.i 1-bit Input activations to the MASAR column 102. w.sub.ij 1-bit weight stored in the i.sup.th row and j.sup.th column MASAR cell 104. NOTE: Weight storage is not required in all MASAR cells 104. V.sub.CO, j Is the one-bit output of the j.sup.th MASAR column 102 comparator 114. BG.sub.j[0: N.sub.BG ? 1] ADC guess bits for the j.sup.th MASAR column 102. Used for SAR ADC conversion of MAC result. B.sub.j[0: NBG ? 1] Digital output of the j.sup.th MASAR column 102. This value can be one bit, B.sub.j[x], where x is the current bit that has been resolved by the SAR ADC algorithm. Or, this value can be all bits of the MAC result: B.sub.j[0: N.sub.BG ? 1]. This is dependent on the location of the SAR digital logic 116. NOTE: The full SAR digital logic 116 does not have to be implemented within the digital logic 116 of the MASAR column 102.
[0086]
[0087] Here, the MASAR column 102 is shown in the first portion of the MAC. In this first portion, all products, a.sub.i.Math.w.sub.ij, are being applied to the MASAR column 102 for computation by the multiplier 204. At this point, the ADC guess bits are set to zero (BG.sub.0=BG.sub.1=0). Additionally, the signal EM is set such that EM=1, EM=0, thereby forcing the products on the cell side of the unit capacitors 206. The signal RESET is set such that RESET=1, forcing the bit line side of the unit capacitors 206 to the reference voltage, V.sub.s. Further aspects of the signaling of the MASAR column 102 are illustrated in Table 4.
TABLE-US-00004 TABLE 4 Simplified MASAR Column Signal Definitions Signal Name Description RESET When RESET = 1, it forces the bit line voltage to reference voltage, V.sub.S. V.sub.BL The voltage on the MASAR column 102 bit line V.sub.S Reference voltage for the MASAR array V.sub.CI = Input to the comparator 114 in the MASAR column 102 V.sub.BL ? V.sub.S V.sub.CO Output of the comparator 114 B.sub.i Digital outputs of the SAR logic. These represent the digitized output of the MAC operation performed by MASAR column 102.
[0088] Referring more specifically to the MAC step 1 aspect, the 1-bit products, a.sub.i.Math.w.sub.ij, are stored as a charge, Q.sub.i, (Eq. 1) on the unit capacitors 206 of the MASAR cell 104. Here, EM=1, EM=0. The charge is thereby stored by applying the product, a.sub.i.Math.w.sub.ij, to the cell side of each unit capacitor 206 or node V.sub.xi in the cell. At the same time, the common side of the unit capacitors 206 or bit line is forced to the reference voltage, V.sub.S by setting RESET=1. Eq. 2 is the total charge, Q.sub.tot stored in the MASAR column 102 after MAC step 1.
[0089] Note that for this MASAR column 102 with 2.sup.N cells there is a maximum of N.sub.y inputs where N.sub.y=2.sup.N?1. One MASAR cell 104 (the zero input cell 112) has zero input and N.sub.y cells have inputs. This is to ensure full analog to digital conversion of the MAC result on the MASAR column 102. The total capacitance, C.sub.TOT, of the MASAR column 102 is given in Eq. 3. It should again be noted that a zero input cell 112 is not needed if the MASAR column 102 is not performing a full resolution conversion, i.e., where the output of the MASAR column has less than log 2 (N.sub.y) bits.
[0090]
[0091]
[0092]
[0093] For this example, the bit line voltage is defined by Eq. 4 and input to the comparator 114 by Eq. 5. Finally, the comparator 114 computes Eq. 6. For purposes of showing how the SAR algorithm operation, the expected result of the MAC is defined by Eq. 7. In this case, the output of the MASAR column 102, for this example, is 2 or B[1]=1, B[0]=0.
[0094] Referring more specifically to the SAR conversion, The SAR conversion (SAR step 1) starts with the SAR logic guessing the most significant bit BG[1]=1, while keeping the least significant bit, BG[0]=0. This results in a comparator 114 input of zero, as shown in Eq., and a comparator 114 output of zero as shown in Eq. Here, the SAR logic assigns B[1]=1.
[0095]
[0096] This portion of the SAR conversion starts with the SAR logic guessing the least significant bit, BG[0]=1. As the most significant bit, BG[1], has been already determined to be 1, that value is not changed. Setting BG[0]=1 results in a comparator 114 input that is greater than zero as shown in Eq. 10. Therefore, the comparator 114 output is one as shown in Eq. 11. The SAR logic assigns B[0]=0. This is the last portion of the SAR computation for this example. The final output of the MASAR column 102 is therefore B[1]=1, B[0]=0, which matches the expected MAC result of 2.
[0097] Thus, a simplified 4-cell MASAR column 102 (N.sub.BG=2,N.sub.y=3) may accomplish a MAC computation and a SAR analog to digital conversion using the same capacitor 206 array. Note this was done for 1-bit computations which do not require a sign for each MAC product. However, this approach may be extended to signed operations for multibit MACs.
[0098] While the aforementioned example utilizes four cells, the MAC mode may be extended to a MASAR column 102 comprised of N.sub.r rows. In this case a N.sub.r row MASAR column 102 may perform N.sub.y=N.sub.r?1 one-bit MAC calculations. Thus, the maximum digital value of the MAC output for a MASAR column 102 using N.sub.r?1 rows as inputs is B.sub.MAX, as shown in Eq. 12.
[0099] The total charge stored on the capacitance of the MASAR column 102 is given earlier by Eq. The bit line voltage (extending the example to the general case) is given by Eq. 13. For the general case the output of the j.sup.th MASAR column 102 is a digital output as given by Eq. 14.
[0100] As a variation, the addition of an input bias and calibration in MASAR columns 102 may be performed. In some cases, it may be of interest to add input biases, b.sub.j, to the MAC calculation. In this case the desired output of the MASAR column 102 is given by Eq. 15:
[0101] To add these biases N.sub.b rows in the MASAR column 102 can be dedicated to the bias input. Since the number of inputs is fixed at N.sub.y this reduces the number of possible input activations to N.sub.a=N.sub.y?N.sub.b. For instance, if it is desired to calibrate the SAR ADC, additional N.sub.c rows may be dedicated to the addition of calibration of the ADC output. If desired, this may further reduce the quantity of inputs for the MAC, as shown in Eq. 16. It should be noted that while adding bias in MASAR cells 104 may be performed in some approaches, in other approaches the biases may be added to the outputs after the MASAR column 102. This may occur in the digital summation stages, for example (as shown in the FIGS. herein).
[0102] Similarly, while the aforementioned example utilizes four cells, the SAR mode may be extended to a MASAR column 102 comprised of N.sub.r rows. Here, the ADC guess bits can be distributed to form an N bit CDAC 115.
[0103]
[0104]
[0105]
[0106] Accordingly, a MASAR column 102 may be configured as an SAR ADC with a maximum ADC resolution or max number of bits, N.sub.BG=k (for 2's complement) and, N.sub.BG=k+1 for signed integer computation. While optional, it is assumed one row in all MASAR columns 102 is the row of zero input cells 112. Doing so ensures the SAR ADC including the rows of MASAR cells 104 can perform a full resolution conversion of the MAC result.
[0107] One key aspect of the SAR computation and MASAR concept is the routing of the ADC guess bits, BG[n], to the MASAR cells 104. An example MASAR column 102 may be used to demonstrate different options for setting the SAR ADC conversion resolution by changing how we configure the MASAR column 102 ADC guess bits.
[0108]
[0109]
[0110]
[0111]
[0112] It should be noted that these SAR ADC conversion modes assume MASAR columns 102 configured as a binary CDAC 115. In other words, the capacitors are sized such that they are binarily weighted, as shown in Eq. 17. The choice of binary weighting may dictate how the ADC guess bits are distributed to control the individual MASAR cells 104 in the previous sections. However, there are alternatives to binary weighting. For example, a SAR ADC may be developed with non-binary split-capacitor arrays, which can be implemented as well in the MASAR columns 102. Use of a MASAR column 102 for such applications may provide even more compact architectures and/or lower energy implementations as compared to other designs of SAR ADC.
[0113]
[0114]
[0115] For some applications, however, it may be desirable to utilize a subset of possible values that may be available through use of the MASAR column 102. For instance, in some cases less precision may be desired. In such a case the LSB may not be used. Or, in other cases conversion may be desired for a subset of ranges of the values, with values below the range of interest being set to a minimum and values above the range of interest begin set to a maximum. As discussed above abstractly with respect to
[0116]
[0117] Referring more specifically to
[0118]
Accordingly, the resultant mapping is from a low value of 28 to a high value of 28+7 or 35, based on the values of the 3 LSBs.
[0119]
[0120]
Accordingly, the resultant mapping is from a low value of 4 to a high value of 4+7=11, based on the values of the 3 LSBs.
[0121]
[0122]
[0123]
[0124]
Accordingly, the resultant mapping is from a low value of 24 to a high value of 24+14=46, with a step size of 2, based on the values of the 3 utilized bits (the 8-, 4- and 2-bits).
[0125]
[0126] Thus, by configuring the mapping of the unit capacitors 206 to the SAR DAC, configurable output mappings of the and offset range of values may be performed. It should also be noted that the number of inputs is not limited to being need to 2.sup.N?1. Indeed, any number of inputs N>2.sup.M may be possible with approximate conversion.
[0127]
[0128] Serial and parallel SAR architectures for the MASAR columns 102 and MASAR arrays 150 may be utilized.
[0129] For a serial SAR MASAR array 150 the MAC calculation occurs in parallel, however, the SAR ADC conversion of the MAC results occur in a serial fashion. The ADC conversion occurs in each MASAR column 102 one at a time. The advantage of this architecture is that the SAR logic can be global and does not need to be in each MASAR column 102. This results in an area savings for the MASAR array 150. The disadvantage is that throughput or the speed of the MAC calculation is reduced. However, for some applications the tradeoff between area and speed is advantageous.
[0130]
[0131] Additionally, the global digital logic 154 may provide control signals for the different modes of the MASAR array 150. These modes are described in Table 2. For instance, the digital logic 116 may apply input activations, a.sub.i, to the row driver 152 in the MAC mode and weight values, w.sub.ij, for programming the SRAM 202 weight memories in the weight programming mode.
[0132]
[0133]
[0134] The global digital logic 154 may be used to orchestrate top level functions of the parallel array, for providing control signals for the different modes of the MASAR array 150, as discussed in Table 2. For example, the global digital logic 154 may apply input activations, a.sub.i, to the row drivers 152 in the MAC mode, and may provide provides weight memory values, w.sub.ij, for programming the SRAM 202 in the weight programming mode. The digital logic 116 may also controls the timing of the array signals.
[0135] Unlike the serial MASAR array 150, however, the global digital logic 154 in the parallel MASAR array 150 may not apply the ADC guess signals, B.sub.Gj[0: N.sub.BG?1], to the row driver 152 during the SAR modes. Instead, this may be done by local SAR logic 156 in each MASAR column 102, which is routed through the MASAR column 102 to each MASAR cell 104.
[0136]
[0137] Thus, MASAR columns 102 and MASAR arrays 150 which perform 1-bit MAC computations may be utilized in serial or parallel configurations. These computations may include a summation of products of 1-bit weights and activations. Additionally, MASAR columns 102 and MASAR arrays 150 may be used to perform multi-bit MAC computations. In such examples, the weights and activations can be >1-bit in precision.
[0138] Multibit digital multiplications may be decomposed into individual units, which may be implemented using MASAR columns 102. A product of N.sub.p bit precision weights and activations may accordingly be accomplished. An example of 4-bit signed integer (N.sub.p=4-bit) activations and weights is defined as shown in Eq. 19 and Eq. 20. The multibit activations and weights may be represented by single bits having different significance, l. For instance, A.sub.i can be represented by the 1-bit values, au, and Wu by the 1-bit values w.sub.ij. The most significant bits are the sign bits, a.sub.i3, w.sub.i3. These may be used to calculate the sign bit for the overall product, as given by Eq. 21. Note, for simplicity of notation, that the column index j for the weights is omitted in these examples.
[0139]
[0144]
[0145] Each cell in
[0146]
[0147]
[0148] The architecture shown in
[0149]
[0150]
[0151] While previous examples have been with N.sub.p=4-bit signed integers, this architecture can be scaled to precisions that are larger or smaller than 4-bits. This involves scaling the number of rows, N.sub.pr, and columns, N.sub.pc, of the product cells, as shown in Eq. 23 and Eq. 24.
[0152] The relationship between the total number of rows, N.sub.r, in the MASAR column 102 the number of MACs, N.sub.M, and number of zero input rows, N.sub.Z, can be determined with Eq. and Eq.
[0153] It is assumed here the number of rows, N.sub.r, must be a power of 2 to enable using of a binary weighted capacitor DAC in each MASAR column 102. To give an example let N.sub.r=2.sup.k=256 (k=8) and N.sub.p=4 bits. In this case, N.sub.pr=3, N.sub.pc=5, N.sub.Z=1, and N.sub.M=85 can be calculated from the equations above. For this example, a 256-row by 5-column MASAR array 150 can compute 85 parallel MACs with 4-bit precision. Note zero input rows are added to insure there are 2.sup.k MASAR cells 104 in each MASAR column 102. This is required since each MASAR column 102 is also an k=8-bit SAR ADC. In another example, a 256-row by 13-column, 8-bit precision MASAR array 150 can compute N.sub.M=36 MACs. For the 8-bit case: N.sub.pr=7, N.sub.pc=13, and N.sub.Z=4.
[0154]
[0155] While parallel multibit architectures improve speed of computation, serial architectures are more compact. In this section we describe how to decompose multibit digital multiplications into serial computations which enable smaller multibit MASAR array 150 accelerators.
[0156]
[0157]
[0158] Referring back to
[0159]
[0160] The architecture shown in
[0161]
[0162] It should also be noted that serial MASAR accelerators can be extended to higher or lower bit precision. For instance a N.sub.p=16-bit precision accelerator that calculates N.sub.M 16 bit MACs can be implemented with a 16 column by (N.sub.M+1) row serial MASAR accelerator.
[0163] It should be noted that while many of the examples above are discussed in terms of signed integer values, the MASAR columns 102 and MASAR arrays 150 may also be used to perform two's-complement computations. Like unsigned numbers, N-bit two's complement numbers represent one of 2.sup.N possible values, although with a different range. Thus, for two's complement computations, 2.sup.N rows may be used for an N-bit output. However, for signed values, 2.sup.N+1 rows may be required for an N-bit output to account for the sign.
[0164]
[0165]
[0166]
[0167]
[0168]
[0169]
[0170] While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms encompassed by the claims. The words used in the specification are words of description rather than limitation, and it is understood that various changes can be made without departing from the spirit and scope of the disclosure. As previously described, the features of various embodiments can be combined to form further embodiments of the disclosure that may not be explicitly described or illustrated. While various embodiments could have been described as providing advantages or being preferred over other embodiments or prior art implementations with respect to one or more desired characteristics, those of ordinary skill in the art recognize that one or more features or characteristics can be compromised to achieve desired overall system attributes, which depend on the specific application and implementation. These attributes can include, but are not limited to cost, strength, durability, life cycle cost, marketability, appearance, packaging, size, serviceability, weight, manufacturability, ease of assembly, etc. As such, to the extent any embodiments are described as less desirable than other embodiments or prior art implementations with respect to one or more characteristics, these embodiments are not outside the scope of the disclosure and can be desirable for particular applications.