ULTRA-PRECISE TUNING OF ANALOG NEURAL MEMORY CELLS IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK

Abstract

Embodiments for ultra-precise tuning of a selected memory cell are disclosed. The selected memory cell optionally is first programmed using coarse programming and fine programming methods. The selected memory cell then undergoes ultra-precise programming through the programming of an adjacent memory cell. As the adjacent memory cell is programmed, capacitive coupling between the floating gate of the adjacent memory cell and the floating gate of the selected memory cell will cause the voltage of the floating gate of the selected memory cell to increase, but in smaller increments than could be achieved by programming the selected memory cell directly. In this manner, the selected memory cell can be programmed with ultra-precise gradations.

Claims

1. A method of programming a selected memory cell in a neural memory to a target value, comprising: programming a floating gate of the selected memory cell to a first voltage by applying a first sequence of voltages to terminals of the selected memory cell; and programming the floating gate of the selected memory cell to a second voltage through capacitive coupling between the floating gate of the selected memory cell and a floating gate of an adjacent tuning cell by applying a second sequence of voltages to terminals of the adjacent tuning cell, wherein the second voltage corresponds to the target value.

2. The method of claim 1, wherein the terminals of the selected memory cell comprise a bit line terminal coupled to a bit line, a source line terminal coupled to a source line, and a word line terminal coupled to a word line.

3. The method of claim 2, wherein the terminals of the selected memory cell further comprise a control gate terminal coupled to a control gate line.

4. The method of claim 3, wherein the control gate terminal of the selected memory cell is connected to a control gate line, and wherein the control gate line is connected to control gate terminals of a column of cells containing the selected memory cells and an adjacent column of cells.

5. The method of claim 3, wherein the terminals of the selected memory cell further comprise an erase gate terminal coupled to an erase gate line.

6. The method of claim 5, wherein the control gate is orthogonal to the erase gate line.

7. The method of claim 5, wherein the control gate line is orthogonal to the source line.

8. The method of claim 2, wherein the selected memory cell is a split-gate memory cell.

9. The method of claim 2, wherein the selected memory cell is a stacked-gate memory cell.

10. The method of claim 1, wherein a charge equivalent to a sub-single electron is added to the floating gate of the selected memory cell during each programming pulse in the second sequence of voltages.

11. The method of claim 1, wherein the selected memory cell and the adjacent tuning cell are contained within a row comprising a plurality of pairs of adjacent data cells and tuning cells.

12. The method of claim 1, wherein the selected memory cell and the adjacent tuning cell are contained within an array comprising a plurality of pairs of adjacent data cells and tuning cells.

13. The method of claim 11, wherein a distance between adjacent data cells is greater than a distance between an adjacent data cell and tuning cell.

14. The method of claim 11, wherein half of the data cells in the row store a W+ value and half of the data cells in the row store a W− value.

15. The method of claim 14, wherein the half of the data cells storing a W+ value are used as tuning cells.

16. The method of claim 14, wherein half of the data cells storing a W− value are used as tuning cells.

17. The method of claim 11, wherein during a read operation, an adjacent pair of a data bitline and a tuning bitline are coupled to a sense amplifier.

18. The method of claim 17, wherein the data bitline and the tuning bitline are interchangeable

19. The method of claim 1, wherein the step of programming a floating gate of the selected memory cell to a first voltage comprises coarse programming.

20. The method of claim 1, wherein the step of programming a floating gate of the selected memory cell to a first voltage comprises coarse programming and precision programming.

21. The method of claim 1, wherein in the event that during the step of programming a floating gate of the selected memory cell, if the voltage of the selected memory cell is over-programmed to a voltage exceeding the first voltage, the method further comprises erasing the selected memory cell.

22. The method of claim 1, wherein the floating gate of the selected memory cell and the floating gate of the adjacent tuning cell are partially overlapping.

23. A method of programming a first memory cell in a neural memory to a target value, comprising: programming a second memory cell by applying programming voltages to terminals of the second memory cell; and determining if an output of the first memory cell has reached the target value.

24. The method of claim 23, wherein the second memory cell is adjacent to the first memory cell

25. The method of claim 23, wherein the first memory cell is coupled to a data bitline and the second memory cell is coupled to a tuning bitline.

26. The method of claim 25, further comprising: prior to the programming step, determining which bitline in a pair of adjacent bitline contains greater noise and designating that bitline as the tuning bitline and designating the other bitline in the pair as the data bitline.

27. The method of claim 23, wherein the first memory cell and the second memory cell are contained in an analog neural memory array.

28. The method of claim 23, further comprising: if the output of the first memory cell has not reached the target value, repeating the programming and determining steps until the output of the first memory cell has reached the target value.

29. The method of claim 28, wherein for each repeating step, increasing one or more of the programming voltages applied to the terminals of the second memory cell.

30. The method of claim 28, wherein the same programming voltages are used during each repeating step.

31. The method of claim 23, wherein the terminals of the second memory cell comprise a bit line terminal, a source line terminal, and a word line terminal.

32. The method of claim 31, wherein the terminals of the second memory cell further comprise a control gate terminal.

33. The method of claim 32, wherein the control gate terminal of the first memory cell is connected to a control gate line; and wherein the control gate line is connected to control gate terminals of a column of cells containing the first memory cell and an adjacent column of cells containing the second memory cell.

34. The method of claim 34, wherein the terminals of the second memory cell further comprise an erase gate terminal

35. The method of claim 34, wherein the control gate is orthogonal to the erase gate

36. The method of claim 34, wherein the control gate is orthogonal to the source line

37. The method of claim 23, wherein the first memory cell and the second memory cells are split-gate memory cells.

38. The method of claim 23, wherein the first memory cell and the second memory cell are stacked-gate memory cells.

39. The method of claim 23, wherein a charge equivalent to a sub-single electron is added to the floating gate of the first memory cell during the programming step.

40. The method of claim 23, wherein the first memory cell and the second memory cell are contained within an array comprising a plurality of pairs of adjacent data cells and tuning cells.

41. The method of claim 40, wherein a distance between adjacent data cells is greater than a distance between an adjacent data cell and tuning cell.

42. The method of claim 23, wherein during the determining step, a bit line coupled to the first memory cell and a bit line coupled to the second memory cell are both coupled to a sense amplifier.

43. The method of claim 23, further comprising: performing coarse programming on the first memory cell.

44. The method of claim 23, further comprising: performing coarse programming on the first memory cell; and performing precision programming on the first memory cell.

45. The method of claim 21, wherein the floating gate of the first memory cell and the floating gate of the second memory cell are partially overlapping.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0045] FIG. 1 depicts a prior art artificial neural network.

[0046] FIG. 2 depicts a prior art split gate flash memory cell.

[0047] FIG. 3 depicts another prior art split gate flash memory cell

[0048] FIG. 4 depicts another prior art split gate flash memory cell.

[0049] FIG. 5 depicts another prior art split gate flash memory cell

[0050] FIG. 6 depicts another prior art split gate flash memory cell.

[0051] FIG. 7 depicts a prior art stacked gate flash memory cell.

[0052] FIG. 8 depicts a prior art twin split gate flash memory cell.

[0053] FIG. 9 depicts the different levels of an exemplary artificial neural network utilizing one or more non-volatile memory arrays.

[0054] FIG. 10 depicts a vector-by-matrix multiplication system.

[0055] FIG. 11 depicts an exemplary artificial neural network utilizing one or more a vector-by-matrix multiplication systems.

[0056] FIG. 12 depicts an embodiment of a VMM system.

[0057] FIG. 13A depicts an embodiment of a method of programming a non-volatile memory cell.

[0058] FIG. 13B depicts another embodiment of a method of programming a non-volatile memory cell.

[0059] FIG. 14 depicts an embodiment of a coarse programming method.

[0060] FIG. 15 depicts exemplary pulses used in the programming of a non-volatile memory cell.

[0061] FIG. 16 depicts exemplary pulses used in the programming of a non-volatile memory cell.

[0062] FIG. 17 depicts a calibration algorithm for the programming of a non-volatile memory cell that adjusts the programming parameters based on slope characteristics of the cell.

[0063] FIG. 18 depicts a calibration algorithm for the programming of a non-volatile memory cell.

[0064] FIG. 19 depicts a calibration algorithm for the programming of a non-volatile memory cell.

[0065] FIG. 20 depicts the floating gate voltage of a selected memory cell during a sequence of programming pulses.

[0066] FIG. 21 depicts a VMM array that is capable of ultra-precise programming.

[0067] FIG. 22 depicts a cell layout for the VMM array of FIG. 21.

[0068] FIG. 23 depicts a VMM array that is capable of ultra-precise programming, where certain columns contain positive values (W+) and certain columns contain negative values (W−).

[0069] FIG. 24 depicts the floating gate voltage of a selected memory cell during a sequence of programming pulses

[0070] FIG. 25A depicts a VMM array that is capable of ultra-precise programming, where adjacent cells are read together to reduce noise.

[0071] FIG. 25B depicts a schematic of adjacent cells read together by a sense amplifier.

[0072] FIG. 26 depicts an ultra-precise programming method.

[0073] FIG. 27 depicts another ultra-precise programming method.

[0074] FIG. 28 depicts the floating gate voltage of a selected memory cell and an adjacent tuning cell during a sequence of programming pulses

DETAILED DESCRIPTION OF THE INVENTION

[0075] FIG. 12 depicts a block diagram of VMM system 1200. VMM system 1200 comprises VMM array 1201, row decoders 1202, high voltage decoders 1203, column decoders 1204, bit line drivers 1205, input circuit 1206, output circuit 1207, control logic 1208, and bias generator 1209. VMM system 1200 further comprises high voltage generation block 1210, which comprises charge pump 1211, charge pump regulator 1212, and high voltage level generator 1213. VMM system 1200 further comprises algorithm controller 1214, analog circuitry 1215, control logic 1216, and test control logic 1217. The systems and methods described below can be implemented in VMM system 1200.

[0076] Various levels of precision can be achieved during the programming process using coarse programming, precision programming, and ultra-precision programming.

[0077] As described herein for neural networks, the non-volatile memory cells of VMM array 1200, i.e. the flash memory of VMM array 1200, are preferably configured to operate in a sub-threshold region.

[0078] The non-volatile reference memory cells and the non-volatile memory cells described herein are biased in sub-threshold region:

Ids=Io*e.sup.(Vg−Vth)/nVt=w*Io*e.sup.(Vg)/nVt, [0079] where w=e.sup.(−Vth)/nVt
where Ids is the drain to source current; Vg is gate voltage on the memory cell; Vth is threshold voltage of the memory cell; Vt is thermal voltage=k*T/q with k being the Boltzmann constant, T the temperature in Kelvin, and q the electronic charge; n is a slope factor=1+(Cdep/Cox) with Cdep=capacitance of the depletion layer, and Cox capacitance of the gate oxide layer; Io is the memory cell current at gate voltage equal to threshold voltage, Io is proportional to (Wt/L)*u*Cox*(n−1)*Vt.sup.2 where u is carrier mobility and Wt and L are width and length, respectively, of the memory cell.

[0080] For an I-to-V log converter using a memory cell (such as a reference memory cell or a peripheral memory cell) or a transistor to convert input current Ids, into an input voltage, Vg:

Vg=n*Vt*log[Ids/wp*Io]

Here, wp is w of a reference or peripheral memory cell.

[0081] For an I-to-V log converter using a memory cell (such as a reference memory cell or a peripheral memory cell) or a transistor to convert input current Ids, into an input voltage, Vg:

Vg=n*Vt*log[Ids/wp*Io]

[0082] Here, wp is w of a reference or peripheral memory cell.

[0083] For a memory array used as a vector matrix multiplier VMM array, the output current is:

Iout=wa*Io*e.sup.(Vg)/nVt namely

Iout=(wa/wp)*Iin=W*Iin

W=e.sup.(Vthp−Vtha)/nVt

Iin=wp*Io*e.sup.(Vg)/nVt [0084] Here, wa=w of each memory cell in the memory array.

[0085] A wordline or control gate can be used as the input for the memory cell for the input voltage.

[0086] Alternatively, the non-volatile memory cells of VMM arrays described herein can be configured to operate in the linear region:

Ids=beta*(Vgs−Vth)*Vds; beta=u*Cox*Wt/L,

Wα(Vgs−Vth), [0087] meaning weight W in the linear region is proportional to (Vgs-Vth)

[0088] A wordline or control gate or bitline or sourceline can be used as the input for the memory cell operated in the linear region. The bitline or sourceline can be used as the output for the memory cell.

[0089] For an I-to-V linear converter, a memory cell (such as a reference memory cell or a peripheral memory cell) or a transistor operating in the linear region or a resistor can be used to linearly convert an input/output current into an input/output voltage.

[0090] Alternatively, the memory cells of VMM arrays described herein can be configured to operate in the saturation region:

Ids=½*beta*(Vgs−Vth).sup.2; beta=u*Cox*Wt/L

Wα(Vgs-Vth).sup.2, meaning weight W is proportional to (Vgs−Vth).sup.2

[0091] A wordline, control gate, or erase gate can be used as the input for the memory cell operated in the saturation region. The bitline or sourceline can be used as the output for the output neuron.

[0092] Alternatively, the memory cells of VMM arrays described herein can be used in all regions or a combination thereof (sub threshold, linear, or saturation) for each layer or multi layers of a neural network.

[0093] Embodiments for Coarse Programming and Precision Programming of Cells in a VMM

[0094] FIG. 13A depicts programming method 1300 that utilizes coarse programming and precision programming. First, the method starts (step 1301), which typically occurs in response to a program command being received. Next, a mass program operation programs all cells to a ‘0’ state (step 1302). Then a soft erase operation erases all cells to an intermediate weakly erased level such that each cell would draw current of, for example, approximately 3-5 μA during a read operation (step 1303). This is in contrast to a deeply erased level where each cell would draw current of approximately ˜20-30 μA during a read operation. Then, a hard program is performed on all un-used cells to a very deep programmed state to add electrons to the floating gates of the cells (step 1304) to ensure that those cells are really “off,” meaning that those cells will draw a negligible amount of current during a read operation.

[0095] A coarse programming method (to get the cell much closer to the target, for example 1.2×-100× the target) is then performed on the selected cells (step 1305), followed by a precision programming method on the selected cells (step 1306) to program the precise value desired for each selected cell.

[0096] FIG. 13B depicts another programming method 1310, which is similar to programming method 1300 and also utilizes coarse programming and precision programming. However, instead of a program operation to program all cells to a ‘0’ state as in step 1302 of FIG. 13A, after the method start (step 1301), an erase operation is used to erase all cells to a ‘1’ state (step 3312). Then a soft (weakly) program operation (step 1313) is used to program all cells to an intermediate state (level) such that each cell would draw current of approximately 0.2-5 uA (e.g., 2X-100X the target) during a read operation. Afterward, coarse and precision programming method would follow as in FIG. 13A. A variation of the embodiment of FIG. 13B would remove the soft programing method (step 1313) altogether.

[0097] FIG. 14 depicts a first embodiment of coarse programming method 1305, which is search and execute method 1400. First, a lookup table or a function search is performed to determine a coarse target current value (I.sub.CT) for the selected cell based on the value that is intended to be stored in that selected cell (step 1401). This table or function is, for example, created by silicon characterization or from calibration from wafer testing. It is assumed that the selected cell can be programmed to store one of N possible values (e.g., 128, 64, 32, without limitation). Each of the N values would correspond to a different desired current value (ID) that is drawn by the selected cell during a read operation. In one embodiment, a look-up table might contain M possible current values to use as the coarse target current value I.sub.CT for the selected cell during search and execute method 1400, where M is an integer less than N. For example, if N is 8, then M might be 4, meaning that there are 8 possible values that the selected cell can store, and one of 4 coarse target current values will be selected as the coarse target for search and execute method 1400. That is, search and execute method 1400 (which again is an embodiment of coarse programming method 1305) is intended to quickly program the selected cell to a coarse target current value (I.sub.CT) that is somewhat close to the desired current value (ID), and then the precision programming method 1306 is intended to more precisely program the selected cell to be extremely close to the desired current value (ID).

[0098] Examples of cell values, desired current values, and coarse target current values are depicted in Tables 9 and 10 for the simple example of N=8 and M=4:

TABLE-US-00006 TABLE NO. 9 Example of N Desired Current Values for N = 8 Value Stored in Desired Current Selected Cell Value (I.sub.D) 000 100 pA 001 200 pA 010 300 pA 011 400 pA 100 500 pA 101 600 pA 110 700 pA 111 800 pA

TABLE-US-00007 TABLE NO. 10 Example of M Target Current Values for M = 4 Associated Coarse Target Current Value (I.sub.CT) Cell Values 800 pA + I.sub.CTOFFSET1 000, 001 1600 pA + I.sub.CTOFFSET2 010, 011 2400 pA + I.sub.CTOFFSET3 100, 101 3200 pA + I.sub.CTOFFSET4 110, 111

[0099] The offset values I.sub.CTOFFSETx are used to prevent overshooting the desired current value during coarse tuning. Once the coarse target current value I.sub.CT is selected, the selected cell is programmed by applying the voltage v.sub.0 to the appropriate terminal of selected cell based on the cell architecture type of the selected cell (e.g., memory cells 210, 310, 410, or 510) (step 1402). If the selected cell is of type memory cell 310 in FIG. 3, then the voltage v.sub.0 will be applied to control gate terminal 28 (and/or source line 14), and v.sub.0 might be for example 5-8V depending on coarse target current value I.sub.CT. The value of v.sub.0 optionally can be determined from a voltage look up table that stores v.sub.0 vs. coarse target current value I.sub.CT.

[0100] Next, the selected cell is programmed by applying the voltage v.sub.i=v.sub.i-1+v.sub.increment, where i starts at 1 and increments each time this step is repeated, and where v.sub.increment is a small, fine voltage that will cause a degree of programming that is appropriate for the granularity of change desired (step 1403). Thus, the first time step 1403 is performed, i=1, and v.sub.1 will be v.sub.0+v.sub.increment. Then a verify operation occurs (step 1404), wherein a read operation is performed on the selected cell and the current drawn through the selected cell (I.sub.cell) is measured. If I.sub.cell is less than or equal to I.sub.CT (which here is a first threshold value), then search and execute method 1400 is complete and precision programming method 1306 can begin. If I.sub.cell is not less than or equal to coarse target current value I.sub.CT, then step 1403 is repeated, and i is incremented.

[0101] Thus, at the point when coarse programming method 1305 ends and precision programming method 1306 begins, the voltage v.sub.i will be the last voltage used to program the selected cell, and the selected cell will be storing a value associated with the coarse target current value I.sub.CT. The goal of precision programming method 1306 is to program the selected cell to the point where during a read operation it draws a current ID (plus or minus an acceptable amount of deviation, such as +/−50 pA or +/−30% or less), which is the desired current value that is associated with the value that is intended to be stored in the selected cell.

[0102] FIG. 15 depicts examples of different voltage progressions that can be applied to the control gate of a selected memory cell during coarse programming method 1305 and/or precision program method 1306.

[0103] Under a first approach, increasing voltages are applied in progression to the control gate to further program the selected memory cell. The starting point is v.sub.i, which is approximately around the last voltage (+ or − some delta voltage as desired or depending on target current) applied during coarse programming method 1305. An increment of v.sub.p1 is added to v.sub.i and the voltage v.sub.u+v.sub.p1 is then used to program the selected cell (indicated by the second pulse from the left in progression 1501). v.sub.p1 is an increment that is smaller than v.sub.increment (the voltage increment used during coarse programming method 1305). After each programming voltage is applied, a verify step (similar to step 1404) is performed, where a determination is made if I.sub.cell is less than or equal to I.sub.PT1 (which is the first precision target current value and here is a second threshold value), where I.sub.PT1=I.sub.D+I.sub.PTIOFFSET, where I.sub.PTIOFFSET is an offset valued added to prevent program overshoot. If it is not, then another increment v.sub.p1 is added to the previously-applied programming voltage, and the process is repeated. At the point where I.sub.cell is less than or equal to I.sub.PT1, then this portion of the programming sequence stops. Optionally, if I.sub.PT1 is equal to ID, or almost equal to I.sub.D with sufficient precision, then the selected memory cell has been successfully programmed.

[0104] If I.sub.PT1 is not close enough to I.sub.D, then further programming of a smaller granularity can occur. Here, progression 1502 is now used. The starting point for progression 1502 is approximately about the last voltage (+ or − some delta voltage as desired or depending on target current) used for programming under progression 1501. An increment of V.sub.p2 (which is smaller than v.sub.p1) is added to that voltage, and the combined voltage is applied to program the selected memory cell. After each programming voltage is applied, a verify step (similar to step 1404) is performed, where a determination is made if I.sub.cell is less than or equal to I.sub.PT2 (which is the second precision target current value and here is a third threshold value), where I.sub.PT2=I.sub.D+I.sub.PT2OFFSET, I.sub.PT2OFFSET is an offset value added to prevent program overshoot. Typically, IPT2OFFSET<IPT1OFFSET, since the programming steps become smaller and more precise with each round. If it is not, then another increment V.sub.p2 is added to the previously-applied programming voltage, and the process is repeated. At the point where I.sub.cell is less than or equal to I.sub.PT2, then this portion of the programming sequence stops. Here, it is assumed that I.sub.PT2 is equal to I.sub.D or close enough to I.sub.D that the programming can stop, since the target value has been achieved with sufficient precision. One of ordinary skill in the art can appreciate that additional progressions can be applied with smaller and smaller programming increments used. For example, in FIG. 16, three progressions (1601, 1602, and 1603) are applied instead of just two.

[0105] A second approach is shown in progression 1503 in FIG. 15 and progression 1604 in FIG. 16. Here, instead of increasing the voltage applied during the programming of the selected memory cell, the same voltage is applied for durations of increasing period. Instead of adding an incremental voltage such as v.sub.p1 in progression 1501 and v.sub.p2 in progression 1503, an additional increment of time to is added to the programming pulse such that each applied pulse is longer than the previously-applied pulse by to. After each programming pulse is applied, a verify step (similar to step 1404) is performed. Optionally, additional progressions can be applied where the additional increment of time added to the programming pulse is of a smaller duration than the previous progression used. Although only one temporal progression is shown, one of ordinary skill in the art will appreciate that any number of different temporal progressions can be applied.

[0106] Alternatively, the duration of each pulse can be the same for pulse progressions 1503 and 1603, and the system can rely on the number of pulses to perform additional programming.

[0107] Additional detail will now be provided for three additional embodiments of coarse programming method 1305.

[0108] FIG. 17 depicts a first embodiment of coarse programming method 1305, which is adaptive calibration method 1700. The method starts (step 1701). The cell is programmed at a default start value v.sub.0 (step 1702). Unlike in search and execute method 1400, here v.sub.0 is not derived from a lookup table, and instead can be a pre-determined relatively small initial value. The control gate voltage of the cell is measured at a first current value IR1 (e.g., 100 na) and a second current value IR2 (e.g., 10 na), and a sub-threshold slope is determined based on those measurements (e.g., 360 mV/dec) and stored (step 1703).

[0109] A new desired voltage, v.sub.i, is determined. The first time this step is performed, i=1, and v.sub.1 is determined based on the stored sub-threshold slope value and a current target and offset value using a sub-threshold equation, such as the following:

Vi=Vi−1+Vincrement,

Where Vincrement is proportional to slope of Vg

Vg=n*Vt*log[Ids/wa*Io]

Here, wa is w of a memory cell, Ids is the current target plus offset value.

[0110] If the stored sub-threshold slope value is relatively steep, then a relatively small current offset value can be used. If the stored sub-threshold slope value is relatively flat, then a relatively high current offset value can be used. Thus, determining the sub-threshold slope value will allow for a current offset value to be selected that is customized for the particular cell in question. This ultimately will make the programming process shorter. When this step is repeated, i is incremented, and v.sub.i=+v.sub.increment. The cell is then programmed using vi. V.sub.increment can be determined for example from a lookup table storing values of v.sub.increment. vs. desired current value (I.sub.D).

[0111] Next, a verify operation is performed, wherein a read operation is performed on the selected cell and the current drawn through the selected cell (I.sub.cell) is measured (step 1705). If I.sub.cell is less than or equal to coarse target current value I.sub.CT, where I.sub.CT is set=I.sub.D+I.sub.CTOFFSET, where I.sub.CTOFFSET is an offset value added to prevent program overshoot, then adaptive calibration method 1700 is complete and precision programming method 2206 can begin. If I.sub.cell is not less than or equal to coarse target current value I.sub.CT, then steps 1704-1705 are repeated, and i is incremented.

[0112] FIG. 18 depicts a second embodiment of coarse programming method 1305, which is adaptive calibration method 1800. The method starts (step 1801). The cell is programmed at a default start value v.sub.0 (step 1802). v.sub.0 is derived from a lookup table such as created from silicon characterization, where the table value includes an offset so as not to overshoot the programmed target.

[0113] In step 1803 an slope parameter is created which is used in predicting the next programming voltage. A first control gate read voltage, V.sub.CGR1, is applied to the selected cell, and the resulting cell current, IR.sub.1, is measured. Then a second control gate read voltage, V.sub.CGR2, is applied to the selected cell, and the resulting cell current, IR.sub.2, is measured. A slope is determined based on those measurements and stored, for example as according to the equation in sub threshold region (cell operating in sub threshold):

slope=(V.sub.CGR1−V.sub.CGR2)/(LOG(IR.sub.1)−LOG(IR.sub.2))

(step 1803). Examples of values for V.sub.CGR1 and V.sub.CGR2 are 1.5V and 1.3V, respectively.

[0114] Determining the slope information allows for a v.sub.increment value to be selected that is customized for the particular cell in question. This ultimately will make the programming process shorter.

[0115] When step 1804 is repeated, i is incremented, a new desired programming voltage, v.sub.i, is determined based on the stored slope value and a current target I.sub.CT and offset value using an equation such as the following:

v.sub.i=v.sub.i-1+V.sub.increment, [0116] where for i−1, v.sub.increment=alpha*slope*(LOG(IR.sub.1)−LOG(I.sub.CT)),
where I.sub.CT is the coarse target current and alpha is a pre-determined constant <1 (programming offset value) to prevent overshoot, e.g., 0.9.

[0117] The cell is then programmed using Vi. (step 1805) Here, v.sub.i can be applied to the source line terminal, control gate terminal, or erase gate terminal of the selected cell, depending on the programming scheme used.

[0118] Next, a verify operation occurs, wherein a read operation is performed on the selected cell and the current drawn through the selected cell (I.sub.cell) is measured (step 1806). If I.sub.cell is less than or equal to coarse target threshold value I.sub.CT, where coarse target threshold value I.sub.CT is set=+I.sub.CTOFFSET, where I.sub.CTOFFSET is an offset value added to prevent program overshoot, then the process proceeds to the step 1807. If not, then the process returns to step 1804 and i is incremented.

[0119] In step 1807, I.sub.cell is compared against a threshold value, I.sub.CT2, that is smaller than coarse target threshold value I.sub.CT. The purpose of this is to see if an overshoot has occurred. That is, although the goal is for I.sub.cell to be below coarse target threshold value I.sub.CT, if it falls too far below coarse target threshold value I.sub.CT, then an overshoot has occurred and the stored value may actually correspond to the wrong value. If I.sub.cell is not less than or equal to I.sub.CT2, then no overshoot has occurred, and adaptive calibration method 1800 has completed, as which point the process progresses to precision programming method 1306. If I.sub.cell is less than or equal to I.sub.CT2, then an overshoot has occurred. The selected cells are then erased (step 1808), and the programming process starts over at step 1802 with adjusted V.sub.increment such as having smaller value depending on how much it overshoots. Optionally, if step 1808 is performed more than a predetermined number of times, the selected cell can be deemed a bad cell that should not be used.

[0120] The precision program method 1306 consists of multiple verify and program cycles, in which the program voltage is incremented by a constant fine voltage with a fixed pulse width or in which the program voltage is fixed and the program pulse width is varied or constant for next pulses, as described above in relation to FIG. 15.

[0121] Optionally, the step of determining if the current through the selected non-volatile memory cell during a read or verify operation is less than or equal to the first threshold current value, I.sub.CT, can be performed by applying a fixed bias to a terminal of the non-volatile memory cell, measuring and digitizing the current drawn by the selected non-volatile memory cell to generate digital output bits, and comparing the digital output bits to digital bits representing the first threshold current value, I.sub.CT.

[0122] Optionally, the step of determining if the current through the selected non-volatile memory cell during a read or verify operation is less than or equal to the first threshold current value, I.sub.CT, can be performed by applying an input to a terminal of the non-volatile memory cell, modulating the current drawn by the non-volatile memory cell with an input pulse to generate a modulated output, digitizing the modulated output to generate digital output bits, and comparing the digital output bits to digital bits representing the first threshold current, I.sub.CT.

[0123] FIG. 19 depicts a third embodiment of programming method 1305, which is absolute calibration method 1900. The method starts (step 1901). The cell is programmed at a default starting value v.sub.0 (step 1902). The control gate voltage of the cell (VCGRx) is measured at a current target value Itarget and stored (step 1903). A new desired voltage, v.sub.1, is determined based on the stored control gate voltage and the current target value Itarget and an offset value, Itarget+Ioffset (step 1904). For example, the new desired voltage, v.sub.1, can be calculated as follows: v.sub.1=v.sub.0+theta*(VCGBIAS−stored VCGR), theta is about 1, VCGBIAS is the default read control gate voltage at a maximum target current, typically ˜1.5V, and stored VCGR is the measured read control gate voltage of step 1903. In short, the updated program voltage is adjusted based on the difference between the measured control gate voltage and the target control gate voltage.

[0124] The cell is then programmed using v.sub.i. When i=1, the voltage v.sub.1 from step 1904 is used. When i>=2, the voltage v.sub.i=v.sub.i-1+V.sub.increment is used. V.sub.increment can be determined from a lookup table storing values of v.sub.increment. vs. target current value, I.sub.CT. Next, a verify operation occurs, wherein a read operation is performed on the selected cell and the current drawn through the selected cell (I.sub.cell) is measured (step 1906). If I.sub.cell is less than or equal to coarse target current value I.sub.CT, then absolute calibration method 1900 is complete and precision programming method 1306 can begin. If I.sub.cell is not less than or equal to coarse target current value I.sub.CT, then steps 1905-1906 are repeated, and i is incremented.

[0125] Alternatively, coarse and/or fine program methods may comprise of increments of the program voltage on one terminal (e.g., CG) and decreasing voltage on another terminal (e.g., EG or SL) for more precise control of charge coupling into the floating gate.

[0126] The coarse and precision programming methods described thus far will be precise enough for most systems. However, even these methods have a limit on their precision. Ultimate precision can be understood to be one electron precision. FIG. 20 depicts data collected by Applicant for a working embodiment of the coarse and precision programming methods described above. FIG. 20 depicts the floating gate voltage against the number of programming pulses executed in an attempt to program floating gate voltage 2002 as close to target 2001 as possible. As can be seen, floating gate voltage 2002 is able to approximate target 2001 within +/−4 mV, which is equivalent to the charge of a single electron added to the floating gate. This might not be sufficiently precise for certain systems. For example, if N (the number of different values that can be held on any floating gate) is large (e.g., 512), then greater precision than +/−4 mV may be required, meaning that sub-electron (fractional electron) precision programming may be required; that is, a mechanism is needed where the floating gate voltage can be adjusted in increments smaller than +/−4 mV, meaning that somehow a fraction of an electron must be added or subtracted from the floating gate. Also, it can be seen that the increments in floating gate voltage during the programming process are not uniform and predictable, which means that the system will not always achieve a given target voltage with complete precision. This may be due to statistically nature of the programming physics.

[0127] Embodiments for Ultra-Precision Programming of Cells in a VMM

[0128] FIGS. 21-28 depict embodiments of an ultra-precision programming method and system that allow for more precise programming than can be achieved through coarse programming method 1305 and precision programming method 1306 alone. The ultra-precision programming method and system enable the voltage of the floating gate of a selected memory cell to increase by the charge equivalent to a fraction of a single electron (a sub-single electron) being added to the floating gate for each programming pulse, which is the finest programming increment that is physically possible.

[0129] FIG. 21 depicts VMM array 2100, which comprises rows and columns of non-volatile memory cells. Here, part of one row of memory cells is shown, specifically, data memory cell 2101, data memory cell 2102, tuning memory cell 2103, tuning memory cell 2104, data memory cell 2105, and data memory cell 2106, which are coupled to bit lines 2121, 2122, 2123 (Tuning BL), 2124 (Tuning BL), 2125, and 2126, respectively, and to a control gate line and optionally to a word line, erase line, and/or source line, depending on which type of memory cell architecture is used (e.g., memory cells 210, 310, 410, 510, 610, 710, or 810). Data memory cell 2101 comprises floating gate 2111, data memory cell 2102 comprises floating gate 2112, tuning memory cell 2103 comprises floating gate 2113, tuning memory cell 2104 comprises floating gate 2114, data memory cell 2105 comprises floating gate 2115, and data memory cell 2106 comprises floating gate 2116. Tuning BL 2123 and 2124, respectively, are the bitlines used to perform ultra-precision tuning for an adjacent bitline.

[0130] As is typically the case, there is a capacitive coupling between adjacent floating gates in the same row as shown in FIG. 21. VMM array 2100 utilizes this phenomena to achieve ultra-precise programmability. In this example, data memory cell 2102 is used to store data, but tuning memory cell 2103 is not used to store data and instead is used solely to assist in programming cell 2102 through capacitive coupling. Similarly, data memory cell 2105 is used to store data but tuning memory cell 2104 is used solely to assist in programming cell 2105 through capacitive coupling. Data memory cells 2101 and 2106 are used to store data and their programming is assisted by adjacent tuning cells not shown.

[0131] In one embodiment, notably, when data memory cells are adjacent to one another, they are separated by a distance d.sub.2, as shown for data memory cells 2101 and 2102, and for data memory cells 2105 and 2106. However, when a data memory cell is adjacent to a tuning memory cell, they are separated by a distance d.sub.1, as shown for data memory cell 2102 and adjacent tuning memory cell 2103, and data memory cell 2104 and adjacent tuning memory cell 2105. When a tuning memory cell is adjacent to another tuning memory cell, they can be separated by a distance d.sub.1 or d.sub.2, as shown for tuning memory cells 2103 and 2104. By design, d.sub.2>d.sub.1. As a result, the capacitive coupling between cells that are apart by a distance d.sub.2 has a capacitance of C.sub.2, while the capacitive coupling between cells that are apart by a distance d.sub.1 has a capacitance of C.sub.1, where C.sub.1>C.sub.2. That is, the capacitance is greater between cells that are closer to one another. Further, the distance d.sub.1 may be designed to achieve a desired value of C.sub.1 to optimize the effect of the tuning memory cell on the data memory cell and thus the final programming precision.

[0132] If data memory cell 2102 is the selected cell and it is desired to program data memory cell 2102 to a certain target value, data memory cell 2102 can be programmed to a certain degree using the coarse and precision programming methods described previously with reference to FIGS. 13-20. However, before the target value is achieved or exceeded, the coarse programming and precision programming are stopped, and an ultra-precise programming method is implemented instead.

[0133] Specifically, tuning memory cell 2103 is programmed using coarse and precise programing methods. Due to capacitive coupling, as tuning memory cell 2103 is programmed, the charge in floating gate 2113 will cause the charge on floating gate 2112 to also increase, but by a lesser amount than the increase in charge of floating gate 2113. Through this mechanism, floating gate 2112 will increase by a finer increment than occurs in floating gate 2113 or which could be achieved by programming cell 2102 directly using coarse and precision programming methods 1305 and 1306. In this case, programming is performed on the tuning memory cell 2103 but a verify operation only needs to be performed on data memory cell 2102. Once the target value is achieved in data memory cell 2102, floating gate 2113 is maintained in its state of charge so that floating gate 2112 remains at the target value.

[0134] FIG. 22 depicts exemplary layout 2200 for data memory cell 2102, tuning memory cell 2103, tuning memory cell 2104, and data memory cell 2105 for a system with bi-directional tuning (meaning both programming and erasing can be used for tuning purposes due to the erase gate being horizontal and the control gate being vertical, meaning the control gate is orthogonal to the erase gate; similarly, the control gate or word line can be orthogonal to the source line). Here, CG gates are shared for two adjacent columns vertically. In the case of a cell which may utilize bi-directional tuning, the ultra precision tuning may be accomplished by erasing or programming the tuning cell or both.

[0135] FIG. 23 depicts an alternative embodiment that utilizes VMM array 2100. Here, memory cell 2102 is used to store a positive value (W+), and memory cell 2105 is used to store a negative value (W−), and these together store the value W, where W=(W+)−(W−), which can be achieved by a subtraction circuit during a read or verify operation. In this embodiment, a tuning cell, such as tuning cells 2103 and 2104, is programmed with a weight used to tune the adjacent data cell to an opposite weight. Accordingly, for example, bit line 2123 programs tuning memory cell 2103 to cause capacitive coupling to decrease the voltage of floating gate 2112 in cell 2102, and bit line 2124 programs tuning memory cell 2104 to cause capacitive coupling to decrease the voltage of floating gate 2115 in memory cell 2105.

[0136] FIG. 24 depicts the effect of a tuning memory cell on an adjacent data memory cell, for instance, tuning memory cell 2103 and data memory cell 2102 as in FIG. 21. Initially, data memory cell 2102 and tuning memory cell 2103 are both programmed near target voltage 2401, but still below it. Then, tuning memory cell 2103 is programmed such that the voltage of floating gate 2113 may exceed target voltage 2401. The voltage on floating gate 2112 is verified (through a read verify operation), and tuning memory cell 2103 continues to be programmed until data memory cell 2102 achieves the exact target voltage 2401. This process results in charge equivalent to a sub-single electron potential being added to floating gate 2112 during a programming pulse, which is the finest increment of programming that is physically possible.

[0137] FIG. 25A depicts another alternative embodiment that utilizes VMM array 2100. Here, the tuning bitline and data bitline (constituting a bitline pair) are inter-changeable. For example, the bitline within a bitline pair that has the greater amount of noise (such as random telegraph noise RTN) can be designated as the tuning bitline within the bitline pair. For example, if memory cell 2102 has greater RTN noise than memory cell 2103, bitline 2122 can be designated as a tuning bitline, i.e. memory cell 2102 is designated as a tuning memory cell, and bitline 2123 as a data bitline, i.e. memory cell 2103 is designated as a data memory cell. FIG. 25B shows schematically how this is done. Both bit lines 2122 and 2123 are fed as an input into sense amplifier 2503 through bitline read transistors 2501 and 2502, respectively. The bitline that has less RTN noise than the other bitline will be designated as a data bit line, i.e. the associated memory cell is a designated as a data memory cell. The cells in the bit line pairs are separated by distance d.sub.1, with inherent capacitance C.sub.1 between their respective floating gates. Cells that are not used as pairs, may be separated from adjacent cells by distance d.sub.2, which as indicated above is greater than distance d.sub.1, with a resultant capacitance of C.sub.2, which is less than C.sub.1.

[0138] FIG. 26 depicts ultra-precision programing method 2600. The method starts (step 2601). The selected data memory cell and its adjacent tuning memory cell are erased (step 2602). Un-selected memory cells are deeply programmed (step 2603). Coarse programming is performed on selected data memory cells (step 2604). Precision programming using increments and/or decrements is performed on selected data memory cells (step 2605), and/or precision programming using a constant voltage is performed on selected data memory cells (step 2605). Then ultra-precision programming is performed using the capacitive coupling between the adjacent tuning memory cell and the selected data memory cell (step 2607). Once the target value is achieved, the method ends (step 2608).

[0139] FIG. 27 depicts ultra-precision programing method 2700. The method starts (step 2701). The entire VMM array is erased (step 2702). Un-selected cells are deeply programmed (step 2703). All cells are programmed to an intermediate value (e.g., ˜0.5-1 μA) using coarse programing (step 2704). Precision programming using increments is performed on selected data memory cell(s) (step 2705), and/or precision programming using constant voltages is performed on selected data memory cell (step 2706). Then ultra-precision programming is performed using the capacitive coupling between the adjacent tuning memory cell and the selected data memory cell (step 2707). Once the target value is achieved, the method ends (step 2708).

[0140] In the embodiments described above, when a selected data memory cell is read or verified, its associated tuning memory cell also must be read or verified, as the capacitive coupling must be active at the time the data memory cell is read or verified. One way to do this is to couple the data bitline and the tuning bitline to the sense amplifier during a read or verify operation.

[0141] The end result of ultra-precision programming is shown in FIG. 28, which depicts data collected by Applicant for a working embodiment of the ultra-precision programming methods and systems described herein. FIG. 28 depicts the floating gate voltage of a data memory cell (e.g. 2102) and an adjacent tuning memory cell (e.g. 2103) against the number of programming pulses executed in an attempt to program the voltage of floating gate 2112 exactly to target 2801, by programming floating gate 2113 in the adjacent tuning memory cell 2103. The selected data memory cell 2102 is first programmed up to just under 90 pulses, and the balance of the programming is done by only providing programming pulses to adjacent tuning memory cell 2103. As can be seen, ultra-precision programming is much more precise than coarse programming and fine programming methods alone (depicted in FIG. 20), and the increment in voltage to floating gate 2112 due to capacitive coupling between floating gates 2112 and 2113 actually corresponds to less than one electron being added per programming pulse to floating gate 2112.

[0142] Another embodiment for ultra-precision programming uses vertical floating gate to floating gate coupling instead of horizontal floating gate to floating gate coupling such as describe above with reference to the tuning bitline. In this case, an adjacent row (the Tuning Row) is used for coupling purpose. This is particular suited for memory cells 210, 310, 510, and 710, in which case there is no physical barrier (erase gate) in the top FG and bottom FG.

[0143] Another embodiment for ultra-precision programming uses overlapping floating gate to floating gate coupling, such as where a tuning cell floating gate is overlapped with a target cell floating gate. The overlapping is such as when one floating gate is partially on top of another floating gate.

[0144] It should be noted that, as used herein, the terms “over” and “on” both inclusively include “directly on” (no intermediate materials, elements or space disposed therebetween) and “indirectly on” (intermediate materials, elements or space disposed therebetween). Likewise, the term “adjacent” includes “directly adjacent” (no intermediate materials, elements or space disposed therebetween) and “indirectly adjacent” (intermediate materials, elements or space disposed there between), “mounted to” includes “directly mounted to” (no intermediate materials, elements or space disposed there between) and “indirectly mounted to” (intermediate materials, elements or spaced disposed there between), and “electrically coupled” includes “directly electrically coupled to” (no intermediate materials or elements there between that electrically connect the elements together) and “indirectly electrically coupled to” (intermediate materials or elements there between that electrically connect the elements together). For example, forming an element “over a substrate” can include forming the element directly on the substrate with no intermediate materials/elements therebetween, as well as forming the element indirectly on the substrate with one or more intermediate materials/elements there between.

ULTRA-PRECISE TUNING OF ANALOG NEURAL MEMORY CELLS IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK

Assignee

Inventors

Cpc classification

Classification Explorer

G11C16/26

PHYSICS

Classification Explorer

G11C2216/08

PHYSICS

Classification Explorer

G11C16/107

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G11C16/12

PHYSICS

Classification Explorer

G11C16/3404

PHYSICS

Classification Explorer

G11C16/3459

PHYSICS

Classification Explorer

G06N3/048

PHYSICS

Classification Explorer

G06N3/063

PHYSICS

Classification Explorer

G11C16/0408

PHYSICS

Classification Explorer

G11C11/5628

PHYSICS

Classification Explorer

G11C16/10

PHYSICS

Classification Explorer

G11C11/54

PHYSICS

Classification Explorer

G06N3/065

PHYSICS

Classification Explorer

G11C16/14

PHYSICS

International classification

Classification Explorer

G11C16/10

PHYSICS

Classification Explorer

G06N3/063

PHYSICS

Classification Explorer

G11C16/14

PHYSICS

Classification Explorer

G11C16/26

PHYSICS

Abstract

Claims

Description