OUTPUT CIRCUITRY FOR ANALOG NEURAL MEMORY IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK

20230049032 · 2023-02-16

    Inventors

    Cpc classification

    International classification

    Abstract

    Numerous embodiments of output circuitry for an analog neural memory in a deep learning artificial neural network are disclosed. In some embodiments, a common mode circuit is used with differential cells, W+ and W−, that together store a weight, W. The common mode circuit can utilize current sources, variable resistors, or transistors as part of the structure for introducing a common mode voltage bias.

    Claims

    1. An output current neuron circuit, comprising: a first bit line coupled to a W+ cell in a memory array to draw a first current during a read operation; a second bit line coupled to a W− cell in the memory array to draw a second current during a read operation, wherein a difference between a value stored in the W+ cell and a value stored in the W− cell is a weight value, W; a bias circuit to generate a common mode bias voltage; a first variable current source to apply a first common mode bias current to the first bit line in response to the common mode bias voltage to generate a first output; and a second variable current source to apply a second common mode bias current to the second bit line in response to the common mode bias voltage to generate a second output, wherein the first and second common mode bias currents are identical; wherein the first output equals the common mode bias current minus the first current and the second output equals the common mode bias current minus the second current.

    2. The output current neuron circuit of claim 1, wherein the first variable current source comprises a first PMOS transistor.

    3. The output current neuron circuit of claim 2, wherein the second variable current source comprises a second PMOS transistor.

    4. An output current neuron circuit, comprising: a current source; a bias circuit to apply a control voltage to the current source; a first variable resistor comprising a first end and a second end, the first end coupled to the current source; a second variable resistor comprising a third end and a fourth end, the third end coupled to the current source, the current source to provide a bias current to the first variable resistor and the second variable resistor so as to generate a common mode voltage; a first bit line coupled to a W+ cell during a read operation; a second bit line coupled to a W− cell during the read operation, wherein a difference between a value stored in the W+ cell and a value stored in the W− cell is a weight value, W; a first output coupled to the second end of the first variable resistor and the first bit line to provide a first output current; and a second output coupled to the fourth end of the second variable resistor and the second bit line to provide a second output current, the first output and the second output forming a common mode, differential current signal.

    5. The circuit of claim 4, wherein the first variable resistor comprises an NMOS transistor, where a voltage applied to a gate of the NMOS transistor determines a resistance of the NMOS transistor.

    6. The circuit of claim 5, wherein the second variable resistor comprises a second NMOS transistor, where a voltage applied to a gate of the second NMOS transistor determines the resistance of the NMOS transistor.

    7. An output current neuron circuit, comprising: a first output node to receive a first current from a memory array; a second output node to receive a second current from a memory array; a bias circuit to generate a bias current; a first device to generate a first output current equal to the first current subtracted from the bias current; and a second device to generate a second output current equal to the second current subtracted from the bias current.

    8. The output current neuron circuit of claim 7, wherein the first current is generated from a read operation of a bit line coupled to one or more W+ cells.

    9. The output current neuron circuit of claim 8, wherein the second current is generated from a read operation of a bit line coupled to one or more W− cells.

    10. An output current neuron circuit, comprising: a first output node to receive a first current from a memory array; a second output node to receive a second current from a memory array; a bias circuit to generate a bias voltage at a bias node; a first variable resistor coupled between the bias node and the first output node; and a second variable resistor coupled between the bias node and the second output node.

    11. A current-to-voltage converter comprising: a first bit line to receive a first current generated during a read operation of a W+ cell; a second bit line to receive a second current generated during a read operation of a W− cell, wherein a difference between a value stored in the W+ cell and a value stored in the W− cell is a weight value, W; and a differential amplifier to receive the first current and the second current and to generate a differential output voltage comprising a first voltage output and a second voltage output responsive to the first current and the second current.

    12. An output block, comprising: a plurality of current-to-voltage converters, each to receive a respective bit line differential pair and to generate a respective differential voltage output; and a plurality of differential input analog-to-digital converters, each to receive the respective differential voltage output from one of the plurality of current-to-voltage converters and to generate a respective set of digital output bits responsive to the received respective differential voltage output.

    13. An output block, comprising: a plurality of current-to-voltage converters, each to receive a respective bit line differential pair and to generate a respective voltage output; and a plurality of differential input analog-to-digital converters, each to receive the respective voltage output from one of the plurality of current-to-voltage converters and to generate a respective set of digital output bits.

    14. An output block, comprising: a current-to-voltage converter to receive a bit line differential pair, the current-to-voltage converter comprising: a differential operational amplifier comprising a first input and a second input and a first output and a second output, the first input and the second input coupled to the bit line differential pair; a first variable resistor coupled between the first input and the first output; a second variable resistor coupled between the second input and the second output; and a common mode input circuit coupled between the first input and the second input; and a differential input analog-to-digital converter to receive the first output and the second output and to generate a set of digital output bits.

    15. The output block of claim 14, wherein the common mode input circuit comprises a first variable current source coupled to the first input and a second variable current source coupled to the second input, the first variable current source and the second variable current source to generate equal currents.

    16. An output current neuron circuit, comprising: a first bit line coupled to a W+ cell in a memory array to draw a first current during a read operation; a second bit line coupled to a W− cell in the memory array to draw a second current; a first bias current coupled to the first bit line; and a second bias current couple to the second bit line, wherein the first bias current and the second bias current have the same value.

    17. An output block, comprising: an output current neuron circuit, comprising: a first bit line coupled to a W+ cell in a memory array to draw a first current during a read operation; and a second bit line coupled to a W− cell in the memory array to draw a second current during the read operation; a first bias current coupled to the first bit line; and a first output current which is proportional to a difference between the first and second current.

    18. The output block of claim 17, wherein the first output current is equal to half of the difference between the first and second current.

    19. The output block of claim 17, further comprising a second output current that is complementary to the first output current.

    20. An offset calibration method for an output block, the method comprising: applying nominal biases to input nodes of a sub-circuit block of the output block; and applying an increased or decreased offset trim setting to the sub circuit block of the output block until an output of the output block is within a target value of an expected output value.

    21. The method of claim 20, wherein the sub circuit block is a current-to-voltage circuit.

    22. The method of claim 20, wherein the sub circuit block is an analog-to-digital converter circuit.

    23. The method of claim 20, further comprising: providing, by the output block, an output from a neuron.

    24. The method of claim 23, wherein the neuron is a portion of a neural memory array in a neural network.

    25. An offset calibration method for an output block, the method comprising: measuring a new trimmed output of the output block in response to an increased offset trim setting; comparing the new trimmed output and a nominal bias output, wherein: when the new trimmed output is equal to the nominal bias output, repeating the measuring and comparing steps; and when the new trimmed output is different than the nominal bias output, storing the new trimmed output as trim value; and applying the trim value to a sub circuit block within the output block during operation.

    26. The method of claim 25, further comprising: providing, by the output block, an output from a neuron.

    27. The method of claim 26, wherein the neuron is a portion of a neural memory array in a neural network.

    28. An offset calibration method for an output block, the method comprising: applying nominal biases to input nodes of a sub-circuit block of the output block; measuring a nominal bias output of the output block in response to the nominal biases; applying a decreased offset trim setting to the input nodes; measuring a new trimmed output of the output block in response to the decreased offset trim setting; comparing the measured new trimmed output and the measured nominal bias output, wherein: when the measured new trimmed output is equal to the measured nominal bias output, repeating the applying, measuring, and comparing steps; and when the measured new trimmed output is different than the measured nominal bias output, storing the new trimmed output as trim value; and applying the trim value to the sub-circuit block of the output block during operation.

    29. The method of claim 28, further comprising: providing, by the output block, an output from a neuron.

    30. The method of claim 29, wherein the neuron is a portion of a neural memory array in a neural network.

    31. An offset calibration method for an output block, the method comprising: applying an input value to input nodes of a sub-circuit block of the output block; measuring an output value of the output block in response to the input value; comparing the output value to a target offset value, wherein: when the output value exceeds the target offset value, repeating the applying, measuring, and comparing steps with a next input value; and when the output value is less than or equal to the target offset value, storing the input value as a trim value; and applying the trim value to the sub-circuit block of the output block during operation of the output block.

    32. The method of claim 31, further comprising: providing, by the output block, an output from a neuron.

    33. The method of claim 32, wherein the neuron is a portion of a neural memory array in a neural network.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0096] FIG. 1 is a diagram that illustrates an artificial neural network.

    [0097] FIG. 2 depicts a prior art split gate flash memory cell.

    [0098] FIG. 3 depicts another prior art split gate flash memory cell.

    [0099] FIG. 4 depicts another prior art split gate flash memory cell.

    [0100] FIG. 5 depicts another prior art split gate flash memory cell.

    [0101] FIG. 6 is a diagram illustrating the different levels of an exemplary artificial neural network utilizing one or more non-volatile memory arrays.

    [0102] FIG. 7 is a block diagram illustrating a vector-by-matrix multiplication system.

    [0103] FIG. 8 is a block diagram illustrates an exemplary artificial neural network utilizing one or more vector-by-matrix multiplication systems.

    [0104] FIG. 9 depicts another embodiment of a vector-by-matrix multiplication system.

    [0105] FIG. 10 depicts another embodiment of a vector-by-matrix multiplication system.

    [0106] FIG. 11 depicts another embodiment of a vector-by-matrix multiplication system.

    [0107] FIG. 12 depicts another embodiment of a vector-by-matrix multiplication system.

    [0108] FIG. 13 depicts another embodiment of a vector-by-matrix multiplication system.

    [0109] FIG. 14 depicts a prior art long short-term memory system.

    [0110] FIG. 15 depicts an exemplary cell for use in a long short-term memory system.

    [0111] FIG. 16 depicts an embodiment of the exemplary cell of FIG. 15.

    [0112] FIG. 17 depicts another embodiment of the exemplary cell of FIG. 15.

    [0113] FIG. 18 depicts a prior art gated recurrent unit system.

    [0114] FIG. 19 depicts an exemplary cell for use in a gated recurrent unit system.

    [0115] FIG. 20 depicts an embodiment of the exemplary cell of FIG. 19.

    [0116] FIG. 21 depicts another embodiment of the exemplary cell of FIG. 19.

    [0117] FIG. 22 depicts another embodiment of a vector-by-matrix multiplication system.

    [0118] FIG. 23 depicts another embodiment of a vector-by-matrix multiplication system.

    [0119] FIG. 24 depicts another embodiment of a vector-by-matrix multiplication system.

    [0120] FIG. 25 depicts another embodiment of a vector-by-matrix multiplication system.

    [0121] FIG. 26 depicts another embodiment of a vector-by-matrix multiplication system.

    [0122] FIG. 27 depicts another embodiment of a vector-by-matrix multiplication system.

    [0123] FIG. 28 depicts another embodiment of a vector-by-matrix multiplication system.

    [0124] FIG. 29 depicts another embodiment of a vector-by-matrix multiplication system.

    [0125] FIG. 30 depicts another embodiment of a vector-by-matrix multiplication system.

    [0126] FIG. 31 depicts another embodiment of a vector-by-matrix multiplication system.

    [0127] FIG. 32 depicts another embodiment of a vector-by-matrix multiplication system.

    [0128] FIG. 33 depicts another embodiment of a vector-by-matrix multiplication system.

    [0129] FIG. 34 depicts another embodiment of a vector-by-matrix multiplication system.

    [0130] FIGS. 35A, 35B, 35C, 35D, 35E, and 35F depict embodiments of an output block.

    [0131] FIG. 36 depicts another embodiment of an output block.

    [0132] FIGS. 37A and 37B depicts another embodiment of an output block.

    [0133] FIGS. 38A and 38B depicts another embodiment of an output block.

    [0134] FIG. 39 depicts a variable resistor replica.

    [0135] FIG. 40 depicts an embodiment of a current-to-voltage converter.

    [0136] FIG. 41 depicts a differential output amplifier.

    [0137] FIG. 42 depicts an offset calibration method.

    [0138] FIG. 43 depicts another offset calibration method.

    DETAILED DESCRIPTION OF THE INVENTION

    [0139] The artificial neural networks of the present invention utilize a combination of CMOS technology and non-volatile memory arrays.

    [0140] VMM System Overview

    [0141] FIG. 34 depicts a block diagram of VMM system 3400. VMM system 3400 comprises VMM array 3401, row decoder 3402, high voltage decoder 3403, column decoder 3404, bit line drivers 3405, input circuit 3406, output circuit 3407, control logic 3408, and bias generator 3409. VMM system 3400 further comprises high voltage generation block 3410, which comprises charge pump 3411, charge pump regulator 3412, and high voltage analog precision level generator 3413. VMM system 3400 further comprises (program/erase, or weight tuning) algorithm controller 3414, analog circuitry 3415, control engine 3416 (that may include special functions such as arithmetic functions, activation functions, embedded microcontroller logic, without limitation), and test control logic 3417. The systems and methods described below can be implemented in VMM system 3400.

    [0142] Input circuit 3406 may include circuits such as a DAC (digital to analog converter), DPC (digital to pulses converter, digital to time modulated pulse converter), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), PAC (pulse to analog level converter), or any other type of converters. The input circuit 3406 may implement normalization, linear or non-linear up/down scaling functions, or arithmetic functions. The input circuit 3406 may implement a temperature compensation function for input levels. The input circuit 3406 may implement an activation function such as ReLU or sigmoid. The output circuit 3407 may include circuits such as a ADC (analog to digital converter, to convert neuron analog output to digital bits), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), APC (analog to pulse(s) converter, analog to time modulated pulse converter), or any other type of converters.

    [0143] Output circuit 3407 may implement an activation function such as rectified linear activation function (ReLU) or sigmoid. The output circuit 3407 may implement statistic normalization, regularization, up/down scaling/gain functions, statistical rounding, or arithmetic functions (e.g., add, subtract, divide, multiply, shift, log) for neuron outputs. Output circuit 3407 may implement a temperature compensation function for neuron outputs or array outputs (such as bitline output) so as to keep power consumption of the array approximately constant or to improve precision of the array (neuron) outputs such as by keeping the IV slope approximately the same.

    [0144] FIG. 35A depicts output block 3500. Output block 3500 comprises current-to-voltage converters (ITV, with differential inputs and differential outputs) 3501-1 through 3501-i, where i is the number of bit line W+ and W− pairs received by output block 3500; multiplexor 3502; sample and hold circuits 3503-1 through 3503-k, channel multiplexor 3504, and differential input analog-to-digital converter (ADC) 3505. Output block 3500 receives differential weight outputs W+ and W− from bit line pairs in the array, and ultimately generates a digital output, DOUTx, representing the output of one of the bit line pairs (e.g., W+ and W-lines) from the ADC 3505 (ADC with differential inputs).

    [0145] Current-to-voltage (ITV) converters 3501-1 through 3501-i each receive analog bit line current signals BLw+ and BLw− (which are bit line outputs generated in response to inputs and stored W+ and W-weights, respectively) and convert them into respective differential voltages ITVO+ and ITVO−.

    [0146] Differential voltages ITVO+ and ITVO− are then received by multiplexor 3502, which time-multiplexes the outputs from current-to-voltage converters 3501-1 through 3501-i to the sample and hold (S/H) circuits 3503-1 to 3503k, where k can be the same as or different than i.

    [0147] S/H circuits 3503-1 to 3503-k each sample its received differential voltages and holds them as a differential output.

    [0148] Channel multiplexor 3504 receives a control signal to select one of the bit line W+ and W− channels, i.e., one of the bit line pairs, and outputs the differential voltages held by the respective sample and hold circuit 3503 to ADC 3505, which converts the analog differential voltages that are output by the respective sample and hold circuit 3503 into a set of digital bits, DOUTx. A single S/H 3503 can be shared across the multiple ITV converters 3501. The ADC 3505 can operate on multiple ITV converters in a time-multiplexed manner. Each S/H 3503 can be just a capacitor or a capacitor followed by a buffer (e.g., operational amplifier).

    [0149] The ITV converters 3501 can comprise output current neuron circuit 3700, 3750, 3800, or 3820 from FIGS. 37A, 37B, 38A, and 38B, respectively, combined with current-to-voltage converter 4000 in FIG. 40. In such an instance, the inputs to ITV converters 3501 would be two current inputs (such as BLW+ and BLW− in FIG. 35A to 35E, 37A, 37B, 38A, or 38B) and the outputs of ITV converters are differential outputs (such as VOP and VON in FIG. 40 or ITVO+ and ITVO− in FIGS. 35A to 35D).

    [0150] ADC 3505 can be of a hybrid ADC architecture, meaning it has more than one ADC architecture to perform conversion. For example, if DOUTx is an 8-bit output, ADC 3505 can comprise an ADC sub-architecture to generate bits B7-B4 and another ADC sub-architecture to generate bits B3-B0 from the differential inputs ITVSH+ and ITVSH−. That is, ADC circuit 3505 can include multiple ADC sub-architectures.

    [0151] Optionally, an ADC sub-architecture can be shared among all channels while another ADC sub-architecture is not shared among all channels.

    [0152] In another embodiment, channel multiplexor 3504 and ADC 3505 can be removed, and the output instead can be analog differential voltages from a S/H 3503, which can be buffered by an operational amplifier. For example, the use of an analog voltage can be implemented in an all-analog neural network (i.e., one where a digital output or digital input is not needed for the neural memory array).

    [0153] FIG. 35B depicts output block 3550. Output block comprises current-to-voltage converters (ITV) 3551-1 through 3551-i, where i is the number of bit line W+ and W− pairs received by output block 3550; multiplexor 3552; differential to single ended converters Diff-to-S Converter 3553-1 to 3553-k, sample and hold circuits 3554-1 through 3554-k (where k is the same as or different than i), channel multiplexor 3555, and analog-to-digital converter (ADC) 3556. Diff-to-S converter 3553 is used to convert the differential outputs from the ITV 3551 signal provided by mux 3552, i.e. ITVOMX+ and ITVOMX− into a singled-ended output ITVSOMX+. The singled-ended output ITVSOMX+ is then input to the S/H 3554, multiplexor 3555, and ADC 3556.

    [0154] FIG. 35C depicts output block 3560. Output block 3560 comprises current-to-voltage converters (ITV) 3561-1 through 3561-i, where i is the number of bit line W+ and W-pairs received by output block 3560, and differential input analog-to-digital converter (ADC) 3566-1 through 3566-i.

    [0155] FIG. 35D depicts output block 3570. Output block 3570 comprises current-to-voltage converters (ITV) 3571-1 through 3571-i, where i is the number of bit line W+ and W-pairs received by output block 3570, and single input analog-to-digital converter (ADC) 3576-1 through 3576-i. In this case only one output of the differential output ITV is used, the ITV is used with differential inputs and single output.

    [0156] FIG. 35E depicts output block 3580. Output block 3580 comprises current-to-voltage converters (ITV) 3581-1 through 3581-i, where i is the number of bit line W+ and W-pairs received by output block 3580, and differential input analog-to-digital converter (ADC) 3586-1 through 3586-i. ITV blocks 3581-1 through 3581-i comprise common mode input circuit 3582-1 through 3582-i, respectively, and differential operational amplifier 3583-1 through 3583-I, respectively, with feedback provided by variable resistors 3584-1 through 3584-i, respectively, and 3585-1 through 3585-i, respectively.

    [0157] FIG. 35F depicts output block 3590, which could be used for common mode inputs circuits 3582-1 through 3582-i in FIG. 35E. Output block 3591 comprises two equal variable current sources Ibias+ and Ibias− connected to two current inputs, BLw+ and BLw−.

    [0158] FIG. 36 depicts output block 3600. Output block comprises summation circuits 3601-1 through 3601-i (such as a current mirror circuit), where i is the number of bit line BLw+ and BLw− pairs received by output block 3600; current-to-voltage converter circuits (ITV) 3602-1 through 3602-i, multiplexor 3603; sample and hold circuits 3604-1 through 3604-k (where k is the same as or different than i), channel multiplexor 3605, and ADC 3606. Output block 3600 receives differential weight outputs BLw+ and BLw− from bit line pairs in the array, and ultimately generates a digital output from ADC 3606, DOUTx, representing the output of one of the bit line pairs at a time.

    [0159] Current summation circuits 3601-1 through 3601-i each receive current from a pair of bit lines and subtract the BLw− value from the BLw+ value and output the result as a summation current IWO.

    [0160] Current-to-voltage converters 3602-1 through 3602-i receive the output summation current IWO and convert the respective summation current into differential voltages ITVO+ and ITVO−, which are then received by multiplexor 3603 and selectively provided to sample-and-hold circuits 3604-1 through 3604-k. The differential voltages are to be digitized (converted into a digital output bits) by a differential input ADC (block 3606), which has various advantages such as input noise reduction (such as from clock feed-through) and more accurate comparison operation (as in SAR ADC).

    [0161] Each sample and hold circuit 3604 receives differential voltages ITVOMX+ and ITVOMX−, samples the received differential voltages, and holds them as a differential voltage output, OSH+ and PSH−.

    [0162] Channel multiplexor 3605 receives a control signal to select one of the bit line pairs, i.e., channels, BLw+ and BLw− and outputs the voltage held by the respective sample and hold circuit 3604 to differential input ADC 3606, which converts the voltage into a set of digital bits as DOUTx.

    [0163] FIG. 37A depicts output current neuron circuit 3700, which optionally can be included in output block 3500 of FIG. 35 or output block 3600 of FIG. 36.

    [0164] Output current neuron circuit 3700 comprises first variable current source 3701, second variable current source 3702, and bias circuit 3703. Bias circuit 3703 generates a control voltage, Vbias, based on a comparison of BLW+ and VREF or BLW− and VREF. First variable current source 3701 generates an output current, Ibias+, that is varied by a control voltage, Vbias, (i.e. the amount of output current Ibias+ is responsive to the value of Vbias) and is coupled to a first bit line, BLW+. Second variable current source 3702 generates an output current, Ibias−, that is varied by Vbias (i.e. the amount of output current Ibias− is responsive to the value of Vbias) and is coupled to a second bit line, BLW−. BLW+ is selected by a column decoder (not shown) and receives a first current from cells storing W+ values during a read operation, and BLW− is selected by the column decoder and receives a second current from cells storing W− values during the read operation. A W+ value and associated W− value represent a weight value, W. The outputs, Ibias+ and Ibias−, of current sources 3701 and 3702 are identical at any given time.

    [0165] VREF is applied as a input common mode voltage to generate Vbias voltage to control variable current sources 3701 and 3702 to impose a common mode voltage on BLW+ and BLW−, where the input common mode voltage acts as a reference read voltage on the bitlines during a read operation. The output of output current neuron circuitry 3700 is Iout+ and Iout−, which form a differential signal. Iout+ is the output current from bit line BLW+ after Vbias has been applied to generate Ibias+, and Iout− is the output current from bit line BLW− after Vbias has been applied to generate Ibias−, where Iout+=Ibias+−IBLW+ and Iout−=Ibias−−IBLW−.

    [0166] FIG. 37B depicts output current neuron circuit 3750 which depicts an embodiment of variable current sources 3701 and 3702 using PMOS transistors 3711 and 3712.

    [0167] FIG. 38A depicts output current neuron circuit 3800, which optionally can be included in output block 3500 of FIG. 35, output block 3550 of FIG. 35B or output block 3600 of FIG. 36.

    [0168] Output current neuron circuit 3800 comprises a first variable resistor 3801 (a first device) comprising a first end and a second end, the second end coupled to bit line BLW+ that is selected during a read operation; a second variable resistor 3802 (a second device) comprising a third end and a fourth end, the fourth end coupled to bit line BLW− that is selected during a read operation, where BLW+ is connected to cells in a memory array storing W+ values and BLW− is connected to cells in the memory array storing associated W− values; variable current source 3803; and bias circuit op amp 3804 that generates a bias voltage, Vbias, whose value represents the difference between BLW+ (or, alternatively, BLW−) and VREF. The first end of first variable resistor 3801 and the third end of second variable resistor 3802 are coupled to variable current source 3803.

    [0169] VREF is used to generate Vbias voltage that is applied to variable current source 3803 to impose an input common mode voltage to bit lines BLW+ and BLW−, where the input common mode voltage acts as a read reference voltage on the bitlines during a read operation. The output of output current neuron circuitry 3800 is Iout+ (a first output current) from first variable resistor 3801 and Iout− (a second output current) from second variable resistor 3802, which form a differential current signal. Iout+ is the output current from bit line BLW+ after Vbias has been applied to generate Ibias, and Iout− is the output current from bit line BLW− after Vbias has been applied to generate Ibias, according to the following: Iout+=Ibias−IBLW+ and Iout−=Ibias−IBLW−.

    [0170] FIG. 38B depicts output current neuron circuit 3820, which optionally can be included in output block 3500 of FIG. 35, output block 3550 of FIG. 35B or output block 3600 of FIG. 36. The circuit is similar to that of the circuit in FIG. 38A except the output of the op amp 3804 drives directly into two terminals of the two variable resistors 3801 and 3802.

    [0171] FIG. 39 depicts variable resistor replica 3900, which optionally can be used in place of variable resistor 3801 and/or variable resistor 3802 in FIG. 38. Variable resistor replica 3900 comprises NMOS transistor 3901. One terminal of NMOS transistor 3901 is coupled to bias circuit 3804. Another terminal of NMOS transistor 3901 is coupled to either BLW+ or BLW−. The gate of NMOS transistor 3901 is coupled to comparator 3902, which generates a control signal, VGC, that adjusts the resistance provided by NMOS transistor 3901. Hence, the resistance of NMOS 3901 is =VREF/IBIAS. By changing VREF or IBIAS, the equivalent resistance of the NMOS 3901 can be changed.

    [0172] FIG. 40 depicts current-to-voltage converter 4000, which can be used for current-to-voltage converters 3501 in FIG. 35A, current-to-voltage converters 3511 in FIG. 35B or current-to-voltage converters 3602 in FIG. 36.

    [0173] Current-to-voltage converter 4000 comprises differential amplifier 4001; variable integrating resistors 4002 and 4003; controlled switches 4004, 4005, 4006, and 4007; and variable sample and hold capacitors 4008 and 4009, configured as shown.

    [0174] Current-to-voltage converter 4000 receives differential currents IOUT+ and IOUT− and outputs voltages VOP and VON. The output voltage VOP=IOUT+*R and the output voltage VON=IOUT−*R, with resistors 4002 and 4003 each having value equal to R. The scaling of the output neuron is provided by the variation of the values of the resistors 4002 and 4003. For example, resistors 4002 and 4004 can each be provided by the resistor replica circuit 3900. The capacitor 4008 and 4009 serves as holding S/H capacitor, to hold the output voltage once the resistors 4002 and 4003 and the input currents are shut off. A control circuit (not shown) controls the opening and closing of switches 4004, 4005, 4006 and 4007 to provide an integration time.

    [0175] In another mode of the operation, variable capacitors 4008 and 4009 are used to integrate the differential output current IOUT+ and IOUT−. In this case, resistors 4002 and 4003 are disabled (not used). The output voltage VOP is therefore proportional to Iout+*Time/C and the output voltage VON is therefore proportional to Iout−*Time/C. The value Time is controlled by the pulse width of pulse 4010 T. The C value is provided by the capacitors 4008 and 4009. The scaling of the output neuron values is then provided by the variation of the pulse-width T or the variation of the capacitance values of the capacitors 4008 and 4009 in this example.

    [0176] The differential currents IOUT+ and IOUT− are derived from first bit line current BLW+ and second bit line current BLW−. IOUT+ and IOUT− have complementary values (one positive and the other negative with the same magnitude). The value of IOUT+=((current of BLW−)−(current of BLW+))/2, and IOUT−=((current of BLW+)−(current of BLW−))/2). For example, if the current of BLW+ is 1 μa and the current of BLW− is 31 μa, the Iout+=(31 μA−1 μA)/2=15 μA and Iout−=−15 μA.

    [0177] FIG. 41 depicts differential amplifier 4100, which optionally can be included in output block 3500 of FIG. 35A, output block 3550 of FIG. 35B or output block 3600 of FIG. 36. Differential output amplifier 4100 comprises PMOS transistors 4101, 4102, 4103, 4104, 4105, 4106, 4107, and 4108, and NMOS transistors 4109, 4110, 4111, 4112, and 4113, configured as shown. Differential output amplifier 4100 receives inputs VINP and VINN and generates outputs VOUTP and VOUTN. VPBIAS is applied to the gates of PMOS transistors 4102, 4104, 4106, and 4108, and VNBIAS is applied to the gates of NMOS transistors 4111 and 4113. If VINP>VINN, then VOUTP will be high and VOUTN will be low. If VINP<VINN, then VOUTP will be low and VOUTN will be high. A common mode feedback circuit for the output common mode is not shown.

    [0178] FIG. 42 depicts offset calibration method 4200 for an output block such as output blocks 3500, 3550, 3560, 3570, 3580, 3590, or 3600 described above. The method can be performed within a sub circuit block of the output block such as by an ITV block or by an ADC block.

    [0179] First, nominal biases are applied to input nodes. The nominal biases can be a mid-point offset trim setting, such as 0 value or an average value (such as an average of a target input range, for input for BLw+ and BLw−) (step 4202).

    [0180] Second, an increased offset trim setting is applied to one of sub circuit block of the output block (such as the ITV or ADC). (step 4203).

    [0181] Third, the new trimmed output value of the entire output block is measured and compared against the expected output value to see if the value is within target value of the nominal output value (step 4204). If it is true, the method proceeds to step 4207. If it is not true, then steps 4203 and 4204 are repeated, with the offset trim setting applied to the sub circuit block being increased each time, until the new trimmed output value of the entire output block is within than the expected output value, at which point it proceeds to step 4207.

    [0182] After a certain number of tries (set by a threshold T), if the new trimmed output value of the entire output block is not within the target of as the expected output value, then the offset trim setting is returned to the nominal offset trim setting then the offset trim setting is decreased from the nominal setting (step 4205).

    [0183] The new trimmed output value of the entire output block is measured and compared against the expected output value of the entire output block to see if the value is within target value of the expected output value (step 4206). If it is true, the method proceeds to step 4207. If it is not true, then steps 4205 and 4206 are repeated, with the offset trim setting applied to the input nodes being decreased each time, until the new trimmed output value is within the target value of the expected output value, at which point it proceeds to step 4207.

    [0184] In step 4207, the trimmed value that caused the output value to be within target value of the expected output value is stored as the stored trim value. That is the trim value that will result in the smallest offset by the output block.

    [0185] In step 4208, optionally, the stored trimmed value is added as a bias to the sub-circuit block of the output block during every operation.

    [0186] Thus, offset calibration method 4200 performs a trim operation on the entire output block by trimming a sub-circuit block of the output block.

    [0187] FIG. 43 depicts offset calibration method 4300 for an output block, such as output blocks 3500, 3550, 3560, 3570, 3580, 3590, or 3600 described above. The method can be performed within a sub-circuit block such as an ITV block or by an ADC block

    [0188] First, reference biases are applied to the input nodes (such as input for BLw+ and BLw−) of a sub-circuit block of the output block (step 4301).

    [0189] Next, the output value of the output block is measured and compared against a target offset value (step 4302).

    [0190] If the measured output value>target offset value, then the next offset trim value in a sequence of offset trim values is applied (step 4303), and step 4302 is repeated. The offset trim is applied to one of sub circuit block of the output block (such as the ITV or ADC).

    [0191] Steps 4303 and 4302 are repeated until the measured output value<=target offset value, at which point the offset trim value is stored (step 4304). That is the trim value that results in an acceptable level of offset.

    [0192] Optionally, the stored offset trim value is applied as a bias to the sub-circuit block of the output block during every operation (step 4305).

    [0193] In alternative embodiments, the variable resistors in FIG. 35E or 40B are not equal in resistance. In this case output voltages or currents from the ITV are proportional to the resistance values. For example, in FIG. 35E, the resistor 3585-1 can be very large, then most of current from two bitlines (IBLw+−IBLw−) will flow through the resistor 3584-1. In another example in FIG. 35E, the resistor 3585-1 is disconnected, then all of the current from two bitlines (IBLw+−IBLw−) will flow through the resistor 3584-1.

    [0194] It should be noted that, as used herein, the terms “over” and “on” both inclusively include “directly on” (no intermediate materials, elements or space disposed therebetween) and “indirectly on” (intermediate materials, elements or space disposed therebetween). Likewise, the term “adjacent” includes “directly adjacent” (no intermediate materials, elements or space disposed therebetween) and “indirectly adjacent” (intermediate materials, elements or space disposed there between), “mounted to” includes “directly mounted to” (no intermediate materials, elements or space disposed there between) and “indirectly mounted to” (intermediate materials, elements or spaced disposed there between), and “electrically coupled” includes “directly electrically coupled to” (no intermediate materials or elements there between that electrically connect the elements together) and “indirectly electrically coupled to” (intermediate materials or elements there between that electrically connect the elements together). For example, forming an element “over a substrate” can include forming the element directly on the substrate with no intermediate materials/elements therebetween, as well as forming the element indirectly on the substrate with one or more intermediate materials/elements there between.