HYBRID MEMORY SYSTEM CONFIGURABLE TO STORE NEURAL MEMORY WEIGHT DATA IN ANALOG FORM OR DIGITAL FORM

20230053608 · 2023-02-23

    Inventors

    Cpc classification

    International classification

    Abstract

    Numerous embodiments of a hybrid memory system are disclosed. The hybrid memory can store weight data in an array in analog form when used in an analog neural memory system or in digital form when used in a digital neural memory system. Input circuitry and output circuitry are capable of supporting both forms of weight data.

    Claims

    1. A system comprising: an array of non-volatile memory cells arranged into rows and columns; configurable input circuitry coupled to the array to provide an input to the array; and configurable output circuitry coupled to the array to provide an output received from the array in response to the input; wherein in a first mode, the configurable output circuitry provides digital data from the array; and wherein in a second mode, the configurable output circuitry provides analog data from the array.

    2. The system of claim 1, wherein the digital data comprises digital weight data and the analog data comprises analog weight data.

    3. The system of claim 1, wherein the configurable input circuitry comprises: a row register and a digital-to-analog converter block for use in the first mode; and a row decoder block for use in the second mode.

    4. The system of claim 3, wherein the configurable output circuitry comprises: a current-to-voltage converter and analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    5. The system of claim 1, wherein the configurable output circuitry comprises: a current-to-voltage converter and analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    6. The system of claim 1, wherein the non-volatile memory cells are stacked-gate flash memory cells.

    7. The system of claim 1, wherein the non-volatile memory cells are split-gate flash memory cells.

    8. The system of claim 1, wherein the system is an analog neural memory system.

    9. A system comprising: an array of non-volatile memory cells arranged into rows and columns; input circuitry coupled to the array to provide an input to the array; and output circuitry coupled to the array to provide an output received from the array; wherein the input circuitry provides a digital input to the array in a first mode or an analog input to the array in a second mode.

    10. The system of claim 9, wherein in the first mode, the output circuitry provides digital data from the array.

    11. The system of claim 10, wherein the digital data comprises digital weight data.

    12. The system of claim 10, wherein in the second mode, the output circuitry provides analog data from the array.

    13. The system of claim 12, wherein the analog data comprises analog weight data.

    14. The system of claim 9, wherein the input circuitry comprises: a row register and digital-to-analog converter block for use in the first mode; and a row decoder block for use in the second mode.

    15. The system of claim 14, wherein the output circuitry comprises: a current-to-voltage converter and analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    16. The system of claim 9, wherein the output circuitry comprises: a current-to-voltage converter and analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    17. The system of claim 9, wherein the non-volatile memory cells are stacked-gate flash memory cells.

    18. The system of claim 9, wherein the non-volatile memory cells are split-gate flash memory cells.

    19. The system of claim 9, wherein the system is an analog neural memory system.

    20. A system comprising: an array of non-volatile memory cells arranged into rows and columns; input circuitry coupled to the array to provide an input to the array; and output circuitry coupled to the array to provide an output received from the array; wherein the output circuitry provides a digital bit output from the array in a first mode or an analog output from the array in a second mode.

    21. The system of claim 20, wherein the input circuitry comprises: a row register and digital-to-analog converter block for use in the first mode; and a row decoder block for use in the second mode.

    22. The system of claim 21, wherein the output circuitry comprises: a current-to-voltage converter and an analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    23. The system of claim 20, wherein the output circuitry comprises: a current-to-voltage converter and an analog-to-digital converter block for use in the first mode; and a multi-state sense amplifier block for use in the second mode.

    24. The system of claim 20, wherein the non-volatile memory cells are stacked-gate flash memory cells.

    25. The system of claim 20, wherein the non-volatile memory cells are split-gate flash memory cells.

    26. The system of claim 20, wherein the system is an analog neural memory system.

    27. A reconfigurable output block, comprising: an operational amplifier comprising a noninverting input, an inverting input, and an output, the noninverting input receiving a reference voltage; and a variable current source coupled to a selected memory cell and the inverting input and controlled by logic in response to the output of the operational amplifier.

    28. The reconfigurable output block of claim 27, wherein the selected memory cell is a stacked-gate flash memory cell.

    29. The reconfigurable output block of claim 27, wherein the selected memory cell is a split-gate flash memory cell.

    30. The reconfigurable output block of claim 27, wherein the selected memory cell is a portion of an analog neural memory system.

    31. A reconfigurable output block, comprising: an output circuit configurable to operate on stored digital data and configurable to operate on stored analog data.

    32. The reconfigurable output block of claim 31, wherein the digital data comprises digital weight data and the analog data comprises analog weight data.

    33. The reconfigurable output block of claim 31, wherein the digital data and the analog data are stored in stacked-gate flash memory cells.

    34. The reconfigurable output block of claim 31, wherein the digital data and the analog data are stored in split-gate flash memory cells.

    35. (canceled)

    36. A reconfigurable input block, comprising: an input circuit configurable to store and retrieve digital data and configurable to store and retrieve analog data.

    37. The reconfigurable input block of claim 36, wherein the digital data comprises digital weight data and the analog data comprises analog weight data.

    38. The reconfigurable input block of claim 36, wherein the digital data and the analog data are stored in stacked-gate flash memory cells.

    39. The reconfigurable input block of claim 36, wherein the digital data and the analog data are stored in split-gate flash memory cells.

    40. (canceled)

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0097] FIG. 1 is a diagram that illustrates an artificial neural network.

    [0098] FIG. 2 depicts a prior art split gate flash memory cell.

    [0099] FIG. 3 depicts another prior art split gate flash memory cell.

    [0100] FIG. 4 depicts another prior art split gate flash memory cell.

    [0101] FIG. 5 depicts another prior art split gate flash memory cell.

    [0102] FIG. 6 is a diagram illustrating the different levels of an exemplary artificial neural network utilizing one or more non-volatile memory arrays.

    [0103] FIG. 7 is a block diagram illustrating a vector-by-matrix multiplication system.

    [0104] FIG. 8 is a block diagram illustrates an exemplary artificial neural network utilizing one or more vector-by-matrix multiplication systems.

    [0105] FIG. 9 depicts another embodiment of a vector-by-matrix multiplication system.

    [0106] FIG. 10 depicts another embodiment of a vector-by-matrix multiplication system.

    [0107] FIG. 11 depicts another embodiment of a vector-by-matrix multiplication system.

    [0108] FIG. 12 depicts another embodiment of a vector-by-matrix multiplication system.

    [0109] FIG. 13 depicts another embodiment of a vector-by-matrix multiplication system.

    [0110] FIG. 14 depicts a prior art long short-term memory system.

    [0111] FIG. 15 depicts an exemplary cell for use in a long short-term memory system.

    [0112] FIG. 16 depicts an embodiment of the exemplary cell of FIG. 15.

    [0113] FIG. 17 depicts another embodiment of the exemplary cell of FIG. 15.

    [0114] FIG. 18 depicts a prior art gated recurrent unit system.

    [0115] FIG. 19 depicts an exemplary cell for use in a gated recurrent unit system.

    [0116] FIG. 20 depicts an embodiment of the exemplary cell of FIG. 19.

    [0117] FIG. 21 depicts another embodiment of the exemplary cell of FIG. 19.

    [0118] FIG. 22 depicts another embodiment of a vector-by-matrix multiplication system.

    [0119] FIG. 23 depicts another embodiment of a vector-by-matrix multiplication system.

    [0120] FIG. 24 depicts another embodiment of a vector-by-matrix multiplication system.

    [0121] FIG. 25 depicts another embodiment of a vector-by-matrix multiplication system.

    [0122] FIG. 26 depicts another embodiment of a vector-by-matrix multiplication system.

    [0123] FIG. 27 depicts another embodiment of a vector-by-matrix multiplication system.

    [0124] FIG. 28 depicts another embodiment of a vector-by-matrix multiplication system.

    [0125] FIG. 29 depicts another embodiment of a vector-by-matrix multiplication system.

    [0126] FIG. 30 depicts another embodiment of a vector-by-matrix multiplication system.

    [0127] FIG. 31 depicts another embodiment of a vector-by-matrix multiplication system.

    [0128] FIG. 32 depicts another embodiment of a vector-by-matrix multiplication system.

    [0129] FIG. 33 depicts another embodiment of a vector-by-matrix multiplication system.

    [0130] FIG. 34 depicts another embodiment of a vector-by-matrix multiplication system.

    [0131] FIG. 35A depicts a hybrid memory system.

    [0132] FIG. 35B depicts another hybrid memory system.

    [0133] FIG. 36 depicts a hybrid memory operation method.

    [0134] FIG. 37 depicts configurable macro circuitry for use with a hybrid memory system.

    [0135] FIG. 38 depicts a system comprising a plurality of hybrid array tiles.

    [0136] FIG. 39 depicts a reconfigurable current-to-voltage and analog-to-digital converter circuit.

    DETAILED DESCRIPTION OF THE INVENTION

    [0137] The artificial neural networks of the present invention utilize a combination of CMOS technology and non-volatile memory arrays.

    [0138] VMM System Overview

    [0139] FIG. 34 depicts a block diagram of VMM system 3400. VMM system 3400 comprises VMM array 3401, row decoder 3402, high voltage decoder 3403, column decoder 3404, bit line drivers 3405, input circuit 3406, output circuit 3407, control logic 3408, and bias generator 3409. VMM system 3400 further comprises high voltage generation block 3410, which comprises charge pump 3411, charge pump regulator 3412, and high voltage analog precision level generator 3413. VMM system 3400 further comprises (program/erase, or weight tuning) algorithm controller 3414, analog circuitry 3415, control engine 3416 (that may include special functions such as arithmetic functions, activation functions, embedded microcontroller logic, without limitation), and test control logic 3417. The systems and methods described below can be implemented in VMM system 3400.

    [0140] Input circuit 3406 may include circuits such as a DAC (digital to analog converter), DPC (digital to pulses converter, digital to time modulated pulse converter), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), PAC (pulse to analog level converter), or any other type of converters. The input circuit 3406 may implement normalization, linear or non-linear up/down scaling functions, or arithmetic functions. The input circuit 3406 may implement a temperature compensation function for input levels. The input circuit 3406 may implement an activation function such as ReLU or sigmoid. The output circuit 3407 may include circuits such as a ADC (analog to digital converter, to convert neuron analog output to digital bits), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), APC (analog to pulse(s) converter, analog to time modulated pulse converter), or any other type of converters.

    [0141] Output circuit 3407 may implement an activation function such as rectified linear activation function (ReLU) or sigmoid. The output circuit 3407 may implement statistic normalization, regularization, up/down scaling/gain functions, statistical rounding, or arithmetic functions (e.g., add, subtract, divide, multiply, shift, log) for neuron outputs. Output circuit 3407 may implement a temperature compensation function for neuron outputs or array outputs (such as bitline output) so as to keep power consumption of the array approximately constant or to improve precision of the array (neuron) outputs such as by keeping the IV slope approximately the same.

    [0142] FIGS. 35A and 35B depict hybrid memory systems 3500 and 3550, respectively. Hybrid memory systems 3500 and 3550 each is capable of operating as a multi-level digital neural memory system to obtain digital weight data from the array in a first mode or as a multi-level analog neural memory system to obtain analog weight data from the array in a second mode.

    [0143] In FIG. 35A, hybrid memory system 3500 comprises hybrid array 3501 comprising an array of non-volatile memory cells arranged into rows and columns; configurable input circuitry 3502; and configurable output circuitry 3503.

    [0144] Configurable input circuitry 3502 provides an input to hybrid array 3501 and comprises row register and digital-to-analog (DAC) block 3505 for use in the first mode and row decoder block 3504 for use in the second mode.

    [0145] Configurable output circuitry 3503 provides an output responsive to signals received from hybrid array 3501 and comprises current-to-voltage converter (ITV) and analog-to-digital converter (ADC) block 3506 for use in the first mode and multi-state sense amplifier (MS SA) block 3507 for use in the second mode. The ITV+ADC block 3506 comprises multiple ITV circuits and multiple ADC circuits. The MS SA block 3507 comprises multiple MS SA circuits.

    [0146] In the first mode, hybrid array 3501 operates as a non-volatile memory storage to store or retrieve weight data in multi-bit digital form (digital multilevel form, meaning one physical memory cell can store one of multiple discrete levels such as 4 or 8 or 16 or 32 levels, meaning an output of one cell would equivalent to 2 digital bits or 3 digital bits or 4 digital bits or 5 digital bits, respectively). For example, if each cell can store 8 different values (3 bit or 3b cell), the digital weight data can vary from 000 to 111. As another example, if each cell can store 2 different values, as in a binary memory cell (1 bit cell), the digital weight data can vary from 0 to 1.

    [0147] In the first mode, row register and digital-to-analog (DAC) block 3505 generates an analog input signal to read one or more rows in hybrid array 3501 in response to a received digital signal. Digital MLC (multilevel cell) read mode only reads one row at a time, neural read mode reads more than one row at a time typically tens or hundreds of rows at a time. Block ITV+ADC 3506 receives analog (current) outputs from a plurality or all of columns of hybrid array 3501 to generate digital outputs representing a neural read of the majority (reading multiple rows and multiple columns at a time) of the entire hybrid array 3501. One ITV circuit is used to read one bitline at time to output analog value, which could include multiple cells on the same bitline. The ITV is used typically to convert the array output current into a voltage. One ADC circuit is used typically to read one bitline at a time to output digital bits, which could include multiple cells on the same bitline. The ADC circuit is typically used to convert a voltage into digital output bits. In one embodiment the ADC circuit can be used to convert the array current into digital output bits directly. For example, for a SAR ADC using voltage references, it can instead use current references for the operation.

    [0148] In the second mode, hybrid array 3501 operates as a VMM in an analog neural memory to store weight data in analog multi-level form, meaning each cell stores analog multilevels that has continuous analog values between levels. For example, for a digital multi-level cell of 8 levels, the cell has distinct levels from 1, 2, 3, 4, . . . , 8. For an analog multi-level cell of 8 levels, the cell has continuous value between levels, for example between level of 1 and 2, there exists analog values of 1.001, 1.002, . . . , 1.01, . . . 1.1, 1.2, . . . , 1.999, 2.0. The analog multi levels are needed for vector matrix multiplier (VMM) applications for neural array memory application.

    [0149] In the second mode, row decoder block 3504 is used to select (enable) one row in hybrid array 3501 for a read, program, or erase operation. During a read or program operation, MS SA block 3507 is used to read or verify one or more cells in one or more columns in hybrid array 3501. One MS SA circuit is used to read one cell at a time.

    [0150] Thus, hybrid memory system 3500 can operate as a multi-level digital neural memory system to obtain digital weight data from the array in a first mode or as a multi-level analog neural memory system to obtain analog weight data from the array in a second mode.

    [0151] In FIG. 35B, hybrid memory system 3550 comprises hybrid array 3551 comprising an array of non-volatile memory cells arranged into rows and columns; configurable input circuitry 3552; and configurable output circuitry 3553.

    [0152] Configurable input circuitry 3552 provides an input to hybrid array 3551 and comprises a row decoder, a row register, and a digital-to-analog block 3554. That is, blocks 3504 and 3505 of FIG. 35A are consolidated into a single block 3554. Configurable output circuitry 3503 provides an output responsive to signals received from hybrid array 3551 and comprises a current-to-voltage converter, an analog-to-digital converter, and a sense amplifier block 3555. That is, block 3506 and 3507 from FIG. 35A are consolidated into a single block 3555.

    [0153] In a first mode, hybrid array 3551 operates as non-volatile memory storage to store weight data in multilevel digital form. Block 3554 generate an analog input signal to read one or more rows in hybrid array 3551 in response to a received digital signal. Block 3555 receives analog (current) outputs from some or all of the columns of hybrid array 3551 to generate a digital output representing a neural read of at least a majority of the cells in the hybrid array 3551.

    [0154] In a second mode, hybrid array 3551 operates as a VMM in an analog neural memory to store weight data in multi-level analog form. Block 3554 is used to select one row in hybrid array 3501 for a read, program, or erase operation by acting as a row decoder. Block 3555 is used to read or verify one or more cells in one or more columns in hybrid array 3551 by acting as a multi-state sense amplifier. Each MS SA circuit operates on one cell at a time (i.e., one bitline with one cell enabled).

    [0155] Thus, hybrid memory system 3550 can operate as a digital neural memory system to obtain digital weight data from hybrid array 3551 in a first mode or as an analog neural memory system to obtain analog weight data from hybrid array 3551 in a second mode.

    [0156] FIG. 36 depicts hybrid memory operation method 3600, which can be performed by hybrid memory system 3500 of FIG. 35A or hybrid memory system 3550 of FIG. 35B.

    [0157] In step 3601, the system determines if a VMM analog neural memory operation is to be performed. If yes, the system proceeds to step 3602. If no, the system proceeds to step 3609.

    [0158] In step 3602, a VMM analog neural operation begins. In step 3603, an input is provided by a digital-to-analog converter, and a resulting output is provided by an analog-to-digital converter. The DAC can be a 1-bit DAC.

    [0159] In step 3604, a plurality of rows are enabled.

    [0160] In step 3605, a plurality of columns are enabled.

    [0161] In step 3606, an output from the hybrid memory array is converted into a different form such as digital output bits (analog weight data).

    [0162] In step 3607, a partial sum storage is performed.

    [0163] In step 3608, the actions of summation, activation, and/or pooling are performed to generate a neural output.

    [0164] In step 3609, a digital non-volatile memory operation is to be performed.

    [0165] In step 3610, an input is provided by a row decoder, and an output is provided by a multi-state sense amplifier.

    [0166] In step 3611, a row is enabled.

    [0167] In step 3612, a column is enabled.

    [0168] In step 3613, an output from the hybrid memory array is converted into a different form such as digital output bits (digital weight data).

    [0169] In step 3614, the output from step 3613 is stored in a buffer memory such as an SRAM memory.

    [0170] In step 3615, the system determines if all target rows have been operated upon. If yes, the system proceeds to step 3616. If no, the system returns to step 3611 and performs the steps described above.

    [0171] In step 3616, the actions of summation, activation, and/or pooling are performed to generate an output.

    [0172] FIG. 37 depicts configurable memory system 3700, which comprises hybrid memory system 3500 or 3550, as well as configurable macro circuitry 3701. Configurable macro circuitry 3701 can be configured to operate in conjunction with a first mode or a second mode of hybrid memory system 3500 or 3550. This configuration can occur during start-up or in real-time during operation. Configurable macro circuitry 3701 optionally comprises SRAM 3702, SIMD (single instruction, multiple data instruction processing) module 3703, interconnect matrix 3704 (for connecting configurable macro circuitry 3701 to hybrid memory system 3500 or 3550), and eMCU (control unit) 3705.

    [0173] FIG. 38 depicts system 3800, which comprises a plurality of hybrid array tiles 3801 (each of which can comprise hybrid memory system 3500 or 3550 or 3700), interconnect 3802, system-level SIMD module 3803, eMCUsys (system-level controller) 3804, system-level memory 3805, and system-level interface IFTC 3806 (which is a high-speed interface such as OctoSPI, PCIe, Internet, etc.).

    [0174] FIG. 39 depicts re-configurable ITV+ADC circuit 3900. Re-configurable ITV+ADC circuit 3900 comprises adjustable current source 3901, current source 3902 (which is the selected memory cell), comparator 3903, and logic 3904. Re-configurable ITV+ADC circuit 3900 can perform as current SAR ADC with current references provided by current reference block 3901. For example, for an 8-bit current SAR ADC, the IDAC 3901 will provide 15 levels for 8 bits. The circuit is reconfigured for digital weight read or analog weigh neural read by adjusting the IDAC reference values (e.g., for read neural, IDAC reference values would be larger depending on how many rows are enabled and what is maximum bitline current for digital weight or analog weight neural read)

    [0175] It should be noted that, as used herein, the terms “over” and “on” both inclusively include “directly on” (no intermediate materials, elements or space disposed therebetween) and “indirectly on” (intermediate materials, elements or space disposed therebetween). Likewise, the term “adjacent” includes “directly adjacent” (no intermediate materials, elements or space disposed therebetween) and “indirectly adjacent” (intermediate materials, elements or space disposed there between), “mounted to” includes “directly mounted to” (no intermediate materials, elements or space disposed there between) and “indirectly mounted to” (intermediate materials, elements or spaced disposed there between), and “electrically coupled” includes “directly electrically coupled to” (no intermediate materials or elements there between that electrically connect the elements together) and “indirectly electrically coupled to” (intermediate materials or elements there between that electrically connect the elements together). For example, forming an element “over a substrate” can include forming the element directly on the substrate with no intermediate materials/elements therebetween, as well as forming the element indirectly on the substrate with one or more intermediate materials/elements there between.