SPLIT ARRAY ARCHITECTURE FOR ANALOG NEURAL MEMORY IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK

20230229903 · 2023-07-20

    Inventors

    Cpc classification

    International classification

    Abstract

    Numerous embodiments are disclosed for splitting a physical array into multiple arrays for separate vector-by-matrix multiplication (VMM) operations. In one example, a system comprises an array of non-volatile memory cells arranged into rows and columns; and a plurality of sets of output lines, where each column contains a set of output lines; wherein each row is coupled to only one output line in the set of output lines for each column.

    Claims

    1. A system comprising: an array of non-volatile memory cells arranged into rows and columns; and a plurality of sets of two or more output lines, where each column contains a set of two or more output lines; wherein each row is coupled to only one output line in the set of two or more output lines for each column.

    2. The system of claim 1, wherein the output lines are bit lines.

    3. The system of claim 1, wherein each set of two or more output lines in the plurality of sets of two or more output lines comprises two output lines.

    4. The system of claim 1, wherein each set of two or more output lines in the plurality of sets of two or more output lines comprises four output lines.

    5. The system of claim 1, wherein the non-volatile memory cells are split-gate flash memory cells.

    6. The system of claim 1, wherein the non-volatile memory cells are stacked-gate flash memory cells.

    7. The system of claim 1 comprising: an output driver coupled to the plurality of sets of two or more output lines.

    8. The system of claim 1 comprising: a first output driver coupled to a first line in each set of two or more output lines in the plurality of sets of two or more output lines; and a second output driver coupled to a second line in each set of two or more output lines in the plurality of sets of two or more output lines.

    9. The system of claim 8 comprising: a third output driver coupled to a third line in each set of two or more output lines in the plurality of sets of two or more output lines; and a fourth output driver coupled to a fourth line in each set of two or more output lines in the plurality of sets of two or more output lines.

    10. The system of claim 8 comprising: a high voltage decoder for providing high voltages to the array during program and erase operations.

    11. The system of claim 8 comprising: bit line drivers coupled to the plurality of sets of two or more output lines.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0096] FIG. 1 is a diagram that illustrates an artificial neural network.

    [0097] FIG. 2 depicts a prior art split gate flash memory cell.

    [0098] FIG. 3 depicts another prior art split gate flash memory cell.

    [0099] FIG. 4 depicts another prior art split gate flash memory cell.

    [0100] FIG. 5 depicts another prior art split gate flash memory cell.

    [0101] FIG. 6 is a diagram illustrating the different levels of an exemplary artificial neural network utilizing one or more non-volatile memory arrays.

    [0102] FIG. 7 is a block diagram illustrating a vector-by-matrix multiplication system.

    [0103] FIG. 8 is a block diagram illustrates an exemplary artificial neural network utilizing one or more vector-by-matrix multiplication systems.

    [0104] FIG. 9 depicts another embodiment of a vector-by-matrix multiplication system.

    [0105] FIG. 10 depicts another embodiment of a vector-by-matrix multiplication system.

    [0106] FIG. 11 depicts another embodiment of a vector-by-matrix multiplication system.

    [0107] FIG. 12 depicts another embodiment of a vector-by-matrix multiplication system.

    [0108] FIG. 13 depicts another embodiment of a vector-by-matrix multiplication system.

    [0109] FIG. 14 depicts a prior art long short-term memory system.

    [0110] FIG. 15 depicts an exemplary cell for use in a long short-term memory system.

    [0111] FIG. 16 depicts an embodiment of the exemplary cell of FIG. 15.

    [0112] FIG. 17 depicts another embodiment of the exemplary cell of FIG. 15.

    [0113] FIG. 18 depicts a prior art gated recurrent unit system.

    [0114] FIG. 19 depicts an exemplary cell for use in a gated recurrent unit system.

    [0115] FIG. 20 depicts an embodiment of the exemplary cell of FIG. 19.

    [0116] FIG. 21 depicts another embodiment of the exemplary cell of FIG. 19.

    [0117] FIG. 22 depicts another embodiment of a vector-by-matrix multiplication system.

    [0118] FIG. 23 depicts another embodiment of a vector-by-matrix multiplication system.

    [0119] FIG. 24 depicts another embodiment of a vector-by-matrix multiplication system.

    [0120] FIG. 25 depicts another embodiment of a vector-by-matrix multiplication system.

    [0121] FIG. 26A depicts another embodiment of a vector-by-matrix multiplication system.

    [0122] FIG. 26B depicts another embodiment of a vector-by-matrix multiplication system.

    [0123] FIG. 26C depicts another embodiment of a vector-by-matrix multiplication system.

    [0124] FIG. 27 depicts another embodiment of a vector-by-matrix multiplication system.

    [0125] FIG. 28 depicts another embodiment of a vector-by-matrix multiplication system.

    [0126] FIG. 29 depicts another embodiment of a vector-by-matrix multiplication system.

    [0127] FIG. 30 depicts another embodiment of a vector-by-matrix multiplication system.

    [0128] FIG. 31 depicts a vector-by-matrix multiplication system.

    [0129] FIG. 32 depicts an embodiment of a split vector-by-matrix multiplication system.

    [0130] FIG. 33 depicts an embodiment of a split array vector-by-matrix multiplication system.

    [0131] FIG. 34 depicts another embodiment of a split array vector-by-matrix multiplication system.

    [0132] FIG. 35 depicts another embodiment of a split array vector-by-matrix multiplication system.

    [0133] FIG. 36 depicts another embodiment of a split array vector-by-matrix multiplication system.

    [0134] FIG. 37 depicts an embodiment of a split array in a vector-by-matrix multiplication system.

    [0135] FIG. 38 depicts another embodiment of a split array in a vector-by-matrix multiplication system.

    [0136] FIG. 39 depicts exemplary layouts of a single array and a split array in vector-by-matrix multiplication systems.

    [0137] FIG. 40 depicts an example of a split array vector-by-matrix multiplication system with multiple output lines per column.

    [0138] FIG. 41 depicts an example of a split array vector-by-matrix multiplication system with multiple output lines per column.

    [0139] FIG. 42 depicts an example of an array with multiple output lines per column.

    [0140] FIG. 43 depicts an example of a split array vector-by-matrix multiplication system with multiple output lines per column.

    [0141] FIG. 44 depicts an example of an array that comprises a plurality of the arrays depicted in FIG. 42.

    DETAILED DESCRIPTION OF THE INVENTION

    [0142] The artificial neural networks of the present invention utilize a combination of CMOS technology and non-volatile memory arrays.

    [0143] VMM System Overview

    [0144] FIG. 31 depicts a block diagram of VMM system 3100. VMM system 3100 comprises VMM array 3101, row decoder 3102, high voltage decoder 3103, column decoder 3104, bit line drivers 3105, input circuit 3106, output circuit 3107, control logic 3108, and bias generator 3109. VMM system 3100 further comprises high voltage generation block 3110, which comprises charge pump 3111, charge pump regulator 3112, and high voltage level generator 3113. VMM system 3100 further comprises (program/erase, or aka weight tuning) algorithm controller 3114, analog circuitry 3115, control engine 3116 (that may include special functions such as arithmetic functions, activation functions, embedded microcontroller logic, etc.), and test control logic 3117. The systems and methods described below can be implemented in VMM system 3100.

    [0145] The input circuit 3106 may include circuits such as a DAC (digital to analog converter), DPC (digital to pulses converter, digital to time modulated pulse converter), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), PAC (pulse to analog level converter), or any other type of converters. The input circuit 3106 may implement normalization, linear or non-linear up/down scaling functions, or arithmetic functions. The input circuit 3106 may implement a temperature compensation function for input levels. The input circuit 3106 may implement an activation function such as ReLU or sigmoid. The output circuit 3107 may include circuits such as an ADC (analog to digital converter, to convert neuron analog output to digital bits), AAC (analog to analog converter, such as a current to voltage converter, logarithmic converter), APC (analog to pulse(s) converter, analog to time modulated pulse converter), a current-to-voltage converter, or any other type of converters. The output circuit 3107 may implement an activation function such as ReLU or sigmoids. The output circuit 3107 may implement statistic normalization, regularization, up/down scaling/gain functions, statistical rounding, or arithmetic functions (e.g., add, subtract, divide, multiply, shift, log) for neuron outputs. The output circuit 3107 may implement a temperature compensation function for neuron outputs or array outputs (such as bitline output) so as to keep power consumption of the array approximately constant or to improve precision of the array (neuron) outputs such as by keeping the IV slope approximately the same.

    [0146] FIGS. 32-36 depicts embodiments of VMM systems that contain some commonality with VMM system 3100 but also some modifications.

    [0147] FIG. 32 depicts VMM system 3200. VMM system 3200 comprises array 3201, shared row decoder 3202, shared high voltage decoder 3203, column decoders 3204 and 3205, (row) input circuit 3220, output circuits 3206 and 3207, and shared bit line drivers 3208. Shared row decoder 3202 is coupled to all rows in array 3201 and applies a voltage to a selected row. Shared high voltage decoder 3203 can be selectively coupled to all rows in array 3201. Shared high voltage decoder 3203 optionally comprises control gate high voltage decoder 3231 that can be selectively coupled to all rows in the array and shared erase gate high voltage decoder 3232 that can be selectively coupled to all rows in the array. The input circuit 3220 is, for example, similar to the input circuit 3106 of FIG. 31. The circuits and functions of the output circuit 3206 and 3207 are, for example, each similar to the circuits and functions of the output circuit 3107 of FIG. 31. Unlike in VMM system 3100, in VMM system 3200 certain operations are split between different sets of circuitry. Specifically, half of the columns (for example, all odd columns) in array 3201 are operated upon by column decoder 3204 and output circuit 3206, and the other half of the columns (for example, all even columns) in array 3201 are operated upon by column decoder 3205 and output circuit 3207. Thus, output circuit 3206 is coupled to column decoder 3204 for generating a first output from one or more columns in a first half of the columns during a read operation, and output circuit 3207 is coupled to column decoder 3207 for generating a second output from one or more columns in a second half of the columns during a read operation. In this embodiment, all columns are coupled to shared bit line drivers 3208 during program or erase operations. This allow multiple bitlines to be read concurrently, meaning bitlines coupled to the column decoder 3204 and the output circuit 3206 and bitlines coupled to the column decoder 3205 and the output circuit 3207 are enabled at the same time, by shared bit line drivers 3208, for read operations. Hence, this increases the throughput for reading the array 3201. Alternatively, the read operations need not be concurrent.

    [0148] Optionally, with further reference to FIG. 39, continuous diffusion can be implemented between the top half and the bottom half of the array.

    [0149] FIG. 33 depicts VMM system 3300. VMM system 3300 comprises arrays 3301a and 3301b, row decoder 3302, shared high voltage decoder 3303, column decoders 3304 and 3305, input circuit 3320, current-to-voltage converter circuits 3306 and 3307, shared analog-to-digital converter (ADC) 3308, and shared bit line drivers 3309. The current-to-voltage converter circuits 3306 or 3307 and the shared ADC circuit 3308 are parts of the output circuit 3207 in FIG. 32.

    [0150] Unlike in VMM system 3100, in VMM system 3300 certain operations are split between different sets of circuitry. Specifically, array 3301a is operated upon by column decoder 3304 and current-to-voltage converter 3306, and array 3301b is operated upon by column decoder 3305 and current-to-voltage converter 3307. This allows multiple read and/or program operations to be performed simultaneously, where read or program operations can be performed concurrently on one or more cells in array 3301a and one or more cells in array 3301b.

    [0151] Current-to-voltage converter circuits 3306 and 3307 are both coupled to shared analog-to-digital converter 3308, which is used in a time multiplexing fashion during read operations, and to shared bit line drivers 3309, which is used during program and erase operations. For example, in read operation, the array 3301a is enabled and is coupled to the column decoder 3304 and to the current-to-voltage converter circuit 3306 while the array 3301b is enabled and is coupled to the column decoder 3305 and the current-to-voltage converter circuit 3307 at the same time. The output voltage from the current-to-voltage converter circuits 3306 and 3307 are sampled and held (S/H), e.g., by S/H capacitors inside the shared ADC 3308, and these array output voltage are digitized (converted) by the time multiplexed shared ADC 3308 (since it is shared between the current-to-voltage converter circuits 3306 and 3307). For example, for one ADC shared between two current-to-voltage converter circuits, two set of S/H capacitors are used. In another embodiment, one ADC can be used for N current-to-voltage converter circuits, and in this case N set of S/H capacitors are used.

    [0152] The use of a shared ADC between two current-to-voltage converter circuits can be applied to FIGS. 34/35/36 as well.

    [0153] FIG. 34 depicts VMM system 3400. VMM system 3400 comprises arrays 3401a and 3401b, shared row decoder 3402, shared high voltage decoder 3403, column decoders 3404 and 3405, input circuit 3420, output circuits 3406 and 3407, and shared bit line drivers 3408. Unlike in VMM system 3100, in VMM system 3400 certain operations are split between different sets of circuitry. Specifically, array 3401a is operated upon by column decoder 3404 and output circuit 3406, and array 3401b is operated upon by column decoder 3405 and output circuit 3407. This allows multiple read/or and program operations to be performed simultaneously, where read or program operations can be performed concurrently on one or more cells in array 3401a and one or more cells in array 3401b. Arrays 3401a and 3401b are both coupled to shared bit line drivers 3408, which is used during program and erase operations.

    [0154] FIG. 35 depicts VMM system 3500. VMM system 3500 comprises arrays 3501a, 3501b, 3501c, and 3501d; row decoders 3502 and 3503; shared high voltage decoder 3504; column decoders 3505, 3506, 3507, and 3508; input circuit 3520, output circuits 3509, 3510, 3511, and 3512; and shared bit line drivers 3513 and 3514. Shared high voltage decoder 3504 can be selectively coupled to all rows in arrays 3501a, 3501b, 3501c, and 3501d. Row decoder 3502 is shared by arrays 3501a and 3501b and is coupled to all rows in those arrays and applies a voltage to a selected row, and row decoder 3503 is shared by arrays 3501c and 3501d and is coupled to all rows in those arrays and applies a voltage to a selected row.

    [0155] In VMM system 3500, certain operations are split between different sets of circuitry. Specifically, array 3501a is operated upon by column decoder 3505 and output circuit 3509; array 3501b is operated upon by column decoder 3507 and output circuit 3511; array 3501c is operated upon by column decoder 3506 and output circuit 3510; and array 3501d is operated upon by column decoder 3508 and output circuit 3512. This allows multiple read/or and program operations to be performed simultaneously in all four arrays at once, where read or program operations can be performed concurrently on one or more cells in array 3501a, one or more cells in array 3501b, one or more cells in array 3501c, and one or more cells in array 3501d. Arrays 3501a and 3501b are both selectively coupled to shared bit line drivers 3513 during program and erase operations. Arrays 3501c and 3501d are both selectively coupled to shared bit line drivers 3514 during program and erase operations.

    [0156] For example, a first read operation can be performed where column decoder 3505 and output circuit 3509 generate a first output from one or more rows in array 3501a, a second read operation can be performed where column decoder 3506 and output circuit 3510 generate a second output from one or more rows in array 3501c, a third read operation can be performed where column decoder 3507 and output circuit 3511 generate a third output from one or more rows in array 3501b, and a fourth read operation can be performed where column decoder 3508 and output circuit 3512 generate a fourth output from one or more rows in array 3501d. Optionally, the first and third read operations can occur concurrently. Optionally, the second and fourth read operations can occur concurrently.

    [0157] FIG. 36 depicts VMM system 3600. VMM system 3600 comprises arrays 3601a, 3601b, 3601c, and 3601d; row decoder 3621; control gate decoders 3602 and 3603; shared high voltage decoder 3604; column decoders 3605, 3606, 3607, and 3608; output circuits 3609, 3610, 3611, and 3612; and shared bit line drivers 3613 and 3614. In VMM system 3600, certain operations are split between different sets of circuitry. Specifically, array 3601a is operated upon by column decoder 3605 and output circuit 3609; array 3601b is operated upon by column decoder 3607 and output circuit 3611; array 3601c is operated upon by column decoder 3606 and output circuit 3610; and array 3601d is operated upon by column decoder 3608 and output circuit 3612. This allows multiple read and/or program operations to be performed simultaneously in all four arrays at once, where read or program operations can be performed concurrently on one or more cells in array 3601a, one or more cells in array 3601b, one or more cells in array 3601c, and one or more cells in array 3601d. Arrays 3601a and 3601b are both selectively coupled to shared bit line drivers 3613 during program and erase operations. Arrays 3601c and 3601d are both selectively coupled to shared bit line drivers 3614 during program and erase operations.

    [0158] FIGS. 32-36 show the reading is done by row input on control gates. Alternatively, it can be done on word lines or erase gates. The input circuit 3220 in FIG. 32, 3320 in FIG. 33, 3420 in FIG. 34, 3520 in FIG. 35, and 3620 in FIG. 36 are similar to the input circuit 3106 of FIG. 31. The output circuits 3206/3207 in FIGS. 32 and 3406/4307 in FIG. 34, 3507/3508/3509/3510 in FIG. 35, and 3607/3608/3609/3610 in FIG. 36 are similar to the output circuit 3107 of FIG. 31.

    [0159] FIG. 37 depicts a portion of VMM array 3700. VMM array 3700 comprises rows 3701, 3702, 3703, 3704, 3705, 3706, 3707, and 3708. Rows 3701, 3702, 3705, and 3706 share an erase gate line (EG0) and a source line (SL0); rows 3703, 3704, 3707, and 3708 share an erase gate line (EG1) and a source line (SL1). In addition, rows 3701 and 3703 share a control gate line (CG0/CG2); rows 3702 and 3704 share a control gate line (CG1/CG3); rows 3705 and 3707 share a control gate line (CG4/CG6); and rows 3706 and 3708 share a control gate line (CG5/CG7). These couplings allow different rows to share decoder circuitry. The array terminals are shared such that program or erase disturb is reduced by having a reduced amount of erase or program voltage stress on un-selected cells.

    [0160] In the arrays of FIGS. 37 and 38 (described below), the row input for the VMM array 3700 and 3800 for neural read operations (multiple row and multiple bitlines are on at the same time) is on the word lines. If the input for neural read is on the control gates, the control gates cannot be shared across multiple rows in the same sub-array or array bank.

    [0161] FIG. 38 depicts a portion of array 3800. Array 3800 comprises sectors 3809 and 3819. Sector 3809 comprises rows 3801, 3802, 3803, 3804, 3805, 3806, 3807, and 3808. Sector 3819 comprises rows 3811, 3812, 3813, 3814, 3815, 3816, 3817, and 3818.

    [0162] Rows 3801 (a first row) and 3811 (a second row) share a control gate line (CG0) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3802 and 3812 share a control gate line (CG1) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3803 and 3813 share a control gate line (CG2) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3804 and 3814 share a control gate line (CG3) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3805 and 3815 share a control gate line (CG4) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3806 and 3816 share a control gate line (CG5) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); rows 3807 and 3817 share a control gate line (CG6) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line); and rows 3808 and 3818 share a control gate line (CG7) (meaning that the control gate terminal of each cell in those rows is coupled to the same control gate line). This means that the control gates are shared across the sectors. These couplings allow different rows to share decoder circuitry. The array terminals are shared such that the program or erase disturb is reduced by having a reduced amount of erase or program voltage stress on un-selected cells.

    [0163] Rows 3801 (a first row), 3802 (a third row), 3805, and 3806 share an erase gate line (EG0) (meaning that the erase gate terminal of each cell in those rows is coupled to the same erase gate line) and a source line (SL0) (meaning that the source line terminal of each cell in those rows is coupled to the same source line); rows 3803, 3084, 3807, and 3808 share an erase gate line (EG1) (meaning that the erase gate terminal of each cell in those rows is coupled to the same erase gate line) and a source line (SL1) (meaning that the source line terminal of each cell in those rows is coupled to the same source line); rows 3811, 3812, 3815, and 3816 share an erase gate line (EG0) (meaning that the erase gate terminal of each cell in those rows is coupled to the same erase gate line) and a source line (SL0) (meaning that the source line terminal of each cell in those rows is coupled to the same source line); and rows 3813, 3114, 3817, and 3818 share an erase gate line (EG1) (meaning that the erase gate terminal of each cell in those rows is coupled to the same erase gate line) and a source line (SL1) (meaning that the source line terminal of each cell in those rows is coupled to the same source line).

    [0164] FIG. 39 depicts exemplary layouts of a portion of a single array 3901 (such as array 3101 in FIG. 31 and array 3201 in FIG. 32) and a split array 3902 (such as arrays 3301a and 3301b in FIG. 33, arrays 3401a and 3401b in FIG. 34, arrays 3501a, 3501b, 3501c, and 3501d in FIG. 35, and arrays 3601a, 3601b, 3601c, and 3601d in FIG. 36). Split array 3902 follows the same design as array 3901 except that certain contacts and metal connection 3904 are removed (or not formed), creating sub-arrays 3903a and 3903b. The few dummy rows at the interface are disabled such by grounding the wordline and control gates. This maintains process uniformity due to the front-end layers (i.e., continuous column diffusion within columns and continuous row diffusion within source lines) and polysilicon are continuous and uniform between the two arrays of non-volatile memory cells (between electrically separate arrays). This also results in reduced area overhead as compared to physical separation of different arrays.

    [0165] Array with Multiple Output Lines Per Column

    [0166] FIG. 40 depicts VM/I system 4000. VMM system 4000 comprises arrays 4001a and 4001b, shared row decoder 4002, shared high voltage decoder 4003, column decoder 4005, input circuit 4020, output circuits 4006 and 4007, and shared bit line drivers 4008 (control programming such as providing program current or inhibit the bitline in programming). Unlike in VMM system 3100, in VMM system 4000 certain operations are split between different sets of circuitry. Specifically, although array 4001a and array 4001b can be formed of the same physical array, by design there are multiple output lines per memory column (a column of memory cells). In this example, the output lines are bit lines, meaning that there are multiple bitlines per column, with one set of bit lines being coupled to output circuit 4006 and another set of bit lines being coupled to output circuit 4007. This allows read/or and program operations to be performed simultaneously on array 4001a and array 4001b, where read or program operations can be performed concurrently on one or more cells in array 4001a and one or more cells in array 4001b. Arrays 4001a and 4001b are both coupled to shared bit line drivers 4008, which is used during program and erase operations. Alternatively, the read or program operations can performed independently at different time period.

    [0167] FIG. 41 depicts VMM system 4100. VMM system 4100 comprises arrays 4101a and 4101b, shared row decoder 4102, shared high voltage decoder 4103, column decoder 4105, input circuit 4120, shared output circuit 4106, and shared bit line drivers 4107. VMM system 4100 is similar to VMM system 4100 except that VMM system 4100 contains one output circuit 4106 instead of two output circuits 4006 and 4007. As in VMM system 4000, in VMM system 4100 certain operations are split between different sets of circuitry. Specifically, although array 4101a and array 4101b can be formed of the same physical array, by design there are multiple output lines per column. In this example, the output lines are bit lines, meaning that there are multiple bitlines per column, with all bit lines coupled to output circuit 4106. This allows read/or and program operations to be performed simultaneously on array 4101a and array 4101b, where read or program operations can be performed concurrently on one or more cells in array 4101a and one or more cells in array 4101b. Arrays 4101a and 4101b are both coupled to shared bit line drivers 4108, which is used during program and erase operations. Alternatively, the read or program operations can be performed independently at different time period.

    [0168] FIG. 42 depicts a portion of array 4200. Array 4200 comprises a plurality of non-volatile memory cells arranged into rows and columns 4201, 4202, 4203, and 4204. It is to be understood that array 4200 can comprise many additional rows and columns that are not shown but that the same principles shown will also apply to those additional rows and columns. In this example, each non-volatile memory cell is contained in a row and a column and has a bit line terminal, word line terminal, control gate terminal, erase gate terminal, and control gate terminal as in memory cell 310 of FIG. 3.

    [0169] Unlike in the prior art, each column contains two output lines (two bit lines) Alternatively it can contain more than two output lines. For example, column 4201 comprises bit lines BLB0 and BLTO, column 4202 comprises bit lines BLB1 and BLT1, column 4203 comprises bit lines BLB2 and BLT2, and column 4204 comprises bit lines BLB3 and BLT3. The rows are grouped into block arrays 4210 and 4220. The cells in array 4210 are coupled to bit lines BLT0, BLT1, BLT2, and BLT3 but are not coupled to bit line BLB0, BLB1, BLB2, and BLB3, and the cells in array 4220 are coupled to bit lines BLB0, BLB1, BLB2, and BLB3 but are not coupled to bit lines BLT0, BLT1, BLT2, and BLT3.

    [0170] Due to this configuration, the rows in array 4210 can be read or programmed concurrently with the rows in array 4220 without any contention. This provides extreme efficiency when, for example, array 4210 is used in a first VMM operation and array 4220 is used in a second, separate VMM operation. This allows the same physical array to enable two separate VMM arrays and operations to be operated upon concurrently, which improves the speed of the VMM system compared to the prior art.

    [0171] FIG. 43 depicts VMM system 4300. VMM system 4300 comprises arrays 4301a, 4301b, 4301c, and 4301d, shared row decoder 4302, shared high voltage decoder 4303, column decoders 4305 and 4310, shared input circuit 4320, output circuits 4306, 4307, 4311, and 4312, and bit line drivers 4308 and 4313. Unlike in VMM system 3100, in VMM system 4300 certain operations are split between different sets of circuitry. Specifically, although array 4301a, 4301b, 4301c, and 4301d can be formed of the same physical array, by design there are multiple output lines per column. In this example, the output lines are bit lines, and there are four output lines (bit lines) per memory column, with a first set of bit lines being coupled to output circuit 4306, a second set of bit lines being coupled to output circuit 4307, a third set of bit lines being coupled to output circuit 4311, and a fourth set of bit lines being coupled to output circuit 4312. This allows read/or and program operations to be performed simultaneously on arrays 4301a, 4301b, 4301c, and 4301d, where read or program operations can be performed concurrently on one or more cells in array 4301a, one or more cells in array 4301b, one or more cells in array 4301c, and one or more cells in array 4301d. Arrays 4301a and 4301b are both coupled to bit line drivers 4313, which is used during program and erase operations involving those arrays, and arrays 4301c and 4301d are coupled to bit line drivers 4308, which is used during program and erase operations involving those arrays. Alternatively, the read or program operations can be performed independently at different time period.

    [0172] FIG. 44 depicts VMM system 4400, which comprises a plurality of sets of arrays, where each set comprises array 4401a and array 4401b, where each array 4401a is an instance of array 4210 in FIG. 42 and each array 4401b is an instance of array 4220 in FIG. 42.

    [0173] It should be noted that, as used herein, the terms “over” and “on” both inclusively include “directly on” (no intermediate materials, elements or space disposed therebetween) and “indirectly on” (intermediate materials, elements or space disposed therebetween). Likewise, the term “adjacent” includes “directly adjacent” (no intermediate materials, elements or space disposed therebetween) and “indirectly adjacent” (intermediate materials, elements or space disposed there between), “mounted to” includes “directly mounted to” (no intermediate materials, elements or space disposed there between) and “indirectly mounted to” (intermediate materials, elements or spaced disposed there between), and “electrically coupled” includes “directly electrically coupled to” (no intermediate materials or elements there between that electrically connect the elements together) and “indirectly electrically coupled to” (intermediate materials or elements there between that electrically connect the elements together). For example, forming an element “over a substrate” can include forming the element directly on the substrate with no intermediate materials/elements therebetween, as well as forming the element indirectly on the substrate with one or more intermediate materials/elements there between.