STATIC RANDOM-ACCESS MEMORY FOR DEEP NEURAL NETWORKS
20220309330 · 2022-09-29
Inventors
- Jae-sun Seo (Tempe, AZ, US)
- Shihui Yin (Mesa, AZ, US)
- Zhewei Jiang (New York, NY, US)
- Mingoo Seok (New York, NY, US)
Cpc classification
G11C27/005
PHYSICS
G06N3/082
PHYSICS
G11C11/413
PHYSICS
G11C7/16
PHYSICS
G11C7/1006
PHYSICS
International classification
G11C11/413
PHYSICS
G11C27/00
PHYSICS
G11C7/10
PHYSICS
Abstract
A static random-access memory (SRAM) system includes SRAM cells configured to perform exclusive NOR operations between a stored binary weight value and a provided binary input value. In some embodiments, SRAM cells are configured to perform exclusive NOR operations between a stored binary weight value and a provided ternary input value. The SRAM cells are suitable for the efficient implementation of emerging deep neural network technologies such as binary neural networks and XNOR neural networks.
Claims
1. (canceled)
2. A static random-access memory (SRAM) system comprising a SRAM cell, the SRAM cell comprising: a write word line; a first write bit line and a second write bit line; a read bit line; a first read word line, a second read word line, a third read word line, and a fourth read word line; a first inverter comprising an input coupled to a first intermediate node, an output coupled to a second intermediate node, a first voltage input node coupled to a supply voltage, and a second voltage input node coupled to a fixed voltage; a second inverter comprising an input coupled to the second intermediate node, an output coupled to the first intermediate node, a first voltage input node coupled to the supply voltage, and a second voltage input node coupled to a fixed voltage; a third inverter comprising an input coupled to the first intermediate node, an output coupled to the read bit line, a first voltage input node coupled to the first read word line, and a second voltage input node coupled to the second read word line; a fourth inverter comprising an input coupled to the second intermediate node, an output coupled to the first intermediate node, a first voltage input node coupled to the third read word line, and a second voltage input node coupled to the fourth read word line; a first switching element comprising a control node coupled to the write word line, a first switching node coupled to the first write bit line, and a second switching node coupled to the first intermediate node; and a second switching element comprising a control node coupled to the write word line, a first switching node coupled to the second write bit line, and a second switching node coupled to the second intermediate node.
3. The SRAM system of claim 2, further comprising memory control circuitry coupled to the SRAM cell and configured to: write a binary weight value to the SRAM cell; and provide signals at the first read word line, the second read word line, the third read word line, and the fourth read word line, wherein the signals are indicative of a ternary input value and in response to the signals at the first read word line, the second read word line, the third read word line, and the fourth read word line the SRAM cell is configured to provide a signal at the read bit line indicative of a ternary output value at the read bit line.
4. The SRAM system of claim 3, wherein writing the binary weight value to the SRAM cell comprises: providing a signal indicative of the binary weight value at the first write bit line; and providing an activation signal at the write word line.
5. The SRAM system of claim 4, further comprising analog-to-digital converter (ADC) circuitry coupled to the read bit line and configured to receive the signal representative of the ternary output value and provide a digital output signal representative of the ternary output value.
6. The SRAM system of claim 5, wherein the first switching element and the second switching element are transistors.
7. A static random-access memory (SRAM) system comprising a plurality of SRAM cells, each SRAM cell comprising: a write word line; a first write bit line and a second write bit line; a read bit line; a first read word line, a second read word line, a third read word line, and a fourth read word line; a first inverter comprising an input coupled to a first intermediate node, an output coupled to a second intermediate node, a first voltage input node coupled to a supply voltage, and a second voltage input node coupled to a fixed voltage; a second inverter comprising an input coupled to the second intermediate node, an output coupled to the first intermediate node, a first voltage input node coupled to the supply voltage, and a second voltage input node coupled to a fixed voltage; a third inverter comprising an input coupled to the first intermediate node, an output coupled to the read bit line, a first voltage input node coupled to the first read word line, and a second voltage input node coupled to the second read word line; a fourth inverter comprising an input coupled to the second intermediate node, an output coupled to the first intermediate node, a first voltage input node coupled to the third read word line, and a second voltage input node coupled to the fourth read word line; a first switching element comprising a control node coupled to the write word line, a first switching node coupled to the first write bit line, and a second switching node coupled to the first intermediate node; and a second switching element comprising a control node coupled to the write word line, a first switching node coupled to the second write bit line, and a second switching node coupled to the second intermediate node.
8. The SRAM system of claim 7, further comprising memory control circuitry coupled to each of the plurality of SRAM cells and configured to: write a binary weight value to each one of the plurality of SRAM cells; provide signals at the first read word line, the second read word line, the third read word line, and the fourth read word line, wherein the signals are indicative of a ternary input value and in response to the signals at the first read word line, the second read word line, the third read word line, and the fourth read word line each one of the plurality of SRAM cells is configured to provide a signal at the read bit line indicative of a ternary output value at the read bit line.
9. The SRAM system of claim 8, wherein writing the binary weight value to each one of the plurality of SRAM cells comprises: providing a signal indicative of the binary weight value at the first write bit line; and providing an activation signal at the write word line.
10. The SRAM system of claim 8, further comprising analog-to-digital converter (ADC) circuitry coupled to the read bit line and configured to receive the signal representative of the ternary output value from each one of the plurality of SRAM cells and provide a digital output signal representative of a bitwise count of the ternary output values of the plurality of SRAM cells.
11. The SRAM system of claim 7, wherein the first switching element and the second switching element are transistors.
Description
BRIEF DESCRIPTION OF THE DRAWING FIGURES
[0010] The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.
[0011]
[0012]
[0013]
[0014]
DETAILED DESCRIPTION
[0015] The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the embodiments and illustrate the best mode of practicing the embodiments. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the disclosure and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims.
[0016] It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0017] It will be understood that when an element such as a layer, region, or substrate is referred to as being “on” or extending “onto” another element, it can be directly on or extend directly onto the other element or intervening elements may also be present. In contrast, when an element is referred to as being “directly on” or extending “directly onto” another element, there are no intervening elements present. Likewise, it will be understood that when an element such as a layer, region, or substrate is referred to as being “over” or extending “over” another element, it can be directly over or extend directly over the other element or intervening elements may also be present. In contrast, when an element is referred to as being “directly over” or extending “directly over” another element, there are no intervening elements present. It will also be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly connected” or “directly coupled” to another element, there are no intervening elements present.
[0018] Relative terms such as “below” or “above” or “upper” or “lower” or “horizontal” or “vertical” may be used herein to describe a relationship of one element, layer, or region to another element, layer, or region as illustrated in the Figures. It will be understood that these terms and those discussed above are intended to encompass different orientations of the device in addition to the orientation depicted in the Figures.
[0019] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes,” and/or “including” when used herein specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
[0020] Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms used herein should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
[0021]
[0022] Notably, while a select number of SRAM cells 12 are shown for illustration, those skilled in the art will appreciate that the SRAM system 10 may include any number of SRAM cells 12 without departing from the principles of the present disclosure. Further, while the memory control circuitry 16 is shown as a single block in
[0023] In a conventional memory architecture, each one of the SRAM cells 12 is configured to store a binary value (a single binary bit). These binary values may only be written to and read from each one of the SRAM cells 12 only in specific sub-groups (i.e., pages) thereof. Further, no operations are performed on the binary values written to and read from the SRAM cells 12. The foregoing limitations make conventional memory architectures highly inefficient for use with emerging deep neural network technologies such as binary neural networks or XNOR neural networks as discussed above.
[0024] To address these shortcomings of conventional SRAM systems,
[0025] In operation, the memory control circuitry 16 writes a binary weight value to the SRAM cell 12 by providing a signal at the first word line 18A sufficient to activate the first switching element 22A (i.e., cause the first switching element 22A to couple the first bit line 20A to the first intermediate node IN.sub.1) and providing a signal at the first bit line 20B representative of the binary weight value. As discussed herein, a binary high value is represented as the supply voltage V.sub.dd, while a binary low value is represented as ground. However, those skilled in the art will readily appreciate that binary values may be represented by a variety of different signals, all of which are contemplated herein. In some embodiments, the memory control circuitry 16 may write the binary weight value to the SRAM cell 12 by providing a signal at the first word line 18A sufficient to activate the first switching element 22A and the second switching element 22B (i.e., cause the first switching element 22A to couple the first bit line 20A to the first intermediate node IN.sub.1 and cause the second switching element 22B to couple the second bit line 20B to the second intermediate node IN.sub.2—this signal may be equal to the supply voltage V.sub.dd in some embodiments), providing a signal at the first bit line 20A representative of the binary weight value, and providing a signal at the second bit line 20B indicative of a complement of the binary weight value. Writing the binary weight value to the SRAM cell 12 in this manner may reduce write times.
[0026] When a signal below a threshold value of the inverters 24 is provided at the input IN thereof, the inverters 24 provide the voltage at the first voltage input node V.sub.in1 (in this case, the supply voltage V.sub.dd) at the output OUT thereof. When a signal above the threshold value of the inverters 24 is provided at the input IN thereof, the inverters 24 provide the voltage at the second voltage input node V.sub.in2 (in this case, ground) at the output thereof. Once the SRAM cell 12 is written to, the inverters 24 continue to invert the signal provided at the input IN thereof in a circular fashion, thereby storing the binary weight value as long as the supply voltage V.sub.dd continues to be provided.
[0027] The memory control circuitry 16 may facilitate a read operation by providing a signal indicative of a binary input value at the first word line 18A and providing a signal indicative of a complement of the binary input value at the second word line 18B. A binary high value of the binary input value will activate (i.e., couple the first switching node SW.sub.1 to the second switching node SW.sub.2) the switching elements 22 coupled to the word line 18 on which it is provided, while a binary low value will cause the switching elements 22 coupled to the word line 18 on which it is provided to remain deactivated (i.e., the first switching node SW.sub.1 remains decoupled from the second switching node SW.sub.2). In response to the signal indicative of the binary input value at the first word line 18A and the signal indicative of the complement of the binary input value at the second word line 18B, the SRAM cell 12 provides a signal at the first bit line 20A indicative of a binary output value, where the binary output value is equal to an exclusive NOR of the binary input value and the binary weight value and provides a signal at the second bit line 20B indicative of a complement of the binary output value. The ADC circuitry 14 may receive the signal at the first bit line 20A and the second bit line 20B.
[0028] As a first example, when the binary weight value is a binary low value, the first intermediate node IN.sub.1 is coupled to ground and the supply voltage V.sub.dd is provided at the second intermediate node IN.sub.2. When the binary input value is also a binary low value, the first word line 18A is coupled to ground and the supply voltage V.sub.dd is provided at the second word line 18B. This causes the first switching element 22A and the second switching element 22B to remain deactivated (i.e., the first switching node SW.sub.1 remains decoupled from the second switching node SW.sub.2) and causes the third switching element 22C and the fourth switching element 22D to activate (i.e., couple the first switching element SW.sub.1 to the second switching element SW.sub.2). Accordingly, the supply voltage at the second intermediate node IN.sub.2 is provided to the first bit line 20A and the second bit line 20B is coupled to ground. As discussed above, the signal provided at the first bit line 20A is indicative of a binary output value, which is equal to an exclusive NOR of the binary input value and the binary weight value. In this example, the binary input value is a binary low value and the binary weight value is a binary low value, resulting in a binary output value that is a binary high value, which is consistent with an exclusive NOR of the binary input value and the binary weight value.
[0029] As a second example, when the binary weight value is a binary low value, once again the the first intermediate node IN.sub.1 is coupled to ground and the supply voltage V.sub.dd is provided at the second intermediate node IN.sub.2. When the binary input value is a binary high value, the supply voltage V.sub.dd is provided at the first word line 18A and the second word line 18B is coupled to ground. This causes the first switching element 22A and the second switching element 22B to activate (i.e., couple the first switching node SW.sub.1 to the second switching node SW.sub.2) and the third switching element 22C and the fourth switching element 22D to remain deactivated (i.e., the first switching node SW.sub.1 remains decoupled from the second switching node SW.sub.2). Accordingly, the first bit line 20A is coupled to ground and the supply voltage V.sub.dd is provided to the second bit line 20B. As discussed above, the signal provided at the first bit line 20A is indicative of a binary output value, which is equal to an exclusive NOR of the binary input value and the binary weight value. In this example, the binary input value is a binary high value and the binary weight value is a binary low value, resulting in a binary output value that is a binary low value, which is again consistent with an exclusive NOR of the binary input value and the binary weight value. Those skilled in the art will readily appreciate the operating result when the binary input value is a binary high value and the binary weight value is a binary high value (a binary high value) and when the binary input value is a binary low value and the binary weight value is a binary high value (a binary low value).
[0030] The result of a read operation on the SRAM cell 12 can be thought of in two ways. If considering a binary high value as 1 and a binary low value as 0, the result of a read operation on the SRAM cell 12 can be thought of as an exclusive NOR between the binary input value and the binary weight value. If considering a binary high value as a +1 and a binary low value as a −1, the result of a read operation on the SRAM cell 12 can be thought of as a multiplication of the binary input value and the binary weight value. Those skilled in the art will readily appreciate that the emerging deep neural network technologies discussed above often represent binary values as +1 and −1 rather than 0 and 1 as a means to replace costly multiplication operations with more economical bitwise operations. The SRAM system 10 accordingly allows for the efficient implementation of these emerging deep neural network technologies, as it is capable of representing binary numbers in this way.
[0031] As shown in
[0032] By operating the SRAM cell 12 as discussed above, XNOR/multiplication operations may be performed between binary input values and binary weight values in a highly efficient manner. Further, the above read process may be performed simultaneously for all SRAM cells 12 located in a column in the SRAM system 10 shown in
[0033] In some situations, it may be beneficial to quantize the binary input values into ternary rather than binary values such that the exclusive NOR operation also results in a ternary output value. Accordingly,
[0034] As discussed above, each one of the inverters 36 is configured to receive a signal at the input IN thereof. If the signal provided at the input IN is below a threshold level of the inverter 36, the inverter 36 is configured to provide the voltage at the first voltage input node V.sub.in1 at the output OUT thereof. If the signal provided at the input IN is above a threshold level of the inverter 36, the inverter 36 is configured to provide the voltage at the second voltage input node V.sub.in2 at the output thereof.
[0035] In operation, the memory control circuitry 16 writes a binary weight value to the ternary input SRAM cell 26 by providing a signal at the write word line 28 sufficient to activate the first switching element 34A (i.e., cause the first switching element 22A to couple the first write bit line 30A to the first intermediate node IN.sub.1) and providing a signal to the first write bit line 30A representative of the binary weight value. As discussed herein, a binary high value is represented as the supply voltage V.sub.dd, while a binary low value is represented as ground. However, those skilled in the art will readily appreciate that binary values may be represented by a variety of different signals, all of which are contemplated herein. In some embodiments, the memory control circuitry 16 may write the binary weight value to the ternary input SRAM cell 26 by providing a signal at the write word line 28 sufficient to activate the first switching element 34A and the second switching element 34B (i.e., cause the first switching element 34A to couple the first write bit line 30A to the first intermediate node IN.sub.1 and cause the second switching element 34B to couple the second write bit line 30B to the second intermediate node IN.sub.2), providing a signal at the first write bit line 30A representative of the binary weight value, and providing a signal at the second write bit line 30B representative of a complement of the binary weight value. Writing the binary weight value to the ternary input SRAM cell 26 in this manner may reduce write times.
[0036] The memory control circuitry 16 may facilitate a read operation by providing signals indicative of a ternary input signal at the first read word line 38A, the second read word line 38B, the third read word line 38C, and the fourth read word line 38D. The ternary states may be represented as described below in Table 1:
TABLE-US-00001 High Low Other First read word line 38A V.sub.dd Ground V.sub.dd Second read word line 38B Ground V.sub.dd Ground Third read word line 38C Ground V.sub.dd V.sub.dd Fourth read word line 38D V.sub.dd Ground Ground
[0037] In response to providing the signals at the first read word line 38A, the second read word line 38B, the third read word line 38C, and the fourth read word line 38D, the third inverter 36C and the fourth inverter 36D provide a signal at the read bit line 32 representative of a ternary output value, wherein the ternary output value is equal to the ternary input value multiplied by the binary weight value.
[0038] As a first example, when the ternary input value is a ternary high value, the supply voltage V.sub.dd is provided at the first read word line 38A and the fourth read word line 38D while the second read word line 38B and the third read word line 38C are coupled to ground. Further, when the binary weight value is a binary low value, the first intermediate node IN.sub.1 is coupled to ground while the supply voltage V.sub.dd is provided at the second intermediate node IN.sub.2. Accordingly, the third inverter 36C and the fourth inverter 36D provide the supply voltage V.sub.dd at the read bit line 32 as discussed in detail below.
[0039] To further illustrate details of this configuration,
[0040] In the first example discussed above, the PMOS transistor Q.sub.p of the third inverter 36C is strongly activated while the NMOS transistor Q.sub.n of the third inverter 36C remains deactivated. Further, the NMOS transistor Q.sub.n of the fourth inverter 36D is weakly activated while the PMOS transistor Q.sub.p of the fourth inverter 36D remains deactivated. Accordingly, there is a strong pull-up path to the supply voltage V.sub.dd through the third inverter 36C and a weak pull-down path to the supply voltage V.sub.dd through the fourth inverter 36D at the read bit line 32. This state reflects a ternary low value.
[0041] As a second example, when the ternary input value is a ternary low value, the supply voltage V.sub.dd is provided at the second read word line 38B and the third read word line 38C while the first read word line 38A and the fourth read word line 38D are coupled to ground. Further, when the binary weight value is a binary low value, the first intermediate node IN.sub.1 is coupled to ground while the supply voltage V.sub.dd is provided at the second intermediate node IN.sub.2. Accordingly, the third inverter 36C and the fourth inverter 36D couple the read bit line 32 to ground as discussed in detail below.
[0042] Referring once again to
[0043] As a third example, when the ternary input value is a ternary other value, the supply voltage V.sub.dd is provided at the first read word line 38A and the third read word line 38C while the second read word line 38B and the fourth read word line 38D are coupled to ground. Further, when the binary weight value is a binary low value, the first intermediate node IN.sub.1 is coupled to ground while the supply voltage V.sub.dd is provided at the second intermediate node IN.sub.2. Accordingly, the third inverter 36C provides a portion of the supply voltage V.sub.dd to the read bit line 32 while the fourth inverter 36D partially couples the read bit line 32 to ground as discussed below.
[0044] Referring once again to
[0045] Those skilled in the art will readily appreciate the operating result when the ternary input value is a ternary high value and the binary weight value is a binary high value (a ternary high value), when the ternary input value is a ternary low value and the binary weight value is a binary high value (a ternary low value), and when the ternary input value is a ternary other value and the binary weight value is a binary high value (a ternary other value).
[0046] The result of a read operation of the ternary input SRAM cell 26 can be thought of in two ways. If considering a binary high value as 1, a binary low value as 0, a ternary high value as 1, a ternary low value as 0, and a ternary other value as 2, the result of a read operation on the ternary input SRAM cell 26 can be thought of as an exclusive NOR between the ternary input value and the binary weight value. If considering a binary high value as a +1, a binary low value as a −1, a ternary high value as a +1, a ternary low value as a −1, and a ternary other value as a 0, the result of a read operation on the ternary input SRAM cell 26 can be thought of as a multiplication of the ternary input value and the binary weight value. Those skilled in the art will readily appreciate that the emerging deep neural network technologies discussed above often represent binary numbers as +1 and −1 rather than as 0 and 1 as a means to replace costly multiplication operations with more economical bitwise operations. The SRAM system 10 accordingly allows for the efficient implementation of these emerging deep neural network technologies, as it is capable of representing binary numbers in this way.
[0047] As shown in
[0048] By operating the ternary input SRAM cell 26 as discussed above, XNOR/multiplication operations may be performed between binary input values and binary weight values in a highly efficient manner. Further, the above read process may be performed simultaneously for all ternary input SRAM cells 26 located in a column in the SRAM system 10 shown in
[0049] In the ideal scenario, a ternary other value is represented as being exactly half-way between a ternary high value and a ternary low value. Referring to
[0050] Referring back to
[0051] Those skilled in the art will recognize improvements and modifications to the preferred embodiments of the present disclosure. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.