Method for the Analogue Multiplication and/or Calculation of a Scalar Product with a Circuit Assembly, in Particular for Artificial Neural Networks
20230185530 · 2023-06-15
Inventors
Cpc classification
International classification
Abstract
The present invention relates to a method for the analogue multiplication and/or calculation of a scalar product, with a circuit assembly, which has a series circuit comprising a first FET and a second FET, or FET array, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit of the first FET and the second FET, or FET array. The capacitance is initially precharged for the multiplication of a first value by a second value. The first value, encoded as the pulse width of a voltage pulse, is applied to the gate of the first FET, and the second value, encoded as the voltage amplitude, is applied to the gate of the second FET. By this means the capacitance is discharged, for the period of time of the voltage pulse, with a discharge current, which is specified by the voltage amplitude applied to the second FET. The result of the multiplication can then be determined from the residual charge or residual voltage of the capacitance. The method operates very energy-efficiently and can advantageously be used for the execution of calculations in neurons of an artificial neural network.
Claims
1. Method for the analogue multiplication with a circuit assembly, which has a series circuit comprising a first FET and a second FET, or FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and at least one capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET and the second FET, or FET array, in which the capacitance is precharged for the execution of a multiplication of a first value by a second value, the first value, encoded as a pulse width of a voltage pulse, is applied to the gate of the first FET, and the second value, encoded as a voltage amplitude, is applied to the gate of the second FET, or, encoded as binary voltage amplitudes, is applied to the gates of the parallel-connected second FETs, so that the capacitance is discharged for a period of time, which is specified by the pulse width of the voltage pulse applied to the gate of the first FET, with a discharge current, which is specified by the voltage amplitude(s) applied to the gate of the second FET, or to the gates of the parallel-connected second FETs, and a result of the multiplication can be determined from a residual charge or voltage of the capacitance, or from a voltage difference or charge difference between the latter and a further capacitance.
2. Method for the analogue calculation of a scalar product, which is formed by the multiplication of a first value by a second value of a respective value pair, and the summation of results of the multiplications for a plurality of value pairs, with a circuit assembly, which has a plurality of parallel-connected series circuits comprising a first FET and a second FET, or FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and at least one capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits comprising the first FET and the second FET, or FET array, wherein each of the value pairs is associated with one of the series circuits, the capacitance is precharged for the calculation of the scalar product for each of the value pairs, the first value, encoded as a pulse width of a voltage pulse, is applied to the gate of the first FET of the associated series circuit, and the second value, encoded as a voltage amplitude, is applied to the gate of the second FET, or, encoded as binary voltage amplitudes, to the gates of the parallel-connected second FETs of the associated series circuit, such that in each case the capacitance is at least partially discharged for a period of time, which is specified by the pulse width of the voltage pulse applied to the gate of the first FET of the respective series circuit, with a discharge current, which is specified by the voltage amplitude(s) applied to the gate of the second FET, or to the gates of the parallel-connected second FETs of the respective series circuit, and a result of the calculation of the scalar product can be determined from a residual charge or voltage of the capacitance, or from a voltage or charge difference between the latter and a further capacitance.
3. Method according to claim 2 in an artificial neural network, in which the circuit assembly represents an artificial neuron, and each value pair is respectively formed by a weight factor and an input value of the artificial neuron.
4. Method according to claim 3, characterised in that the weight factor is selected as the first value of each value pair, and the input value is selected as the second value.
5. Method according to claim 3, characterised in that the input value is selected as the first value of each value pair, and the weight factor is selected as the second value.
6. Method according to claim 4, characterised in that the weight factor is provided as a binary digit sequence, wherein each digit of the digit sequence controls the pulse width at the gate of the first FET by way of a digital-time converter.
7. Method according to claim 5, characterised in that the weight factor is provided as a binary digit sequence, wherein each digit of the digit sequence, encoded as a voltage amplitude, controls a second FET of the parallel-connected second FETs.
8. Method according to claim 3, characterised in that the parallel-connected series circuits, comprising a first FET and a second FET, or an FET array comprising a plurality of parallel-connected second FETs, serving as a current source, are used in a matrix-like manner at crossing points between horizontal connections for an input vector, and vertical connections for an output vector, in a layer of the artificial neural network, so as to execute calculations of a layer of the artificial neural network.
9. Method according to claim 2, characterised in that the circuit assembly for processing signed first values in each of the series circuits comprises two parallel circuit branches, which are serially connected to the second FET, or FET array, and in each case comprise a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET of the second circuit branch and the second FET, or FET array, wherein the respective first value, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch, and a result of the multiplication or calculation of the scalar product can be determined from a voltage difference or charge difference between the two capacitors.
10. Neural network with one or more layers of artificial neurons, in which the neurons of at least one of the layers in each case comprise a circuit assembly comprising: a plurality of parallel-connected series circuits comprising a first FET and a second FET, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits comprising the first FET and the second FET, wherein components of weight vectors, encoded as pulse widths of a voltage pulse, are applied to gates of the first FETs, and components of input vectors, encoded as voltage amplitudes, are applied to gates of the second FETs.
11. Neural network with one or more layers of artificial neurons, wherein the neurons of at least one of the layers in each case have a circuit array, which comprises: a plurality of parallel-connected series circuits comprising a first FET, and a second FET, or an FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits of the first FET and the second FET, or FET array, wherein components of input vectors, encoded as pulse widths of a voltage pulse, are applied to gates of the first FETs, and components of weight vectors, encoded as voltage amplitudes, are applied to gates of the second FETs, or, encoded as binary voltage amplitudes, are applied to the gates of the parallel-connected second FETs of the series circuits.
12. Neural network according to claim 10, characterised in that transfer circuits are designed between the circuit assemblies of successive lavers of the neural network, for the transfer of a charge deficit of the capacitance of the respective circuit assembly of the preceding layer to gates of the second FETs of the circuit assemblies of the following layer.
13. Neural network according to one of the claim 10, characterised in that a circuit, for the conversion of digital values into pulse widths of a voltage pulse, is arranged upstream of each circuit assembly.
14. Neural network according to claim 10, characterised in that the circuit assembly for the processing of signed components of the weight vectors in each of the series circuits has two parallel circuit branches, which are connected to the second FET, or FET array, and in each case have a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging means, and can be discharged by way of the series connection of the first FET of the second circuit branch and the second FET, or FET array, wherein the respective component, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, by the control device either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch.
15. Method according to claim 1, characterised in that the circuit assembly for processing signed first values in each of the series circuits comprises two parallel circuit branches, which are serially connected to the second FET, or FET array, and in each case comprise a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET of the second circuit branch and the second FET, or FET array, wherein the respective first value, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch, and a result of the multiplication or calculation of the scalar product can be determined from a voltage difference or charge difference between the two capacitors.
16. Neural network according to claim 11, characterised in that transfer circuits are designed between the circuit assemblies of successive layers of the neural network, for the transfer of a charge deficit of the capacitance of the respective circuit assembly of the preceding layer to gates of the second FETs of the circuit assemblies of the following layer.
17. Neural network according to claim 11, characterised in that a circuit, for the conversion of digital values into pulse widths of a voltage pulse, is arranged upstream of each circuit assembly.
18. Neural network according to claim 11, characterised in that the circuit assembly for the processing of signed components of the weight vectors in each of the series circuits has two parallel circuit branches, which are connected to the second FET, or FET array, and in each case have a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging means, and can be discharged by way of the series connection of the first FET of the second circuit branch and the second FET, or FET array, wherein the respective component, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, by the control device either to the gate of the first FET of the first circuit branch, or to the gate of the first FEY of the second circuit branch.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0019] The proposed methods, in conjunction with an artificial neural network, are explained once again in more detail below by means of examples of embodiment, in conjunction with the figures. Here:
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
PATHS TO THE EXECUTION OF THE INVENTION
[0028] In the following examples, the proposed method, with the associated circuit assembly, is used to calculate scalar products in an artificial neural network. To this end
[0029] With the proposed method, the calculation of the scalar product that takes place in a neuron is executed in an energy-efficient manner.
[0030] In the preferred configuration, the multiplication result is evaluated as follows. The lower MOSFET N.sub.x operates as a current source transistor, which is controlled by its analogue gate-source voltage u.sub.Gs,Nx=u.sub.x, which is provided by way of an input value x, the output of the previous neuron layer. The voltage u.sub.x controls the drain current i.sub.x by way of the nonlinear transfer function I.sub.x (U.sub.x) in accordance with the current equation of the MOSFET. This nonlinearity is a part of the nonlinear transfer function φ of the preceding neuron layer. Since the n-channel MOSFET in the enhancement mode has a threshold voltage greater than 0, a soft rectifier-like transfer function is implemented.
[0031] The drain current i.sub.x is then drawn from the upper pole of the capacitor C only if the stacked MOSFET N.sub.w is also conducting. By setting its gate voltage to U.sub.DD for a period of time T.sub.W corresponding to the weight factor w, the upper MOSFET N.sub.W is switched on. The charge Q.sub.XW drawn from the output node and the corresponding output voltage U.sub.C are given by:
[0032] The result of the multiplication thus corresponds to the amount of charge Q.sub.XW that flows through the series circuit of these two MOSFETs. The temporal relationships of the voltages and currents in this circuit assembly are shown in the left-hand part of
[0033]
[0034] The artificial neuron function, i.e. a scalar product followed by a non-linear transfer function, is mapped according to simple electrical network principles (i.e. Kirchhoff's Laws) in conjunction with established FET device physics (I.sub.DS=f(U.sub.GS, U.sub.DS)). A neuron output activation is implemented along a single line with a series of multipliers.
[0035] Analogue multiplication is implemented by the use of only two small MOSFETs. The total capacitance to be charged or discharged during the multiplication process can be limited to values of only 0.6 fF for 300 nm wide MOSFETs N.sub.x and N.sub.y in 22 nm CMOS. This results in an energy consumption of the multiplication of 0.5 fJ at a supply voltage of 0.8 V. In contrast, the estimated operating energy of an 8-bit×8-bit field multiplier in 28 nm CMOS technology is 8×30 fJ=240 fJ (based on 30 fJ for a single 8-bit adder), resulting in an approximately 500-fold increase in energy efficiency for the proposed AMS.
[0036] In the above preferred configuration, the neuron input weight factors w.sub.i are represented by the temporal width T.sub.wi of current pulses, wherein the current amplitude I.sub.xi represents the input activations, that is to say, the input values x.sub.i (cf.
[0037]
[0038] In an alternative form of embodiment, the weight and activation inputs, and thus the roles of the lower and upper MOSFETs in the multiplier evaluation path(s) of
[0039] The advantage of the alternative form of embodiment of
[0040] The AMS multiplier circuits according to
[0041] In artificial neural networks, the activation value range is often limited to positive values. However, the weights can be positive or negative.
[0042]
[0043]
[0044] To implement both signed weights and signed input activations, that is to say, input values, the circuit topologies of
[0045] For the alternative configuration (
[0046] A single neural layer can be implemented by a matrix-like arrangement of a plurality of AMS multiplication cells, or an arrangement of a plurality of scalar product cells next to each other, as is exemplified in
[0047] The connection of the AMS multiplication cell to a horizontal and a vertical line, and the connection to a local weight memory (+digital-time converter (DTC)) is shown in
[0048] The matrix arrangement of AMS multiplication cells shown in
[0049] A very efficient method for transferring the analogue amplitude domain signals from the outputs back to the inputs is charge transfer. An example of a corresponding circuit for the charge transfer (transfer of a charge deficit) is shown in
[0050] Alternatively, the charge transfer can also take place by means of analogue voltage signal transfer through linear analogue buffer amplifiers, i.e. based on operational amplifiers with resistors and/or switched capacitors. Digital signal transfer by the interposition of A/D and D/A converters, preferably implemented in terms of energy-efficient SC-based conversion principles such as SAR, and supplemented by means for the processing of large neural layers and the implementation of artificial transfer functions, is also possible. This can be done, for example, by way of digital memories and blocks for digital signal processing.
[0051] In the alternative configuration of the proposed method, the output signals y.sub.i are signals in the charge (Q) or voltage (Q/C) amplitude domain, while the input signals x.sub.i are signals in the pulse width domain. The signal transfer from the matrix outputs y.sub.i to the matrix inputs x.sub.i therefore requires a charge-to-pulse width converter, as described in one of the preceding sections.
[0052]
[0053] A stack of functional blocks, which are required to preload and write to the analogue horizontal lines, and to read the analogue vertical lines of the multiplication and addition matrix, is located on the west and south sides of the matrix respectively (blocks: preload and bias injection).
[0054] Neural network layers. which have more neurons than the matrix row and column numbers n and m respectively, can be supported by analogue charge transfer memory units at the southern output and/or western input edge with additional means for analogue charge addition, as represented by the blocks “transfer gate bank” and “capacitor bank” in
[0055] Energy-efficient charge transfer from the neuron layer activation outputs on the south side to the inputs of the next neuron layer on the west side can be implemented by maintaining the analogue charge domain using the analogue charge transfer circuits described above. In addition, power-efficient SC-based A/D converters can be connected to the south activation output edge, and D/A converters can be connected to the west activation input edge to enable a hybrid evaluation of the neural network, i.e. parts requiring low precision in the analogue path, and parts requiring high precision in an additional digital path. This additional digital path can also be used for the application of more specialised activation transfer functions.