Method for the Analogue Multiplication and/or Calculation of a Scalar Product with a Circuit Assembly, in Particular for Artificial Neural Networks

20230185530 · 2023-06-15

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to a method for the analogue multiplication and/or calculation of a scalar product, with a circuit assembly, which has a series circuit comprising a first FET and a second FET, or FET array, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit of the first FET and the second FET, or FET array. The capacitance is initially precharged for the multiplication of a first value by a second value. The first value, encoded as the pulse width of a voltage pulse, is applied to the gate of the first FET, and the second value, encoded as the voltage amplitude, is applied to the gate of the second FET. By this means the capacitance is discharged, for the period of time of the voltage pulse, with a discharge current, which is specified by the voltage amplitude applied to the second FET. The result of the multiplication can then be determined from the residual charge or residual voltage of the capacitance. The method operates very energy-efficiently and can advantageously be used for the execution of calculations in neurons of an artificial neural network.

    Claims

    1. Method for the analogue multiplication with a circuit assembly, which has a series circuit comprising a first FET and a second FET, or FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and at least one capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET and the second FET, or FET array, in which the capacitance is precharged for the execution of a multiplication of a first value by a second value, the first value, encoded as a pulse width of a voltage pulse, is applied to the gate of the first FET, and the second value, encoded as a voltage amplitude, is applied to the gate of the second FET, or, encoded as binary voltage amplitudes, is applied to the gates of the parallel-connected second FETs, so that the capacitance is discharged for a period of time, which is specified by the pulse width of the voltage pulse applied to the gate of the first FET, with a discharge current, which is specified by the voltage amplitude(s) applied to the gate of the second FET, or to the gates of the parallel-connected second FETs, and a result of the multiplication can be determined from a residual charge or voltage of the capacitance, or from a voltage difference or charge difference between the latter and a further capacitance.

    2. Method for the analogue calculation of a scalar product, which is formed by the multiplication of a first value by a second value of a respective value pair, and the summation of results of the multiplications for a plurality of value pairs, with a circuit assembly, which has a plurality of parallel-connected series circuits comprising a first FET and a second FET, or FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and at least one capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits comprising the first FET and the second FET, or FET array, wherein each of the value pairs is associated with one of the series circuits, the capacitance is precharged for the calculation of the scalar product for each of the value pairs, the first value, encoded as a pulse width of a voltage pulse, is applied to the gate of the first FET of the associated series circuit, and the second value, encoded as a voltage amplitude, is applied to the gate of the second FET, or, encoded as binary voltage amplitudes, to the gates of the parallel-connected second FETs of the associated series circuit, such that in each case the capacitance is at least partially discharged for a period of time, which is specified by the pulse width of the voltage pulse applied to the gate of the first FET of the respective series circuit, with a discharge current, which is specified by the voltage amplitude(s) applied to the gate of the second FET, or to the gates of the parallel-connected second FETs of the respective series circuit, and a result of the calculation of the scalar product can be determined from a residual charge or voltage of the capacitance, or from a voltage or charge difference between the latter and a further capacitance.

    3. Method according to claim 2 in an artificial neural network, in which the circuit assembly represents an artificial neuron, and each value pair is respectively formed by a weight factor and an input value of the artificial neuron.

    4. Method according to claim 3, characterised in that the weight factor is selected as the first value of each value pair, and the input value is selected as the second value.

    5. Method according to claim 3, characterised in that the input value is selected as the first value of each value pair, and the weight factor is selected as the second value.

    6. Method according to claim 4, characterised in that the weight factor is provided as a binary digit sequence, wherein each digit of the digit sequence controls the pulse width at the gate of the first FET by way of a digital-time converter.

    7. Method according to claim 5, characterised in that the weight factor is provided as a binary digit sequence, wherein each digit of the digit sequence, encoded as a voltage amplitude, controls a second FET of the parallel-connected second FETs.

    8. Method according to claim 3, characterised in that the parallel-connected series circuits, comprising a first FET and a second FET, or an FET array comprising a plurality of parallel-connected second FETs, serving as a current source, are used in a matrix-like manner at crossing points between horizontal connections for an input vector, and vertical connections for an output vector, in a layer of the artificial neural network, so as to execute calculations of a layer of the artificial neural network.

    9. Method according to claim 2, characterised in that the circuit assembly for processing signed first values in each of the series circuits comprises two parallel circuit branches, which are serially connected to the second FET, or FET array, and in each case comprise a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET of the second circuit branch and the second FET, or FET array, wherein the respective first value, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch, and a result of the multiplication or calculation of the scalar product can be determined from a voltage difference or charge difference between the two capacitors.

    10. Neural network with one or more layers of artificial neurons, in which the neurons of at least one of the layers in each case comprise a circuit assembly comprising: a plurality of parallel-connected series circuits comprising a first FET and a second FET, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits comprising the first FET and the second FET, wherein components of weight vectors, encoded as pulse widths of a voltage pulse, are applied to gates of the first FETs, and components of input vectors, encoded as voltage amplitudes, are applied to gates of the second FETs.

    11. Neural network with one or more layers of artificial neurons, wherein the neurons of at least one of the layers in each case have a circuit array, which comprises: a plurality of parallel-connected series circuits comprising a first FET, and a second FET, or an FET array comprising a plurality of parallel-connected second FETs, serving as a current source, a charging device, and a capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuits of the first FET and the second FET, or FET array, wherein components of input vectors, encoded as pulse widths of a voltage pulse, are applied to gates of the first FETs, and components of weight vectors, encoded as voltage amplitudes, are applied to gates of the second FETs, or, encoded as binary voltage amplitudes, are applied to the gates of the parallel-connected second FETs of the series circuits.

    12. Neural network according to claim 10, characterised in that transfer circuits are designed between the circuit assemblies of successive lavers of the neural network, for the transfer of a charge deficit of the capacitance of the respective circuit assembly of the preceding layer to gates of the second FETs of the circuit assemblies of the following layer.

    13. Neural network according to one of the claim 10, characterised in that a circuit, for the conversion of digital values into pulse widths of a voltage pulse, is arranged upstream of each circuit assembly.

    14. Neural network according to claim 10, characterised in that the circuit assembly for the processing of signed components of the weight vectors in each of the series circuits has two parallel circuit branches, which are connected to the second FET, or FET array, and in each case have a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging means, and can be discharged by way of the series connection of the first FET of the second circuit branch and the second FET, or FET array, wherein the respective component, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, by the control device either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch.

    15. Method according to claim 1, characterised in that the circuit assembly for processing signed first values in each of the series circuits comprises two parallel circuit branches, which are serially connected to the second FET, or FET array, and in each case comprise a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging device, and can be discharged by way of the series circuit comprising the first FET of the second circuit branch and the second FET, or FET array, wherein the respective first value, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, either to the gate of the first FET of the first circuit branch, or to the gate of the first FET of the second circuit branch, and a result of the multiplication or calculation of the scalar product can be determined from a voltage difference or charge difference between the two capacitors.

    16. Neural network according to claim 11, characterised in that transfer circuits are designed between the circuit assemblies of successive layers of the neural network, for the transfer of a charge deficit of the capacitance of the respective circuit assembly of the preceding layer to gates of the second FETs of the circuit assemblies of the following layer.

    17. Neural network according to claim 11, characterised in that a circuit, for the conversion of digital values into pulse widths of a voltage pulse, is arranged upstream of each circuit assembly.

    18. Neural network according to claim 11, characterised in that the circuit assembly for the processing of signed components of the weight vectors in each of the series circuits has two parallel circuit branches, which are connected to the second FET, or FET array, and in each case have a first FET, wherein a first of the two circuit branches is connected to the capacitance, and a second of the two circuit branches is connected to a second capacitance, which can be precharged by way of the charging means, and can be discharged by way of the series connection of the first FET of the second circuit branch and the second FET, or FET array, wherein the respective component, encoded as the pulse width of a voltage pulse, is applied, depending on its sign, by the control device either to the gate of the first FET of the first circuit branch, or to the gate of the first FEY of the second circuit branch.

    Description

    BRIEF DESCRIPTION OF THE FIGURES

    [0019] The proposed methods, in conjunction with an artificial neural network, are explained once again in more detail below by means of examples of embodiment, in conjunction with the figures. Here:

    [0020] FIG. 1 shows a detail of three layers from an artificial neural network (sub-figure a)), together with the structure of an artificial neuron (sub-figure b));

    [0021] FIG. 2 shows an example of the circuit assembly for multiplication (sub-figure a)), together with the circuit assembly for the calculation of a scalar product (sub-figure b)), in accordance with the present invention;

    [0022] FIG. 3 shows an example of a DTC circuit (sub-figure a)), together with the development of the voltages within the circuit over time (sub-figure b));

    [0023] FIG. 4 shows an example of an alternative circuit assembly for multiplication in accordance with the present invention;

    [0024] FIG. 5 shows examples of the implementation of signed weight factors in the proposed circuit assembly;

    [0025] FIG. 6 shows an example of the matrix-like arrangement of a plurality of the proposed multiplication circuit assemblies for the implementation of a neural layer;

    [0026] FIG. 7 shows an example of a charge transfer circuit for the transfer of a charge deficit from the output of one neuron to the input of the next neuron; and

    [0027] FIG. 8 shows an example of the configuration of a neural network in accordance with the present invention.

    PATHS TO THE EXECUTION OF THE INVENTION

    [0028] In the following examples, the proposed method, with the associated circuit assembly, is used to calculate scalar products in an artificial neural network. To this end FIG. 1 shows in sub-figure a) a detail with three layers of an artificial neural network. In sub-figure b), the basic structure of an artificial neuron is shown, here for the j.sup.th neuron in layer y of the neural network. The input values x.sub.1 to x.sub.n—that is to say, the activations from the previous layer x—are multiplied by the corresponding weight factors or weights w.sub.j1 to w.sub.jn, and the multiplication results are added, together with a constant value b.sub.j=x.sub.0−w.sub.j0. The resulting sum S.sub.j corresponds to the scalar product of the activation vector {right arrow over (X)} of the layer x of the neuronal network and the weight vector {right arrow over (W.sub.J)}, which represents the synaptic weights of the input signals to neuron y.sub.j. Furthermore, the sum S.sub.j represents the argument of the transfer function φ(S.sub.j), which generates the final neuron activation y.sub.j. Each multiplication x.sub.i.Math.w.sub.ji corresponds to a single synaptic operation.

    [0029] With the proposed method, the calculation of the scalar product that takes place in a neuron is executed in an energy-efficient manner. FIG. 2a illustrates the core schematic of the proposed circuit assembly, i.e. the AMS multiplication cell (AMS: analogue mixed signal multiplier), which is based on two stacked FETs, here MOSFETs, and a capacitor, utilised here as capacitance. Initialisation takes place by precharging the capacitor C to the positive supply voltage U.sub.DD. The circuit principle is similar to the precharge and evaluation function in CMOS domino logic. FIG. 2a shows the two series-connected MOSFETs N.sub.w, N.sub.x, the capacitor C, and the precharging device connected to the supply voltage U.sub.DD, here in the form of a switch. The basic schematic of this AMS multiplication cell is used in two different forms of embodiment of the proposed method.

    [0030] In the preferred configuration, the multiplication result is evaluated as follows. The lower MOSFET N.sub.x operates as a current source transistor, which is controlled by its analogue gate-source voltage u.sub.Gs,Nx=u.sub.x, which is provided by way of an input value x, the output of the previous neuron layer. The voltage u.sub.x controls the drain current i.sub.x by way of the nonlinear transfer function I.sub.x (U.sub.x) in accordance with the current equation of the MOSFET. This nonlinearity is a part of the nonlinear transfer function φ of the preceding neuron layer. Since the n-channel MOSFET in the enhancement mode has a threshold voltage greater than 0, a soft rectifier-like transfer function is implemented.

    [0031] The drain current i.sub.x is then drawn from the upper pole of the capacitor C only if the stacked MOSFET N.sub.w is also conducting. By setting its gate voltage to U.sub.DD for a period of time T.sub.W corresponding to the weight factor w, the upper MOSFET N.sub.W is switched on. The charge Q.sub.XW drawn from the output node and the corresponding output voltage U.sub.C are given by:

    [00001] Q x w = T w .Math. I x ( U x ) , U C = U D D - T w .Math. I x ( U x ) C .

    [0032] The result of the multiplication thus corresponds to the amount of charge Q.sub.XW that flows through the series circuit of these two MOSFETs. The temporal relationships of the voltages and currents in this circuit assembly are shown in the left-hand part of FIG. 2a.

    [0033] FIG. 2b shows the implementation of the proposed circuit assembly for the implementation of an AMS scalar product cell by applying Kirchhoff's current law to a common output node, in which all output currents of the AMS multiplication cells are accumulated. The corresponding parallel connection of a plurality of series circuits of two MOSFETs, the capacitor C, together with the associated charging device, are shown schematically in FIG. 2b. The output voltage U.sub.yi is given by:

    [00002] U y j = U D D - Q y j C t ot , j = U D D - 1 C t o t , j .Math. i = 0 n Q x w j i = U D D - 1 C t o t , j .Math. i = 0 n T w j i I x i ( U x i ) .

    [0034] The artificial neuron function, i.e. a scalar product followed by a non-linear transfer function, is mapped according to simple electrical network principles (i.e. Kirchhoff's Laws) in conjunction with established FET device physics (I.sub.DS=f(U.sub.GS, U.sub.DS)). A neuron output activation is implemented along a single line with a series of multipliers.

    [0035] Analogue multiplication is implemented by the use of only two small MOSFETs. The total capacitance to be charged or discharged during the multiplication process can be limited to values of only 0.6 fF for 300 nm wide MOSFETs N.sub.x and N.sub.y in 22 nm CMOS. This results in an energy consumption of the multiplication of 0.5 fJ at a supply voltage of 0.8 V. In contrast, the estimated operating energy of an 8-bit×8-bit field multiplier in 28 nm CMOS technology is 8×30 fJ=240 fJ (based on 30 fJ for a single 8-bit adder), resulting in an approximately 500-fold increase in energy efficiency for the proposed AMS.

    [0036] In the above preferred configuration, the neuron input weight factors w.sub.i are represented by the temporal width T.sub.wi of current pulses, wherein the current amplitude I.sub.xi represents the input activations, that is to say, the input values x.sub.i (cf. FIG. 1). In order to minimise energy consumption, the individual weight factors are preferably stored locally, directly next to the corresponding AMS multiplier cells. In a standard CMOS process—the target technology for the implementation of circuits in accordance with the present invention—the most efficient and easy-to-use memory implementation is formed by sets of static 6-MOSFET memory cells, which represent binary words or digits. A conversion from the digital binary memory words into temporal pulse widths, that is to say, a digital-time converter (DTC), is therefore required.

    [0037] FIG. 3a shows a circuit that executes this conversion, based on the discharge of a parasitic circuit node capacitance C.sub.node by a programmable current I.sub.dis. The input binary word is represented by the binary signals W.sub.0 to W.sub.k, which are delivered by the binary memory cells. These binary signals control the discharge rate of the precharged node U.sub.out1. The different discharge currents in the different paths across the switch MOSFETs N.sub.slvt, at whose gate the binary signals W.sub.0 to W.sub.k are applied, are set by the different threshold voltages of the MOSFETs across these switch MOSFETs, as indicated by the abbreviations uhvt (ultra-high threshold voltage), llhvt (high threshold voltage for low leakage current), hvt (high threshold voltage) and rvt (regular threshold voltage). Since the path currents are determined by the threshold voltage and not by the channel width, all MOSFETs in this circuit can have a minimum channel width, resulting in very low dynamic power consumption. Two further precharged and cascaded dynamic amplifier stages with output nodes U.sub.out2 and U.sub.out3 provide amplification and binary signal level regeneration, when moved into the evaluation mode by signals U.sub.rst and U.sub.rst2 respectively. The two reset/preload signals U.sub.rst and U.sub.rst2 and the evaluation signal U.sub.evl are offset in time, as shown in FIG. 3b.

    [0038] In an alternative form of embodiment, the weight and activation inputs, and thus the roles of the lower and upper MOSFETs in the multiplier evaluation path(s) of FIG. 2 are reversed, as shown in FIG. 4. The weight is now represented by a constant source current I.sub.w (cf. FIG. 4a), which is delivered, either by a single lower MOSFET, or by a programmable set of lower MOSFETs N.sub.wk with parallel-connected drains and sources, as the current source Nw. FIG. 4b shows an example of the implementation in circuit form of the current source Nw, controlled by the digital word W, by an array of parallel lower MOSFETs N.sub.wk. The temporal two-stage activation input u.sub.x, which is applied to the gate of the upper MOSFET N.sub.x, now controls the temporal pulse width T.sub.x of the current discharge current i.sub.w(t). In the case of the set of a plurality of MOSFETs N.sub.wk, the source current I.sub.w is again controlled by a local binary weight memory, which supplies the binary signals W.sub.0 to W.sub.k.

    [0039] The advantage of the alternative form of embodiment of FIG. 4 over the preferred form of embodiment of FIG. 2 is that no digital-time converters, or digital-pulse width converters, are required for the weight factors at each position of the mixed signal multiplier. The disadvantage of the alternative form of embodiment compared to the preferred form of embodiment is that charge or voltage pulse width converters are required between the activation outputs (signal domain: analogue voltage or charge) of one neural network layer and the subsequent activation input of the next neural layer (signal domain: pulse width). Such a charge pulse width converter can be implemented after evaluation by recharging the capacitance C by a constant current I.sub.charge, starting at a predefined time t.sub.0. A trigger circuit detects the time t.sub.1 of the complete recharge of capacitor C. Between the times t.sub.0 and t.sub.1, a positive voltage u.sub.y=U.sub.DD is output for the duration t.sub.y=t.sub.1−t.sub.0, wherein t.sub.y=Q.sub.y/I.sub.charge is proportional to the charge Q.sub.y, which is drawn from C by the AMS circuit.

    [0040] The AMS multiplier circuits according to FIG. 2a and FIG. 4a operate only with unsigned signals. In the charge equation, both the currents I.sub.x (preferred configuration) or I.sub.W (alternative configuration), and also the pulse width T.sub.w (preferred configuration) or T.sub.x (alternative configuration) are positive, resulting in a positive charge Q=I.Math.T drawn from the precharged capacitor C in both configurations. Extensions of the circuit topology based on the AMS multipliers shown in FIG. 2a and FIG. 4a enable the use of signed signals.

    [0041] In artificial neural networks, the activation value range is often limited to positive values. However, the weights can be positive or negative. FIG. 5a shows the block diagram for signed weight factors using two signals at the activation output of the neuron. For a positive weight factor w.sub.ji, the two weight components are set to w.sub.jip=w.sub.ji and w.sub.jin=0. With a negative weight factor w.sub.ji, the two weight components are set to w.sub.jip=0 and w.sub.jin=−w.sub.ji.

    [0042] FIG. 5b shows the circuit topology for the implementation of a signed weight for the preferred configuration. A pair of MOSFETs N.sub.wp and N.sub.wn is used, wherein their common source node is connected to the drain of N.sub.x, and the pair N.sub.wp and N.sub.wn replace the single MOSFET N.sub.w. The drains of N.sub.wp and N.sub.wn are connected to two output voltage lines u.sub.cp and u.sub.cn, respectively, which have the precharged capacitors C.sub.n=C.sub.p. The final output signal is the voltage difference u.sub.cD=u.sub.cp−u.sub.cn, or the charge difference Q.sub.D=Q.sub.p−Q.sub.n. A selector connects the output signal U.sub.out3 of the DTC of FIG. 3a to the corresponding input u.sub.wp or u.sub.wn of the differential pair N.sub.wp/n of FIG. 5b, depending on the sign of the weight factor. The other input of the differential pair is connected to ground.

    [0043] FIG. 5c shows the circuit topology for the implementation of a signed weight for the alternative configuration. A pair of MOSFETs N.sub.xp and N.sub.xn is used, wherein their common source node is in turn connected to the drain of N.sub.w, and the pair N.sub.xp and N.sub.xn replaces the single MOSFET N.sub.x. The drains of N.sub.xp and N.sub.xn are connected to two output voltage lines u.sub.cp and u.sub.cn respectively, which have the precharged capacitors C.sub.n=C.sub.p. The final output signal is again the voltage difference u.sub.cD=u.sub.cp−u.sub.cn, or the charge difference Q.sub.D=Q.sub.p−Q.sub.n. A selector connects the activation input signal u.sub.x (pulse width domain) from FIG. 3 to the corresponding input u.sub.xp or u.sub.xn of the difference pair N.sub.xp/n from FIG. 5c, depending on the sign of the weight factor. The other input of the differential pair is connected to ground. Here too, N.sub.w can be implemented as a programmable current source as shown in FIG. 4b.

    [0044] To implement both signed weights and signed input activations, that is to say, input values, the circuit topologies of FIGS. 5b and 5c, which represent a single differential topology, must be extended to double differential topologies, and by cross-connection of their outputs to u.sub.cp and u.sub.cn. For the preferred configuration (FIG. 5b), the single differential pair N.sub.x+(N.sub.wp−N.sub.wn) is doubled to form the double differential topology (N.sub.xp+(N.sub.wp−N.sub.wn).sub.p)−(N.sub.xn+(N.sub.wp−N.sub.wn).sub.n) as shown in FIG. 5d. There is a u.sub.xp and u.sub.xn input for a signed differential input activation signal (voltage domain), which is connected to the gates of the two N.sub.xp and N.sub.xn current source MOSFETs. The weight u.sub.wp (pulse width domain) is connected to both N.sub.wp,p and N.sub.wp,n, and the weight u.sub.wn is connected to both N.sub.wn,p and N.sub.wn,n.

    [0045] For the alternative configuration (FIG. 5c), the single differential pair N.sub.w+(N.sub.xp−N.sub.xn) is doubled to form the double differential topology (N.sub.wp+(N.sub.xp−N.sub.xn).sub.p)−(N.sub.wn+(N.sub.xp−N.sub.xn).sub.n), as is shown in FIG. 5e. There is a u.sub.xp and u.sub.xn differential input for a signed differential input activation signal (pulse width domain); u.sub.xp is connected to both N.sub.xp,p and N.sub.xp,n, and u.sub.xn is connected to both N.sub.xn,p and N.sub.xn,n. N.sub.wp is an active current source for positive weight factors, and N.sub.wn is an active current source for negative weight factors.

    [0046] A single neural layer can be implemented by a matrix-like arrangement of a plurality of AMS multiplication cells, or an arrangement of a plurality of scalar product cells next to each other, as is exemplified in FIG. 6. FIG. 6a shows an arrangement of a plurality of AMS scalar product cells arranged next to each other, with n horizontal lines for the input activation vector {right arrow over (X)}, m vertical lines for the output activation vector {right arrow over (Y)}, and multiplication cells that are positioned at each intersection. Such an arrangement can evaluate one layer of a neural network (cf. FIG. 1a).

    [0047] The connection of the AMS multiplication cell to a horizontal and a vertical line, and the connection to a local weight memory (+digital-time converter (DTC)) is shown in FIG. 6b. A second overlay grid of horizontal and vertical lines is used to write weight data from the north and east sides of the matrix array to the local weight memory. A signal flow diagram representation of the circuit from FIG. 6b is shown in FIG. 6c.

    [0048] The matrix arrangement of AMS multiplication cells shown in FIG. 6 is capable of evaluating a single layer in an artificial neural network. For the calculations of a complete artificial neural network, a plurality of the operations executed by the matrix arrangement must be cascaded. This can be done by retransferring the evaluated matrix output signals y.sub.i to the matrix inputs x.sub.i, or by transferring the evaluated matrix output signals y.sub.i to the matrix inputs x.sub.i of another (different) matrix. For the preferred configuration, the output and input signals y.sub.i and x.sub.i are analogue charge (Q) or voltage (Q/C) amplitude domain signals.

    [0049] A very efficient method for transferring the analogue amplitude domain signals from the outputs back to the inputs is charge transfer. An example of a corresponding circuit for the charge transfer (transfer of a charge deficit) is shown in FIG. 7. The advantage of this circuit is that no Class A linear amplifiers with static power consumption are used. The charge transfer takes place exclusively by way of clocked common gate circuits with dynamic power consumption.

    [0050] Alternatively, the charge transfer can also take place by means of analogue voltage signal transfer through linear analogue buffer amplifiers, i.e. based on operational amplifiers with resistors and/or switched capacitors. Digital signal transfer by the interposition of A/D and D/A converters, preferably implemented in terms of energy-efficient SC-based conversion principles such as SAR, and supplemented by means for the processing of large neural layers and the implementation of artificial transfer functions, is also possible. This can be done, for example, by way of digital memories and blocks for digital signal processing.

    [0051] In the alternative configuration of the proposed method, the output signals y.sub.i are signals in the charge (Q) or voltage (Q/C) amplitude domain, while the input signals x.sub.i are signals in the pulse width domain. The signal transfer from the matrix outputs y.sub.i to the matrix inputs x.sub.i therefore requires a charge-to-pulse width converter, as described in one of the preceding sections.

    [0052] FIG. 8 shows an exemplary implementation of the proposed method in an overall architecture for an integrated neural network coprocessor, which is based on the AMS principles described above (black solid line blocks), supplemented by a parallel digital signal processing path (grey solid line blocks) and an additional external unit for learning purposes (dashed lines). The central part is an n×m AMS multiplication and addition matrix, as has already been explained in connection with FIG. 6a. A forward multiplication and addition unit (AMS multiplication cell, cf. FIGS. 6b and 6c) is located at each crossing point, which enables the matrix to evaluate the neuron layers continuously. Control units for the distributed weight memory are located at the north and east corners of the matrix.

    [0053] A stack of functional blocks, which are required to preload and write to the analogue horizontal lines, and to read the analogue vertical lines of the multiplication and addition matrix, is located on the west and south sides of the matrix respectively (blocks: preload and bias injection).

    [0054] Neural network layers. which have more neurons than the matrix row and column numbers n and m respectively, can be supported by analogue charge transfer memory units at the southern output and/or western input edge with additional means for analogue charge addition, as represented by the blocks “transfer gate bank” and “capacitor bank” in FIG. 8.

    [0055] Energy-efficient charge transfer from the neuron layer activation outputs on the south side to the inputs of the next neuron layer on the west side can be implemented by maintaining the analogue charge domain using the analogue charge transfer circuits described above. In addition, power-efficient SC-based A/D converters can be connected to the south activation output edge, and D/A converters can be connected to the west activation input edge to enable a hybrid evaluation of the neural network, i.e. parts requiring low precision in the analogue path, and parts requiring high precision in an additional digital path. This additional digital path can also be used for the application of more specialised activation transfer functions.