CIRCUIT FOR CALCULATING WEIGHT ADJUSTMENTS OF AN ARTIFICIAL NEURAL NETWORK, AND A MODULE IMPLEMENTING A LONG SHORT-TERM ARTIFICIAL NEURAL NETWORK
20210097379 · 2021-04-01
Inventors
Cpc classification
G11C2213/77
PHYSICS
G11C2013/0042
PHYSICS
G11C13/0007
PHYSICS
International classification
Abstract
A circuit structure for implementing a multilayer artificial neural network, the circuit comprising: a plurality of memristors implementing a synaptic grid array, the memristors storing weights of the network; and a calculation and control module configured to calculate the value of weight adjustments within the network.
Claims
1. A circuit structure for implementing a multilayer artificial neural network, the circuit structure comprising: a plurality of memristors implementing a synaptic grid array, the memristors storing weights of the network; and a calculation controller configured to calculate the value of weight adjustments within the network.
2. The circuit structure according to claim 1, wherein the synaptic grid arrays comprise memristor synapse circuits each having a memristor for storing a weight, a MOS tube comprising a PMOS transistor for inputting positive voltage signals quantized by samples to the memristor, and a NMOS transistor for inputting negative voltage signals to the memristor, having the same absolute value as PMOS; and a control signal input for controlling the on-off state of the PMOS and NMOS transistors.
3. The circuit structure according to claim 1, wherein the calculation controller is configured to generate control signals by: initiating a read process for reading the weight stored a memristor, calculating the output of the network; and initiating a write process for adjusting the weights of memristors.
4. The circuit structure according to claim 3, configured such that: during a first half of the read process the control signal e=V.sub.DD, in which state the NMOS transistor is turned on and the PMOS transistor is off, the input voltage is −V.sub.in, such that the current flows from the negative pole of the memristor to its positive electrode, and the value of the memristor increases with time, and during a second half of the read process the control signal e=−V.sub.DD, the PMOS transistor is on, the NMOS transistor is off, and the input voltage is −V.sub.in, such that the current then flows from the positive pole of the memristor to its negative electrode, such that the resistance of the memristor decreases by the same amount as it increased by during the first half of the read operation, thus returning it to its original state.
5. The circuit structure according to claim 4, wherein the input voltage pulses, and the memristor is multiply operated, to calculate the output of the weight stored by the memristor.
6. The circuit structure according to claim 2, wherein the calculation controller is configured such that the value of the control signal is initially sign(error)V.sub.DD, the conduction of the MOS tube depends on the sign of the error, the voltage signal from the MOS tube and its duration T.sub.update determines the correction quantity of the memristor, such that after time T.sub.update, the value of the memristor is no longer changed until the writing process completes.
7. The circuit structure according to claim 1, wherein the calculation controller further comprises: one or more local gradient computation configured to calculate local gradients in the process of reverse propagation; one or more momentum computation configured to add a momentum adjustment to the weight correction and speeding up the convergence of the circuit; and an adaptive learning rate configured to adjust the learning rate by speeding up convergence of the circuit.
8. The circuit structure according to claim 7, wherein the local gradient computation comprises: a δ.sub.last calculation for calculating the local gradient of the output layer; and a δ.sub.front calculation for calculating the local gradient of the hidden layers.
9. The circuit structure according to claim 8, wherein the δ.sub.last calculation satisfies the mean square error function and/or wherein the δ.sub.last calculation satisfies the cross entropy error function.
10. The circuit structure according to claim 8, wherein the δ.sub.front calculation comprises: a derivation of transfer configured to calculate the derivation of the transfer function at the corresponding input; and a synapse grid array for calculating vector products of weights at each layer, and for determining the local gradient δ.sub.front+1 of next layer of the network.
11. The circuit structure according to claim 7, wherein the momentum computation comprises a sample and hold, an adder and a multiplier, and preferably also comprises a comparator, an amplifier and a constant.
12. A memristor-based LSTM neural network system, comprising: an internal loop control layer providing a data memory and a LSTM cell, wherein the data memory is configured to store data of an input layer of the network, and to store data after feature extraction; and an external classified output layer providing an external memristor crossbar and a voltage comparator, the external memristor crossbar being configured to classify features extracted by the internal loop control layer, and the voltage comparator being configured to compare the analog voltages output by the external memristor crossbar to obtain a comparison result of the analog voltage; wherein an classification result is output based on the achieved comparison result.
13. The memristor-based LSTM neural network system according to claim 12, wherein the internal memristor crossbar includes voltage input ports, threshold memristors, voltage inverters, operational amplifiers and multipliers, wherein for each voltage input port connected to the threshold memristor there exists another one of the voltage input ports connected to the threshold memristor through a voltage inverter, and the operational amplifier is connected in parallel so that one end of the voltage inverter is connected with the operational amplifier where the output is connected, and the other end is connected to the input of the multiplier.
14. The memristor-based LSTM neural network system according to claim 13, wherein the or each operational amplifier is connected in parallel with a threshold memristor so as to provide the operation function of a sigmoid activation function, and so as to transform current signal into voltage signal.
15. The memristor-based LSTM neural network system according to claim 14, wherein the external memristor crossbar comprises voltage input ports, threshold memristors and voltage inverters, so that between two voltage input ports, a voltage input port is connected to a threshold memristor through a voltage inverter, and the other port is directly connected to a threshold memristor.
Description
BRIEF DESCRIPTION
[0040] We now describe features of embodiments of the invention, by way of example only, with reference to the accompanying drawings of which:
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
DETAILED DESCRIPTION
[0051] A circuit structure 10 of a multilayer neural network incorporating memristors is presented in
[0052]
[0053] Each control step comprises a read process for reading the weight stored by memristor and calculating the output of the network; and a write process for adjusting the weight of memristor.
[0054] According to an embodiment herein, and as illustrated in
[0055] According to an embodiment herein, the duration of the write process is T.sub.write. The value of the control signal is initially sign(error)V.sub.DD and the duration is T.sub.update. The conduction of the MOS tube depends on the sign of the error. The voltage signal from the MOS tube and its duration determines the correction quantity of the memristor. After T.sub.update, the value of the memristor is no longer changed until the writing process is over. The variation of the memristor value during the weight updating process is as shown in
[0056] According to an embodiment herein, the calculation and control module comprises one or more local gradient computation modules 18, wherein the local gradient computation modules 18 are designed for calculating local gradients in the process of reverse propagation. The calculation and control module further includes one or more momentum modules 20, wherein the momentum modules 20 are designed for adding a momentum to the weight correction and speeding up the convergence of the circuit. The circuit further includes an adaptive learning rate module 22, wherein the adaptive learning rate module 22 is designed for adjusting the learning rate and speeding up the convergence of the circuit, as is known in the art.
[0057] The local gradient computation modules 18 may, for example, comprise: a δ.sub.last calculation module, wherein the δ.sub.last calculation module is designed for calculating the local gradient of the output layer; and δ.sub.front calculation module is designed for calculating the local gradient of the hidden layers. According to an embodiment herein, the δ.sub.last calculation module satisfies the mean square error function; or, alternatively the δ.sub.last calculation module may satisfy the cross entropy error function.
[0058] According to an embodiment herein, the δ.sub.front calculation module comprises: a derivation of transfer function module, wherein the derivation of transfer function module is designed for calculating the derivation of the transfer function at the corresponding input; and the synapse grid array module is designed for calculating vector product of weights and local gradient δ.sub.front+1 of next layer.
[0059] A suitable momentum module 20 is set out in
[0060] An adaptive learning rate module 22 suitable for use in the system of the present invention is shown in
[0061] In an example implementation in which a data set has been used to train the network, the Iris data set has been used. The data set has a sample size of 150. It is divided into two equal halves, so that the sample sizes of training samples and test samples are both 75. The dimensionality of the data set is 4. There are three different class labels: 1, −1, −1 represents Setosa, −1, 1, −1 represents Versicolour, −1, −1, 1 represents Virginia. A pulse electrical signal is derived from the Iris data set. The iteration period of neuromorphic computing circuit is 0.1 s. Because the input signal of the circuit is pulse form, the input sample should be processed first. In the scheme, 0.1 s is divided into 1000 pulse signals, that is, the circuit interface receives an input pulse vector x.sub.in, a control pulse e and a target output vector d per 10.sup.−4 seconds.
[0062] Turning now to
[0063] The LSTM cell 30 comprises n units. In the example shown in
[0064] Similarly, 64 hidden state values h.sub.t−1[1, . . . , 64] are also received from the data memory 28, as input to each of the LSTM units. The input 32 to the system at time t, x.sub.t, is also input to each LSTM unit. In the example given, the input 32 comprises 50 values x[1, . . . , 50]. Of course it should be understood that the same general structure and principals apply where n is any other number (and not necessarily 64), and where the input comprises more or less than 50 values.
[0065]
[0066]
[0067] A threshold memristor can only express a positive weight, and in any adjacent voltage input port of the present invention, a voltage input port and a threshold memristor are connected by a voltage inverter (56, for example). A voltage input port and a threshold memristor are directly connected, so that two adjacent threshold memristors are configured to express a positive and negative weight. For example, looking at
[0068] In the same manner, the following rows receiving voltage inputs V.sub.h1 to V.sub.hn (i.e. to V.sub.h64 in the given example) are replicated in the next set of rows, each via a voltage inverter 56.
[0069] Both M1 and M2 are threshold memristors, V.sub.s1+ is 1V DC voltage, V.sub.s1− ground, Vs2+ is 1V DC voltage, V.sub.s2− is −1V DC voltage, and the values of resistors R.sub.1-R.sub.5 should be identical (between 1K ohm to 10K ohm).
[0070] The operational amplifiers 46 are connected in parallel with the threshold memristors M1, as shown in
[0071] A further operational amplifier 48 is connected in parallel with the threshold memristor M2 to implement a hyperbolic tangent activation function and convert the current signal into a voltage signal. One end of the voltage inverter is connected to the output of the operational amplifier, and the other end is connected to the input of the multiplier for converting the direction of the voltage.
[0072] According to an embodiment herein, the external classified output layer includes an external memristive crossbar circuit 34 and an auxiliary circuit, wherein the external memristive crossbar circuit 34 includes voltage input ports (labelled V.sub.h1 to V.sub.hn in
[0073] Embodiments of the subject matter and the functional operations described herein can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
[0074] Some embodiments are implemented using one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or to control the operation of, a data processing apparatus. The computer-readable medium can be a manufactured product, such as hard drive in a computer system or an embedded system. The computer-readable medium can be acquired separately and later encoded with the one or more modules of computer program instructions, such as by delivery of the one or more modules of computer program instructions over a wired or wireless network. The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, or a combination of one or more of them. As used herein, in some embodiments the term module comprises a memory and/or a processor configured to control at least one process of a system or a circuit structure. The memory storing executable instructions which, when executed by the processor, cause the processor to provide an output to perform the at least one process. Embodiments of the memory include non-transitory computer readable media.
[0075] The terms “computing device” and “data processing apparatus” encompass all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, a runtime environment, or a combination of one or more of them. In addition, the apparatus can employ various different computing model infrastructures, such as web services, distributed computing and grid computing infrastructures.
[0076] The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
[0077] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
[0078] To provide for interaction with a user, some embodiments are implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
[0079] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described is this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (“LAN”) and a wide area network (“WAN”), an inter-network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).
[0080] Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognise that the embodiments herein can be practiced with modification within the spirit and scope of the appended claims.
[0081] When used in this specification and claims, the terms “comprises” and “comprising” and variations thereof mean that the specified features, steps or integers are included. The terms are not to be interpreted to exclude the presence of other features, steps or components.
[0082] The features disclosed in the foregoing description, or the following claims, or the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for attaining the disclosed result, as appropriate, may, separately, or in any combination of such features, be utilised for realising the invention in diverse forms thereof.
[0083] Although certain example embodiments of the invention have been described, the scope of the appended claims is not intended to be limited solely to these embodiments. The claims are to be construed literally, purposively, and/or to encompass equivalents.