METHOD AND MACHINE READABLE STORAGE MEDIUM OF CLASSIFYING A NEAR SUN SKY IMAGE
20210166065 · 2021-06-03
Inventors
- Ti-chiun Chang (Princeton Junction, NJ)
- Patrick Reeb (Adelsdorf, DE)
- Andrei Szabo (Ottobrunn, DE)
- Joachim Bamberger (Stockdorf, DE)
Cpc classification
G06N3/082
PHYSICS
International classification
Abstract
A method of classifying a near sun sky image includes at least one of the following steps: providing a recurrent neural network in the shape of a long short-term memory cell, the memory cell having at least an input gate, a neuron with a self-recurrent connection, a forget gate, and an output gate, and using a convolutional neural network, which includes, in the cited order, at least an input layer, one or more convolutional layers, an average pooling layer, and an output layer.
Claims
1-11. (canceled)
12. A method of classifying a near sun sky image, the method comprising at least one of the steps of: providing a recurrent neural network in the structure of a gated recurrent unit or a long short-term memory cell, which memory cell comprises at least an input gate, a neuron with a self-recurrent connection, a forget gate, and an output gate; and providing a convolutional neural network, which network comprises, in this order, at least an input layer, one or more convolutional layers, an average pooling layer, and an output layer; and using at least one of the recurrent neural network or the convolutional neural network to classify a near sun sky image.
13. The method according to claim 12, which comprises using the recurrent neural network to classify a near sun sky image and thereby: inputting a sequence of images of the sky near the sun into the input gate of the memory cell; processing the sequence of images in the neuron; and outputting a classification of the sequence of images of the sky near the sun from the output gate.
14. The method according to claim 12, which comprises using the convolutional neural network to classify a near sun sky image and thereby: inputting an image of the sky near the sun into the input layer of the convolutional neural network; processing the image in the convolutional neural network; and outputting a classification of the image of the sky near the sun from the output layer.
15. The method according to claim 12, which comprises using the recurrent neural network and the convolutional neural network to classify a near sun sky image and thereby: inputting a sequence of an output from the output layer of the convolutional neural network into the input gate of the recurrent neural network.
16. The method according to claim 12, which comprises using the recurrent neural network and the convolutional neural network and thereby: with the recurrent neural network: inputting a sequence of images of the sky near the sun into the input gate of the memory cell; processing the sequence of images in the neuron; and outputting a classification of the sequence of images of the sky near the sun from the output gate; with the convolutional neural network: inputting an image of the sky near the sun into the input layer of the convolutional neural network; processing the image in the convolutional neural network; and outputting a classification of the image of the sky near the sun from the output layer; and inputting a sequence of an output from the output layer of the convolutional neural network into the input gate of the recurrent neural network.
17. The method according to claim 12, wherein the convolutional neural network further comprises, between the average pooling layer and the output layer, at least one of a dropout layer, a flatten layer, and a dense layer.
18. A machine-readable storage medium containing non-transitory stored program code which, when executed on a computer, causes the computer to perform a near sun sky image classification by accessing at least one of: a recurrent neural network having a structure of a gated recurrent unit or a long short-term memory cell, the memory cell having at least an input gate, a neuron with a self-recurrent connection, a forget gate, and an output gate; and a convolutional neural network, which includes, in the following order: at least an input layer, one or more convolutional layers, an average pooling layer, and an output layer.
19. The machine-readable storage medium according to claim 18, wherein the computer is prompted to perform the near sun sky image classification by accessing the recurrent neural network, and the stored program code, when executed on the computer, causes the computer to: input a sequence of images of the sky near the sun into the input gate of the memory cell; process the sequence of images in the neuron; and output a classification of the sequence of images of the sky near the sun from the output gate.
20. The machine-readable storage medium according to claim 18, wherein the computer is prompted to perform the near sun sky image classification by accessing the convolutional neural network, and the stored program code, when executed on the computer, causes the computer to: input an image of the sky near the sun into the input layer of the convolutional neural network; process the image in the convolutional neural network; and output a classification of the image of the sky near the sun from the output layer.
21. The machine-readable storage medium according to claim 18, wherein the computer is prompted to perform the near sun sky image classification by accessing both the convolutional neural network and the recurrent neural network, and the stored program code, when executed on the computer, causes the computer to input a sequence of an output from the output layer of the convolutional neural network into the input gate of the recurrent neural network.
22. The machine-readable storage medium according to claim 18, wherein the computer is prompted to perform the near sun sky image classification by accessing both the convolutional neural network and the recurrent neural network, and: upon accessing the recurrent neural network, the stored program code causes the computer to: input a sequence of images of the sky near the sun into the input gate of the memory cell; process the sequence of images in the neuron; and output a classification of the sequence of images of the sky near the sun from the output gate; upon accessing the convolutional neural network, the stored program code causes the computer to: input an image of the sky near the sun into the input layer of the convolutional neural network; process the image in the convolutional neural network; and output a classification of the image of the sky near the sun from the output layer; input an image of the sky near the sun into the input layer of the convolutional neural network; process the image in the convolutional neural network; and output a classification of the image of the sky near the sun from the output layer; input a sequence of an output from the output layer of the convolutional neural network into the input gate of the recurrent neural network.
23. The machine-readable storage medium according to claim 18, wherein the convolutional neural network further comprises, between the average pooling layer and the output layer, at least one of a dropout layer, a flatten layer, and a dense layer.
24. An electric power system, comprising: a power grid; a photovoltaic power plant, which is electrically connected to said power grid for supplying electric power to said power grid; at least one further power plant electrically connected to said power grid, for supplying electric power to said power grid and/or at least one electric consumer connected to said power grid, for receiving electric power from said power grid; a control device for controlling an electric power flow between said at least one further power plant and said power grid and/or between said power grid and said at least one electric consumer; and said prediction device being communicatively connected to said control device; and said control device being configured to control, based on the prediction signal, a future electric power flow.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0025] The aspects defined above and further aspects of the present invention are apparent from the examples of embodiment to be described hereinafter and are explained with reference to the examples of embodiment. The invention will be described in more detail hereinafter with reference to examples of embodiment but to which the invention is not limited.
[0026]
[0027]
[0028]
DETAILED DESCRIPTION
[0029] The illustrations in the drawings are schematically. It is noted that in different figures, similar or identical elements are provided with the same reference signs.
[0030]
[0031] The convolution neural network 1 is used for digit recognition. It is assumed that the convolutional neural network 1 is suitable trained beforehand. The training of the convolutional neural network 1 is sufficiently known in the state of the art and needs not further be described. An example of a conventional CNN is known from Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, November 1998.
[0032] The convolutional neural network 1 comprises, in this order, an input layer 2, two convolutional layers 3, 4, an average pooling layer 5, a dropout layer 6, a flatten layer 7, a dense layer 8, a dropout layer 9, a dense layer 10, and an output layer (not shown).
[0033] The convolutional layers 3, 4 comprise learnable filters having a small receptive field, but extend through the full depth of the input volume.
[0034] In the dropout layers 6, 9, a regularization is performed during the network training with the aim to reduce the network's complexity in order to prevent overfitting. For example, certain units (neurons) in a layer can randomly deactivated (or dropped) with a certain probability p for example from a Bernoulli distribution (typically 50% of the activations in a given layer are set to zero, while the remaining ones are scaled-up by a factor of 2). If half of the activations of a layer is set to zero, the neural network won't be able to rely on particular activations in a given feed-forward pass during training. Consequently, the neural network will learn different, redundant representations. At the end of the training, those units, which do not have substantial benefit, are permanently dropped from the network. Finally, if the training has finished, the complete network is usually tested, where the dropout probability is set to 0. Advantageously, training will be faster by the dropout layers 6, 9.
[0035] The dense layers 6, 9 and the flatten layer 7 are classifiers. In contrast to the dropout layers 6, 9, the dense layer 8 is simply a layer where each unit or neuron is connected to each neuron in the next layer. Like every classifier, the dense layer 8 needs individual features like a feature Vector. For this purpose, the multidimensional output must be converted into a one-dimensional vector, which is made by the flatten layer 7.
[0036] A particularity of the first embodiment is the use of the average pooling layer 5 instead of a maximum pooling layer. Maximum pooling is by far the most widespread method, whereby only the activity of the most active (hence “Max”) neuron is retained for further calculation steps from a submatrix of neurons of the convolutional layer, while the activity of the remaining neurons is discarded.
[0037] In contrast thereto, the first embodiment of the present invention uses the average pooling layer 5, whereby only the average in a submatrix of neurons of the convolutional layer is retained for further calculation steps, while the activity of the remaining neurons is discarded. The inventors of the present patent application found that, image patches including the near sun area do not show a strong contrast as in the digit patches of other images, the average pooling layer 5 is preferred. The average pooling layer 5 is able to capture more subtle and likely smooth contrast.
[0038] In a nutshell, the CNN network 1 functions as automatic filter design based on convolution operations (thus the name convolutional neural network 1) in the first two layers 2, 3 (except for the input layer 1) followed by layers of perceptrons 6 to 10, the last of which outputs class probabilities. Examples of conventional perceptrons can be found in F. Rosenblatt, The Perceptron—a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory, 1957.
[0039]
[0040] Instead of a single neural function in the LSTM, there are four modules that interact with each other in a very special way. An LSTM contains the input gate, the output gate and the forget gate and an inner cell in the shape of a neuron. In short, the input gate is the extent to which a new value flows into the cell, the forget gate is the extent to which a value remains or is forgotten in the cell, and the output gate is the extent to which the value in the cell is used for a calculation in a next module in the process. These network elements are connected with sigmoid neural functions and various vector and matrix operations and transferred into each other. The associated equations for each gate and how this network works are known in the state of the art so that there is no need for detailed descriptions. Associated equations for each gate and why this network can be powerful are also explained in S. Hochreiter and J. Schmidhuber (1997), “Long short-term memory”. Neural Computation. 9 (8): 1735-1780. doi:10.1162/neco.1997.9.8.1735.
[0041] The memory cell 11 can forget its state or not at each time step. For example, if a cloud's development is analyzed and it is determined that this development is not relevant for whatever reason, the memory cell 11 can be set to zero before the net ingests the first element of the next analysis.
[0042] The inventors found out that such a LSTM offers the capability to capture dynamic features such as motion. Advantageously, additional features can be extracted from the image dynamics to provide better classification accuracy. The inputs to LSTM are a sequence of images and the outputs are a (delayed) sequence of class probabilities at the corresponding time instance.
[0043] Advantageously, a convenient annotation mechanism is realized to perform supervised training based on robust training given noisy labels. This avoids intensively time consuming of human labor and still achieve high classification accuracy. Examples of such a robust training is for example given in D. Rolnick, A. Veit, S. Belongie, N. Shavit, “Deep Learning is Robust to Massive Label Noise”, https://arxiv.org/abs/1705.10694; D. Flatow and D. Penner, “On the Robustness of ConvNets to Training on Noisy Labels”, http://cs231n.stanford.edu/reports/flatow_ penner_report.pdf, 2017; and A. Vandat, “Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks”, https://arxiv.org/abs/1706.00038.
[0044] The irradiance measurements can be made by a pyranometer, for example.
[0045] If in a period of time, for example 30 minutes, the irradiance follows the predicted clear sky index, then there is a good chance of clear sky in the middle of the 30 minutes, i.e., the 1.5.sup.th minute, if a time counter is initiated from 0 every time. This is because there is time correspondence between the image patches and the measure irradiance. However, this alone might probably not guarantee the condition because clouds can move through and near the sun without covering it, thus resulting in no irradiance drop.
[0046] Therefore, the cloud segmentation algorithms of the present invention can be used as a supplementary criterion. A high threshold can be set to make sure that there is no identified cloud in the image patch to label it as “clear” (vs. cloudy). Combining the two criterions, nearly correct annotations can be achieved among all the labeled data. Optionally, schemes can be adopted to deal with noisy labels and thus improve the training accuracy. Examples of such schemes are known from S. E. Reed, H. Lee, D. Anguelov, C. Szegedy, D. Erhan, and A. Rabinovich, “TRAINING DEEP NEURAL NETWORKS ON NOISY LABELS WITH BOOTSTRAPPING”, workshop contribution at ICLR 2015; and A. J. Bekker, and J. Goldberger, “Training deep neural-networks based on unreliable labels”, Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International
[0047] Conference on, 20-25 Mar. 2016, Shanghai, China.
[0048] The herein used term “image near the sun” covers an image including image parts, the characteristics thereof (for example brightness, contrast, color, etc.) are affected by the sun. The term particularly covers images, in which the sun is included.
[0049] The CNN and the LSTM are usually realized in a computer-implemented manner, i.e. their layers and/or modules are developed by software and stored in a machine readable storage medium. Training of the CNN and the LSTM is likewise performed in a computer-implemented manner. It is to be noted that the CNN and the LSTM need not to be physically implemented, for example by means of mechanical or structural devices.
[0050] The invention may be realized by means of a computer program respectively software. However, the invention may also be realized by means of one or more specific electronic circuits respectively hardware. Furthermore, the invention may also be realized in a hybrid form, i.e. in a combination of software modules and hardware modules.
[0051] The invention described in this document may also be realized in connection with a “CLOUD” network which provides the necessary virtual memory spaces and the necessary virtual computational power.
[0052] Comparing the CNN and the LSTM, the CNN is a type of a feed-forward artificial neural network with variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing. The CNN uses connectivity pattern between its neurons. The CNN does usually not have a memory.
[0053] The LSTM does not follow the strict feed-forward nature and has the internal memory to process arbitrary sequences of inputs, and remember the features learned previously. The LSTM can handle arbitrary input/output lengths. Unlike feedforward neural networks, the LSTM can use its internal memory to process arbitrary sequences of inputs. LSTM uses recurrent time-series information, i.e. an output will impact the next input.
[0054]
[0055] The prediction device comprises a camera 24 for capturing near sun sky images. The near sun sky images will be forwarded to the data processor for processing the corresponding image data in the manner as described above.
[0056] The prediction device further comprises a machine readable storage medium which contains stored program code that, when executed on the computer, causes the computer to perform the near sun sky image classification according to the present invention as described above. The prediction device is communicatively connected to the control device 25, and the control device 25 is configured to control, based on the prediction signal, the electric power flow in the future.
[0057] The described electric power system is based on the idea that with a valid and precise prediction of the intensity of sun radiation, which can be captured by the photovoltaic power plant in the (near) future, the power, which can be supplied from the photovoltaic power plant to the power grid, can be predicted in a precise and reliable manner. This allows to control the operation of the at least one further power plant and/or of the at least one electric consumer in such a manner that the power flow to and the power flow from the power grid are balanced at least approximately. Hence, the stability of the power grid and, as a consequence also the stability of the entire electric power system can be increased.
[0058] It should be noted that the term “comprising” does not exclude other elements or steps and “a” or “an” does not exclude a plurality. Also elements described in association with different embodiments may be combined. It should also be noted that reference signs in the claims should not be construed as limiting the scope of the claims.
LIST OF REFERENCE SIGNS
[0059] 1 convolutional neural network (CNN)
[0060] 2 input layer
[0061] 3 convolutional layer
[0062] 4 convolutional layer
[0063] 5 average pooling layer
[0064] 6 dropout layer
[0065] 7 flatten layer
[0066] 8 dense layer
[0067] 9 dropout layer
[0068] 10 dense layer
[0069] 11 long short-term memory cell (LSTM)
[0070] 12 input gate
[0071] 13 neuron
[0072] 14 self-recurrent connection
[0073] 15 forget gate
[0074] 16 output gate
[0075] 20 power grid
[0076] 21 photovoltaic plant
[0077] 22 conventional power plant
[0078] 23 hydroelectric power plant
[0079] 24 camera
[0080] 25 control unit
[0081] 26 factory
[0082] 27 house