Regularised Training of Neural Networks

20230237323 · 2023-07-27

Inventors

Jens Eric Markus Mehnert (Malmsheim, DE)

Cpc classification

International classification

Abstract

Training an artificial neural network, ANN, which translates one or more input variables into one or more output variables, using learning data sets including learning input variable values having measurement data, and associated learning output variable values, by: mapping learning input variable values from a learning data set onto output variable values using the ANN; processing deviations of the output variable values from the respective learning output variable values using a cost function to form a measure of the error of the ANN when processing the learning input variable values; determining from the error, by backpropagation, changes in parameters, the execution of which, when learning input variable values are further processed by the ANN, improve the evaluation of the obtained output variable values by the cost function, and applying said changes to the ANN; wherein a subset of the output variable values is excluded from consideration in the backpropagation.

Claims

1. A method for training an artificial neural network (ANN), which translates one or more input variables into one or more output variables using learning data sets comprising learning input variable values having measurement data and associated learning output variable values, wherein the measurement data were obtained using a physical measuring operation, and/or using a partial or complete simulation of such a measuring operation, and/or using a partial or complete simulation of a technical system capable of being monitored by such a measuring operation, wherein the behavior of the ANN is characterized by parameters, comprising: mapping learning input variable values from at least one learning data set onto output variable values of using the ANN; processing deviations of the output variable values from the respective learning output variable values in accordance with a cost function to form a measure of the error of the ANN when processing the learning input variable values; determining from the error, by backpropagation, changes in the parameters, the execution of which, when learning input variable values are further processed by the ANN, improve the evaluation of the thus obtained output variable values by the cost function; and applying said changes to the ANN, wherein a subset of the output variable values is excluded at least from consideration in the backpropagation.

2. The method according to claim 1, wherein, for at least one learning data set, a portion of at least 40% and at most 60 of the output variable values generated from the learning input variable values of said learning data set is excluded from consideration in the backpropagation.

3. The method according to claim 1, wherein the input variables are pixel values assigned to the pixels of an image arranged in a two-dimensional grid.

4. The method according to claim 3, wherein: output variable values obtained by processing at least one learning data set that are excluded from consideration in the backpropagation correspond to square blocks in the grid of pixels; and the output variables in each case assign a semantic meaning to the pixels.

5. The method according to claim 4, wherein the square blocks have an edge length of between 16 and 256 pixels.

6. The method according to claim 1, wherein the output variables are probabilities and/or confidences with which an ANN used as a classifier assigns the input variables to one or more classes of a predetermined classification.

7. The method according to claim 1, wherein a frequency distribution of the output variable values that are excluded from consideration in the backpropagation over the learning output variable values that these output variable values in each case aim for corresponds to a frequency distribution of the different learning output variable values in the learning data sets used.

8. The method according to claim 1, wherein output variable values that are excluded from consideration in the backpropagation are also excluded from the evaluation by the cost function.

9. The method according to claim 1, wherein: the output variable values are ordered according to their deviations from the respective learning output variable values; and only a fixed portion of the output variable values having the greatest deviations is included in the backpropagation.

10. The method according to claim 1, wherein: the output variable values are ordered according to their uncertainties; and only a fixed portion of the output variable values having the greatest uncertainties is included in the backpropagation.

11. The method according to claim 1, wherein, during training, neurons and/or other processing units of the ANN, and/or connections between such neurons and/or other processing units, are randomly temporarily deactivated in accordance with a predetermined distribution.

12. The method according to claim 1, wherein, in the process of changing the parameters, the learning rate is reduced in proportion to the increase in the portion of output variable values excluded from the backpropagation.

13. A method comprising the steps of: training an artificial neural network (ANN), using the method according to claim 1; operating the ANN by supplying input variables thereto and mapping said input variables onto output variables, wherein the input variables comprise measurement data obtained using a physical measuring operation, and/or using a partial or complete simulation of such a measuring operation, and/or using a partial or complete simulation of a technical system capable of being monitored by such a measuring operation; forming a control signal from the output variables provided by the ANN; and controlling a vehicle, and/or a system for quality control of products manufactured in series, using the control signal.

14. A computer program containing machine-readable instructions that, when executed on one or more computers, cause the computer or computers to carry out the method according to claim 1.

15. A machine-readable data carrier and/or download product comprising the computer program according to claim 14.

16. A computer equipped with the computer program according to claim 14.

17. A computer equipped with the machine-readable data carrier and/or download product according to claim 15.

18. The method according to claim 1, wherein, for at least one learning data set a portion of at least 45% and at most 55%, of the output variable values generated from the learning input variable values of said learning data set is excluded from consideration in the backpropagation.

Description

EMBODIMENTS

[0043] In the drawings:

[0044] FIG. 1 shows an embodiment of the method 100 for training the ANN 1;

[0045] FIG. 2 shows examples of parts of a semantic segmentation that can be excluded from the backpropagation;

[0046] FIG. 3 shows an embodiment of the method 200 having a complete chain of action.

[0047] FIG. 1 is a schematic flow chart of an embodiment of the method 100 for training the ANN 1. In step 110, learning input variable values 11a from at least one learning data set 2 used for the training are mapped onto output variable values 13 by means of the ANN 1. The behavior of the ANN is characterized by parameters 12.

[0048] In step 120, these output variable values 13 are compared with the learning output variable values 13a from the associated learning data set 2. The result of this comparison is processed in accordance with a cost function 14 to form a measure of the error 14a of the ANN 1 during processing of the learning input variable values 11a.

[0049] Based this error 14a, changes in the parameters 12, the execution of which, when learning input variable values 11a are further processed by the ANN 1, is likely to improve the evaluation of the thus obtained output variable values 13 by the cost function 14, are determined in step 130 using backpropagation. In this case, according to block 131, a subset 13* of the output variable values 13 is excluded from consideration in the backpropagation.

[0050] In this case, according to block 131a, the frequency distribution of the output variable values 13 that are excluded from consideration in the backpropagation over the learning output variable values 13a that these output variable values 13 in each case aim for can in particular correspond to the frequency distribution of the different learning output variable values 13a in the learning data sets 2 used.

[0051] The output variable values 13* excluded from the backpropagation can optionally also already be excluded from the determination of the error 14a, according to block 121.

[0052] According to block 132, for at least one learning data set 2, a portion 13* of at least 40% and at most 60%, preferably of at least 45% and at most 55% and very particularly preferably of 50%, of the output variable values 13 generated from the learning input variable values 11a of said learning data set 2 can be excluded from consideration in the backpropagation.

[0053] According to block 133a, output variable values 13 can be ordered according to their deviations from the respective learning output variable values 13a. According to block 133b, only a defined portion of the output variable values 13 having the greatest deviations can then be included in the backpropagation.

[0054] According to block 134a, output variable values 13 can be ordered according to their uncertainties. According to block 134b, only a defined portion of the output variable values 13 having the greatest uncertainties can then be included in the backpropagation.

[0055] In step 140, the changed parameters 12 are applied to the ANN 1. Subsequently, learning input variable values 11a can be fed back to the ANN in step 110, such that a check can be carried out in a feedback loop to determine whether the success sought with the change to the parameters 12 has occurred with the error 14a. If any abort criterion is reached, the training can be ended and the state 12* of the parameters 12 that is then achieved can be output as the final state.

[0056] Optionally, according to block 141, during training, neurons and/or other processing units of the ANN 1, and/or connections between such neurons and/or other processing units, can be randomly temporarily deactivated in accordance with a predetermined distribution.

[0057] Optionally, according to block 142, in the process of changing the parameters 12, the learning rate can be reduced in proportion to the increase in the portion of output variable values excluded from the backpropagation.

[0058] FIG. 2 shows an example of semantic segmentation of a learning image showing a traffic situation. In this traffic situation, a vehicle 50 waits at a “yield” sign 55. The semantic segmentation summarizes the output variable values 13 that the ANN 1 generated from the learning image.

[0059] The vehicle 50 has mirrors 51 and wheels 52. The mirrors 51 and wheels 52 can expediently be defined as the subset 13* of the output variable values 13 that is excluded from the backpropagation 130. As explained above, the mirrors 51 and the wheels 52 are more specifically dedicated to recognition of a particular vehicle type than generally for recognition of vehicles.

[0060] The road sign 55 has a pole 55a that carries a sign 55b. The pole 55a can in turn expediently be defined as the subset 13* of the output variable values 13 that is excluded from the back propagation 130.

[0061] All traffic signs have such a pole 55a, and it therefore does not contribute to the important recognition of precisely which traffic sign is present.

[0062] FIG. 3 is a schematic flow chart of an embodiment of the method 200 with the complete chain of action up to the control of technical systems 50, 60.

[0063] In step 210, an ANN 1 is trained using the previously described method 100. In step 220, the ANN 1 is operated by supplying input variables 11 thereto and mapping said input variables onto output variables 13. In step 230, a control signal 230a is formed from said output variables 13. In step 240, said control signal 230a is used to control a vehicle 50, and/or a system 60 for quality control of products manufactured in series.

Regularised Training of Neural Networks

Inventors

Cpc classification

Classification Explorer

G06N3/082

PHYSICS

Classification Explorer

G06N3/084

PHYSICS

Classification Explorer

G06N3/08

PHYSICS

International classification

Classification Explorer

G06N3/08

PHYSICS

Abstract

Claims

Description