TRAINING OF MACHINE LEARNING SYSTEMS FOR IMAGE PROCESSING

20220230416 · 2022-07-21

    Inventors

    Cpc classification

    International classification

    Abstract

    A computer-implemented method for training a machine learning system including: initializing parameters of the machine learning system and a metaparameter. Repeatedly carrying out the following as a loop: providing a batch of training data points and manipulating the provided training data points or a training method for optimizing the parameters of the machine learning system or a structure of the machine learning system based on the metaparameter. Ascertaining a cost function as a function of instantaneous parameters of the machine learning system and of the instantaneous metaparameters. Adapting the instantaneous parameters as a function of an ascertained first gradient, which has been ascertained with respect to the instantaneous parameters via the ascertained cost function for the training data points, and adapting the metaparameter as a function of a second gradient, which has been ascertained with respect to the metaparameter used in the preceding step via the ascertained cost function.

    Claims

    1-11. (canceled)

    12. A computer-implemented method for training a machine learning system for image processing, comprising the following steps: initializing parameters of the machine learning system and a metaparameter; repeatedly carrying out the following steps as a loop with a predefined number of iterations or until a convergence criterion with respect to a training progress is met: providing a batch of training data points; manipulating the provided training data points or a training method for optimizing the parameters of the machine learning system or a structure of the machine learning system based on the metaparameter; ascertaining a cost function as a function of instantaneous parameters of the machine learning system and of an instantaneous metaparameter; adapting the instantaneous parameters as a function of a first gradient, which has been ascertained with respect to the instantaneous parameters via the ascertained cost function for the training data points; and when the step of manipulating has been carried out more than once, adapting the metaparameter as a function of a second gradient, which has been ascertained via the ascertained cost function with respect to the metaparameter used in a preceding step.

    13. The method as recited in claim 12, wherein the second gradient with respect to the metaparameter used in the preceding step is ascertained via the ascertained cost function as a function of a scalar product between the first gradient with respect to the instantaneous parameter and the second gradient with respect to a preceding metaparameter via the ascertained cost function of an immediately preceding loop cycle.

    14. The method as recited in claim 12, wherein the second gradient with respect to the metaparameter used in the preceding step is ascertained via the cost function of the training data points of the selected batch for a current step as a function of a scalar product between the first gradient with respect to the parameter of the machine learning system used in the preceding step for a first training data point and a third gradient with respect to the instantaneous parameter of the machine learning system of an averaged sum, the scalar product serving as weighting of the second gradient.

    15. The method as recited in claim 14, wherein the second gradient is ascertained as a function of a sum via a plurality of the training data points of an instantaneous batch of the training points over the scalar product for a plurality of the training data points of the instantaneous batch multiplied in each case by a gradient with respect to the instantaneous metaparameter via a logarithm of a distribution used for the respective first training data point for the scalar product, the distribution characterizing a probability distribution of optimal values of the metaparameter.

    16. The method as recited in claim 12, wherein the first gradient with respect to the instantaneous parameters of the machine learning system is ascertained using a gradient descent method, the gradient descent method being a stochastic gradient descent method.

    17. The method as recited in claim 12, wherein the manipulation of the training data is carried out as a function of the instantaneous metaparameter only for each n-th loop cycle, m being a predefined number.

    18. The method as recited in claim 17, wherein m<2.

    19. The method as recited in claim 12, wherein the metaparameter characterizes a training data point augmentation and during manipulation of the training data points, the train data points are augmented based on the metaparameter.

    20. The method as recited in claim 12, wherein the training data points are images, and the machine learning system is trained as an image classifier.

    21. A device configured to train a machine learning system for image processing, the device configured to: initialize parameters of the machine learning system and a metaparameter; repeatedly carry out the following steps as a loop with a predefined number of iterations or until a convergence criterion with respect to a training progress is met: providing a batch of training data points; manipulating the provided training data points or a training method for optimizing the parameters of the machine learning system or a structure of the machine learning system based on the metaparameter; ascertaining a cost function as a function of instantaneous parameters of the machine learning system and of an instantaneous metaparameter; adapting the instantaneous parameters as a function of a first gradient, which has been ascertained with respect to the instantaneous parameters via the ascertained cost function for the training data points; and when the step of manipulating has been carried out more than once, adapting the metaparameter as a function of a second gradient, which has been ascertained via the ascertained cost function with respect to the metaparameter used in a preceding step.

    22. A non-transitory machine-readable memory medium on which is stored a computer program for training a machine learning system for image processing, the computer program, when executed by a computer, causing the computer to perform the following steps: initializing parameters of the machine learning system and a metaparameter; repeatedly carrying out the following steps as a loop with a predefined number of iterations or until a convergence criterion with respect to a training progress is met: providing a batch of training data points; manipulating the provided training data points or a training method for optimizing the parameters of the machine learning system or a structure of the machine learning system based on the metaparameter; ascertaining a cost function as a function of instantaneous parameters of the machine learning system and of an instantaneous metaparameter; adapting the instantaneous parameters as a function of a first gradient, which has been ascertained with respect to the instantaneous parameters via the ascertained cost function for the training data points; and when the step of manipulating has been carried out more than once, adapting the metaparameter as a function of a second gradient, which has been ascertained via the ascertained cost function with respect to the metaparameter used in a preceding step.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0023] Specific embodiments of the present invention are explained below with reference to the figures.

    [0024] FIG. 1 schematically shows a flowchart of one specific embodiment of the present invention.

    [0025] FIG. 2 schematically shows a representation of temporal dependencies when ascertaining gradients.

    [0026] FIG. 3 schematically shows one exemplary embodiment for controlling an at least semi-autonomous robot.

    [0027] FIG. 4 schematically shows one exemplary embodiment for controlling a manufacturing system.

    [0028] FIG. 5 schematically shows one exemplary embodiment for controlling an access system.

    [0029] FIG. 6 schematically shows one exemplary embodiment for controlling a monitoring system.

    [0030] FIG. 7 schematically shows one exemplary embodiment for controlling a personal assistant.

    [0031] FIG. 8 schematically shows one exemplary embodiment for controlling a medical imaging system.

    DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

    [0032] Machine learning systems, in particular, neural networks, are usually trained with the aid of a so-called gradient descent method. Gradient descent methods are characterized in that parameters, in particular, weights of the machine learning system, are iteratively updated in every training step as a function of a calculated gradient. In this case, the gradients are ascertained via a derivation of a cost function l, the cost function therefore being evaluated on training data and being derived via the parameters of the machine learning system. For the usual gradient descent method, cost function l(θ) is a function of parameters θ of the machine learning system, as well as of ascertained output variables of the machine learning system and provided target output variables, in particular, labels.

    [0033] The present invention begins here in this training method with gradient descent methods and supplements this training method as explained below and as schematically represented in FIG. 1.

    [0034] At start S1 of the training method, metaparameters ϕ, in addition to parameters θ of the machine learning system, are also initialized. It is noted that here two successive metaparameters for the first training steps may be initialized: for example, ϕ.sub.1, ϕ.sub.2:=ϕ.sub.1

    [0035] Metaparameter ϕ parameterizes, for example, a data augmentation of the training data, for example, a distribution via distortions of the images or via rotations.

    [0036] Cost function l(θ, ϕ) is also expanded in such a way that the cost function is now also a function of metaparameter ϕ.

    [0037] Actual training step S2 of the machine learning system, in which parameters θ are updated as a function of the gradient, remains unchanged. This means, a gradient ∇.sub.θl(θ, ϕ) with respect to metaparameter θ is calculated via cost function l(θ, ϕ), the cost function being evaluated using the instantaneous parameters of instantaneous iteration step t on the respective used training data: l(θ.sub.t, ϕ.sub.t).

    [0038] This is followed, in contrast to the usual training method, by an additional optimization step S3. In this step, metaparameter ϕ is optimized via an additional gradient descent method. For this purpose, a gradient ∇.sub.ϕ.sub.t−1 with respect to the metaparameter is calculated as a function of the cost function, for this purpose, the cost function being evaluated as a function of metaparameter ϕ.sub.t−1 used in immediately preceding training step t−1: l(θ.sub.t, ϕ.sub.t−1). This means, instantaneous metaparameter ϕ.sub.t is updated as a function of the value of preceding metaparameter ϕ.sub.t−1.

    [0039] This adaptation of the metaparameter enabled in such a way between two training iterations t−1,t effectively means that immediately preceding metaparameter ϕ.sub.t−1 used on the instantaneously used training data, which have been used for ascertaining the cost function with the instantaneous parameters of the machine learning system, is evaluated. This generates a dependency between successive steps as opposed to the usual training method. Via this further dependency, additional optimization step S3 for optimizing metaparameter ϕ.sub.t results in the metaparameter being optimized in such a way that when used in the next training step, the metaparameter further minimizes the cost function. As a result, it may be said that a more rapid convergence due to the metaparameter is achieved by this newly introduced dependency, since the metaparameter advantageously influences the optimization of the cost function usually carried out.

    [0040] Once metaparameter ϕ.sub.t+1 has then been set for the next training step S4: ϕ.sub.t−1ζϕ.sub.t−β∇.sub.ϕ.sub.t−1l, and the parameters of the machine learning system for the next training have also been set: θ.sub.t+1ζθ.sub.t−α∇.sub.θ.sub.tl, training steps S2 and S3 just described are carried out again, in particular, carried out multiple times in succession, until a predefined abort criterion is met. It is noted that parameters α,β represent weightings of the gradients. These parameters preferably have a value between 0≤α, β<1.

    [0041] It is noted that in the subsequent training steps before carrying out step S2, the training data are augmented in each case as a function of the set metaparameter. It has been found in experiments, however, that the augmentation of the training data has resulted in significant performance improvements only in every n-th training step. Preferably, n=2 is selected here. In one further exemplary embodiment, the gradient descent method for the machine learning system or a structure of the machine learning system may alternatively or in addition be changed after step S2 as a function of the metaparameter.

    [0042] If the training has been completed by a multiple sequential repetition of step S2 and S3, step S4 may follow. Herein, the machine learning system just trained is output.

    [0043] In a subsequent step S5, the output machine learning system may then be used, for example, to control an actuator. In this case, the machine learning system is able to process data provided to it and the actuator is then activated as a function of the ascertained result of the machine learning system.

    [0044] In one preferred exemplary embodiment, the machine learning system is trained using images in order to classify/segment objects in the images.

    [0045] In order to further improve the training method, gradient ∇.sub.ϕ.sub.t−1l is determined using the REINFORCE trick. This measure has the advantage that with this trick, non-differentiable metaparameters ϕ are optimizable, for example, because the latter are not constant or because the latter are characterized by a non-constant probability distribution p.

    [0046] For example, distribution p may be a function of metaparameter ϕ and may output a value α.sub.i˜p(⋅; ϕ) for training data point i. For example, α.sub.i may characterize a value of a hyperparameter of the machine learning system (for example, dropout rate) or a training data point selection strategy. Distribution p(⋅; ϕ) may, for example, be a Softmax distribution, which is parameterized by ϕ.

    [0047] For the measure just cited, a scalar product is used, which connects two successive batches of training data. The scalar product is ascertained as follows for the i-th training data point:


    r.sub.t,i=custom-character∇.sub.θl(θ.sub.t−1, ϕ.sub.t).sub.i, ∇.sub.θL(θ.sub.t)custom-character  (Equation 2):

    with l( ).sub.i being the cost function for the i-th training data point, in particular, from the respectively considered batch of training data points containing n-th training data point, and

    [00001] L ( θ t ) = 1 n .Math. j = 1 n l ( θ t , ϕ t )

    being the cost function of the entire batch of immediately following step t and the scalar product custom-character,custom-character.

    [0048] It is provided that scalar product r.sub.t,i is to be interpreted as a reward and the REINFORCE trick is to be applied thereto. Thus, gradient ∇.sub.ϕ.sub.t−1l.sub.t may now be approximated as follows:

    (Equation 3):

    [0049] [00002] ϕ t - 1 l t .Math. i = 1 n r t , i .Math. ϕ t log p ( a i ; ϕ t )

    [0050] FIG. 2 illustrates the temporal dependencies for ascertaining Equation 2 by way of example for successive steps t=1,2, . . . ,4.

    [0051] FIG. 3 schematically shows an actuator 10 in its surroundings in interaction with a control system 40. The surroundings are detected at preferably regular temporal intervals in a sensor 30, in particular, in an imaging sensor such as a video sensor, which may also be provided by a plurality of sensors, for example, a stereo camera. Other imaging sensors are also possible such as, for example, radar, ultrasound or LIDAR. An infrared camera is also possible. Sensor signal S—or in the case of multiple sensors one sensor signal S each—of sensor 30 is transferred to control system 40. Thus, control system 40 receives a sequence of sensor signals S. Control system 40 ascertains activation signals A therefrom, which are transferred to actuator 10.

    [0052] Control system 40 receives the sequence of sensor signals S of sensor 30 in an optional receiving unit 50, which converts the sequence of sensor signals S into a sequence of input images x (alternatively, each sensor signal S may also be directly adopted as an input image x). Input image x may, for example, be a section or a further processing of sensor signal S. Input image x includes individual frames of a video recording. In other words, input image x is ascertained as a function of sensor signal S. The sequence of input images x is fed to a machine learning system, in the exemplary embodiment, the output machine learning system 60 from step S4.

    [0053] Machine learning system network 60 ascertains output variables y from input images x. These output variables y may include, in particular, a classification and/or a semantic segmentation of input images x. Output variables y are fed to an optional forming unit, which ascertains therefrom activation signals A, which are fed to actuator 10 in order to activate actuator 10 accordingly. Output variable y includes pieces of information about objects detected by sensor 30.

    [0054] Actuator 10 receives control signals A, is activated accordingly and carries out a corresponding action. Actuator 10 in this case may include a (not necessarily structurally integrated) control logic, which ascertains from activation signal A a second activation signal, with which actuator 10 is then activated.

    [0055] In one further specific embodiment, control system 40 includes sensor 30. In still further specific embodiments, control system 40 also includes alternatively or in addition actuator 10.

    [0056] In further preferred specific embodiments, control system 40 includes a single or a plurality of processors 45 and at least one machine-readable memory medium 46, on which the instructions are stored which, when they are carried out on processors 45, then prompt control system 40 to carry out the method according to the present invention.

    [0057] A display unit 10a alternatively or in addition to actuator 10 is provided in alternative specific embodiments.

    [0058] In one further exemplary embodiment, control system 40 is used for controlling an at least semi-autonomous robot, here, an at least semi-autonomous motor vehicle 100. Sensor 30 may, for example, be a video sensor situated preferably in motor vehicle 100.

    [0059] Machine learning system 60 is preferably configured for the purpose of safely identifying x objects from the input images. Machine learning system 60 may be a neural network.

    [0060] Actuator 10 situated preferably in motor vehicle 100 may, for example, be a brake, a drive or a steering system of motor vehicle 100. Activation signal A may then be ascertained in such a way that the actuator or actuators 10 is/are activated in such a way that motor vehicle 100 prevents, for example, a collision with objects reliably identified by artificial neural network 60, in particular, when objects of particular classes, for example, pedestrians, are involved.

    [0061] Alternatively, the at least semi-autonomous robot may also be another mobile robot (not depicted), for example, one which moves by flying, floating, diving or pacing. The mobile robot may, for example, also be an at least semi-autonomous lawn mower or an at least semi-autonomous cleaning robot. In these cases as well, activation signal A may be ascertained in such a way that the drive and/or the steering system of the mobile robot is/are activated in such a way that the at least semi-autonomous robot prevents, for example, a collision with objects identified by artificial neural network 60.

    [0062] Alternatively or in addition, display unit 10a may be activated with activation signal A and, for example, the ascertained safe areas may be displayed. It is also possible, for example, in a motor vehicle 100 including a non-automated steering system that display unit 10a is activated with activation signal A in such a way that it outputs a visual or acoustic warning signal when it is ascertained that motor vehicle 100 threatens to collide with one of the reliably identified objects.

    [0063] FIG. 3 shows one exemplary embodiment, in which control system 40 is used for activating a manufacturing machine 11 of a manufacturing system 200 by activating an actuator 10 that controls this manufacturing machine 11. Manufacturing machine 11 may, for example, be a machine for stamping, sawing, drilling and/or cutting.

    [0064] Sensor 30 may then, for example, be a visual sensor, which detects, for example, properties of manufacturing products 12a, 12b. It is possible that these manufacturing products 12a, 12b, are movable. It is possible that actuator 10 controlling manufacturing machine 11 is activated as a function of an assignment of detected manufacturing products 12a, 12b, so that manufacturing machine 11 correspondingly carries out a subsequent processing step of the correct one of manufacturing products 12a, 12b. It is also possible that by identifying the correct properties of the same one of manufacturing products 12a, 12b (i.e., without a misclassification), manufacturing machine 11 correspondingly adapts the same manufacturing step for a processing of a subsequent manufacturing product.

    [0065] FIG. 5 shows one exemplary embodiment, in which control system 40 is used for controlling an access system 300. Access system 300 may include a physical access control, for example, a door 401. Video sensor 30 is configured to detect a person. This detected image may be interpreted with the aid of object identification system 60. If multiple persons are detected simultaneously, the identity of the persons, for example, may be particularly reliably ascertained by an assignment of the persons (i.e., of the objects) relative to one another, for example, by an analysis of their movements. Actuator 10 may be a lock, which blocks or does not block the access control, as a function of activation signal A, for example, opens or does not open door 401. For this purpose, activation signal A may be selected as a function of the interpretation of object identification system 60, for example, as a function of the ascertained identity of the person. Instead of the physical access control, a logical access control may also be provided.

    [0066] FIG. 6 shows one exemplary embodiment, in which control system 40 is used for controlling a monitoring system 400. This exemplary embodiment differs from the exemplary embodiment shown in FIG. 5 in that instead of actuator 10, display unit 10a is provided, which is activated by control system 40. For example, an identity of the objects recorded by video sensor 30 may be reliably ascertained by artificial neural network 60 in order, for example, to deduce therefrom which become suspicious, and activation signal A is then selected in such a way that this object is displayed in a color highlighted manner by display unit 10a.

    [0067] FIG. 7 shows one exemplary embodiment, in which control system 40 is used for controlling a personal assistant 250. Sensor 30 is preferably a visual sensor, which receives images of a gesture of a user 249.

    [0068] Control system 40 ascertains as a function of the signals of sensor 30 an activation signal A of personal assistant 250, for example, by the neural network carrying out a gesture recognition. This ascertained activation signal A is then conveyed to personal assistant 250 and the latter is thus activated accordingly. This ascertained activation signal A may, in particular, be selected in such a way that it corresponds to an assumed desired activation by user 249. This assumed desired activation may be ascertained as a function of the gesture recognized by artificial neural network 60. Control system 40 may then select activation signal A for conveyance to personal assistant 250 as a function of the assumed desired activation and/or may select activation signal A for conveyance to the personal assistant according to assumed desired activation 250.

    [0069] This corresponding activation may, for example, include that personal assistant 250 retrieves pieces of information from a database and reproduces them in an apprehensible manner for user 249.

    [0070] Instead of personal assistant 250, a household appliance (not depicted), in particular, a washing machine, a stove, an oven, a microwave or a dishwasher may also be provided in order to be activated accordingly.

    [0071] FIG. 8 shows one exemplary embodiment, in which control system 40 is used for controlling a medical imaging system 500, for example, an MRI, an x-ray device or an ultrasound device. Sensor 30 may, for example, be provided in the form an imaging sensor, display unit 10a being activated by control system 40. For example, it may be ascertained by neural network 60 whether an area recorded by the imaging sensor is conspicuous, and activation signal A may then be selected in such a way that this area is displayed in a color highlighted manner by display unit 10a.