Device and method for training a neuronal network
12248878 ยท 2025-03-11
Assignee
Inventors
- Jorn Peters (Amsterdam, NL)
- Thomas Andy Keller (Amsterdam, NL)
- Anna Khoreva (Stuttgart, DE)
- Max WELLING (Amsterdam, NL)
- Priyank Jaini (Amsterdam, NL)
Cpc classification
G06N3/082
PHYSICS
G06F18/2414
PHYSICS
G06N3/008
PHYSICS
G05D1/0088
PHYSICS
International classification
G06N3/082
PHYSICS
G05D1/00
PHYSICS
G06N3/008
PHYSICS
Abstract
A method for training a neural network. The neural network comprises a first layer which includes a plurality of filters to provide a first layer output comprising a plurality of feature maps. Training of the classifier includes: receiving, by a preceding layer, a first layer input in the first layer, wherein the first layer input is based on the input signal; determining the first layer output based on the first layer input and a plurality of parameters of the first layer; determining a first layer loss value based on the first layer output, wherein the first layer loss value characterizes a degree of dependency between the feature maps, the first layer loss value being obtained in an unsupervised fashion; and training the neural network. The training includes an adaption of the parameters of the first layer, the adaption being based on the first layer loss value.
Claims
1. A method for training a neural network, the neural network including a first layer, the first layer including a plurality of filters to provide a first layer output, the first layer output including a plurality of feature maps, the method comprising the following steps: receiving, from a preceding layer, a first layer input in the first layer, wherein the first layer input is based on the input signal; determining the first layer output based on the first layer input and a plurality of parameters of the first layer; determining a first layer loss value based on the first layer output, wherein the first layer loss value characterizes a degree of dependency between the feature maps of the first layer output, the first layer loss value being obtained in an unsupervised fashion; and training the neural network, including adapting the parameters of the first layer, the adaption being based on the first layer loss value.
2. The method according to claim 1, the method further comprising the following steps: determining an output signal of the neural network depending on the first layer output, wherein the output signal characterizes a classification of the input signal; and determining a classification loss value based on the output signal; wherein the training includes adapting the parameters of the first layer based on the classification loss.
3. The method according to claim 1, wherein the first layer loss value characterizes a factoring of a distribution of the first layer output between the feature maps.
4. The method according to claim 1, wherein the first layer loss value is determined according to the following formula:
5. The method according to claim 4, wherein A.sub.i and/or b.sub.i and/or are adapted during training of the neural network.
6. The method according to claim 1, wherein the determining of the first layer output includes padding the first layer input such that a size of the first layer output matches a size of the first layer input along a height and/or width and/or depth.
7. The method as recited in claim 1, the method further comprising the following step: pruning the filters of the first layer.
8. A computer-implemented method for controlling an actuator using a neural network, the neural network including a first layer, first layer including a plurality of filters to provide a first layer output including a plurality of feature maps, the neural network having been trained by receiving, from a preceding layer, a first layer input in the first layer, wherein the first layer input is based on the input signal, determining the first layer output based on the first layer input and a plurality of parameters of the first layer, determining a first layer loss value based on the first layer output, wherein the first layer loss value characterizes a degree of dependency between the feature maps of the first layer output, the first layer loss value being obtained in an unsupervised fashion, and training the neural network, including adapting the parameters of the first layer, the adaption being based on the first layer loss value, the method comprising: providing a second input signal to the trained neural network based on a sensor signal including data from a sensor; and providing an actuator control signal for controlling the actuator based on a second output signal of the trained neural network.
9. The method according to claim 8, wherein, based on the actuator control signal, the actuator controls an at least partially autonomous robot and/or a manufacturing machine and/or an access control system.
10. A non-transitory machine-readable storage medium on which is stored a computer program for training a neural network, the neural network including a first layer, the first layer including a plurality of filters to provide a first layer output, the first layer output including a plurality of feature maps, the computer program, when executed by a computer, causing the computer to perform the following steps: receiving, from a preceding layer, a first layer input in the first layer, wherein the first layer input is based on the input signal; determining the first layer output based on the first layer input and a plurality of parameters of the first layer; determining a first layer loss value based on the first layer output, wherein the first layer loss value characterizes a degree of dependency between the feature maps of the first layer output, the first layer loss value being obtained in an unsupervised fashion; and training the neural network, including adapting the parameters of the first layer, the adaption being based on the first layer loss value.
11. A training system configured to train a neural network, the neural network including a first layer, the first layer including a plurality of filters to provide a first layer output, the first layer output including a plurality of feature maps, the control system configured to: receive, from a preceding layer, a first layer input in the first layer, wherein the first layer input is based on the input signal; determine the first layer output based on the first layer input and a plurality of parameters of the first layer; determine a first layer loss value based on the first layer output, wherein the first layer loss value characterizes a degree of dependency between the feature maps of the first layer output, the first layer loss value being obtained in an unsupervised fashion; and train the neural network, including adapting the parameters of the first layer, the adaption being based on the first layer loss value.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
(10) Shown in
(11) Thereby, the control system (40) receives a stream of sensor signals (S). It then computes a series of actuator control commands (A) depending on the stream of sensor signals S, which are then transmitted to actuator (10).
(12) Control system (40) receives the stream of sensor signals (S) of sensor (30) in an optional receiving unit (50). Receiving unit (50) transforms the sensor signals (S) into input signals (x). Alternatively, in case of no receiving unit (50), each sensor signal (S) may directly be taken as an input signal (x). Input signal (x) may, for example, be given as an excerpt from sensor signal (S). Alternatively, the sensor signal (S) may be processed to yield input signal (x). Input signal (x) comprises image data corresponding to an image recorded by the sensor (30). In other words, the input signal (x) is provided in accordance with the sensor signal (S).
(13) The input signal (x) is then passed on to a neural network (60).
(14) The neural network (60) is parametrized by parameters (0, which are stored in and provided by a parameter storage (St.sub.1).
(15) The neural network (60) determines an output signal (y) from the input signals (x). The output signal (y) comprises information that assigns one or more labels to the input signal (x). The output signal (y) is transmitted to an optional conversion unit (80), which converts the output signal (y) into the control commands (A). The actuator control commands (A) are then transmitted to the actuator (10) for controlling the actuator (10) accordingly. Alternatively, the output signal (y) may directly be taken as actuator control commands (A).
(16) The actuator (10) receives actuator control commands (A), is controlled accordingly and carries out an action corresponding to the actuator control commands (A). The actuator (10) may comprise a control logic which transforms an actuator control command (A) into a further control command, which is then used to control actuator (10).
(17) In further embodiments, the control system (40) may comprise a sensor (30). In even further embodiments, the control system (40) alternatively or additionally may comprise an actuator (10).
(18) In one embodiment, the neural network (60) may be designed to identify lanes on a road ahead, e.g. by classifying a road surface and markings on said road, and identifying lanes as patches of road surface between said markings. Based on an output of a navigation system, a suitable lane for pursuing a chosen path can then be selected, and depending on a present lane and said target lane, it may then be decided whether a vehicle (100) is to switch lanes or stay in the present lane. The actuator control command (A) may then be computed by e.g. retrieving a predefined motion pattern from a database corresponding to the identified action.
(19) Likewise, upon identifying road signs or traffic lights, depending on an identified type of road sign or an identified state of said traffic lights, corresponding constraints on possible motion patterns of the vehicle (100) may then be retrieved from e.g. a database, a future path of the vehicle (100) commensurate with the constraints may be computed, and the actuator control command (A) may be computed to steer the vehicle (100) such as to execute said trajectory.
(20) Likewise, upon identifying pedestrians and/or vehicles, a projected future behavior of said pedestrians and/or vehicles may be estimated, and based on the estimated future behavior, a trajectory may then be selected such as to avoid collision with the identified pedestrians and/or vehicles, and the actuator control command (A) may be computed to steer the vehicle (100) such as to execute said trajectory.
(21) In still further embodiments, it can be envisioned that the control system (40) controls a display (10a) instead of an actuator (10).
(22) Furthermore, the control system (40) may comprise a processor (45) (or a plurality of processors) and at least one machine-readable storage medium (46) on which instructions are stored which, if carried out, cause the control system (40) to carry out a method according to one aspect of the invention.
(23)
(24) The sensor (30) may comprise one or more video sensors and/or one or more radar sensors and/or one or more ultrasonic sensors and/or one or more LiDAR sensors and or one or more position sensors (like e.g. GPS). Some or all of these sensors are preferably but not necessarily integrated in the vehicle (100).
(25) Alternatively or additionally, the sensor (30) may comprise an information system for determining a state of the actuator system. One example for such an information system is a weather information system which determines a present or future state of the weather in the environment (20).
(26) For example, using the input signal (x), the neural network (60) may, for example, detect objects in the vicinity of the at least partially autonomous robot. The output signal (y) may comprise an information which characterizes where objects are located in the vicinity of the at least partially autonomous robot. The actuator control command (A) may then be determined in accordance with this information, for example to avoid collisions with the detected objects.
(27) The actuator (10), which is preferably integrated in the vehicle (100), may be given by a brake, a propulsion system, an engine, a drivetrain, or a steering of the vehicle 100. Actuator control commands (A) may be determined such that actuator (or actuators) (10) is/are controlled such that vehicle (100) avoids collisions with said detected objects. Detected objects may also be classified according to what the classifier (60) deems them most likely to be, e.g. pedestrians or trees, and the actuator control commands (A) may be determined depending on the classification.
(28) In further embodiments, the at least partially autonomous robot may be given by another mobile robot (not shown), which may, for example, move by flying, swimming, diving or stepping. The mobile robot may, inter alia, be an at least partially autonomous lawn mower, or an at least partially autonomous cleaning robot. In all of the above embodiments, the actuator command control (A) may be determined such that propulsion unit and/or steering and/or brake of the mobile robot are controlled such that the mobile robot may avoid collisions with said identified objects.
(29) In a further embodiment, the at least partially autonomous robot may be given by a gardening robot (not shown), which uses the sensor (30), preferably an optical sensor, to determine a state of plants in the environment (20). The actuator (10) may control a nozzle for spraying liquids and/or a cutting device, e.g., a blade. Depending on an identified species and/or an identified state of the plants, an actuator control command (A) may be determined to cause the actuator (10) to spray the plants with a suitable quantity of suitable liquids and/or cut the plants.
(30) In even further embodiments, the at least partially autonomous robot may be given by a domestic appliance (not shown), like e.g. a washing machine, a stove, an oven, a microwave, or a dishwasher. The sensor (30), e.g. an optical sensor, may detect a state of an object which is to undergo processing by the household appliance. For example, in the case of the domestic appliance being a washing machine, the sensor (30) may detect a state of the laundry inside the washing machine. The actuator control command (A) may then be determined depending on a detected material of the laundry.
(31) Shown in
(32) The sensor (30) may be given by an optical sensor which captures properties of, e.g., a manufactured product (12). The classifier (60) may determine a state of the manufactured product (12) from these captured properties. The actuator (10) which controls the manufacturing machine (11) may then be controlled depending on the determined state of the manufactured product (12) for a subsequent manufacturing step of the manufactured product (12). Alternatively, it may be envisioned that the actuator (10) is controlled during manufacturing of a subsequent the manufactured product (12) depending on the determined state of the manufactured product (12).
(33) Shown in
(34) The control system (40) then determines actuator control commands (A) for controlling the automated personal assistant (250). The actuator control commands (A) are determined in accordance with the sensor signal (S) of the sensor (30). The sensor signal (S) is transmitted to the control system (40). For example, the neural network (60) may be configured to, e.g., carry out a gesture recognition algorithm to identify a gesture made by the user (249). The control system (40) may then determine an actuator control command (A) for transmission to the automated personal assistant (250). It then transmits the actuator control command (A) to the automated personal assistant (250).
(35) For example, the actuator control command (A) may be determined in accordance with the identified user gesture recognized by the classifier (60). It may then comprise information that causes the automated personal assistant (250) to retrieve information from a database and output this retrieved information in a form suitable for reception by the user (249).
(36) In further embodiments, it may be envisioned that instead of the automated personal assistant (250), the control system (40) controls a domestic appliance (not shown) controlled in accordance with the identified user gesture. The domestic appliance may be a washing machine, a stove, an oven, a microwave or a dishwasher.
(37) Shown in
(38) Shown in
(39) Shown in
(40) Shown in
(41) If the first layer input (i) comprises a height, width and depth dimension, the first layer output (z) can be obtained through a convolution operation (C), wherein the first layer input (i) is discretely convolved with a predefined amount of filters. The filters may be parametrized by a set of weights (w). Alternatively, of the first layer input (i) does not comprise a width dimension, the first layer output (z) may be obtained by a matrix multiplication of the first layer input (i) with a matrix (not shown), which is parametrized by a set of weights (w). This matrix multiplication may be envisioned as similar to processing a first layer input (i) with a fully connected layer in order to obtain a first layer output (z).
(42) Having obtained the first layer output (z), a first layer loss value is obtained based on the feature maps comprised in the first layer output (z). The first layer loss value (z) may be obtained according to the formula:
(43)
wherein C.sub.out is the number of feature maps in first layer output (z), A.sub.j, and b.sub.j are predefined values and z.sub.j is the j-th feature map.
(44) The first layer output (z) may then be provided (81) to one or multiple other layers of the neural network (60). Alternatively, it may be used as output signal (y) of the neural network (60).
(45)
(46) In a second step (902), a first layer output (z) may be determined by the first layer (L.sub.1), e.g., as shown in
(47) In a third step (903), a first layer loss value (1) may be obtained based on the first layer output (u), e.g., as shown in
(48) In a fourth step (904), the weights (w) of the first layer (L.sub.1) are updated by, e.g., obtaining a gradient of the first layer loss value (1) with respect to the input signal (x) and adapting the weights (w) according to the negative gradient.
(49) In a fifth step (905), the output signal (y) of the neural network (60) is obtained based on the first layer output (z). This may be done by providing the first layer output (z) to a following layer which provides a following output. This following output may be forwarded through the rest of the neural network (60) as is common for feed-forward neural networks. Alternatively, it can be envisioned that the first layer output (z) is used as output signal (y) directly.
(50) In a sixth step (906) a classification loss value is obtained based on the output signal (y). This may achieved by, e.g., applying the softmax function to the output signal (y) in order to obtain a set of probabilities. From these probabilities, a cross entropy value with respect to a desired classification may be computed. The cross entropy loss value may then be used as classification loss value.
(51) In a seventh step (907), the weights of the classifier (preferably including the weights (w) of the first layer) may be updated such that the classification loss value decreases. This may be achieved by computing the gradient of the classification loss value with respect to the input signal (x) and updating the weights (preferably including the weights (w) of the first layer (L.sub.1)) according to the negative gradient. The steps one (901) to (907) may then be repeated in an iterative fashion until a desired amount of iterations has passed. Alternatively, the steps may be repeated until the classification loss value becomes smaller than a predefined classification loss threshold.
(52) In an eighth step (908), the filters of the first layer (L.sub.1) may be pruned. This may, for example, be achieved by determining the filter corresponding to the feature map with the smallest norm, e.g., L.sub.2-norm. This filter may then be discarded and the classifier may be fine-tuned in order to account for the change in network architecture. This can be achieved by, e.g., minimizing the classification loss value from the previous step using a training data set of input signals (x) until the classification loss value has become smaller than the classification loss threshold. The eight step (908) may be repeated in order to prune a predefined amount of filters from the first layer (L.sub.1).
(53) In a further embodiment (not shown) it may be envisioned, that the first (901), second (902) and third step (903) are repeated iteratively with different input signals (x) until a predefined number of iterations has been achieved. Alternatively, the iterations may be repeated until the first layer loss value (1) falls below a certain first layer loss threshold.
(54) In both alternatives of this embodiment, the steps four (904) to eight (908) may then be executed afterwards. It can also be envisioned that after the iterative execution of the steps one (901) to three (903) as explained above, the training may proceed without updating the first layer loss value (1) but only the classification loss value.