Device and method for training a classifier
11960991 ยท 2024-04-16
Assignee
Inventors
- Rizal Fathony (Pittsburgh, PA, US)
- Frank Schmidt (Leonberg, DE)
- Jeremy Zieg Kolter (Pittsburgh, PA, US)
Cpc classification
G05B19/4155
PHYSICS
G06F18/21
PHYSICS
G05B2219/40496
PHYSICS
International classification
A63F13/67
HUMAN NECESSITIES
G05B19/4155
PHYSICS
G07C3/00
PHYSICS
Abstract
A computer-implemented method for training a classifier, particularly a binary classifier, for classifying input signals to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier. The method includes providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications, and training the classifier depending on the provided weighting factors.
Claims
1. A computer-implemented method for training a classifier for classifying input signals to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the method comprising the following steps: providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications; and training the classifier depending on the provided weighting factors; wherein the non-decomposable metric is given by the formula
2. The method according to claim 1, wherein classifier is a binary classifier, and the optimization is carried out by finding an optimum value of a Lagrangian multiplier corresponding to the moment-matching constraint and wherein trained parameters of a fully-connected layer of the binary classifier are set equal to the optimum value of the Lagrangian multiplier.
3. The method according to claim 1, wherein the optimization includes solving the two-player game by solving a linear program in only one of those two players.
4. The method according to claim 1, wherein the optimization of the performance according to the non-decomposable metric is further subject to an inequality constraint of an expected value of a second metric that measures an alignment between the classifications and the predicted classifications.
5. A computer-implemented method for using a classifier for classifying sensor signals, wherein the classifier is trained to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the training including providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications, and training the classifier depending on the provided weighting factors, the method comprising the following steps: receiving a sensor signal including data from a sensor; determining a first input signal which depends on the sensor signal; and feeding the first input signal into the classifier to obtain an output signal that characterizes a classification of the first input signal; wherein the non-decomposable metric is given by the formula
6. A computer-implemented method for using a classifier trained for providing an actuator control signal for controlling an actuator, wherein the classifier is trained to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the training including providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications, and training the classifier depending on the provided weighting factors, the method comprising the following steps: receiving a sensor signal including data from a sensor; determining a first input signal which depends on the sensor signal; feeding the first input signal into the classifier to obtain an output signal that characterizes a classification of the first input signal; and determining the actuator control signal depending on the output signal; wherein the non-decomposable metric is given by the formula
7. The method according to claim 6, wherein the actuator controls an at least partially autonomous robot, and/or a manufacturing machine, and/or an access control system.
8. A non-transitory machine readable storage medium on which is stored a computer program for training a binary classifier for classifying input signals to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the computer program when executed by a computer, causing the computer to perform the following steps: providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications; and training the classifier depending on the provided weighting factors; wherein the non-decomposable metric is given by the formula
9. A control system for operating an actuator, the control system comprising: a classifier trained for classifying input signals to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the classifier being trained by providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications, and training the classifier depending on the provided weighting factors; wherein the control system is configured to operate the actuator in accordance with an output of the classifier; wherein the non-decomposable metric is given by the formula
10. A control system that is configured to use a classifier for classifying sensor signals, wherein the classifier is trained to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the training including providing weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications, and training the classifier depending on the provided weighting factors, the control system configured to: receive a sensor signal including data from a sensor; determine a first input signal which depends on the sensor signal; and feed the first input signal into the classifier to obtain an output signal that characterizes a classification of the first input signal; wherein the non-decomposable metric is given by the formula
11. A training system configured train a classifier for classifying input signals to optimize performance according to a non-decomposable metric that measures an alignment between classifications corresponding to input signals of a set of training data and corresponding predicted classifications of the input signals obtained from the classifier, the training system configured to: provide weighting factors that characterize how the non-decomposable metric depends on a plurality of terms from a confusion matrix of the classifications and the predicted classifications; and train the classifier depending on the provided weighting factors; wherein the non-decomposable metric is given by the formula
12. The method as recited in claim 1, wherein the classifier is a binary classifier.
13. The method as recited in claim 1, wherein the marginal probabilities represent marginal probabilities of a classification of a given input value being equal to a predefined classification and a sum of all classifications being equal to a predefined sum value.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
(12) Shown in
(13) Thereby, control system 40 receives a stream of sensor signals S. It then computes a series of actuator control commands A depending on the stream of sensor signals S, which are then transmitted to actuator 10.
(14) Control system 40 receives the stream of sensor signals S of sensor 30 in an optional receiving unit 50. Receiving unit 50 transforms the sensor signals S into input signals x. Alternatively, in case of no receiving unit 50, each sensor signal S may directly be taken as an input signal x. Input signal x may, for example, be given as an excerpt from sensor signal S. Alternatively, sensor signal S may be processed to yield input signal x. Input signal x may comprise image data corresponding to an image recorded by sensor 30, or it may comprise audio data, for example if sensor 30 is an audio sensor. In other words, input signal x may be provided in accordance with sensor signal S.
(15) Input signal x is then passed on to a classifier 60, for example an image classifier, which may, for example, be given by an artificial neural network.
(16) Classifier 60 is parametrized by parameters ?, which are stored in and provided by parameter storage St.sub.1.
(17) Classifier 60 determines output signals y from input signals x. The output signal y comprises information that assigns one or more labels to the input signal x. Output signals y are transmitted to an optional conversion unit 80, which converts the output signals y into the control commands A. Actuator control commands A are then transmitted to actuator 10 for controlling actuator 10 accordingly. Alternatively, output signals y may directly be taken as control commands A.
(18) Actuator 10 receives actuator control commands A, is controlled accordingly and carries out an action corresponding to actuator control commands A. Actuator 10 may comprise a control logic, which transforms actuator control command A into a further control command, which is then used to control actuator 10.
(19) In further embodiments, control system 40 may comprise sensor 30. In even further embodiments, control system 40 alternatively or additionally may comprise actuator 10.
(20) In still further embodiments, it may be envisioned that control system 40 controls a display 10a instead of an actuator 10. Furthermore, control system 40 may comprise a processor 45 (or a plurality of processors) and at least one machine-readable storage medium 46 on which instructions are stored which, if carried out, cause control system 40 to carry out a method according to one aspect of the present invention.
(21)
(22) Sensor 30 may comprise one or more video sensors and/or one or more radar sensors and/or one or more ultrasonic sensors and/or one or more LiDAR sensors and or one or more position sensors (like, e.g., GPS). Some or all of these sensors are preferably but not necessarily integrated in vehicle 100.
(23) Alternatively or additionally sensor 30 may comprise an information system for determining a state of the actuator system. One example for such an information system is a weather information system that determines a present or future state of the weather in environment 20.
(24) For example, using input signal x, the classifier 60 may for example detect objects in the vicinity of the at least partially autonomous robot. Output signal y may comprise an information that characterizes where objects are located in the vicinity of the at least partially autonomous robot. Control command A may then be determined in accordance with this information, for example to avoid collisions with the detected objects.
(25) Actuator 10, which is preferably integrated in vehicle 100, may be given by a brake, a propulsion system, an engine, a drivetrain, or a steering of vehicle 100. Actuator control commands A may be determined such that actuator (or actuators) 10 is/are controlled such that vehicle 100 avoids collisions with the detected objects. Detected objects may also be classified according to what the classifier 60 deems them most likely to be, e.g., pedestrians or trees, and actuator control commands A may be determined depending on the classification.
(26) In one embodiment, classifier 60 may be designed to identify lanes on a road ahead, e.g., by classifying a road surface and markings on the road, and identifying lanes as patches of road surface between the markings. Based on an output of a navigation system, a suitable target lane for pursuing a chosen path can then be selected, and depending on a present lane and the target lane, it may then be decided whether vehicle 10 is to switch lanes or stay in the present lane. Control command A may then be computed by, e.g., retrieving a predefined motion pattern from a database corresponding to the identified action.
(27) Likewise, upon identifying road signs or traffic lights, depending on an identified type of road sign or an identified state of the traffic lights, corresponding constraints on possible motion patterns of vehicle 10 may then be retrieved from, e.g., a database, a future path of vehicle 10 commensurate with the constraints may be computed, and the actuator control command A may be computed to steer the vehicle such as to execute the trajectory.
(28) Likewise, upon identifying pedestrians and/or vehicles, a projected future behavior of the pedestrians and/or vehicles may be estimated, and based on the estimated future behavior, a trajectory may then be selected such as to avoid collision with the pedestrian and/or the vehicle, and the actuator control command A may be computed to steer the vehicle such as to execute the trajectory.
(29) In further embodiments, the at least partially autonomous robot may be given by another mobile robot (not shown), which may, for example, move by flying, swimming, diving or stepping. The mobile robot may, inter alia, be an at least partially autonomous lawn mower, or an at least partially autonomous cleaning robot. In all of the above embodiments, actuator command control A may be determined such that propulsion unit and/or steering and/or brake of the mobile robot are controlled such that the mobile robot may avoid collisions with the identified objects.
(30) In a further embodiment, the at least partially autonomous robot may be given by a gardening robot (not shown), which uses sensor 30, preferably an optical sensor, to determine a state of plants in the environment 20. Actuator 10 may be a nozzle for spraying chemicals. Depending on an identified species and/or an identified state of the plants, an actuator control command A may be determined to cause actuator 10 to spray the plants with a suitable quantity of suitable chemicals.
(31) In even further embodiments, the at least partially autonomous robot may be given by a domestic appliance (not shown), like, e.g., a washing machine, a stove, an oven, a microwave, or a dishwasher. Sensor 30, e.g., an optical sensor, may detect a state of an object that is to undergo processing by the household appliance. For example, in the case of the domestic appliance being a washing machine, sensor 30 may detect a state of the laundry inside the washing machine based on image. Actuator control signal A may then be determined depending on a detected material of the laundry.
(32) Shown in
(33) Sensor 30 may be given by an optical sensor that captures properties of, e.g., a manufactured product 12. Classifier 60 may determine a state of the manufactured product 12 from these captured properties, e.g., whether the product 12 is faulty or not. Actuator 10 which controls manufacturing machine 11 may then be controlled depending on the determined state of the manufactured product 12 for a subsequent manufacturing step of manufactured product 12. Alternatively, it may be envisioned that actuator 10 is controlled during manufacturing of a subsequent manufactured product 12 depending on the determined state of the manufactured product 12. For example, actuator 10 may be controlled to select a product 12 that has been identified by classifier 60 as faulty and sort it into a designated bin, where they may be re-checked before discarding them.
(34) Shown in
(35) Control system 40 then determines actuator control commands A for controlling the automated personal assistant 250. The actuator control commands A are determined in accordance with sensor signal S of sensor 30. Sensor signal S is transmitted to the control system 40. For example, classifier 60 may be configured to, e.g., carry out a gesture recognition algorithm to identify a gesture made by user 249. Control system 40 may then determine an actuator control command A for transmission to the automated personal assistant 250. It then transmits the actuator control command A to the automated personal assistant 250.
(36) For example, actuator control command A may be determined in accordance with the identified user gesture recognized by classifier 60. It may then comprise information that causes the automated personal assistant 250 to retrieve information from a database and output this retrieved information in a form suitable for reception by user 249.
(37) In further embodiments, it may be envisioned that instead of the automated personal assistant 250, control system 40 controls a domestic appliance (not shown) controlled in accordance with the identified user gesture. The domestic appliance may be a washing machine, a stove, an oven, a microwave or a dishwasher.
(38) Shown in
(39) Shown in
(40) Shown in
(41) Shown in
(42) Classifier 60 is configured to compute output signals ? from input signal x.sub.i. These output signals ?.sub.i are also passed on to assessment unit 180.
(43) A modification unit 160 determines updated parameters ? depending on input from assessment unit 180. Updated parameters ? are transmitted to parameter storage St.sub.1 to replace present parameters ?.
(44) Furthermore, training system 140 may comprise a processor 145 (or a plurality of processors) and at least one machine-readable storage medium 146 on which instructions are stored which, if carried out, cause control system 140 to carry out a method according to one aspect of the invention.
(45) Shown in
(46) Shown in
(47) Then (1010), optimum values Q* for the optimization problem stated as the inner minimax problem in equation (5) (or (7), in case constraints are provided) are computed. In addition, a matrix ? is computed. Details of this computation are discussed in connection with
(48) Next (1020), an increment d?=??(Q*.sup.T1?y.sub.T) is computed with y.sub.T=(y1, . . . , y.sub.n).sup.T being the vector with the classifications of the training data set.
(49) Then (1030) is checked whether the method is converged, e.g., by checking whether an absolute value of increment d? is less than a predefined threshold.
(50) If the method has converged, the algorithm is stopped and training is complete (1060).
(51) If not, in optional step (1040), the increment to d? are taken as an increment to parameters ?.sub.f of fully-connected layer 64 and backpropagated through the remaining network, i.e., through layers 63, 62 and 61 to obtain an increment dw to parameters w and the method continues with step (1050). Alternatively, parameters w can remain fixed and the method branches directly from step (1030) to step (1050).
(52) In step (1050), parameters ?, ?.sub.f, and w are updated as ???+d? w?w+dw ?.sub.f??.
(53) Then, the method continues with step (1010) and iterates until the method is concluded in step (1060).
(54) Shown in
(55) First (2010), based n?n matrices D, E, F are provided as
(56)
(57) Then (2020), Z(Q) is provided as a symbolic expression as
(58)
(59) Next (2030), a linearly transformed expression Z(Q) is provided from Z(Q) via Z(Q)=Z(Q).Math.diag(1, . . . , n).
(60) Furthermore, c(Q) is computed as c(Q)=0 in case the special cases as defined in equations (S1) and (S2) do not need to be enforced. If we like to enforce (S1), Z(Q) is increased by
(61)
and c(Q) becomes
(62)
with Id being a n?n-dimensional identity matrix.
(63) If (S2) is to be enforced, Z(Q) is increased by an a n?n-dimensional matrix E that is 0 everywhere, except at position (n, n) where it is set to Q.sub.nn.
(64) Now (2040), all input signals x.sub.i in dataset T, are propagated through classifier (60) to yield feature vectors ?.sub.1(x.sub.i). An n?m matrix ? (with n being the number of data samples in dataset T and m being the number of features) the columns of which denote the features of each sample as
?.sub.:,i=?.sub.1(x.sub.i)
and a matrix W is computed as
W=?.sup.T?1.sup.T.
(65) In case equation (7) is to be solved, the resulting output values of classifier (60) are also stored as ?.sub.i.
(66) Next, (2050) in case equation (5) is to be solved, Q* is computed as the optimum value of the linear program
(67)
(68) In case equation (7) is to be solved, a matrices B.sup.(i) and scalars ?.sub.i are defined for each constraint of equation (7) by computing.sub.({tilde over (P)}(X,Y);P({circumflex over (?)}))(metric.sup.i(?,Y))=:
B.sup.(i),P
+?.sub.i
(69) This is done by defining for each constraint i the vectors D.sub.k.sup.i=
(70)
for l=?.sub.iy.sub.i and setting:
(71)
(72) If neither (S1) nor (S2) are enforced for any i.
(73) If (S1) is enforced, the above mentioned expression remains the same as long as l=?.sub.iy.sub.i>0. If we have l=0, and the above variables are set as
(74)
(75) If (S2) is enforced, the above-mentioned expression (prior to the S1 special case) remains the same as long as l=?.sub.iy.sub.i<n. If we have l=n, we choose ?.sub.i=0 and B.sup.(i) as a n?n-dimensional matrix that is 0 everywhere except at position (n, n) where it is 1.
(76) Then, Q* is obtained as the optimum value by solving the linear program
(77)
(78) This concludes the method.
(79) The term computer covers any device for the processing of predefined calculation instructions. These calculation instructions can be in the form of software, or in the form of hardware, or also in a mixed form of software and hardware.
(80) It is further understood that the procedures cannot only be completely implemented in software as described. They can also be implemented in hardware, or in a mixed form of software and hardware.