Method for generating a model ensemble for calibrating a control device
11203962 · 2021-12-21
Assignee
Inventors
- Amra Suljanovic (Graz, AT)
- Hans-Michael Kögeler (Graz, AT)
- Stefan Jakubek (Vienna, AT)
- Nico DIDCOCK (Graz, AT)
Cpc classification
F01N9/005
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
F02D2041/1437
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
F02D41/2432
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
F02D41/1461
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
G06Q10/04
PHYSICS
International classification
F01N9/00
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
F02D41/24
MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
G06Q10/06
PHYSICS
Abstract
A method for generating a model ensemble that estimates at least one output variable of a physical process as a function of at least one input variable, the model ensemble being formed from a sum of model outputs from a plurality of models that have been weighted with a weighting factor.
Claims
1. A method for calibrating a technical system controllable by control variables with a model ensemble, that estimates at least one output variable (y) of the technical system as a function of at least one input variable (u), comprising: forming the model ensemble from a sum of model outputs (ŷ.sub.j) from a plurality (j) of models (M.sub.j) that have been weighted with a weighting factor (w.sub.j), determining for each model (M.sub.j) an empirical complexity measurement (c.sub.j), that evaluates the deviation of the model output variable (ŷ.sub.j) from the output variable (y) of the actual physical process over a specified input variable range (U), and a model error (E.sub.j), wherein the empirical complexity measurement (c.sub.j) is weighted with a complexity aversion parameter (a.sub.K), forming a surface information criterion (SIC.sub.j, SIC) from the empirical complexity measurement (c.sub.j) and the model error (E.sub.j) from which the weighting factor (w.sub.j) for the model ensemble is determined, calibrating the technical system using the model ensemble by setting the control variables of the technical system to ensure an optimized at least one output variable (y) of the technical system during operation.
2. The method according to claim 1, wherein the mean square error (MSE.sub.j) between the output variables (y) of the physical process measured at N input variables (u) and the model output variables calculated at these N input variables is used as model error (E.sub.j) of a model (M.sub.j) according to the relationship
3. The method according to claim 1, wherein the empirical complexity measurement (c.sub.j) of a model (M.sub.j) is calculated using the formula
4. The method according to claim 1, wherein the weighting factors (w.sub.j) of each model (M.sub.j) of the model ensemble are calculated using the surface
5. The method according to claim 1, wherein for the model ensemble the surface information criterion (SIC) is formed from an error matrix (F) that includes the model error (E.sub.j) of the models (M.sub.j) and a complexity measurement matrix (C) that includes the empirical complexity measurement (c.sub.j) of the models (M.sub.j), whereas the error matrix (F) and the complexity measurement matrix (C) according to the formula SIC={w.sup.TFw+w.sup.TCw} each being weighted twice using a weighting vector (w) that includes the weighting factors (w.sub.j) of the models (M.sub.j), and the surface information criterion (SIC) of the model ensemble being minimized with respect to the weighting factors (w.sub.j).
6. The method according to claim 5, wherein the error matrix (F) is calculated as a matrix product of a matrix (E), whereas the matrix (E) being calculated using the formula E=(y(u.sub.i)−ŷ.sub.j(u.sub.j)).
7. The method according to claim 5, wherein the complexity measurement matrix (C) is weighted using a complexity aversion parameter (a.sub.K).
8. The method according to claim 7, wherein the weighting factors (w.sub.j) are calculated for different complexity aversion parameters (a.sub.K) and the weighting vector (w.sub.a.sub.
9. The method according to claim 7, wherein weighting vectors (w.sub.a.sub.
10. The method according to claim 6, wherein the complexity measurement matrix (C) is weighted using a complexity aversion parameter (a.sub.K).
11. A combustion engine comprising a control device calibrated by a model ensemble generated by the method of claim 1.
12. The method according to claim 1, wherein the output variable comprises an emission variable.
13. A method of calibrating a technical system controllable by control variables using a model ensemble that estimates an output variable of the technical system as a function of an input variable, comprising: forming the model ensemble from a sum of model outputs of a plurality of models that have been weighted with a weighting factor; determining an empirical complexity measurement for each of the plurality of models that evaluates a deviation of the model output from the output variable over a specified input variable range, wherein the empirical complexity measurement is weighted with a complexity aversion parameter; determining a model error for each of the plurality of models; forming a surface information criterion from the empirical complexity measurement and the model error from which the weighting factor is determined; calibrating the technical system using the model ensemble by setting the control variables of the technical system to ensure an optimized output variable of the technical system during operation.
14. A combustion engine comprising a control device calibrated by the method of claim 13.
15. The method according to claim 13, wherein the output variable comprises an emission variable.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The present invention is explained below with reference to
(2)
(3)
(4)
(5)
DETAILED DESCRIPTION
(6) A model ensemble 1, as illustrated in
(7) Using model ensemble 1, or models M.sub.j included therein, as a physical process e.g. an emission or consumption variable of a combustion engine, such as the NOx emission, the CO or CO.sub.2 emission or the fuel consumption, is estimated as an output variable ŷ of model ensemble 1, or as a model output variable ŷ.sub.j of model M.sub.j. In the following description, for the sake of simplicity, a single output variable y will be assumed without limitation of general applicability, whereas an output variable vector y made up of a plurality of output variables y also being possible of course.
(8) In model ensemble 1, each model output variable ŷ.sub.j is weighted with a weighting factor w.sub.j and output variable ŷ of model ensemble 1 is the weighted sum of model output variables ŷ.sub.j of individual models M.sub.j in the form
(9)
In the description, for simplicity's sake, ŷ and ŷ.sub.j are also used, respectively, instead of the correct notation ŷ(u) and ŷ.sub.j(u). With respect to weighting factors w.sub.j, boundary conditions w.sub.j∈[0,1]and
(10)
are preferably to be taken into consideration. The problem is thus presented of how to best determine weighting factors w.sub.j so that output variables y of the physical process are approximated by model ensemble 1 or by its output variable ŷ, as best as possible. The goal here, of course, is for model ensemble 1 to estimate output variable y of the physical process over the complete input variable range U, or the range of interest, better than the best model M.sub.j of model ensemble 1.
(11)
(12) This basic relationship is illustrated in
(13) In order to evaluate the complexity of the jth model M.sub.j, an empirical complexity measurement c.sub.j is used according to the invention that does not evaluate the model structure as in the prior art, but instead evaluates the deviation of model output variable ŷ.sub.j from the output variable y of the physical process over a specified input variable range U. In contrast to a model error E, which relates to the deviation between model M.sub.j and the physical process at specific measured data points, empirical complexity measurement c.sub.j evaluates the deviation over a complete input variable range U, thus specifically also between the measured data points. Different approaches are available for an evaluation of this sort.
(14) In a first approach, the surface of the model output variable ŷ.sub.j over the input variable range U is used for evaluation. The inventive idea behind this can also be explained in reference to
(15)
(16) In this, ∇ is the known Nabla operator with respect to the input variables in the input variable vector u, therefore
(17)
The integral is determined over a specified input variable range uεU, preferably over the whole range. This integral increases monotonically with the surface of model output variable ŷ.sub.j. The surface of model output variable ŷ.sub.j over input variable range U is thus evaluated here as empirical complexity measurement c.sub.j.
(18) As an alternative empirical complexity measurement c, which evaluates the deviation of model M.sub.j or of model output variable ŷ.sub.j from output variable y of the physical process, the variance of the model output variables ŷ.sub.j can be employed. The variance (also designated as the second moment of a random variable) is, as is well known, the expected square deviation of a random variable from its expected value. Applied to the present invention, the model output variable ŷ.sub.j at the available N data points is compared, using the variance, to the model output variable ŷ.sub.j between these data points, which is designated here as variability. The idea behind this is that a model M.sub.j having an increased variability generally predicts the basic physical process over input variable range U worse than a model M.sub.j having a lower variability. This lies in the fact that the better model M.sub.j approximates the measured data points, i.e. the more complex the model M.sub.j becomes, the greater the probability of an increased variability becomes. However, if the variability becomes too large, the risk of overfit for model M.sub.j therefore also increases. The typical behavior of such an overfilled or too-complex model M.sub.j is a greatly varying model output variable ŷ.sub.j over input variable range U, which in turn can lead to a larger deviation between actual output variable y and model output variable ŷ.sub.j. This variability based on the variance can be mapped onto empirical complexity measurement c.sub.j if empirical complexity measurement c.sub.i is calculated according to the following formula.
(19)
(20) It is clear that there are additional possibilities for evaluating the deviation between model M.sub.j and the physical process, or output variable y of the process and the model output variable ŷ.sub.j. The basic idea remains unaltered, namely, the idea that the larger the empirical complexity measurement c.sub.j, the more complex basic model M.sub.j is. Empirical complexity measurement c.sub.j therefore also evaluates the complexity of model M.sub.j.
(21) According to the invention, a surface information criterion SIC.sub.j of jth model M.sub.j is derived from empirical complexity measurement cj, which, analogous to the above Akaike Information Criterion AIC in the prior art, is again formed from model error E.sub.j of model M.sub.j and empirical complexity measurement c.sub.j, therefore SIC.sub.j=(E.sub.j+α.sub.K.Math.c.sub.j.sup.s). Mean square error
(22)
for example, can again be used as model error E.sub.j, wherein also any other model error E.sub.j, such as in the form of the mean absolute deviation, could obviously also be used.
(23) The preferably used parameter α.sub.Kε[0, ∞[ in surface information criterion SIC.sub.j is used as a complexity aversion parameter. This represents the only degree of freedom with which the complexity of model M.sub.j of model ensemble 1 can be further penalized. The larger the complexity aversion parameter α.sub.K becomes, the more complexity enters into the surface information criterion SIC.sub.j. Small complexity aversion parameters α.sub.K therefore favor more complex models M.sub.j, meaning models M.sub.j having more degrees of freedom (number of model parameters p.sub.j).
(24) Analogous to the known Akaike Information Criterion, weighting factors w.sub.j can again be determined from
(25)
wherein w.sub.j∈[0,1] and
(26)
can be preferably be considered as boundary conditions. Although a model ensemble 1 can already be formed by using this, which, under the given conditions, better approximates the actual process, meaning with fewer errors than a model formed using the Akaike Information Criterion AIC, the quality of model ensemble 1 can be further improved according to the invention. The method involves the approach as explained below.
(27) It can be shown that the mean square model error MSE and the empirical complexity measurement c of model ensemble 1 with respect to a weighting vector w, which includes weighting factors w.sub.j of j models M.sub.j can each be represented as a quadratic function of model error E.sub.j and empirical complexity measurement c.sub.j of models M.sub.j in the form SIC={w.sup.TFw+α.sub.Kw.sup.TCw}. Within this, optional complexity aversion parameter α.sub.K represents a degree of freedom in the determination of weighting factors w.sub.j of j models M.sub.j.
(28) In this context, F designates an error matrix that includes model error E.sub.j of models M.sub.j and C a complexity measurement matrix that includes empirical complexity measurement c.sub.j of models M.sub.j. In the case of mean square error MSE; as model error E.sub.j and with a matrix E=(y(u.sub.i)−ŷ.sub.j(u.sub.i)), for all i ∈ N data points and j, error matrix F results as the product of matrix E with itself, according to F=E.sup.TE. Depending upon the empirical complexity measurement c.sub.j chosen, complexity measurement matrix C results in, for example,
(29)
each having model output variable vector ŷ.sub.a, which contains model output variables ŷ.sub.j of j models, thus ŷ.sub.a={ŷ.sub.1 . . . ŷ.sub.j}. Matrices F and C can thus be calculated in advance and, above all, without knowledge of models M.sub.j or their model structure or the number of model parameters p.sub.j.
(30) For determining weighting factors w.sub.j (or, analogously, weighting vector w), surface information criterion SIC of model ensemble 1 for a specified complexity aversion parameter α.sub.K can be optimized with regard to weighting factors w.sub.j, in particular minimized. An optimization problem in the form
(31)
can be derived from this.
(32) As can be easily recognized, this is a quadratic optimization problem that can be solved quickly and efficiently using available standard solution algorithms for a predetermined complexity aversion parameter α.sub.K, w.sub.j∈[0,1] and
(33)
preferably apply as boundary conditions for optimization. Any initial weighting vector w can be specified.
(34) The result of the optimization of Surface Information Criterion SIC of model ensemble 1 for determination of weighting factors w.sub.j is described in reference to
(35) In
(36) It can also be deduced from the diagram on the right in
(37) To accomplish this, the associated weighting vectors w.sub.α.sub.
(38)
(39) Using the known Mallow equation, complexity aversion parameter α.sub.K is chosen as optimum complexity aversion parameter α.sub.K,opt, which solves the following optimization problem
(40)
(41) Within this, F is again the error matrix (F=E.sup.TE) and a is the standard deviation of the available data points, but which is generally not known. There are, however, known methods (as described in Hansen, B. E. “Least squares model averaging,” Econometrica, 75(4), 2007, pp. 1175-1189, for example) to estimate the standard deviation σ from the available data points. Vector p again includes for all j models M.sub.j the number of model parameters p.sub.j. The knowledge of models M.sub.j or their model structures is, therefore, required for this step.
(42) This optimization is not, however, solved directly, but with respect to the initially determined set of weighting vectors
(43)
This means that there is selected the weighting vector w associated to a specific complexity aversion parameter α.sub.K as optimum weighting vector w.sub.opt, which yields the minimal expression
(44)
(45) In
(46) A model ensemble determined according to the invention is used, for example, for calibrating a technical system, such as a combustion engine. In the calibration—in order to optimize at least one output variable of the technical system—control variables of the technical system, by which the technical system is controlled, are varied in a specified operational state of the technical system that is defined by state variables or a state variable vector. The optimization of output variables by variation of the control variables is generally formulated and solved as an optimization problem. There are sufficient known methods for accomplishing this. The control variables determined in this manner are stored as a function of the respective operational conditions, for example in the form of characteristic maps or tables. This relationship can then be used to control the technical system as a function of the actual operational state (which is measured or otherwise determined (for example, estimated)). This means that the stored control variables for the relevant operational state are readout from the stored relationship and used to control the technical process. In the case of a combustion engine as technical system, the operational condition is often described using measurable variables such as speed and torque, wherein other variables such as engine coolant temperature, ambient temperature, etc., can also be used. In a combustion engine, the position of a variable-turbine-geometry turbocharger, the position of an exhaust-gas recirculation system or the injection timing are often used as control variables. The output variable to be optimized in a combustion engine is typically the consumption and/or emission variable (for example, NOx, CO, CO.sub.2, etc.). Calibration of a combustion engine thus ensures by setting correct control variables that consumption and/or emission during operation are minimal.