METHOD AND DEVICE FOR TESTING A TECHNICAL SYSTEM

Abstract

A method for testing a technical system. The method includes: tests are carried out with the aid of a simulation of the system, the tests are evaluated with respect to a fulfillment measure of a quantitative requirement on the system and an error measure of the simulation, on the basis of the fulfillment measure and error measure, a classification of the tests as either reliable or unreliable is carried out, and a test database is improved on the basis of the classification.

Claims

1. A method for testing a technical system, comprising the following steps: carrying out tests using a simulation of the system; evaluating the tests with respect to a fulfillment measure of a quantitative requirement on the system and an error measure of the simulation; based on the fulfillment measure and the error measure, carrying out a classification of the tests as either reliable or unreliable; and improving a test database based on the classification.

2. The method as recited in claim 1, wherein the technical system is an at least semi-autonomous robot or vehicle.

3. The method as recited in claim 1, wherein: the classification is carried out by a classifier based on a feature vector; and the fulfillment measure and the error measure form components of the feature vector.

4. The method as recited in claim 3, wherein: the classifier maps the feature vector on one of multiple classes; and the classification takes place within predefined decision limits between the classes.

5. The method as recited in claim 4, wherein: in a preparation phase, the simulation is confirmed by an experimental measurement on the system; and the decision limits are drawn in such a way that the fulfillment measure taken, on the one hand, in the simulation and, on the other hand, in the measurement deviates as little as possible.

6. The method as recited in claim 5, wherein further tests to be carried out in the preparation phase are selected automatically.

7. The method as recited in claim 4, wherein: the classifier is defined by solving an equation system; and the equation system includes definition equations of the fulfillment measure and the error measure.

8. The method as recited in claim 1, wherein the evaluation is carried out in such a way that the fulfillment measure is positive when the system meets the requirement, and negative when the system fails the requirement.

9. The method as recited in claim 1, wherein: for specific parameters each of the tests, the fulfillment measure and the error measure are each represented in a feature space spanned by the parameters; and after the evaluation, the classification is visualized in the feature space.

10. The method as recited in claim 1, wherein an automatic improvement of errors of the system recognized by the testing takes place.

11. A non-transitory machine-readable storage medium on which is stored a computer program for testing a technical system, the computer program, when executed by a computer, causing the computer to perform the following steps: carrying out tests using a simulation of the system; evaluating the tests with respect to a fulfillment measure of a quantitative requirement on the system and an error measure of the simulation; based on the fulfillment measure and the error measure, carrying out a classification of the tests as either reliable or unreliable; and improving a test database based on the classification.

12. A device configured to test a technical system, the device configured to: carry out tests using a simulation of the system; evaluate the tests with respect to a fulfillment measure of a quantitative requirement on the system and an error measure of the simulation; based on the fulfillment measure and the error measure, carry out a classification of the tests as either reliable or unreliable; and improve a test database based on the classification.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] Exemplary embodiments of the present invention are illustrated in the figures and explained in greater detail in the description below.

[0018] FIG. 1 shows a virtual test classifier, in accordance with an example embodiment of the present invention.

[0019] FIG. 2 shows a first approach for generating the decision limit of the classifier on the basis of data, in accordance with an example embodiment of the present invention.

[0020] FIG. 3 shows a second approach for generating the decision limit of the classifier on the basis of a formal solution, in accordance with an example embodiment of the present invention.

[0021] FIG. 4 shows the input space of a test and valid parameter ranges and regions for each class, in accordance with an example embodiment of the present invention.

[0022] FIG. 5 shows the use of the classifier to improve the simulation test database, in accordance with an example embodiment of the present invention.

[0023] FIG. 6 shows additional tests along the decision limit between the classes, in accordance with an example embodiment of the present invention.

[0024] FIG. 7 shows the additional tests in the classifier space, in accordance with an example embodiment of the present invention.

[0025] FIG. 8 shows the use of the classifier to improve the database of actual tests, in accordance with an example embodiment of the present invention.

[0026] FIG. 9 shows the visualization of a classification result in a feature space spanned by the test parameters, in accordance with an example embodiment of the present invention.

[0027] FIG. 10 schematically shows a control unit, in accordance with an example embodiment of the present invention.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

[0028] According to the present invention, in the context of a test X, which may be taken as a test case from a test catalog or may be obtained as an instance of a parametric test, simulation model error SMerrorX is evaluated and quantitative specification QSpec is assessed on the basis of a simulation of the SUT. The virtual test classifier uses SMerrorX and QSpec as the input and makes a binary decision as to whether the test result based on the simulation is trustworthy or not.

[0029] According to the linguistic usage typical in information technology and in particular pattern recognition, a classifier is to be understood here as any algorithm or any mathematical function which maps a feature space on a set of classes which were formed and bounded from one another in the course of a classification. To be able to decide in which class an object is to be categorized or classed (colloquially also “classified”), the classifier uses so-called class or decision limits. If it is not important to distinguish between method and instance, the term “classifier” is used in technical language and also sometimes in the following as equivalent with “classification” or “classing.”

[0030] FIG. 1 illustrates such a classification in the present exemplary application. In this case, each point corresponds to a test which was carried out in the course of the simulation and for which fulfillment measure 13 of the requirement QSpec and error measure 14 SMerrorX were calculated. QSpec is defined in this case so that it assumes a positive value if it may be inferred from the tests that the system meets the respective requirement (reference numeral 24), and negative if the system fails the requirement (reference numeral 25).

[0031] As may be seen from the figure, decision limit 19 of classifier 18 divides the space into four classes A, B, C and D. Tests of class A were passed by the system with high reliability. For tests of classes B and C, the simulation only supplies unreliable results; such tests are therefore to be carried out on the real system. Tests of class D were failed on the system with high reliability.

[0032] This virtual test classifier 18 is based on the consideration that a requirement which is only barely met in the simulation may only replace the testing of the real system if at most a marginal model error 14 is to be presumed. On the other hand, in the event of a high absolute value of fulfillment measure 13 of quantitative requirement “QSpec,” i.e., in the case of a specification which is greatly exceeded or clearly failed, a certain deviation of the simulation results from corresponding experimental measurements may be accepted.

[0033] Since this way of viewing things presumes the knowledge of model error SMerrorX of the simulation model, it is presumed that the latter was subjected to a verification and validation before the use of virtual test classifier 18. Within the scope of this validation—for example on the basis of a Gaussian process or in another way by machine learning—a generalized model is to be formed which supplies SMerrorX for a given X. It is to be noted that the trustworthiness of the simulation is decisively dependent on the correctness of this generalized model.

[0034] FIG. 2 illustrates one possible approach for generating decision limit (19—FIG. 1) of classifier 18 on the basis of data. In the simplest case, limit 19 extends here along a line through the origin. The slope of the straight line is preferably to be selected so that all points in which fulfillment measure 13 of quantitative requirement QSpec differs in sign between simulation 11 and real measurement 21—thus quasi all tests 12 in which the simulation model fails—are in areas C and B and these areas are moreover preferably small.

[0035] Furthermore, a more general, for example polynomial decision limit 19 comes into consideration, whose function curve is adapted with the aid of linear programming in such a way that it meets the criterion of a classifier 18 VTC. All points in which fulfillment measure 13 of quantitative requirement QSpec differs in sign between simulation 11 and real measurement 21—thus quasi all tests 12 in which the simulation model fails—are also in areas C and B in this case.

[0036] FIG. 3 illustrates the alternative approach of defining classifier 18 by solving 23 a formal equation system, on which the definition equations of fulfillment measure 13 and error measure 14 are based. The resulting function, which assigns a truth value to feature vector 13, 14 formed from these two measures, may alternately be specified deterministically or stochastically.

[0037] For the purposes of the following statements, I is the input quantity, O is the output quantity—under certain circumstances also including inputs, and m.sub.1,m.sub.2:I.fwdarw.O is the system model and real system as functions, which may only be observed for a finite number of inputs by simulation 11 or experimental measurement 21. Furthermore q:O×O.fwdarw. custom-character is simulation model error SMerrorX, i.e., distance or error measure 14 of two outputs corresponding to one another. Finally I.sub.∈:={i|q(m.sub.1(i),m.sub.2(i)=∈} is the set of all inputs for which error measure 14 assumes the value ∈.

[0038] Starting from these definitions, the deviation of fulfillment measure 13 of a requirement for each input i∈I.sub.∈ may be restricted on the upper end as follows by a term which is dependent neither on m.sub.1 nor on m.sub.2:

[00001] $\begin{matrix} \forall ϵ \forall i \in I_{ϵ} : .Math. p (m_{1} (i)) - p (m_{2} (i)) .Math. \leq \sup_{j \in I_{ϵ}} .Math. p (m_{1} (j)) - p (m_{2} (j)) .Math. \leq \sup_{(o_{1}, o_{2}) \in q^{- 1} (ϵ)} .Math. p (o_{1}) - p (o_{2}) .Math. & Formula 1 \end{matrix}$

[0039] Classifier 18 therefore results as

[00002] $\begin{matrix} VTC (ϵ, δ) = {\begin{matrix} W & if .Math. δ .Math. > \sup_{(o_{1}, o_{2}) \in q^{- 1} (ϵ)} .Math. p (o_{1}) - p (o_{2}) .Math. \\ F & otherwise \end{matrix} . & Formula 2 \end{matrix}$

[0040] The simulation model is classified as reliable here in the case of VTC(∈,δ)=W in the meaning that m.sub.1 and m.sub.2 correspond with respect to p. It is to be noted that classifier 18 requires the reversal of q.

[0041] One main advantage of this representation is that virtual test classifier 18 may be formulated independently of m.sub.1 and m.sub.2 since it is only dependent on fulfillment measure 13 of the quantitative requirement and error measure 14. Proceeding from a single error measure 14 and a plurality n of quantitative requirements, n virtual test classifiers 18 may thus be calculated, namely one for each requirement. The model is therefore only to be validated once with respect to error measure 14 and not for example with regard to each individual requirement.

[0042] This observation may be generalized in a simple way for a plurality m of error measures and a plurality n of quantitative requirements, m typically resulting as very small and and n resulting as large. In this case, n.Math.m virtual test classifiers 18 may be calculated. If one of these classifiers 18 supplies value W, the simulation results may be considered to be reliable. This enables a more precise classification, since some error measures 14 may be more suitable for certain requirements than others.

[0043] Alternatively, virtual test classifier 18 may be defined in a stochastic framework, in which the inputs are assumed to be randomly distributed—according to an arbitrary probability density function. For this purpose, F.sub.∈(δ):=P(|P(m.sub.1(i)−p(m.sub.2(i))|≤δ|q(m.sub.1(i),m.sub.2(i)=∈) denotes the conditional cumulative distribution function of the deviation of fulfillment measure 13 under the assumption that error measure 14 assumes value E. At a threshold value τ∈(0,1) of the probability that classifier 18 makes the correct decision—value τ is therefore typically close to 1—, virtual test classifier 18 may be defined as follows:

[00003] $\begin{matrix} VTC (ϵ, δ) = {\begin{matrix} W & if | δ | > \inf F_{ϵ}^{- 1} (τ) \\ F & otherwise \end{matrix} & Formula 3 \end{matrix}$

[0044] The approach according to the present invention will now be explained from an application viewpoint. One aspect of this approach is based on the use of classifier 18 to expand the test database and parameter ranges of the simulation tests. For illustration, FIG. 4 shows the test cases classified according to FIG. 1 in an input space spanned by features X1 and X2 and the classifier applied thereto. The non-observed areas of the test space between the classes are apparent from this image.

[0045] FIG. 5 examines a method 10 in order to define additional tests 12 closer to decision limits 19 (FIG. 4), to update and in particular expand the parameter ranges, and to determine parameter regions for which tests 12 may be considered to be reliable. The following assumptions are made: [0046] Classifier 18 was defined according to a method as described with respect to FIGS. 2 and 3. [0047] A set of tests 12 is given together with a definition of their input parameters, for example in the form of a test database, a test catalog, or parameterized test cases. These input parameters are restricted to those ranges—corresponding to classes A and D according to FIG. 1—in which simulation 11 is considered to be reliable. [0048] Furthermore, a simulation model is given 11. [0049] Requirements QSpec are quantifiable and predefined and are implemented within the scope of a monitoring system which evaluates tests 12 with respect to fulfillment measure 13 of these requirements. In the figure, both fulfillment measures 13 relate to the same requirement QSpec, but evaluated once on the basis of simulation 11 and once in the course of experimental measurement 21 on the system. [0050] SMerrorX is an error measure 14 which was defined beforehand. For some test inputs, simulation 11 and measurement 21 were thus already carried out, and error measure 14 generalizes corresponding tests 12 two new experiments which have not yet been carried out with a certain reliability, which is determined, for example, by an upper and lower limit for error measure 14. For classifier 18 (FIGS. 1 through 3), only the most unfavorable, i.e., the highest error measure 14 was used. It is to be noted that classifier 18 may be used to further refine error measure 14.

[0051] Under these assumptions, method 10 may be designed as follows: [0052] 1. Tests 12 from the database or the catalog are carried out with the aid of simulation 11, output signals being generated. [0053] 2. The output signals are evaluated with respect to fulfillment measure 13 of requirements QSpec and error measure 14 of simulation 1 according to SMerrorX error model. [0054] 3. Fulfillment measure (13) and error measure 14 are supplied 15 to classifier 18. [0055] 4. For each test 12, distance 31 to decision limit 19 of classifier 18 is ascertained and this classifier carries out a classification 15 into one of the following classes A, B, C, D (FIG. 1): Test 12 was successful in simulation 11 and its result is reliable; the test 12 failed in simulation 11 and its result is reliable; or the result of simulation 11 is unreliable. [0056] 5. A test generation technique such as so-called search-based testing (SBT) is used to make a selection 32 of new tests 12 including the following optimization goal: New tests 12 are—as shown in FIG. 6—firstly distributed closer to decision limit 19 of classifier 18 and secondly along the same limit 19. [0057] 6. On the basis of classification 15 of new tests 12 on the basis of decision limit 19 of classifier 18, the parameter areas or regions corresponding to classes A, D (FIG. 7) are enlarged in the input space and new tests 12 are added to the database. [0058] 7. According to described method 10, a classification 15 of reliable and unreliable test inputs is carried out. Each of these tests 12 corresponds to a point marked in FIGS. 6 and 7; those regions are sought in which simulation 11 is considered to be reliable. The knowledge of this region is deepened by step-by-step selection 32 and execution of new tests—for example in the course of SBT. The test database for simulation 11 is improved in this way.

[0059] FIG. 8 illustrates a method 10 which outputs an updated database 36 or an updated catalog on the basis of a database or a catalog of actual tests. If no optional new tests are given, this ultimately corresponds to a technique for test selection or for thinning out test suites. The following assumptions are made: [0060] The classifier 18 was defined according to a method described with respect to FIGS. 2 and 3. [0061] A test database 33 or a catalog of actual test cases and corresponding test cases 34 for simulation 11 is given. [0062] New tests 12 are optionally given for simulation 11. [0063] Furthermore, a simulation model is given. [0064] Requirements QSpec are quantifiable and predefined and are implemented within the scope of a monitoring system which evaluates tests 12 with respect to fulfillment measure 13 of these requirements. In the figure, both fulfillment measures 13 relate to the same requirement QSpec, but evaluated once on the basis of simulation 11 and once in the course of experimental measurement 21 on the system. [0065] SMerrorX is an error measure 14 which was defined beforehand. For some test inputs, simulation 11 and measurement 21 were thus already carried out, and error measure 14 generalizes corresponding tests 12 to new experiments which have not yet been carried out with a certain reliability, which is determined, for example, by an upper and lower limit for error measure 14. For classifier 18 (FIGS. 1 through 3), only the most unfavorable, i.e., the highest error measure 14 was used. It is to be noted that classifier 18 may be used to further refine error measure 14.

[0066] Under these assumptions, method 10 may be designed as follows: [0067] 1. Possible new tests 12 corresponding 34 to the database 33 or to the catalog are carried out with the aid of simulation 11, output signals being generated. [0068] 2. The output signals are evaluated with respect to fulfillment measure 13 of requirements QSpec and error measure 14 of simulation 11 according to SMerrorX error model. [0069] 3. Fulfillment measure 13 and error measure 14 are supplied 15 to classifier 18. [0070] 4. For each test 12, classifier 18 carries out a classification 15 into one of the following classes A, B, C, D (FIGS. 1 and 7): Test 12 was successful in simulation 11 and its result is reliable 16; test 12 failed in simulation 11 and its result is reliable 16; or the result of simulation 11 is unreliable 17. [0071] 5. Tests classified as reliable 16 are left in database 33 and tests classified as unreliable 17 are removed 35. [0072] 6. If new tests 12 were given for simulation 11 and classification 15 showed that their result is reliable 16, the corresponding actual test cases from database 33 are thus added 35.

[0073] FIG. 9 outlines the possible visualization of a classification result in a feature space of the test parameters spanned by the test parameters (in the following: “parameter space”). For certain parameters 26, 27 of a test 12—exemplary according to the figure, distance 26 and mass 27 of a vehicle merging into the ego lane—fulfillment measure 13 and error measure 14 are each represented as points in the parameter space. In a virtual test environment 29, visualization 28 of classification 15 of test 12 is carried out by classifier 18 in the parameter space.

[0074] This method 10 may be implemented, for example, in software or hardware or in a mixed form of hardware and software, for example in a workstation 30, as the schematic view of FIG. 10 illustrates.

METHOD AND DEVICE FOR TESTING A TECHNICAL SYSTEM

Inventors

Cpc classification

Classification Explorer

G06F11/263

PHYSICS

Classification Explorer

G06F11/2257

PHYSICS

Classification Explorer

G06F18/24765

PHYSICS

Classification Explorer

G06F18/2413

PHYSICS

Classification Explorer

G06F11/2268

PHYSICS

Classification Explorer

G06F11/261

PHYSICS

Classification Explorer

G06F18/2193

PHYSICS

Classification Explorer

G06F11/0772

PHYSICS

International classification

Classification Explorer

G06F11/263

PHYSICS

Classification Explorer

G06F11/07

PHYSICS

Classification Explorer

G06F11/22

PHYSICS

Classification Explorer

G06K9/62

PHYSICS

Abstract

Claims

Description