MONITORING OF FAILURE TOLERANCE FOR AN AUTOMATION INSTALLATION
20170082998 · 2017-03-23
Assignee
Inventors
Cpc classification
International classification
Abstract
A method for monitoring failure tolerance for an automation installation is disclosed. The automation installation operates a process via a controlled system. At least two control apparatuses alternately regulate the controlled system in a control mode by outputting control outputs and failure of the currently regulating control apparatus prompts changeover to another of the control apparatuses. During the changeover, the controlled system continues to be operated in controller-less fashion for a down time. At least one operating point for the controlled system that is possible in control mode is ascertained. Controller-less operation is respectively simulated for each operating point for the duration of the down time. A state trajectory setting out from the operating point is ascertained for the controlled system and a check is performed to determine whether the state trajectory fails to meet a predetermined safety criterion. A predetermined protective measure is initiated to avoid the operating point.
Claims
1.-15. (canceled)
16. A method for monitoring a failure tolerance for an automation installation, comprising: providing a controlled system and at least two control apparatuses, said at least two control apparatus alternately controlling the controlled system during a normal operation by outputting control outputs, said automation installation operating a process via the control system; prompting a changeover between the at least two apparatuses at a failure; continuously operating the controlled system during the changeover in a controller-less operation for a down time; ascertaining a possible operating point for the controlled system during the normal operation; simulating a controller-less operation for each operating point for a duration of the down time to thereby ascertain a state trajectory starting out from the operating point for the controlled system; checking whether the state trajectory fails to meet a predetermined safety criterion; and if affirmative initiating a predetermined protective measure to avoid the at least operating point.
17. The method of claim 16, wherein the controller-less operation is simulated by a simulation starting out from the at least one operating point using a model of the controlled system by temporally and successively computing the at least one operating point and the at least one computed operating point is combined to produce the state trajectory.
18. The method of claim 16, wherein the safety criterion includes whether the state trajectory comprises the at least one operating point that lies outside a predetermined admissible operating range, and/or whether a dynamic transition between two operating points of the state trajectory is greater than a predetermined maximum admissible dynamic range.
19. The method of claim 16, wherein a protective measure comprises outputting a warning to a user of the automation installation.
20. The method of claim 19, wherein during the controller-less operation a constant control output is transmitted to the controlled system and the protective measure comprising the constant control output is ascertained that reveals for a respective operating point a safe trajectory for continued operation of the controlled system and the ascertained constant control output is assigned to the at least one operating point.
21. The method of claim 20, wherein the respective operating point is assigned a safety control output that is output at the operating point in an event of the changeover and interrupts an operation of the controlled system.
22. The method of claim 19, wherein the protective measure comprises engineering data from the automation installation being taken as a basis for ascertaining an installation component causing the greatest proportion of the down time, and/or adopting an inadmissible operating point in line with the state trajectory.
23. The method of claim 19, wherein the control apparatuses use a synchronization connection to interchange synchronization data for aligning controller states and the protective measure comprises a rate of the synchronization connection being increased.
24. The method of claim 20, wherein the protective measure comprises the respective operating point being excluded from the normal operation and controller parameters of the control apparatuses being adjusted.
25. The method of claim 17, wherein the simulation includes an assumption about a maximum absolute value of a disturbance variable acting in the controlled system for the simulation and the protective measure includes the maximum absolute value being decreased and a new simulation being performed and the disturbance variable being indicated if the safety criterion is met for the new simulation.
26. The method of claim 19, wherein a new monitoring of the failure tolerance is iteratively performed after the protective measure is performed.
27. The method of claim 16, wherein the at least one operating point is ascertained by ascertaining an intended operating range on a basis of a configuration of the automation installation.
28. The method of claim 27, wherein the at least one possible operating point is ascertained by taking into consideration only extreme values for manipulated variable restrictions of installation components.
29. An engineering system for designing and/or configuring an automation installation having at least two control apparatuses for controlling a controlled system, said engineering system comprising an analysis device configured to take a present topology model of the automation installation and a process model of a process to be performed via the automation installation as a basis for ascertaining the resultant controlled system and a down time caused by a changeover between the control apparatuses, said analysis device configured to: provide a controlled system and at least two control apparatuses, said at least two control apparatus alternately controlling the controlled system during a normal operation by outputting control outputs, said automation installation operating a process via the control system; prompt a changeover between the at least two apparatuses at a failure; continuously operate the controlled system during the changeover in a controller-less operation for a down time; ascertain a possible operating point for the controlled system during the normal operation; simulate a controller-less operation for each operating point for a duration of the down time to thereby ascertain a state trajectory starting out from the operating point for the controlled system; check whether the state trajectory fails to meet a predetermined safety criterion; and if affirmative initiate a predetermined protective measure to avoid the at least operating point.
30. An automation installation having a controlled system for operating a process and having at least two control apparatuses for failsafe and alternate control of the controlled system, said automation installation being configured to monitor a failure tolerance during operation by: providing a controlled system and at least two control apparatuses, said at least two control apparatus alternately controlling the controlled system during a normal operation by outputting control outputs, said automation installation operating a process via the control system; prompting a changeover between the at least two apparatuses at a failure; continuously operating the controlled system during the changeover in a controller-less operation for a down time; ascertaining a possible operating point for the controlled system during the normal operation; simulating a controller-less operation for each operating point for a duration of the down time to thereby ascertain a state trajectory starting out from the operating point for the controlled system; checking whether the state trajectory fails to meet a predetermined safety criterion; and if affirmative initiating a predetermined protective measure to avoid the at least operating point.
Description
[0031] An exemplary embodiment of the invention is described below. In this regard:
[0032]
[0033]
[0034]
[0035]
[0036] The exemplary embodiment explained below is a preferred embodiment of the invention. In the case of the exemplary embodiment, however, the described components of the embodiment are each individual features of the invention that are intended to be considered independently of one another and that each also develop the invention independently of one another and hence can also be regarded as part of the invention individually or in a combination other than that shown. Furthermore, the described embodiment is also augmentable by further instances of the features of the invention that have already been described.
[0037]
[0038] By way of example, the automation system S can comprise two control apparatuses 20, 22 that may each have a PLC, for example. There may also be further control apparatuses (not shown) provided. Each control apparatus 20, 22 may be designed to use a control system R, R to regulate the controlled system 32 to a nominal value preset W. In this case, the control apparatuses 20, 22 control the controlled system 32 not simultaneously but rather alternately, a change being able to take place whenever the currently controlling control apparatus 20, 22 fails.
[0039]
[0040] The automation system S has a high level of availability as a result of the redundant design with at least two control apparatuses 20, 22. The peripheral components 14, 16 connected to the automation system S can be controlled by both control apparatuses 20, 22 in principle. So that both control apparatuses 20, 22 can operate in sync, they can be synchronized via a synchronization connection 24 at prescribed intervals of time. The synchronization connection 24 may be a direct connection (as represented in
[0041] The changeover action has lasted for a down time T during which neither the control apparatus 20 nor the control apparatus 22 have output their control outputs U, to the control system 32. During this time, a steady control output Ustat has been output to the peripheral components 14, 16. By way of example, this can be achieved by virtue of the communication network 18 involving timeslot-based communication and the values transmitted for the individual timeslots not being erased, so that they continue to be output to the peripheral components 14, 16 even when the communication cycle is repeated.
[0042] The control apparatuses 20, 22 can be configured by an engineering system E in the automation installation 10. The engineering system E can also be used to plan a topology of the automation system 10, as is needed in order to operate the process 12 in a desired manner.
[0043] The automation installation 10 has the assurance that one of the control apparatuses 20, 22 can fail at any time and the control system 32 can then continue to be operated, that is to say that the flow of the process 12 can be maintained without the process 12 reaching an undesirable state, that is to say that an operating point of the controlled system 32 is situated outside a predetermined set of admissible operating points, during the down time T.
[0044] The two controller systems R, R can involve a controller algorithm that is known per se, for example a proportional controller, integral controller, differential controller or a hybrid form thereof, such as a PID controller, for example. The control systems R, R can particularly also comprise an observer, as is represented by way of example in
[0045] The model 30 can be used to simulate or predict the effect of a down time as arises between the time of the control apparatus 22 being decoupled and the control apparatus 20 being coupled.
[0046] In the example, the model 30 has been able to be taken from a control-engineering application, that is to say the engineering data for the installation 10, as are available in the engineering system E, particularly without additional complexity. When the installation 10 is engineered to configure or design control of the process 12 by means of a respective one of the control apparatuses 20, 22, it may be that some state variables of the process 12, that is to say temperatures or other physical variables, for example, have to be ascertained indirectly because they cannot be measured directly or can be measured only with an undesirably high level of complexity and therefore need to be estimated. By way of example, this can be accomplished by using an observer method, such as a Luenberger observer 34, for example. The observer 34 shown by way of example in
[0047] The model 30 is now advantageously also used to compute the response of the controlled system 32 in the changeover situation. The changeover situation is characterized in that both the input data Y and the output data U, U to the peripheral components 14, 16 cannot be updated for the duration of the down time T. During the down time T, the controlled system 32 is thus decoupled from the controller system R that is currently still active, so that it cannot be influenced by the controller system R and also by the controller system R that is not yet coupled.
[0048] In this regard,
[0049]
[0050] By way of example, the peripheral outputs can maintain their last value during the changeover phase, so that the controlled system 32 has the last output vector applied during the down time T. Said output vector results in a trajectory for the state variables of the controlled system 32. Depending on the system parameters, the state variables of the controlled system 32, for example a boiler temperature, change in the undesirable case such that they reach a value that is critical for the process 12. In such a case, the failover down time of the automation system S used would be too long for the process 12 that is to be controlled. The down time T that can be expected is a characteristic variable of the high-availability control system S used, however. It is also influenced by planning and design of the automation installation 10, however, that is to say the quantities therein, the network topologies used for the peripheral insertions, and can accordingly be ascertained and adjusted for the specifically used automation system S.
[0051] In the case of the present automation system 10, the user is assisted in this by the engineering system E.
[0052] The reason is that if the down time T that can be expected is known, then it is possible to check, for example in the manner described below, whether particular state variables reach a critical value during failover. Since control failures and failovers caused thereby arise spontaneously and in unplanned fashion, the operating state X0 of the automation installation 10 is unknown, and unplannable, at the time to of failure of a control apparatus 20, 22, however. Therefore, the set of operating points, what is known as the admissible operating range in which the process 12 can reside during operation of the automation installation 10, is ascertained first of all. Exceptions in this case may be the startup and shutdown responses, for example. In addition, this set may also have safety intervals from dangerous, that is to say undesirable, operating ranges.
[0053] The set of undesirable or dangerous operating points reveals the set V of prohibited operating states, which may be defined as polytopes or polyhedra, for example. The set of admissible operating points reveals the operating range 8, which may likewise be defined as a polytope or polyhedron, for example. Physical manipulated variable restrictions Umax and Umin of the actuators among the peripheral components 14, 16 in the process 12 can likewise be ascertained, that is to say a smallest and largest valve opening, a maximum pump power, a maximum heating power, for example. Maximum absolute values can also be used as a basis for disturbance variables acting on the process 12. The model 30 for the controlled process 12 may be a linear or nonlinear model, for example the differential equation below can be used as the basis for describing the controlled system 32:
d(X(t))/dt=f(X(t), U(t), D(t)) X(t0)=X0,
ps where d()/dt represents the mathematical derivative with respect to time t, f() is a linear or nonlinear function and represents the dynamic response of the controlled system 32 in reaction to the current operating state X, the control output U and the disturbance variable D, and X0 represents an operating point at a time 0.
[0054] It is now possible to perform a reachability analysis on the basis of the model 30, the set of initial conditions to be examined, that is to say the operating point B that the controlled system 32 can adopt in line with expectations, all possible input values U in the range of the manipulated variable restrictions Umax, Umin and all possible disturbance variables D that are to be examined.
[0055] The result of the reachability analysis, for each future time t, is a set E(t) of reachable states, as arise when a control apparatus fails and thus the peripheral components 14, 16 have the steady control output Ustat applied in the manner described. In other words, the model is thus operated as follows starting from a failure time tO that is to be examined, assuming a steady control output:
d(X(t))/dt=f(X(t) Ustat, D(t)), X(t0)=X0.
[0056] The points obtained therefrom for the subsequent times t>t0 together form a state trajectory that describes the progression or the response of the controlled system 32 during the down time T.
[0057] It is now possible to ascertain the first time tv, at which the section G (tv) intersected by V is no longer empty, that is to say is therefore not an empty set. Each preceding time t<tv at which E(t) intersected by V forms an empty set determines an admissible time horizon tvo for safe operation. This time horizon tvo can also be shortened still further by a buffer time for safety reasons. The time tv is the longest tolerable changeover time for failover.
[0058] In the engineering system E, said time can be used for selecting the components for the automation installation 10. If, during the actual engineering, that is to say the design and configuration of the automation installation 10, it is known, as a result of knowledge of the control algorithms of the control systems R, R, that the limits Umax, Umin of the manipulated variables are not fully utilized, then it is also possible to stipulate a range of the control outputs U, U that is narrowed down accordingly. This likewise increases the acceptable latency for changeover.
[0059] It is advantageous if the engineering system E indicates these adjustment options to the user so that he does not select an excessively expensive alternative installation component at an early stage. If the down time t continues to be too long, then it is likewise possible to shorten the down time t in the event of a failover by changing the installation topology. The user can check this likewise using the engineering system E. As part of an iterative procedure, the user can thereby tailor the installation topology to the requirements of the process 12 that is to be automated.
[0060] By way of example, the reachability analysis 42 can be performed by an analysis device of the engineering system E, for example a program module of the engineering system E and, in this context, a process model 44 of the process 12 to be operated and also a topology model 46 of the automation installation 10, as the user has currently stipulated. From the process model 44, which describes the physical actions in the process 12, and the topology model 46, it is possible for the model 30 of the controlled system 32 to be ascertained in a manner that is known per se according to the principles of control engineering. Additionally, the topology model 46 reveals a value for the down time T.
[0061] The reachability analysis can ascertain the state trajectory for different operating points of the operating range B in a step S10 and, in a step S12, can check a safety criterion 48 for each state trajectory, that is to say whether the respective state trajectory reaches the set V, for example. If this is the case, symbolized by a plus sign (+) in
[0062] Hence, the exemplary embodiment as a whole describes a method for model-based determination of the effects of a failover in a high-availability automation system on a process that is to be controlled.
LIST OF REFERENCE SYMBOLS
[0063] 10 Automation installation
[0064] Process
[0065] 14, 16 Peripheral component
[0066] 18 Communication network
[0067] 20, 22 Control apparatus
[0068] 24 Synchronization connection
[0069] 26 Control connection
[0070] 28 Decoupled control connection
[0071] 30 Controlled system model
[0072] 32 Controlled system
[0073] 34 Observer
[0074] 36, 38 Subtraction point
[0075] 40 Integrator
[0076] 42 Reachability analysis
[0077] 44 Process model
[0078] 46 Topology model
[0079] 48 Safety criterion
[0080] E Engineering system
[0081] U, U Control output
[0082] R, R Control system
[0083] W Nominal value preset
[0084] T Down time
[0085] S10-S16 Method step
[0086] Ustat Steady control output