Method, computer program and system for controlling a plurality of robots, and computer-readable storage medium

10486305 ยท 2019-11-26

Assignee

Inventors

Cpc classification

International classification

Abstract

A method for controlling a plurality of agents to complete a mission, including deriving a decomposition set of decomposition states in a set of possible states of an automaton, wherein the automaton characterizes the mission, deriving a sequence of actions to be carried out by the plurality of agents depending on the decomposition set, where each action is to be carried out by at most one of the plurality of agents.

Claims

1. A method for controlling a plurality of agents to complete a mission, comprising: deriving a decomposition set of decomposition states in a set of possible states of an automaton, wherein the automaton represents a specification of the mission including all tasks of the mission, the automaton being defined by a tuple including: (i) the set of possible states, (ii) a set of initial states, the set of initial states being a subset of the set of possible states, (iii) a set of Boolean formulas over a set of atomic propositions, (iv) transition conditions, and (v) a set of accepting states, the set of accepting states being a subset of the set of possible states; deriving a sequence of actions to be carried out by the plurality of agents depending on the decomposition set, where each of the actions is to be carried out by at most one of the plurality of agents; and providing a control signal for controlling the plurality of agents in accordance with the derived sequence of actions.

2. The method according to claim 1, further comprising: controlling the plurality of agents in accordance with the derived sequence of actions.

3. The method according to claim 1, further comprising: generating the decomposition set by exploring an essential sequence of an accepting run through one or more candidate decomposition states.

4. The method according to claim 3, further comprising: adding the one or more candidate decomposition state to the decomposition set depending on whether a complementary sequence to the explored essential sequence around the respective one or more candidate decomposition state is accepting.

5. The method according to claim 4, wherein in which the decomposition set includes all those states in the set of possible states of the automaton, for which the complementary sequence to the explored essential sequence around the respective state is accepting.

6. The method according to claim 1, further comprising: generating a team model based on the automaton that represents a specification of the mission and based on automata that each model the capabilities of one of the plurality of agents.

7. The method according to claim 6, wherein the team model comprises a set of actions that comprises switch transitions which change the acting agent from one of the plurality of agents to another one of the plurality of agents.

8. The method according to claim 7, wherein the switch transitions are configured to each change the acting agent from one of the plurality of agents to a next one of the plurality of agents.

9. The method according to claim 8, wherein the switch transitions are configured such as to act only if the automaton that characterizes the mission is in a decomposition state.

10. The method according to claim 6, further comprising: deriving the sequence of actions to be carried out by the plurality of agents by a label-setting algorithm in which each state of a set of states of the team model is associated with labels that are characterized by a sequence of action leading to the respective state.

11. The method according to claim 10, further comprising: constructing a reachable set of temporary labels for each state and a set of permanent labels.

12. The method according to claim 11, further comprising: constructing, for each selected label, a set of consecutive labels by extending an action sequence associated with the selected label by all available actions and adding the resulting labels to the reachable set of temporary labels.

13. The method according to claim 12, wherein each label comprises at least one component that characterizes a cost under the corresponding sequence of actions.

14. The method according to claim 13, wherein the derived sequence of actions to be carried out by the plurality of agents is the one out of all actions that satisfy a characterization of the mission that minimizes a team cost which depends on the component that characterizes the cost.

15. The method according to claim 14, wherein only actions resulting in Pareto-optimal labels at their target state are added to the reachable set of temporary labels.

16. The method according to claim 13, wherein the component that characterizes the cost under the corresponding sequence of actions is depending on costs associated with each of these actions with one component each for each one of the agents.

17. The method according to claim 16, wherein the component that characterizes the cost under the corresponding sequence of actions is stored in memory by way of a data structure that comprises at least one component that characterizes costs associated with a selected one of the agents and at least one component that characterizes the costs associated with a group of agents that precede the selected one of the agents.

18. The method according to claim 17, wherein each label comprises at least one component that characterizes a resource status at the respective state under the corresponding sequence of actions.

19. The method according to claim 18, wherein the characterization of the mission comprises an inequality constraint that restricts the at least one component that characterizes a resource status to a predefined region.

20. A non-transitory machine-readable storage medium on which is stored a computer program for controlling a plurality of agents to complete a mission, the computer program, when executed by a computer, causing the computer to perform: deriving a decomposition set of decomposition states in a set of possible states of an automaton, wherein the automaton represents a specification of the mission including all tasks of the mission, the automaton being defined by a tuple including: (i) the set of possible states, (ii) a set of initial states, the set of initial states being a subset of the set of the possible states, (iii) a set of Boolean formulas over a set of atomic propositions, (iv) transition conditions, and (v) a set of accepting states, the set of accepting states being a subset of the set of possible states; deriving a sequence of actions to be carried out by the plurality of agents depending on the decomposition set, where each of the actions is to be carried out by at most one of the plurality of agents; and controlling the plurality of agents in accordance with the derived sequence of actions.

21. A system for controlling a plurality of agents to complete a mission, which is configured to: derive a decomposition set of decomposition states in a set of possible states of an automaton, wherein the automaton represents a specification of the mission including all tasks of the mission, the automaton being defined by a tuple including: (i) the set of possible states, (ii) a set of initial states, the set of initial states being a subset of the set of possible states, (iii) a set of Boolean formulas over a set of atomic propositions, (iv) transition conditions, and (v) a set of accepting states, the set of accepting states being a subset of the set of possible states; derive a sequence of actions to be carried out by the plurality of agents depending on the decomposition set, where each of the actions is to be carried out by at most one of the plurality of agents; and control the plurality of agents in accordance with the derived sequence of actions.

22. The system according to claim 21, further comprising at least one of the plurality of agents.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The present invention is explained in more detail below with reference to figures.

(2) FIG. 1 shows a robot control system according to a first aspect of the present invention.

(3) FIG. 2 shows a flow-chart diagram that illustrates a preferred method according to a further aspect of the present invention.

(4) FIG. 3 shows a flow-chart diagram which relates to a preferred algorithm to determine the decomposition set custom character.

(5) FIG. 4 shows a flow-chart diagram which relates to a preferred method to construct the team model custom character.

(6) FIG. 5 shows a flow-chart diagram which relates to a preferred method to plan an optimal action sequence.

(7) FIG. 6 illustrates an example of the structure of the team model.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

(8) FIG. 1 shows a robot control system 10 that is configured to plan and allocate tasks to a plurality of robots 11, 12, 13 such that the plurality of agents, preferably robots 11, 12, 13, by fulfillment of their respective tasks, jointly achieve a common goal, thus achieving a predefinable mission custom character. Robot control system 10 is equipped with communication means (not shown), e.g., a wireless networking transceiver, to communicate with each of the robots 11, 12, 13 via a communication link 22, 23, 24. Similarly, each of the robots 11, 12, 13 is equipped with corresponding communication means.

(9) In a preferred embodiment, the robot control system 10 comprises a computer 20 with memory 21, on which a computer program is stored, said computer program comprising instructions that are configured to carry out the method according to aspects of the present invention described below if the computer program is executed on the computer 20.

(10) In further aspects of the preferred embodiment, the robots 11, 12, 13 comprise a computer 11b, 12b, 13b each, said computer being equipped with a computer memory each (not shown) on which a computer program is stored, said computer program comprising instructions that are configured to carry out some or all of the method according to further aspects of the invention described below if the computer program is executed on the computer 11b and/or 12b and/or 13b. Preferably, the robots 11, 12, 13 each comprise actuators 11a, 12a, 13a that enable each of the robots to physically interact with an environment in which the robots are placed.

(11) FIG. 2 shows a flow-chart diagram that illustrates a preferred method according to a further aspect of the present invention. In a first step 1000 the method receives an agent model custom character.sub.1,custom character.sub.2,custom character.sub.3, . . . each for each of the agents or robots 11, 12, 13. The agent models can for example be read from a dedicated location in computer memory 21.

(12) These agent models custom character.sub.1,custom character.sub.2,custom character.sub.3, . . . are preferably each given as an automaton custom character=(custom character,custom character,custom character,,) consisting of a set of states custom character that the corresponding agent or robot can be in an initial state custom charactercustom character a set of possible actions custom character.Math.custom charactercustom character that the corresponding agent or robot can carry out a set of propositions a labeling function :custom character.fwdarw.2.sup..

(13) Modeling the agent models custom character.sub.1,custom character.sub.2,custom character.sub.3, . . . as an automaton as described is convenient because it is intuitive to model the internal state and the actions of the agents as a state machine. Furthermore, it is convenient to model an abstraction of places in the environment as a topological map.

(14) Independently of step 1000, the method receives a specification of the mission custom character in step 1100. Preferably, this mission specification custom character is an LTL specification, e.g. a set of tasks {custom character, . . . , custom character}. In a following step 1200, this mission specification custom character is converted into a nondeterministic finite automaton custom character. Note that steps 1100 and 1200 are optional. Alternatively, the method may directly receive the mission specification custom character as the nondeterministic finite automaton custom character. Then, in step 1300, the method determines the decomposition set custom character depending on the automaton custom character. A preferred embodiment of this determination procedure is explained in detail in FIG. 3.

(15) Following steps 1000 and 1300, the method constructs a team model custom character depending on the automaton custom character, the decomposition set custom character and the agent models custom character.sub.1,custom character.sub.2,custom character.sub.3 . . . in step 1400. A preferred embodiment of this construction procedure is explained in detail in FIG. 4.

(16) In the following step 2000, the method carries out a procedure of planning an optimal action sequence .sub.fin based on the team model custom character which is explained in detail in FIG. 5.

(17) In step 3000, the optimal action sequence .sub.fin is translated into executable commands for the agents or robots 11, 12, 13, for example by means of a lookup table that may be stored in computer memory 21. The executable commands are each associated with one of the agents or robots 11, 12, 13 and distributed to the respective agent or robot 11, 12, 13 via one of the communication links 22, 23, 24. The respective agent or robot 11, 12, 13 then executes this command and preferably upon completion of the command sends a confirmation message to the robot control system 10. In case that the execution of this command is not possible, the respective agent or robot 11, 12, 13 may send an error notification to the robot control system 10, which may react accordingly. In case it receives a confirmation message that a command has been executed, it may send a next command to a next respective agent or robot 11, 12, 13. In the case it receives an error notification, it may enter a dedicated mode, e.g., a shut-down mode of all agents or robots 11, 12, 13.

(18) FIG. 3 shows a flow-chart diagram that depicts a further aspect of the present invention, which relates to a preferred algorithm to determine the decomposition set custom character. This method starts in step 1310 with a step of reading the aforementioned automaton custom character. An index i that will be used to label all the states {q.sub.i}=Q of the set of states Q that is associated with the automaton custom character. This index is initialized to an initial value, e.g. i=1. The decomposition set custom character is initialized as custom character=.

(19) Then, in step 1320, the method constructs an accepting run .sub.i that passes through state q.sub.i corresponding to the present value of the index i. State q.sub.i is the candidate decomposition state. Such an accepting run .sub.i may for example be constructed by exploring the graph defined by the transition conditions associated with the automaton custom character and constructing a first partial run .sub.f from state q.sub.i to initial state q.sub.0 associated with automaton custom character while considering inverted transitions and a second partial run .sub.l from state q.sub.i to a final state fF associated with automaton custom character. The accepting run .sub.i that passes through q.sub.i may then be constructed by concatenating the inverted first partial run .sub.f and the second partial run .sub.l.

(20) In the following step 1330, the method generates an essential sequence .sub.e associated with the accepting run .sub.i.

(21) A sequence is called essential for nondeterministic finite automaton custom character and associated with a run if and only if it describes the run in custom character and (t)\{}custom character((t1), (t)) for all t and propositions (t), i.e., contains only the required propositions.

(22) For example, the essential sequence .sub.e may be generated from the accepting run .sub.i by converting all propositions of corresponding transition conditions (.sub.i(t+1),.sub.i(t)) of the accepting run .sub.i to their respective disjunctive normal form and successively adding all propositions of each one conjunctive clause to the essential sequence .sub.e for all t.

(23) In the following step 1340, the method generates a complementary sequences {circumflex over ()}.sub.e of the essential sequence .sub.e. To this end partial sequences .sub.1 and .sub.2 are generated with .sub.1 being the part of essential sequence .sub.e from its initial state to state q.sub.i and .sub.2 being the remaining part of essential sequence .sub.e from state q.sub.i to its final state, i.e., essential sequence .sub.e=.sub.1.sub.2 is a concatenation of these two partial sequences .sub.1 and .sub.2. The complementary sequence {circumflex over ()}.sub.e is then generated by reversing the order of these two partial sequences .sub.1 and .sub.2, i.e., {circumflex over ()}.sub.e=.sub.2.sub.1.

(24) Next follows step 1350, in which it is checked whether or not the complementary sequence {circumflex over ()}.sub.e corresponds to an accepting run (which amounts to a simple iterative check whether the propositions of the complementary sequence {circumflex over ()}.sub.e satisfy the transition conditions ). If it is determined that the complementary sequence {circumflex over ()}.sub.e is an accepting sequence, the method branches to step 1360 in which state q.sub.i is added to the decomposition set custom character, after which the method continues with the execution of step 1370. If it is determined that the complementary sequence {circumflex over ()}.sub.e is not an accepting sequence, the method skips directly to step 1370.

(25) In step 1370, it is checked whether index i has already iterated over all states q.sub.i of set Q, preferably by checking whether i=Q. If not, the method branches to step 1380 in which index i is incremented by an increment of 1 and the method continues with a next iteration in step 1320. If, however, it is determined that the index i has already iterated over all states q.sub.i of set Q, the method branches to step 1390, in which this part of the algorithm for determining the decomposition set custom character ends.

(26) FIG. 4 shows a flow-chart diagram that describes a preferred embodiment of a still further aspect of the present invention. This aspect relates a method for constructing the team model custom character from the nondeterministic finite automaton custom character, the decomposition set custom character and the agent models custom character.sub.1,custom character.sub.2,custom character.sub.3 . . . .

(27) The method starts in step 1410, in which the nondeterministic finite automaton custom character, the decomposition set custom character and the agent models custom character.sub.1,custom character.sub.2,custom character.sub.3, . . . , custom character.sub.n. are received. For ease of notation, in the context of the discussion of FIG. 4 the models will be labelled with a generic labelling superscript (r), i.e the agent models will be denoted custom character.sup.(r).

(28) Next, in step 1420, a corresponding product model P.sup.(r) will be created for every agent model custom character.sup.(r). By combining the agent model automaton custom character.sup.(r) with the nondeterministic finite automaton custom character of the mission custom character, the product model custom character.sup.(r) can be constructed to capture both the agent capabilities encoded in the agent model automaton custom character.sup.(r) and the specification of the mission custom character encoded in the nondeterministic finite automaton custom character. Dropping the superscript (r) for ease of notation, conveniently, the product model custom character may be given by custom character=custom character.Math.custom character=(custom character,custom character, custom character) comprising a set of states custom character=Qcustom character a set of initial states custom character=Q.sub.0{custom character} a set of actions A.sub.p={((q.sub.s,s.sub.s),(q.sub.t,s.sub.t))custom charactercustom character:(s.sub.s,s.sub.t)custom character(s.sub.s)custom character(q.sub.s,q.sub.t))}.

(29) For a plurality of agents, especially a plurality of robots, the respective agent models may differ from each other, each representing the capabilities of the respective agent, while the nondeterministic finite automaton custom character Y is determined by a particular specification of the mission custom character. As such, the product model custom character may be constructed separately for each of the agents. It describes for each of the different agent how the mission custom character can be executed by the agent to which the agent model custom character.sup.(r) corresponds.

(30) Therefore, in a preferred embodiment for each r{1, . . . , N} the corresponding product model custom character.sup.(r) is constructed as custom character.sup.(r)=custom character.Math.custom character.sup.(r) as defined above.

(31) In order to combine a plurality of agents it is possible to construct a team model automaton custom character from the individual product models custom character.sup.(r). This is done in step 1430.

(32) The team model automaton custom character is conveniently constructed as a union of all the local product models custom character.sup.(r) with r{1, . . . , N} as follows: The team model automaton custom character is constructed as custom character=(custom character,custom character,custom character,custom character), comprising a set of states custom character={(r,q,s):r{1, . . . , N}, (q,s)custom character} a set of initial states custom character={(r,q,s)custom character:r=1,(q,s)custom character} a set of final states custom character={(r,q,s)custom character:qF} a set of actions custom character=.sub.r custom character.

(33) In following step 1440 a set of switch transitions custom charactercustom charactercustom characteris determined. The set of switch transitions custom character is defined as the set of all those transitions custom character=((r.sub.s,q.sub.s,s.sub.s),(r.sub.t,q.sub.t,s.sub.t)) between a starting state (r.sub.s,q.sub.s,s.sub.s)custom character and a terminal state (r.sub.t,q.sub.t,s.sub.t)custom character which connect different agents, i.e. r.sub.sr.sub.t, preserve the progress in the nondeterministic finite automaton custom character, i.e. q.sub.s=q.sub.t, point to the next agent, i.e. r.sub.t=r.sub.s+1 point to an initial agent state, i.e. s.sub.t=custom character, and start in the decomposition set custom character, i.e. q.sub.scustom character.

(34) Conveniently, the set of switch transitions custom character may be constructed by traversing all states q.sub.s in the decomposition set custom character and all starting agent indices r.sub.s={1, . . . , N1}. For this choice of state q.sub.s and starting agent index r.sub.s, traversing all states s.sub.s for which (q.sub.s,s.sub.s)custom character fixes r.sub.t,q.sub.t,s.sub.t and thus yields the set of switch transitions custom character.

(35) An example of the structure of the team model custom character is depicted in FIG. 6, which shows an example of a system comprising three agents. The team model custom character has an initial state (bottom left corner) and three final states (right side). Between the agent automata, directed switch transitions custom character to the next agent connect states of the decomposition set custom character.

(36) In step 1450 following step 1440, the set of switch transitions custom character is added to the set of actions custom character, i.e. custom character.fwdarw.custom charactercustom character. This concludes the algorithm shown in FIG. 4. FIG. 5 shows a flow-chart diagram that describes a preferred embodiment of an even further aspect of the invention. This even further aspect of the invention relates to derive an action sequence .sub.fin which minimize a team cost for given agent models custom character.sup.(r) of the team of agents 11, 12, 13, a cost function C, initial resources .sub.00 and the specification of the mission custom character such that the specification is satisfied. An action sequence is called satisfying if the associated state sequence satisfies the specification .

(37) Generally, an action sequence is preferably defined as =s.sub.0a.sub.1s.sub.1 . . . a.sub.ns.sub.n which is a run in custom character with s.sub.icustom character and a.sub.icustom character. In order to distribute among the involved agents, custom character for agent r is preferably obtained by projecting onto custom character.sup.(r).

(38) Conveniently, the cost function C may be defined as follows. Each action of the team model custom character is assigned a non-negative cost, i.e. C:custom character.fwdarw.custom character.sub.0. For switch transitions custom character, preferably the associated cost C(custom character) is chosen as zero to reflect the fact that switch transitions custom character are purely virtual and will not appear in the action sequence .sup.(r) executed by the agents 11, 12, 13.

(39) For modelling the multi-agent character of a cost, it is convenient to extend the cost C(a) associated with an action acustom character to a vector of the same dimensionality N as the number of agents 11, 12, 13, i.e. C(a)custom character.sub.0.sup.N where each agent r=1, . . . , N represents one dimension.

(40) To reflect the fact that each action a with non-zero cost c.sub.a=C(a) is associated with a particular agent by the fact that custom character\custom character=.sub.rcustom character, it is convenient to define

(41) c a , i = { C ( a ) , if i = r 0 , otherwise
and custom character=0. Consequently, the costs c.sub. associated with an action sequence can be computed as c.sub.=.sub.ac.sub.a.

(42) Given a set of action sequences, a Pareto front of all cost vectors c.sub. for satisfying action sequences then forms a set of potentially optimal solutions. In order to prioritize these solution, in a preferred embodiment one may compute an overall team cost as (c.sub.)=(1)c.sub..sub.+c.sub..sub.1, where (0,1] may be chosen fixed but freely. This conventiently reflects an objective to minimize the maximal agent cost c.sub..sub., e.g. minimizing a completion time of mission custom character, and an objective to avoid unnecessary actions of the agents 11, 12 13 via a regularization term c.sub..sub.1.

(43) To save memory requirements for storing the cost vector c.sub., preferably the cost vector c.sub. is stored as a compressed cost vector .sub. which is three-dimensional, independent of the number of agents, by recursively choosing

(44) c ^ = ( || ( c , 1 , .Math. , c , r - 1 ) T || || ( c , 1 , .Math. , c , r - 1 ) T || 1 c , r ) . ( 1 )

(45) This definition exploits the mathematical truth discovered as part of the work leading to the invention that given a fixed but arbitrary agent r, the team cost of the action sequence can already be evaluated for all agents r<r since no action associated with any of these agents r will occur in a continuation of .

(46) This makes it possible to simplify the computation of the team cost by instead computing a compressed team cost
{circumflex over ()}(.sub.)(1)(.sub.,1,.sub.,3).sup.T.sub.+(.sub.,2,.sub.,3).sup.T.sub.1,(2)
with .sub.,i denoting the i-th component of the compressed cost vector .sub.. This representation not only removes a dependency of the team cost c.sub. on the team size N, it also a more efficient representation during planning. The reason for this efficiency gain is that additional cost vectors are Pareto-dominated as will be discussed below in the discussion of step 2100, and can thus be eliminated from the set of potential solutions much earlier in the planning process.

(47) Furthermore, in addition to the specification which allows to model discrete constraints, in an optional further development is possible to consider constraints of the agents in continuous domains, like for example constraints on resources . A change of resources may be modeled by a resource function :custom character.fwdarw.custom character.sup.M where M indicates the number of resource dimensions that models the change of resources under a given action acustom character. Conveniently, the resource function can take both negative and positive values to reflect the fact that resources can be modified in both directions.

(48) For the action sequence , the resulting status of resources .sub. is given by .sub.=.sub.0+.sub.a(a). The set of satisfying action sequences is constrained to sequences =s.sub.0a.sub.1s.sub.1 . . . a.sub.ns.sub.n such that at any state s.sub.x and a truncation of sequence until this state s.sub.x, i.e. =s.sub.0a.sub.1 . . . a.sub.xs.sub.x it holds that .sub.,i>0 for each component i=1, . . . , M. In other words, the action sequences are constrained such that the inequality constraint of the resources .sub. holds at any time during the execution of the action sequence .

(49) Note that it is also possible to express constraints of the from .sub.,i0 within this framework by choosing a fixed offset smaller than the minimal change .sub.,i of the resource component .sub.,i under an exchange of any one action a.sub.jcustom character for any other action a.sub.kcustom character, i.e. .sub.,i=min.sub.(a.sub.j.sub.,a.sub.k.sub.)|(a.sub.j).sub.i(a.sub.k).sub.i|. The constraint .sub.,i0 can then be modeled as an equivalent inequality constraint .sub.,i+>0.

(50) While it would be possible to capture interval constraints of the form .sub.,iI=(I.sub.l,I.sub.u) by a set of two inequality constraints, a more preferred solution that introduces a smaller number of Pareto optimal labels as explained below is to remodel the interval constraint as

(51) , I - I u - I l 2 > 0
where

(52) , I = || I u - I l 2 + I l - , i ||
denotes a distance measure of .sub.,i from the center of the interval I.

(53) The actual algorithm for the planning problem discussed above is based on a label-setting approach which can be thought of as a multi-criteria generalization of the Dijkstra shortest path search. Instead of operating on states with associated costs, the label-setting algorithm constructs a set of labels for each state. For each state scustom character, a label l will be given as l=(.sub.,.sub.,v,i.sub.v) which depends on the action sequence that led to state s, .sub. is the associated compressed cost and .sub. the associated resource status, vcustom character is the state that precedes state s in action sequence and i.sub.v is the respective predecessor label.

(54) In other words, the construction of such a multi-dimensional label l fore each state s is an extension of the team-model state space custom character to a higher-dimensional, infinitely large label space custom character, in which each label lcustom charactercustom character of state s instantiates one possible continuous resource configuration and transitions between the labels are described by their predecessor relations. custom character denotes the set of instantiated, i.e., feasible, labels at state s and custom character=custom charactercustom charactercustom characterdenotes the set of all feasible labels.

(55) It is possible to model a resource constraint as a proposition .sub.i, e.g., .sub.i:=(.sub.,i>0). Whether or not .sub.i is true would, in the state space custom character, depend on a full action sequence . However, in label space custom character, .sub.i is either true or false for each element of the label space custom character since it is possible to associate a single label lcustom character with a specific .sub.l,i=.sub.,i as its second component. In a preferred embodiment, the resource constraints are indeed modeled in this way and denote the corresponding set of resource constraint propositions with .sub..

(56) The actual algorithm which is illustrated in FIG. 5 starts with an initialization in step 2010. A set of temporary labels L.sub.t,v is initialized as L.sub.t,v={0,.sub.0, , } for each initial state vcustom character. For each other state scustom character\custom character, a set of temporary labels L.sub.t,s is initialized as L.sub.t,s=. Furthermore, for each state scustom character a set of permanent labels L.sub.p,s is initialized as L.sub.p,s=.

(57) In the following step 2020, it is checked whether the set of temporary labels L.sub.t,s is empty for each state s. If this is the case, no final state f is reachable and the algorithm stops with an error indication in step 2021, which may result in a controlling agents 11, 12, 13 accordingly, e.g. by transitioning the control system 10 into a safe state.

(58) If, however, it is determined that the set of temporary labels L.sub.t,s is not empty for at least one state s, the method proceeds to step 2030. In step 2030, the compressed cost vector compressed cost vector .sub. is computed according to equation (1) and the compressed team cost {circumflex over ()}(.sub..sup.(l)) is computed according to equation (2). This is possible since each label l specifies its predecessor label, and the action sequence leading to label l can be reconstructed. Then, a minimizing state s* and a minimizing label l* from the corresponding set of temporary labels L.sub.t,s* is determined such that they minimize the compressed team cost {circumflex over ()}, i.e.

(59) ( s * , l * ) = argmin s S �� , l L t , s ^ ( c ^ ( l ) ) .

(60) In the next step 2040, the minimizing label l* is removed from the set of temporary labels L.sub.t,s, corresponding to the minimizing state s*, i.e. L.sub.t,s*L.sub.t,s*\{l*} and added to the corresponding set of permanent labels L.sub.p,s*, i.e. L.sub.p,s*L.sub.p,s*, {l*}.

(61) In the following step 2050, it is checked whether the minimizing state s* is an accepting state. If this is the case, the method continues with step 2060, if not, it continues with step 2080.

(62) In step 2060, a final label l.sub.fin is set to label l*. As outlined above, the corresponding final action sequence .sub.fin is reconstructed iteratively from the predecessor labels. The final action sequence .sub.fin is the selected action sequence with minimal compressed team costs {circumflex over ()} and hence minimal team costs . This concludes the algorithm.

(63) In step 2080, a set V of all neighboring states v of minimizing state s* and a corresponding set L.sub.v of corresponding neighboring labels is determined. For example, V may be determined by intitializing V=, L.sub.v=, exploring each state v=(r.sub.v,q.sub.v,s.sub.v)custom character and adding state v to set V if and only if there is an action a that links the minimizing state s* to state v, i.e. a=(s*,v)custom character. If state vcustom character is added to set V, the corresponding new costs .sub.new are computed depending on action a via

(64) c ^ new = { ( || c ^ 1 ( l ) , c ^ 3 ( l ) || || c ^ 2 ( l ) , c ^ 3 ( l ) || 1 0 ) if a c ^ ( l ) + ( 0 , 0 , C ( a ) ) T otherwise .

(65) Similarly, corresponding new resources .sub.new are computed depending on action a via

(66) new = { ( global ( l ) 0 , r v ) if a ( l ) + ( a ) otherwise .

(67) In this formula, .sub.global.sup.(l) denotes the part of the resources .sup.(l) that is global, i.e. independent of the agent, and .sub.0,r.sub.v denote the initial resources of agent r.sub.v. A corresponding new label l.sub.v=(.sub.new, .sub.new,s*,i.sub.s*) is generated, with i.sub.s*=card(L.sub.p,s*). This corresponding new label l.sub.v is then added to the set of neighboring labels L.sub.V. After exploration of all states v is completed, the method continues with step 2090.

(68) In the next step 2090, it is checked it is checked for each neighboring label lL.sub.v whether the corresponding new resource status .sub.new satisfies all constraints. For this purpose, an extended transition function :custom charactercustom character.sup.M.fwdarw.{custom character,custom character} which is an extension of the transition function of the nondeterministic finite automaton custom character is defined as :(a=((r.sub.s,q.sub.s,s.sub.s),(r.sub.t,q.sub.t,s.sub.t)),)custom character((s.sub.s).sub.)custom character(q.sub.s, q.sub.t). The action a.sub.l associated with neighboring label l is determined and it is checked whether (a.sub.l,.sub.new) is true. If it is not true, the method branches back to step 2020. If it is true, however, it is also checked whether the neighboring label l is non-dominated in the Pareto sense.

(69) For ease of notation, an operator <.sub.P denotes a less than-relation in the Pareto sense, i.e. (a.sub.1, . . . , a.sub.n).sup.T<.sub.P(b.sub.1, . . . , b.sub.n).sup.Taba.sub.ib.sub.ii{1, . . . , n}. An operator .sub.P relaxes this relation and also allows a=b. A label is non-dominated in the Pareto sense if there does not exist another label custom character in either the set of temporary labels L.sub.t,s or the set of permanent label L.sub.p,s at the same state v such that (.sup.(custom character.sup.),.sup.(custom character.sup.)).sub.P(.sup.(l),.sup.(l)).

(70) If it is found that no such label custom character exists, it is deemed that the neighboring label l is non-dominated in the Pareto sense and the method continues with step 2100. If, however, such a label custom character exists, the method skips back to step 2020.

(71) In step 2100, all labels custom character which are dominated by any neighboring label lL.sub.v where said neighboring label found to satisfy all constraints and be non-dominated in the Pareto sense by another label are removed from the set of temporary labels L.sub.t,v at the same state v, i.e., L.sub.t,vL.sub.t,v\{custom characterL.sub.t,v:l<.sub.Pcustom character}.

(72) Next, in step 2110, all said aforementioned neighboring labels l are added to the set or temporary labels L.sub.t,v, i.t. L.sub.t,vL.sub.t,v {l}. The method then continues with step 2020.