Unmanned aerial vehicle (UAV) task cooperation method based on overlapping coalition formation (OCF) game
11567512 · 2023-01-31
Assignee
Inventors
- Nan Qi (Nanjing, CN)
- Zanqi Huang (Nanjing, CN)
- Diliao Ye (Nanjing, CN)
- Luliang Jia (Nanjing, CN)
- Yueyue Su (Nanjing, CN)
- Kewei Wang (Nanjing, CN)
- Wei Wang (Nanjing, CN)
- Yijia Liu (Nanjing, CN)
Cpc classification
B64U2201/102
PERFORMING OPERATIONS; TRANSPORTING
B64C39/024
PERFORMING OPERATIONS; TRANSPORTING
International classification
G05D1/10
PHYSICS
Abstract
An unmanned aerial vehicle (UAV) task cooperation method based on an overlapping coalition formation (OCF) game includes: constructing a sequential OCF game model for a UAV multi-task cooperation problem; using a bilateral mutual benefit transfer (BMBT) order that is biased toward the utility of a whole coalition to evaluate a preference of a UAV for a coalitional structure; optimizing task resource allocation of the UAV under an overlapping coalitional structure by using a preference gravity-guided Tabu Search algorithm to form a stable coalitional structure; and optimizing a transmission strategy based on the current coalitional structure, an updated status of a task resource allocation scheme of the UAV, and a current fading environment, so as to maximize task execution utility of a UAV network. The method quantifies characteristics of resource properties of the UAV and a task, and optimizes the task resource allocation of the UAV under the overlapping coalitional structure.
Claims
1. An unmanned aerial vehicle (UAV) task cooperation method based on an overlapping coalition formation (OCF) game, comprising: step 1: considering an overlapping and complementary relationship between resource properties of a UAV and a task and a task priority, quantifying characteristics of the resource properties of the UAV and the task, optimizing task resource allocation of the UAV under an overlapping coalitional structure, and constructing a sequential OCF game model for a UAV multi-task cooperation problem; step 2: using a bilateral mutual benefit transfer (BMBT) order that is biased toward utility of a whole coalition to evaluate a preference of the UAV for the overlapping coalitional structure, wherein all coalition members cooperate with each other to achieve mutual benefits and further improve total task execution utility of a whole network; step 3: optimizing the task resource allocation of the UAV under the overlapping coalitional structure by using a preference gravity-guided Tabu Search algorithm based on a preference relationship between the UAV and tasks with a same type of resource to form a stable coalitional structure; step 4: optimizing a transmission strategy based on a current coalitional structure, an updated status of a task resource allocation scheme of the UAV, and a current fading environment, to maximize the task execution utility of the UAV network; and step 5: performing, by the UAV, the task according to the optimized task resource allocation and the optimized transmission strategy; wherein in step 1, the sequential OCF game model is constructed for the UAV multi-task cooperation problem; and in the sequential OCF game model, the UAV serves as a player and is assumed to allocate a resource and form an overlapping coalition to cooperatively complete the task, wherein a quantity of coalitions is equal to a quantity of tasks; a task cooperation model based on the OCF game is defined as ={N, U.sub.m, SC, X}, wherein N represents a UAV player; U.sub.m represents a utility function of a task point coalition m; SC={A.sub.1, . . . A.sub.m}, Mem(A.sub.m)∈N represents the overlapping coalitional structure; Mem(A.sub.m) represents a coalition member set of UAVs that allocate resources to an m.sup.th task point and is expressed as Mem(A.sub.m)={n∈N|A.sub.m.sup.(n)≠Ø}; and X={x.sub.1, . . . , x.sub.n, . . . , x.sub.N} represents a UAV decision-making vector for determining the task resource allocation, and a resource allocation vector of each UAV is defined as X.sub.n=[A.sub.1.sup.(n), . . . , A.sub.m.sup.(n), . . . , A.sub.m.sup.(n)]; and each UAV gets a share of a revenue from a coalition that the UAV joins, a revenue sharing problem is resolved according to basic proportional fairness of a Shapley value, and utility of a UAV n is expressed as follows:
.sub.n.sup.(m) represents a proportion of UAV resource allocation for the task point coalition m to guarantee that a utility back deserved by the UAV from the coalition increases as the task resources of the UAV allocated to the coalition increases, and
.sub.n.sup.(m) is expressed as follows:
.sub.n SC.sub.P; SC.sub.Q
.sub.n SC.sub.P indicates that the UAV player n prefers to allocate the task resources by using the coalitional structure SC.sub.Q instead of the coalitional structure SC.sub.P; a coalitional structure SC.sub.P={A.sub.l.sup.(p), . . . , A.sub.m.sup.(p)} is considered, wherein for some resources δ.sub.n.sup.(z.sup.
.sub.n SC.sub.P; wherein the BMBT order in step 2 is specifically as follows: for any UAV n∈N and any two coalitional structures SC.sub.P and SC.sub.Q that are generated through a switch operation, wherein A(n)={A.sub.n∈SC|A.sub.m.sup.(n)≠Ø, m∈M} represents a set of other task point coalitions to which the UAV n allocates resources,
Γ(k+1)=Γ(k)+k(Γ.sub.max−Γ(k))/K.sub.max; c) randomly selecting some consumable UAV resources δ.sub.n.sup.(z.sup.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(5) The embodiments of the present invention are further described in detail below with reference to the accompanying drawings.
(6) The embodiments are established based on a system model shown in
(7) As shown in
(8) Step 1: Consider an overlapping and complementary relationship between resource properties of a UAV and a task and a task priority, quantify characteristics of the resource properties of the UAV and the task, optimize task resource allocation of the UAV under an overlapping coalitional structure, and construct a sequential OCF game model for a UAV multi-task cooperation problem.
(9) (1) The quantifying characteristics of the resource properties of the UAV and the task is specifically as follows:
(10) A cluster network consisting of N heterogeneous UAVs is considered, where a set of the UAVs is expressed as N={1, . . . n . . . , N}. The UAVs need to complete M tasks randomly distributed in the network, and a set of the tasks is expressed as M={1, . . . m . . . , M}. It is assumed that there are Z types of task resources. A set of sub-task types is T={TB.sub.1, . . . , TB.sub.z.sub.
(11) (2) The optimizing task resource allocation of the UAV under an overlapping coalitional structure is specifically as follows:
(12) A satisfaction function is introduced to measure a satisfaction degree of the task. A utility function of the m.sup.th task point may be expressed as follows:
(13)
(14) where C.sub.m.sup.(req) represents a service completion requirement of the task point, and C.sub.m(A.sub.m) represents a service revenue of the task point, which comprehensively considers a task completion status and an energy loss and is defined as follows:
(15)
(16) where D represents a constant to ensure that C.sub.m>0; ω.sub.1, ω.sub.2, and ω.sub.3 are weight coefficients to evaluate proportions of impact of a task revenue, waiting time, and UAV energy consumption on network utility; r(A.sub.m) represents a completion degree of the m.sup.th task point; t.sub.m.sup.(wait) represents waiting time of the m.sup.th task point; and e.sub.m.sup.(n) represents a flight loss of the UAV n for task execution at the m.sup.th task point, which is calculated based on a proportion of a quantity of resources allocated by the UAV to the task point to a total quantity of resources allocated to the task point, and is expressed as follows:
(17)
(18) where E.sub.n represents total propulsion energy consumption of the UAV n. These performance indicators are defined as follows:
(19) 1) Task completion degree: In a resource allocation process of the UAV, quality and a quantity of completed task types need to be considered. The task completion degree represents a proportion of a resource actually allocated to the m.sup.th task point to a resource demand of the task point. When a total quantity of resources allocated by a UAV coalition exceeds the demand of the task point, the task completion degree reaches 100%. Otherwise, the task completion degree is less than 100%. An average task completion degree r(A.sub.m) of the m.sup.th task point is defined as follows:
(20)
(21) where
(22)
and
(23)
represent proportions of allocated consumable and non-consumable resources to the resource demand of the m.sup.th task point respectively,
(24)
represents a total quantity of a z.sub.b.sub.
(25)
represents a total quantity of a z.sub.c.sub.
(26) 2) Waiting time of the task: Time consumed by the UAV n for task execution is decomposed into total flight duration t.sub.n.sup.(fly) and total hovering duration t.sub.n.sup.(hover). The UAV sorts, based on task priorities, task points to which resources are allocated, and generates a task execution sequence, such that a position sequence of the tasks executed by the UAV n is obtained based on the current coalitional structure, namely, Task.sub.n.sup.(UAV)={task.sub.n.sup.(1), . . . , task.sub.n.sup.(i), . . . , task.sub.n.sup.(ζ)}, task.sub.n.sup.(i)∈M, where ζ represents a length of the task execution sequence. In a task execution process, because each UAV has a different position status and flight path, resulting in inconsistent time of arriving at a task position, and task start time of each coalition is determined by a UAV that arrives last, t.sub.task.sub.
(27)
where d.sub.task.sub.
(28)
(29) where loc.sub.n represents an initial position of the UAV n; task execution duration (roughly about the hovering duration of the AUV) of the UAV at the m.sup.th task point is defined as t.sub.m.sup.(hover)=t.sub.m.sup.(com)+t.sub.m.sup.(tran), where t.sub.m.sup.(com) and t.sub.m.sup.(tran) represent duration of executing a consumable task and a non-consumable task by the UAV respectively; and based on total communication capacity σ.sub.m.sup.(z.sup.
(30)
(31) In conclusion, the total hovering duration of the UAV n is obtained according to the following formula:
(32)
(33) After the sorting by task priority, an execution sequence before the m.sup.th task point is defined as follows:
Task.sub.m.sup.(point)={task.sub.m.sup.(l), . . . ,task.sub.m.sup.(j), . . . ,task.sub.m.sup.(J)},task.sub.m.sup.(j)∈M,task.sub.m.sup.(J)=m
(34) where J represents a length of the task execution sequence before the m.sup.th task point, and the waiting time of the m.sup.th task point is defined as follows:
(35)
(36) 3) Flight energy consumption of the UAV: Because propulsion energy consumption is far greater than communication energy consumption, communication power consumption is usually ignored compared with the propulsion power consumption. Therefore, when a speed of the UAV is V, propulsion power of the UAV is expressed as follows:
(37)
(38) where P.sub.0 and P.sub.1 represent blade profile power and induced power in a hovering state respectively, U.sub.tip and v.sub.0 represent a tip speed of a rotor and a mean rotor velocity in the hovering state respectively, f.sub.0 and η represent a fuselage drag ratio and rotor solidity respectively, and ρ and s.sub.0 represent air density and disc area of the rotor respectively; and therefore, the total propulsion energy consumption of the UAV is as follows:
(39)
(40) In addition, the consumable resources carried by the UAV are limited. After the consumable resources are exhausted, only the non-consumable task can be performed. Furthermore, fuel oil carried by the UAV is also limited. When a remaining fuel capacity of the UAV reaches a threshold, the UAV has to exit task allocation and return. Therefore, navigation of the UAV needs to satisfy the following energy constraint:
E.sub.n≤E.sub.n.sup.(threshold),n∈N.
(41) In conclusion, network optimization is performed to maximize the utility of the whole network by forming a better overlapping coalitional structure for resource allocation. Therefore, an optimization formula is as follows:
(42)
(43) (3) A task resource allocation problem of a heterogeneous coalition-based UAV network is modeled as an overlapping coalition game model with transferable utility. In the model, the UAV acts as a player. Assuming that the UAV allocates a resource and forms an overlapping coalition to cooperatively complete the task, a quantity of coalitions is equal to a quantity of tasks. A task cooperation model based on the OCF game is defined as ={N, U.sub.m, SC, X}, where N represents a UAV player; U.sub.m represents a utility function of a task point coalition m; SC={A.sub.1, . . . , A.sub.m}, Mem(A.sub.m)∈N represents the overlapping coalitional structure; Mem(A.sub.m) represents the coalition member set of the UAVs that allocate the resources to the m.sup.th task point and is expressed as Mem(A.sub.m)={n∈N|A.sub.m.sup.(n)≠Ø}; and X={x.sub.1, . . . , x.sub.n, . . . , x.sub.N} represents a UAV decision-making vector for determining the task resource allocation, and a resource allocation vector of each UAV is defined as X.sub.n=[A.sub.1.sup.(n), . . . , A.sub.m.sup.(n), . . . , A.sub.M.sup.(n)]. In addition, each UAV gets a share of a revenue from a coalition that the UAV joins, and a revenue sharing problem is resolved according to basic proportional fairness of a Shapley value. Utility of the UAV n may be expressed as follows:
(44)
(45) where .sub.n.sup.(m) represents a proportion of UAV resource allocation for the task point coalition m to guarantee that a utility back deserved by the UAV from the coalition increases as the task resources of the UAV allocated to the coalition increases. A corresponding expression is as follows:
(46)
(47) where O.sub.n.sup.m represents an amount of resources allocation of the UAV n to the task point coalition m, and is expressed as follows:
(48)
(49) Two coalitional structures SC.sub.Q and SC.sub.P are provided, and the coalitional structure SC.sub.Q is superior to the coalitional structure SC.sub.P which is expressed as SC.sub.Q.sub.n SC.sub.P. SC.sub.Q
.sub.n SC.sub.P indicates that the UAV player n prefers to allocate the task resources by using the coalitional structure SC.sub.Q instead of the coalitional structure SC.sub.P. A coalitional structure SC.sub.P={A.sub.l.sup.(p), . . . , A.sub.m.sup.(p)} is considered. For some resources δ.sub.n.sup.(z.sup.
(50) For a non-consumable resource μ.sub.n.sup.(z.sup..sub.n SC.sub.P.
(51) Step 2: Propose a BMBT order to evaluate preferences of the UAV n for the two coalitional structures, to avoid falling into local optimization and create more total network utility.
(52) The BMBT order is defined as follows: For any UAV n∈N and any two coalitional structures SC.sub.P and SC.sub.Q that are generated through the switch operation, where A(n)={A.sub.n∈SC|A.sub.m.sup.(n)≠Ø, m∈M} represents a set of other task point coalitions to which the UAV n allocates resources,
(53)
and
(54) when the UAV n performs a resource switch operation, a proposed preference decision indicates that total coalitional utility containing utility of the UAV n and utility of a resource transfer coalition is greater than that before resource transfer. When this condition is satisfied, the switch operation is successful; otherwise, the switch operation fails.
(55) Step 3: Optimize the task resource allocation of the UAV under the overlapping coalitional structure by using a preference gravity-guided Tabu Search algorithm based on a preference relationship between the UAV and tasks with a same type of resource, to form a stable coalitional structure.
(56) Firstly, a resource allocation vector of the UAV n for the m.sup.th task point in a k.sup.th iteration is defined as A.sub.m.sup.(n)={τ.sub.n,m.sup.(1)(k), . . . , τ.sub.n,m.sup.(z.sup.
(57)
(58) Le.sub.m.sup.(z.sup.
(59)
(60) A probability vector of the z.sup.th type of resource δ.sub.n.sup.(z) of the n.sup.th UAV for being allocated to each task is defined as P.sub.n.sup.(z)(k)=[p.sub.n,1.sup.(z)(k), . . . , p.sub.m,n.sup.(z)(k), . . . , p.sub.n,M.sup.(z)(k)], and is expressed as follows:
(61)
(62) where Γ(k) represents the Boltzmann coefficient. The UAV performs a switch operation according to a selection probability established based on the preference gravity. If proposed priority is satisfied, the UAV performs a switch operation of resource allocation to improve the total task execution utility of the network; otherwise, the UAV maintains the original coalitional structure under the resource allocation. A specific algorithm process is as follows:
(63) a) Initialization: Set a quantity k of iterations to 0, namely, k=0. A coalitional structure under resource allocation in each iteration is denoted as SC.sup.(k)={A.sub.1.sup.(k), . . . , A.sub.m.sup.(k)}, a Tabu list Tabu.sub.SC={SC.sup.(k-L.sup.
(64) b) Set k=k+1. A vector L.sub.m.sup.(less)(k) and the Tabu list Tabu.sub.SC are updated based on a coalitional structure SC.sup.(k-1). The Boltzmann coefficient is updated according to a rule Γ(k+1)=Γ(k)+k(Γ.sub.max−Γ(k))/K.sub.max.
(65) c) Some consumable UAV resources δ.sub.n.sup.(z.sup.
(66) d) The UAV n updates the coalitional structure according to the following order:
(67)
(68) e) End the process when the utility is still not improved after K.sub.stable iterations are performed or a total quantity of iterations reaches K.sub.max, to obtain a final convergence structure SC.sup.(*).
(69) Simulation analysis is performed by using simulation parameters in Table 1.
(70) TABLE-US-00001 TABLE 1 Simulation parameters Parameter Value Required task completion time t.sub.m.sup.(com) ∈ (50 − 120), ∀m ∈ M Maximum flight speed of a UAV (m/s) v.sub.n ∈ (6 − 22), ∀n ∈ N Time consumption consumption weight ω.sub.2/ω.sub.1 = (0.01 − 0.04) coefficient Energy consumption weight coefficient ω.sub.3/ω.sub.1 = (0.01 − 0.04) Boltzmann coefficient Γ.sub.max = 2 Blade profile power p.sub.n = 0.2 W, ∀n ∈ N Fuselage drag ratio f.sub.0 = 0.3 Air density ρ = 1.125 Mean rotor velocity v.sub.0 = 200 m/s Tip speed of a rotor U.sub.tip = 7.3 m/s
(71) As shown in
(72)
(73) To sum up, the existing CF game model usually assumes that different UAVs execute tasks separately, does not consider a cooperation relationship between heterogeneous UAVs, and only optimizes a composition structure of UAVs in a coalition.
(74) In view of this, considering the overlapping and complementary relationship between the resource properties of the UAV and the task and the task priority, the present invention provides the sequential OCF game model to quantify the characteristics of the resource properties of the UAV and the task, and optimize the task resource allocation of the UAV under the overlapping coalitional structure. In addition, the present invention provides a BMBT order maximizing a revenue of a UAV, to further prove that the OCF game under the BMBT order is an extract potential game. Then, the stability of the overlapping coalitional structure is ensured through Nash equilibrium (NE). Based on the preference relationship between the UAV and the tasks with the same type of resource, the present invention provides the preference gravity-guided Tabu search algorithm to obtain the stable coalitional structure. The proposed OCF game scheme based on the preference gravity-guided Tabu search algorithm in the present invention is superior to a non-overlapping CF game scheme. In addition, the proposed BMBT order is superior to other criteria.
(75) The above described are only preferred implementations of the present invention, and the protection scope of the present invention is not limited to the above embodiments. All technical solutions based on the idea of the present invention should fall within the protection scope of the present invention. It should be noted that several modifications and adaptations made by those of ordinary skill in the art without departing from the principle of the present invention should fall within the protection scope of protection of the present invention.