Optimization method for UAV-based wireless information and energy transmission

11641591 · 2023-05-02

Assignee

Inventors

Cpc classification

International classification

Abstract

An optimization method for UAV-based wireless information and energy transmission includes following steps: S1: reporting, by a wireless device, an energy state of the wireless device to a UAV; S2: detecting, by the UAV, a channel state between the UAV and the wireless device; and S3: selecting, by the UAV, an optimal action based on estimated revenue maximization according to an electric quantity of the UAV, an electric quantity of the wireless device, and the channel state. The use of the wireless device can reduce wiring costs, beautify the space, and ensure a smaller size and a lower power. By applying the UAV to information and energy transmission for the wireless devices, the data transmission rate and the energy conversion efficiency of networks are improved.

Claims

1. An optimization method for unmanned aerial vehicle (UAV)-based wireless information and energy transmission, comprising steps of: S1: reporting, by a wireless device, an energy state B(t) of the wireless device to a UAV, where B(t)∈[0,B.sub.max], where B.sub.max denotes a maximum energy state of the wireless device; S2: detecting, by the UAV, a channel state γ(t) between the UAV and the wireless device; and S3: selecting, by the UAV, an optimal action based on an optimal design for wireless information and energy transmission of the UAV that is based on an estimated revenue maximization according to an electric quantity of the UAV, an electric quantity of the wireless device, and the channel state; wherein the optimal design for wireless information and energy transmission of the UAV is modeled as a restrictive Markov decision process within limited time (MDP), a state space of the MDP is S={(γ(t), B(t), E.sub.r (t)): γ(t)∈[0, +∞), B(t)∈[0, B.sub.max], E.sub.r(t)∈[0, E.sub.p]}, an action space is A = { 0 ( keep silence ) , 1 ( transmit energy to the wireless device ) , 2 ( transmit information to the wireless device ) } , the revenue is expressed as R t ( S ( t ) , a ( t ) ) = log 2 ( 1 + P ( t ) γ ( t ) P 0 ) I ( B ( t ) E d I ( E r ( t ) P f ) I ( a ( t ) = 2 ) , a state transition function of the UAV and a state transition function of the wireless device is respectively expressed as B ( t + 1 ) { B ( t ) , a ( t ) = 0 B ( t ) + P f ( t ) γ ( t ) , a ( t ) = 1 B ( t ) - E d , a ( t ) = 2 and E r ( t + 1 ) = { E r ( t ) , a ( t ) = 0 E r ( t ) - P f , a ( t ) = 1 or a ( t ) = 2 , an objective function is expressed as J ( π ) = max π .Math. t = 1 T R t ( S ( t ) , a ( t ) ) , where π represents an action strategy function, input is S(t) and output is a(t), J(π) represents a total revenue under a strategy π, E.sub.r(t) represents a remaining service energy of the UAV, P.sub.0 represents a noise power, I(•) represents an indicator function, P(t) represents an operating power of the UAV in time slot t, γ(t) represents independently identically distribution in time slot t, γ(t) represents a total signal fading, E.sub.d represents the energy required by the wireless device for each time decoding, E.sub.p represents a maximum value of the remaining service energy of the UAV, a(t) represents each action of the time slot t, T represents the UAV serves T time slots for the wireless device, P.sub.f represents a transmission power of the UAV, R.sub.t represents an instantaneous revenue of the time slot t.

2. The optimization method according to claim 1, wherein the energy state B(t) of the wireless device is classified into a first scarcity state, a first medium state, and a first sufficiency state, respectively, the first scarcity state corresponding to B(t)<E.sub.d, the first medium state corresponds to E.sub.d≤B(t)<(1+T−t)E.sub.d, and the first sufficiency state corresponds to B(t)≥(1+T−t)E.sub.d; when B(t)<E.sub.d, the wireless device fails to decode, and the UAV does not transmit the information to the wireless device; and when B(t)≥(1+T−t)E.sub.d, a current electric quantity of the wireless device is enough to support decoding of all current and future time slots, and the UAV does not need to determine to charge the wireless device.

3. The optimization method according to claim 2, wherein an energy state of the UAV is classified into a second scarcity state, a second medium state, and a second sufficiency state, respectively, the second scarcity state corresponds to P.sub.f≤E.sub.r(t)<2P.sub.f, the second medium state corresponds to 2P.sub.f≤E.sub.r(t)<(1+T−t)P.sub.f, and the second sufficiency state corresponds to E.sub.r(t)≥(1+T−t)P.sub.f; when P.sub.f≤E.sub.r(t)<2P.sub.f, the UAV does not determine to charge the wireless device, and when E.sub.r(t)≥(1+T−t)P.sub.f, a current electric quantity of the UAV is enough to support information transmission in all current and future time slots, and the UAV does not need to determine to keep silence.

4. The optimization method according to claim 3, wherein the UAV needs to determine the action space in different states, and when there is more than one action in the action space, a value needs to be calculated for each action of the more than one action, an action with a maximum value is selected, and the value of the each action is defined as Q.sub.t(S(t),a(t))≙R.sub.t(S(t), a(t))+F.sub.t(B(t+1), E.sub.r(t+1)); F.sub.t(B(t+1), E.sub.r(t+1)) represents an estimated future revenue after a time slot t; Q.sub.t represents a total revenue of an instantaneous revenue plus the estimated future revenue corresponding to the each action a(t) in a state S(t), S(t) represents a system state of the time slot t, a(t) represents the each action of the time slot t, and R.sub.t represents the instantaneous revenue of the time slot t.

5. The optimization method according to claim 4, wherein when the electric quantity of the UAV is in different states, there are different calculation methods for the estimated future revenue, and the each action of the time slot of the UAV is expressed as a ( t ) = arg max a t Q t ( S ( t ) , a ( t ) ) .

6. The optimization method according to claim 5, wherein when the UAV is in shortage of energy, F t ( B ( t + 1 ) , E r ( t + 1 ) ) = { V n , a ( t ) = 0 0 , a ( t ) = 2 , wherein V.sub.n represents an expected revenue of a next time slot and is expressed as V n = log 2 ( 1 + P f E { γ ( t + 1 ) } P 0 ) ; P.sub.f represents a transmission power of the UAV, E represents a mathematical expectation symbol, γ represents the channel state, and P.sub.0 represents a noise power.

7. The optimization method according to claim 6, wherein when the UAV has a medium energy, a number of times the UAV charges the wireless device in a future is estimated as n c = arg min n .Math. "\[LeftBracketingBar]" .Math. E r ( t + 1 ) P f .Math. - n - .Math. B ( t + 1 ) + nP f E { γ ( t ) } E d .Math. .Math. "\[RightBracketingBar]" , a number of times the UAV transmits information to the wireless device in the future is estimated as n m = min { .Math. E r ( t + 1 ) P f .Math. - n c , .Math. B ( t + 1 ) + n c P f E { γ ( t ) } E d .Math. } , and the estimated future revenue is expressed as F t ( B ( t + 1 ) , E r ( t + 1 ) ) = n m .Math. log 2 ( 1 + P f E { γ ( t ) } P 0 ) .

8. The optimization method according to claim 7, wherein when the UAV has sufficient energy, the number of times the UAV charges the wireless device in the future is estimated as n c = arg min n .Math. "\[LeftBracketingBar]" .Math. T - t .Math. - n - .Math. B ( t + 1 ) + nP f E { γ ( t ) } E d .Math. .Math. "\[RightBracketingBar]" , the number of times the UAV transmits information to the wireless device is estimated as n m = min { .Math. T - t .Math. - n c , .Math. B ( t + 1 ) + n c P f E { γ ( t ) } E d .Math. } , and the estimated future revenue is expressed as F t ( B ( t + 1 ) , E r ( t + 1 ) ) = n m .Math. log 2 ( 1 + P f E { γ ( t ) } P 0 ) , where n is a variable representing the number of times that the UAV performs wireless energy transfer.

9. The optimization method according to claim 8, wherein a signal transmitted from the UAV to the wireless device is classified into a direct signal and an indirect signal according to different propagation paths.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a flowchart of an optimization method for UAV-based wireless information and energy transmission according to an embodiment of the present disclosure;

(2) FIG. 2 is a schematic diagram of a system model according to an embodiment of the present disclosure;

(3) FIG. 3 is a schematic diagram of a forward algorithm for searching an optimal action according to an embodiment of the present disclosure;

(4) FIG. 4 is a schematic diagram showing revenue comparison of three strategies under different T according to an embodiment of the present disclosure; and

(5) FIG. 5 is a schematic diagram showing revenue of a two-element control strategy under different heights according to an embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

(6) As shown in FIG. 1, it is illustrated a flowchart of an optimization method for UAV-based wireless information and energy transmission according to an embodiment of the present disclosure, which is described in detail as below.

(7) Description of System Model

(8) A UAV-based downlink wireless information and energy transmission system is considered. In this system, both the UAV and a wireless device are provided with batteries. When the UAV transmits energy to the wireless device, the wireless device stores the energy in its own battery. When the UAV transmits information to the wireless device, the wireless device uses the energy of the battery to receive a signal from UAV and decodes the signal.

(9) As shown in FIG. 2, this model includes one UAV and a plurality of wireless devices. The UAV has limited energy, and thus in order to save energy, the UAV only adjusts its height but not moves horizontally during the whole working period. The UAV allocates a period of time and a part of energy for each wireless device to serve separately. Therefore, it is only necessary to study the process of the UAV serving a specific wireless device. The UAV serves T time slots for the wireless device. At the beginning of the t.sup.th time slot of the entire time slots, the wireless device reports its energy state B(t) to the UAV, and the UAV may detect a channel state γ(t) between itself and the wireless device. Next, the UAV selects to keep silence, charge the wireless device or transmit information to the wireless device according to an electric quantity of the UAV, an electric quantity of the wireless device, and the channel state. A horizontal distance and a vertical distance from the wireless device to the UAV are represented by L and H, respectively. An action of the UAV is represented by:

(10) 0 a ( t ) { 0 ( keep silence ) , 1 ( transmit energy to the wireless device ) , 2 ( transmit information to wireless device ) }
remaining service energy is represented by E.sub.r (t), and a transmission power is represented by

(11) P ( t ) = { 0 , a ( t ) = 0 P f , a ( t ) = 1 or a ( t ) = 2 , ( 1 )
where P.sub.f represents the operating power of the UAV. The energy required for each time decoding by the wireless device is represented by E.sub.d. The system state is expressed as S(t)≙(γ(t) B(t), E.sub.r(t)), which is a Markov decision process since the system state of the current time slot is only related to the system state of a previous time slot and the action of the UAV in the previous time slot.

(12) Channel Model A signal transmitted from the UAV to the wireless device may be classified into a direct signal and an indirect signal according to different propagation paths. The proportion of the direct signal depends on the height and the density of surrounding buildings, the height of the UAV, and the horizontal angle between the UAV and the wireless device, etc., which is expressed by Formula

(13) p L = 1 1 + a exp ( - b ( θ - a ) ) , ( 2 )
where E.sub.r(t) and b represent parameters related to the environment. θ represents the horizontal angle between the UAV and the wireless device, and is calculated as

(14) θ = 1 8 0 π arctan H L .
The proportion of the indirect signals is p.sub.N=1−p.sub.L. In the t.sup.th slot, fading of the direct signal and fading of the indirect signal are respectively as below:
γ.sub.L(t)=|h.sub.L(t)|.sup.2(√{square root over (L.sup.2+H.sup.2)}).sup.−α.sup.L  (3), and
γ.sub.N(t)=|h.sub.N(t)|.sup.2(√{square root over (L.sup.2+H.sup.2)}).sup.−α.sup.N  (4), where α.sub.L and α.sub.N represent a path fading coefficient of the direct signal and a path fading coefficient of the indirect signal, respectively. h.sub.L(t) and h.sub.N(t) respectively represent a multipath fading coefficient of the direct signal and a multipath fading coefficient of the indirect signal in the t.sup.th time slot, and both obey Nakagami-m distribution. In this case, a probability density distribution function of the |h.sub.L(t)|.sup.2 and a probability density distribution function of the |h.sub.N(t)|.sup.2 are as below:

(15) f .Math. "\[LeftBracketingBar]" h L .Math. "\[RightBracketingBar]" 2 ( x ) = m L m L X m L - 1 Ω L m L Γ ( m L ) exp ( - m L X Ω L ) , and ( 5 ) f .Math. "\[LeftBracketingBar]" h N .Math. "\[RightBracketingBar]" 2 ( x ) = m L m N X m N - 1 Ω L m N Γ ( m N ) exp ( - m N X Ω N ) , ( 6 )
where m.sub.L and m.sub.N represent a Nakagami parameter of the direct signal and a parameter of the indirect signal, respectively. Ω.sub.t=E {|h.sub.L(t)|.sup.2} and Ω.sub.N=E{|h.sub.N(t)|.sup.2} represent a multipath fading power of the direct signal and a multipath fading power of the indirect signal, respectively. Γ(•) represents a Gamma function. The total signal fading is expressed as
γ(t)=p.sub.Lγ.sub.L(t)+p.sub.Nγ.sub.N(t)  (7).

(16) State, Action and Revenue of an MDP Model The optimal design for wireless information and energy transmission of the UAV may be modeled as a restrictive Markov decision process within limited time. The state space of this MDP is s={(γ(t),B(t),E.sub.r(t)):γ(t)∈[0,+∞,B(t)∈[0,B.sub.max],E.sub.r(t)∈[0,E.sub.p]}. The action space is:

(17) A = { 0 ( keep silence ) , 1 ( transmit energy to the wireless device ) , 2 ( transmit informaiton to the wireless device ) } .
The revenue is an information rate, which is expressed as

(18) R t ( S ( t ) , a ( t ) ) = log 2 ( 1 + P ( t ) γ ( t ) P 0 ) I ( B ( t ) E d I ( E r ( t ) P f ) I ( a ( t ) = 2 ) , ( 8 )
where P.sub.0 represents a noise power, and I(•) represents an indicator function.

(19) State Transition If the UAV does not have enough energy to transmit a signal, the UAV will keep silence. Therefore, when a strategy is designed, it is only needed to consider the situation that the UAV has enough energy to transmit the signal, i.e., E.sub.r(t)≥P.sub.f. A state transition function of the UAV and a state transition function of the wireless devices may be respectively expressed as

(20) B ( t + 1 ) { B ( t ) , a ( t ) = 0 B ( t ) + P f ( t ) γ ( t ) , a ( t ) = 1 B ( t ) - E d , a ( t ) = 2 and ( 9 ) E r ( t + 1 ) = { E r ( t ) , a ( t ) = 0 E r ( t ) - P f , a ( t ) = 1 or a ( t ) = 2 , ( 10 )
and the γ(t) is independently identically distribution in different t.

(21) An Objective Function and a Restriction The objective function is expressed as

(22) J ( π ) = max π .Math. t = 1 T R t ( S ( t ) , a ( t ) ) , ( 11 )
where π represents an action strategy function, the input is S(t) and the output is a(t). J(π) represents the total revenue under the strategy π. The UAV has limited energy, so the restriction of the model is

(23) .Math. t = 1 T P ( t ) E r ( 1 ) , ( 12 )
where E.sub.r(1) represents the total energy available for the UAV to serve the wireless device.

(24) Action Selection Strategy Three strategies are provided: greedy strategy, two-element control strategy, and God strategy.

(25) Greedy Strategy The first strategy is the simplest greedy strategy, and the action of the UAV in the t.sup.th time slot is

(26) 0 a ( t ) = { 1 , B ( t ) < E d 2 , B ( t ) E d . ( 13 )

(27) Two-Element Control Strategy Because the γ(t) and B(t) are continuous, the state S is also continuous. The Markov decision process in this continuous state is particularly difficult to be decoded. Thus, a sub-optimal solution is provided. The energy state of the wireless device is classified into a scarcity state, a medium state, and a sufficiency state, respectively corresponding to B(t)<E.sub.d, E.sub.d≤B(t)<(1+T−t)E.sub.d, and B(t)≥(1+T−t)E.sub.d. When B(t)<E.sub.d, B(t)<E.sub.d the wireless device fails to decode, and the UAV does not transmit information to the wireless device. When B(t)≥(1+T−t)E.sub.d, the current electric quantity of the wireless device is enough to support decoding of all current and future time slots, and the UAV does not need to determine to charge the wireless device. The energy state of the UAV may be likewise classified into the scarcity state, the medium state, and the sufficiency state, respectively corresponding to P.sub.f≤E.sub.r(t)<2P.sub.f, 2P.sub.f≤E.sub.r(t)<(1+T−t)P.sub.f, and E.sub.r(t)≥(1+T−t)P.sub.f. When P.sub.f≤E.sub.r(t)<2P.sub.f, the UAV does not determine to charge the wireless device, otherwise the UAV can do nothing but keep silence in subsequent time slots. When E.sub.r(t)≥(1+T−t)P.sub.f, the current electric quantity of the UAV is enough to support signal transmission of all current and future time slots, and the UAV does not need to determine to keep silence.

(28) TABLE-US-00001 TABLE 1 action space at current time E.sub.r(t) B(t) the scarcity state the medium state the sufficiency state the scarcity state a(t) ∈ {0} a(t) ∈ {0, 1} a(t) ∈ {1} the medium state a(t) ∈ {0, 2} a(t) ∈ {0, 1, 2} a(t) ∈ {1, 2} the sufficiency state a(t) ∈ {0, 2} a(t) ∈ {0, 2} a(t) ∈ {2}

(29) Table 1 lists the action space that the UAV needs to determine in different states. When there is more than one action in the action space, it is needed to calculate a value for each action, and then the action with the greatest value is selected. In the t.sup.th time slot, the value of the action is defined as
Q.sub.t(S(t),a(t))≙R.sub.t(S(t),a(t))+F.sub.t(B(t+1),E.sub.r(t+1))  (14),
where F.sub.t(B(t+1),E.sub.r(t+1)) represents the estimated future revenue after the time slot t. When the electric quantity of the UAV is in different states, there are different calculation methods provided for F.sub.t(B(t+1),E.sub.r(t+1)).

(30) When the UAV is in shortage of energy, F.sub.t(B(t+1),E.sub.r(t+1)) is expressed as

(31) F t ( B ( t + 1 ) , E r ( t + 1 ) ) = { V n , a ( t ) = 0 0 , a ( t ) = 2 , ( 15 )
where V.sub.n represents an expected revenue of a next time slot and is expressed as

(32) V n = log 2 ( 1 + P f E { γ ( t + 1 ) } P 0 ) . ( 16 ) When the UAV has a medium energy, it is estimated that the number of times the UAV will charge the wireless device in the future is

(33) n c = arg min n .Math. "\[LeftBracketingBar]" .Math. E r ( t + 1 ) P f .Math. - n - .Math. B ( t + 1 ) + nP f E { γ ( t ) } E d .Math. .Math. "\[RightBracketingBar]" . ( 17 ) It is estimated that the number of times UAV will transmit information to the wireless device in the future is

(34) n m = min { .Math. E r ( t + 1 ) P f .Math. - n c , .Math. B ( t + 1 ) + n c P f E { γ ( t ) } E d .Math. } . ( 18 ) It is estimated that the future revenue is

(35) F t ( B ( t + 1 ) , E r ( t + 1 ) ) = n m .Math. log 2 ( 1 + P f E { γ ( t ) } P 0 ) . ( 19 ) When the UAV has sufficient energy, it is estimated that the number of times the UAV will charge the wireless device in the future is

(36) n c = arg min n .Math. "\[LeftBracketingBar]" .Math. T - t .Math. - n - .Math. B ( t + 1 ) + nP f E { γ ( t ) } E d .Math. .Math. "\[RightBracketingBar]" . ( 20 ) It is estimated that the number of times UAV will transmit information to wireless devices in the future is

(37) n m = min { .Math. T - t .Math. - n c , .Math. B ( t + 1 ) + n c P f E { γ ( t ) } E d .Math. } . ( 21 ) It is estimated that the future revenue is

(38) F t ( B ( t + 1 ) , E r ( t + 1 ) ) = n m .Math. log 2 ( 1 + P f E { γ ( t ) } P 0 ) , ( 22 )
and finally the action of the t.sup.th time slot may be expressed as

(39) a ( t ) = arg max a t Q t ( S ( t ) , a t ) . ( 23 )

(40) God Strategy

(41) Because the state space is continuous, this Markov decision process is difficult to get an optimal solution in reverse. However, if all future channel states can be known in advance, the optimal solution can be obtained through forward search. This method requires the God's assistance and has a high time complexity, thus it is impossible to put this method into practical application. However, this method can be used as a benchmark for other strategies. As shown in FIG. 3, in the t.sup.th time slot, the total revenue of all action combinations from the t.sup.th time slot to the T.sup.th time slot may be calculated, and then a(t) of a path with the maximum total revenue is selected, which may be expressed as Formula

(42) 0 a ( t ) = arg max a t max a t + 1 , .Math. , a T .Math. t = t T R t ( S ( t ) , a t ) . ( 24 )
The time complexity of this forward algorithm is 0(3.sup.T). Two simulation experiments are conducted: one is performance comparison of the three strategies, and the other is one-dimensional search for the optimal height of the UAV. In the first experiment, parameters are set as: L=200 m, P.sub.f=100 mW, P.sub.0=−100 dBm, Ω.sub.L=Ω.sub.N=12 mW, m.sub.L=3, m.sub.N=2, a=8.5, b=0.33, E.sub.d=4 μW.Math.s, Δt=0.1 s, E.sub.total=40 mW.Math.s, and B(1)=4 μW.Math.S. The total number T of time slots is increased from 1 to 16, and for each T, 1000 rounds are conducted for each strategy and an average revenue is calculated. As shown in FIG. 4, revenues of different strategies are shown. It may be seen that the revenue of the greedy strategy and the revenue of the two-element control strategy have little difference when T is less than or equal to 4. This is because the energy of the UAV is always in the sufficiency state. As T gets closer and closer to 16, the performance of the two-element control strategy is getting better and better than that of the greedy strategy. Finally the performance of the two-element control strategy is increased by 26.05% than that of the greedy strategy, while the performance of the God strategy is only increased by 3.84% than that of the two-element control strategy.

(43) In the second simulation experiment, the parameter is set as H=16, which is increased from 10 m to 200 m. As shown in FIG. 5, the relation between the revenue of the two-element control strategy and the height of the UAV is shown. As can be seen from FIG. 5, the best height is 89 m.

(44) The foregoing descriptions are merely preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modification, equivalent replacement and improvement made within the spirit and principle of the present disclosure shall fall into the protection scope of the present disclosure.