Optimization method for UAV-based wireless information and energy transmission
11641591 · 2023-05-02
Assignee
Inventors
- Yueling CHE (Shenzhen, CN)
- Yabin Lai (Shenzhen, CN)
- Sheng Luo (ShenZhen, CN)
- Jie Ouyang (Shenzhen, CN)
- Kaishun Wu (Shenzhen, CN)
Cpc classification
B64U2101/20
PERFORMING OPERATIONS; TRANSPORTING
H04W24/10
ELECTRICITY
B64U2101/00
PERFORMING OPERATIONS; TRANSPORTING
H02J50/80
ELECTRICITY
B64C39/024
PERFORMING OPERATIONS; TRANSPORTING
H04B7/18506
ELECTRICITY
International classification
H04B7/185
ELECTRICITY
Abstract
An optimization method for UAV-based wireless information and energy transmission includes following steps: S1: reporting, by a wireless device, an energy state of the wireless device to a UAV; S2: detecting, by the UAV, a channel state between the UAV and the wireless device; and S3: selecting, by the UAV, an optimal action based on estimated revenue maximization according to an electric quantity of the UAV, an electric quantity of the wireless device, and the channel state. The use of the wireless device can reduce wiring costs, beautify the space, and ensure a smaller size and a lower power. By applying the UAV to information and energy transmission for the wireless devices, the data transmission rate and the energy conversion efficiency of networks are improved.
Claims
1. An optimization method for unmanned aerial vehicle (UAV)-based wireless information and energy transmission, comprising steps of: S1: reporting, by a wireless device, an energy state B(t) of the wireless device to a UAV, where B(t)∈[0,B.sub.max], where B.sub.max denotes a maximum energy state of the wireless device; S2: detecting, by the UAV, a channel state γ(t) between the UAV and the wireless device; and S3: selecting, by the UAV, an optimal action based on an optimal design for wireless information and energy transmission of the UAV that is based on an estimated revenue maximization according to an electric quantity of the UAV, an electric quantity of the wireless device, and the channel state; wherein the optimal design for wireless information and energy transmission of the UAV is modeled as a restrictive Markov decision process within limited time (MDP), a state space of the MDP is S={(γ(t), B(t), E.sub.r (t)): γ(t)∈[0, +∞), B(t)∈[0, B.sub.max], E.sub.r(t)∈[0, E.sub.p]}, an action space is
2. The optimization method according to claim 1, wherein the energy state B(t) of the wireless device is classified into a first scarcity state, a first medium state, and a first sufficiency state, respectively, the first scarcity state corresponding to B(t)<E.sub.d, the first medium state corresponds to E.sub.d≤B(t)<(1+T−t)E.sub.d, and the first sufficiency state corresponds to B(t)≥(1+T−t)E.sub.d; when B(t)<E.sub.d, the wireless device fails to decode, and the UAV does not transmit the information to the wireless device; and when B(t)≥(1+T−t)E.sub.d, a current electric quantity of the wireless device is enough to support decoding of all current and future time slots, and the UAV does not need to determine to charge the wireless device.
3. The optimization method according to claim 2, wherein an energy state of the UAV is classified into a second scarcity state, a second medium state, and a second sufficiency state, respectively, the second scarcity state corresponds to P.sub.f≤E.sub.r(t)<2P.sub.f, the second medium state corresponds to 2P.sub.f≤E.sub.r(t)<(1+T−t)P.sub.f, and the second sufficiency state corresponds to E.sub.r(t)≥(1+T−t)P.sub.f; when P.sub.f≤E.sub.r(t)<2P.sub.f, the UAV does not determine to charge the wireless device, and when E.sub.r(t)≥(1+T−t)P.sub.f, a current electric quantity of the UAV is enough to support information transmission in all current and future time slots, and the UAV does not need to determine to keep silence.
4. The optimization method according to claim 3, wherein the UAV needs to determine the action space in different states, and when there is more than one action in the action space, a value needs to be calculated for each action of the more than one action, an action with a maximum value is selected, and the value of the each action is defined as Q.sub.t(S(t),a(t))≙R.sub.t(S(t), a(t))+F.sub.t(B(t+1), E.sub.r(t+1)); F.sub.t(B(t+1), E.sub.r(t+1)) represents an estimated future revenue after a time slot t; Q.sub.t represents a total revenue of an instantaneous revenue plus the estimated future revenue corresponding to the each action a(t) in a state S(t), S(t) represents a system state of the time slot t, a(t) represents the each action of the time slot t, and R.sub.t represents the instantaneous revenue of the time slot t.
5. The optimization method according to claim 4, wherein when the electric quantity of the UAV is in different states, there are different calculation methods for the estimated future revenue, and the each action of the time slot of the UAV is expressed as
6. The optimization method according to claim 5, wherein when the UAV is in shortage of energy,
7. The optimization method according to claim 6, wherein when the UAV has a medium energy, a number of times the UAV charges the wireless device in a future is estimated as
8. The optimization method according to claim 7, wherein when the UAV has sufficient energy, the number of times the UAV charges the wireless device in the future is estimated as
9. The optimization method according to claim 8, wherein a signal transmitted from the UAV to the wireless device is classified into a direct signal and an indirect signal according to different propagation paths.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(6) As shown in
(7) Description of System Model
(8) A UAV-based downlink wireless information and energy transmission system is considered. In this system, both the UAV and a wireless device are provided with batteries. When the UAV transmits energy to the wireless device, the wireless device stores the energy in its own battery. When the UAV transmits information to the wireless device, the wireless device uses the energy of the battery to receive a signal from UAV and decodes the signal.
(9) As shown in
(10)
remaining service energy is represented by E.sub.r (t), and a transmission power is represented by
(11)
where P.sub.f represents the operating power of the UAV. The energy required for each time decoding by the wireless device is represented by E.sub.d. The system state is expressed as S(t)≙(γ(t) B(t), E.sub.r(t)), which is a Markov decision process since the system state of the current time slot is only related to the system state of a previous time slot and the action of the UAV in the previous time slot.
(12) Channel Model A signal transmitted from the UAV to the wireless device may be classified into a direct signal and an indirect signal according to different propagation paths. The proportion of the direct signal depends on the height and the density of surrounding buildings, the height of the UAV, and the horizontal angle between the UAV and the wireless device, etc., which is expressed by Formula
(13)
where E.sub.r(t) and b represent parameters related to the environment. θ represents the horizontal angle between the UAV and the wireless device, and is calculated as
(14)
The proportion of the indirect signals is p.sub.N=1−p.sub.L. In the t.sup.th slot, fading of the direct signal and fading of the indirect signal are respectively as below:
γ.sub.L(t)=|h.sub.L(t)|.sup.2(√{square root over (L.sup.2+H.sup.2)}).sup.−α.sup.
γ.sub.N(t)=|h.sub.N(t)|.sup.2(√{square root over (L.sup.2+H.sup.2)}).sup.−α.sup.
(15)
where m.sub.L and m.sub.N represent a Nakagami parameter of the direct signal and a parameter of the indirect signal, respectively. Ω.sub.t=E {|h.sub.L(t)|.sup.2} and Ω.sub.N=E{|h.sub.N(t)|.sup.2} represent a multipath fading power of the direct signal and a multipath fading power of the indirect signal, respectively. Γ(•) represents a Gamma function. The total signal fading is expressed as
γ(t)=p.sub.Lγ.sub.L(t)+p.sub.Nγ.sub.N(t) (7).
(16) State, Action and Revenue of an MDP Model The optimal design for wireless information and energy transmission of the UAV may be modeled as a restrictive Markov decision process within limited time. The state space of this MDP is s={(γ(t),B(t),E.sub.r(t)):γ(t)∈[0,+∞,B(t)∈[0,B.sub.max],E.sub.r(t)∈[0,E.sub.p]}. The action space is:
(17)
The revenue is an information rate, which is expressed as
(18)
where P.sub.0 represents a noise power, and I(•) represents an indicator function.
(19) State Transition If the UAV does not have enough energy to transmit a signal, the UAV will keep silence. Therefore, when a strategy is designed, it is only needed to consider the situation that the UAV has enough energy to transmit the signal, i.e., E.sub.r(t)≥P.sub.f. A state transition function of the UAV and a state transition function of the wireless devices may be respectively expressed as
(20)
and the γ(t) is independently identically distribution in different t.
(21) An Objective Function and a Restriction The objective function is expressed as
(22)
where π represents an action strategy function, the input is S(t) and the output is a(t). J(π) represents the total revenue under the strategy π. The UAV has limited energy, so the restriction of the model is
(23)
where E.sub.r(1) represents the total energy available for the UAV to serve the wireless device.
(24) Action Selection Strategy Three strategies are provided: greedy strategy, two-element control strategy, and God strategy.
(25) Greedy Strategy The first strategy is the simplest greedy strategy, and the action of the UAV in the t.sup.th time slot is
(26)
(27) Two-Element Control Strategy Because the γ(t) and B(t) are continuous, the state S is also continuous. The Markov decision process in this continuous state is particularly difficult to be decoded. Thus, a sub-optimal solution is provided. The energy state of the wireless device is classified into a scarcity state, a medium state, and a sufficiency state, respectively corresponding to B(t)<E.sub.d, E.sub.d≤B(t)<(1+T−t)E.sub.d, and B(t)≥(1+T−t)E.sub.d. When B(t)<E.sub.d, B(t)<E.sub.d the wireless device fails to decode, and the UAV does not transmit information to the wireless device. When B(t)≥(1+T−t)E.sub.d, the current electric quantity of the wireless device is enough to support decoding of all current and future time slots, and the UAV does not need to determine to charge the wireless device. The energy state of the UAV may be likewise classified into the scarcity state, the medium state, and the sufficiency state, respectively corresponding to P.sub.f≤E.sub.r(t)<2P.sub.f, 2P.sub.f≤E.sub.r(t)<(1+T−t)P.sub.f, and E.sub.r(t)≥(1+T−t)P.sub.f. When P.sub.f≤E.sub.r(t)<2P.sub.f, the UAV does not determine to charge the wireless device, otherwise the UAV can do nothing but keep silence in subsequent time slots. When E.sub.r(t)≥(1+T−t)P.sub.f, the current electric quantity of the UAV is enough to support signal transmission of all current and future time slots, and the UAV does not need to determine to keep silence.
(28) TABLE-US-00001 TABLE 1 action space at current time E.sub.r(t) B(t) the scarcity state the medium state the sufficiency state the scarcity state a(t) ∈ {0} a(t) ∈ {0, 1} a(t) ∈ {1} the medium state a(t) ∈ {0, 2} a(t) ∈ {0, 1, 2} a(t) ∈ {1, 2} the sufficiency state a(t) ∈ {0, 2} a(t) ∈ {0, 2} a(t) ∈ {2}
(29) Table 1 lists the action space that the UAV needs to determine in different states. When there is more than one action in the action space, it is needed to calculate a value for each action, and then the action with the greatest value is selected. In the t.sup.th time slot, the value of the action is defined as
Q.sub.t(S(t),a(t))≙R.sub.t(S(t),a(t))+F.sub.t(B(t+1),E.sub.r(t+1)) (14),
where F.sub.t(B(t+1),E.sub.r(t+1)) represents the estimated future revenue after the time slot t. When the electric quantity of the UAV is in different states, there are different calculation methods provided for F.sub.t(B(t+1),E.sub.r(t+1)).
(30) When the UAV is in shortage of energy, F.sub.t(B(t+1),E.sub.r(t+1)) is expressed as
(31)
where V.sub.n represents an expected revenue of a next time slot and is expressed as
(32)
(33)
(34)
(35)
(36)
(37)
(38)
and finally the action of the t.sup.th time slot may be expressed as
(39)
(40) God Strategy
(41) Because the state space is continuous, this Markov decision process is difficult to get an optimal solution in reverse. However, if all future channel states can be known in advance, the optimal solution can be obtained through forward search. This method requires the God's assistance and has a high time complexity, thus it is impossible to put this method into practical application. However, this method can be used as a benchmark for other strategies. As shown in
(42)
The time complexity of this forward algorithm is 0(3.sup.T). Two simulation experiments are conducted: one is performance comparison of the three strategies, and the other is one-dimensional search for the optimal height of the UAV. In the first experiment, parameters are set as: L=200 m, P.sub.f=100 mW, P.sub.0=−100 dBm, Ω.sub.L=Ω.sub.N=12 mW, m.sub.L=3, m.sub.N=2, a=8.5, b=0.33, E.sub.d=4 μW.Math.s, Δt=0.1 s, E.sub.total=40 mW.Math.s, and B(1)=4 μW.Math.S. The total number T of time slots is increased from 1 to 16, and for each T, 1000 rounds are conducted for each strategy and an average revenue is calculated. As shown in
(43) In the second simulation experiment, the parameter is set as H=16, which is increased from 10 m to 200 m. As shown in
(44) The foregoing descriptions are merely preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modification, equivalent replacement and improvement made within the spirit and principle of the present disclosure shall fall into the protection scope of the present disclosure.