A METHOD AND SYSTEM FOR CONTROLLING THE FLIGHT OF A PLURALITY OF QUADCOPTERS
20260023396 ยท 2026-01-22
Assignee
Inventors
Cpc classification
G05D1/6985
PHYSICS
International classification
Abstract
A method and system for controlling the flight of a plurality of quadcopters includes communicating a flight instruction from a user to a leader quadcopter. The method includes calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to convert a flight instruction into a leader formation maneuver and a follower formation maneuver. The method may include communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter. The method may include executing the leader formation maneuver on the leader quadcopter. The method may include executing the follower formation maneuver on the follower quadcopter.
Claims
1. A method for controlling the flight of a plurality of quadcopters, comprising: communicating a flight instruction from a user to a leader quadcopter; calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to convert the flight instruction into a leader formation maneuver and a follower formation maneuver; communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter; executing the leader formation maneuver on the leader quadcopter; and executing the follower formation maneuver on the follower quadcopter.
2. The method of claim 1, wherein the leader-follower formation controller is further configured to utilize a barrier function to calculate the leader formation maneuver and the follower formation maneuver.
3. The method of claim 2, wherein the barrier function is a Lyapunov candidate function which trends to infinity at a predetermined constraint value.
4. The method of claim 2, wherein the leader-follower formation controller is further configured to use an actor-critic learning mechanism.
5. The method of claim 4, wherein the actor-critic learning mechanism is a machine learning algorithm.
6. The method of claim 1, wherein the leader-follower formation controller is further configured to use a distributed sliding mode control.
7. The method of claim 6, wherein the distributed sliding mode control is configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack.
8. The method of claim 7, wherein the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.
9. The method of claim 7, wherein the distributed sliding mode control is configured to use a Nussbaum gain function to mitigate the effects of input gain on the follower formation maneuver, the input gain being created by a malicious cyber-attack.
10. The method of claim 1, wherein the follower formation maneuver comprises: a scaling maneuver; a shearing maneuver; a translation maneuver; and a collinearity maneuver.
11. The method of claim 10, wherein the position of each of the plurality of quadcopters within the follower formation maneuver is defined by: an x position; a y position; a z position; a roll angle; a yaw angle; and a pitch angle.
12. The method of claim 1, wherein the leader-follower formation controller is configured to use a radial basis function neural network to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter.
13. A method for controlling the flight of a plurality of quadcopters, comprising: communicating a flight instruction from a user to a leader quadcopter; calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use a distributed sliding mode control and an actor-critic learning mechanism to convert a flight instruction into a leader formation maneuver and a follower formation maneuver; communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter; executing the leader formation maneuver on the leader quadcopter; and executing the follower formation maneuver on the follower quadcopter.
14. The method of claim 13, wherein the sliding mode control is configured to convert the flight instruction into a follower formation maneuver with affine transformations and stress matrices.
15. The method of claim 13, wherein the actor-critic learning mechanism is a machine learning algorithm and the machine learning algorithm is further configured to use a radial basis function to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter.
16. The method of claim 13, wherein the sliding mode control and the actor-critic learning mechanism are configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack.
17. The method of claim 16, wherein the distributed sliding mode control and the actor-critic learning mechanism are configured to use a Nussbaum gain function to mitigate the effects of input gain on a follower formation maneuver, the input gain being created by a malicious cyber-attack.
18. The method of claim 16, wherein the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.
19. A system for controlling a plurality of quadcopters, the system comprising: a user input device configured to receive a flight instruction from a user and deliver the flight instruction to a leader quadcopter; a leader quadcopter configured to: receive a flight instruction from a user input device; calculate a leader formation maneuver; calculate a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to calculate the follower formation maneuver; communicate the follower formation maneuver to a follower quadcopter; and execute the leader formation maneuver; and a follower quadcopter configured to: receive the follower formation maneuver; and execute the follower formation maneuver.
20. The method of claim 19, wherein the leader-follower formation controller is further configured to use an actor-critic machine learning algorithm to calculate the follower formation maneuver.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] A more complete appreciation of this disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
DETAILED DESCRIPTION
[0030] In the drawings, like reference numerals designate identical or corresponding parts throughout the several views. Further, as used herein, the words a, an and the like generally carry a meaning of one or more, unless stated otherwise.
[0031] Furthermore, the terms approximately, approximate, about and similar terms generally refer to ranges that include the identified value within a margin of 20%, 10%, or preferably 5%, and any values therebetween.
[0032] The term affine formation maneuver refers to a maneuver of a plurality of objects in formation which retain a spatial relationship consistent with an affine transformations. The term affine transformation may refer to the class of linear mapping methods which preserves points, straight lines, and planes, but do not necessarily preserve Euclidian distances or angles. Therefore, the geometry of a formation at a first point in time will have the same points, straight lines, and planes of the geometry of the formation after an affine formation maneuver has been performed.
[0033] The terms leader and follower may describe the independently controlled components of a leader-follower control scheme respectively. In a leader-follower control scheme, at least one member of the control scheme is chosen as a leader and the leader(s) may dictate or decide the whole formation group's moving trajectory, typically by communicating commands to the non-deciding follower members. In one embodiment of a leader-follower control scheme a user may send user commands to leader vehicles, thereby externally controlling the leaders. The leaders may then send information about the user commands and/or send follower commands to the follower vehicles which may act on those instructions, thereby internally controlling the followers. In another embodiment, the user commands may be additionally or alternatively be sent to the followers by a user or a separate computer and/or networked device.
[0034] According to an embodiment, an actor-critic learning scheme for safe leader-follower affine formation maneuver control of networked quadrotor unmanned aerial vehicles (UAVs) is described in the face of external disturbances, sensor deception attacks, and injection attacks on the actuators. Typically, the follower quadrotors (followers) aim to track the formation maneuvers such as scaling, shearing, translation, and rotation decided by the leader quadrotors (leaders). Motivated by increasing safety and performance requirements during formation maneuvering, the dynamic states of the quadrotor UAVs are constrained within the prescribed safety region. The term dynamic states may refer to the full range of actual positions/formations that the UAVs may be arranged in during operation.
[0035] To guarantee that the safety constraints are not violated, a barrier Lyapunov function may be used. A distributed sliding mode control with actor-critic learning is also formulated and implemented to facilitate accurate leader-follower affine formation maneuvers and reject malicious cyber-attack signals. A sliding mode control may refer to any nonlinear control method that alters the dynamics of a nonlinear system, such as a flight path, by applying a discontinuous control signal which forces the system to slide along a cross-section of the system's normal behavior thereby regulating the system. The term actor-critic learning may refer to a machine reinforcement learning system wherein software is trained on the outcomes of actions its decisions have informed such that it will make more effective decisions in the future.
[0036] Additionally, input gains that arise due to the attacks might corrupt the control direction and so in the present disclosure, a Nussbaum gain function is coupled to the controller to address input gain corruption. The actor system may be responsible for estimating uncertain dynamics together with the malicious attack signals, while the critic network evaluates the control performance through an estimated long-term performance index. In the method and system of the present disclosure, the overall stability of the closed-loop system can be uniformly bounded using a Lyapunov stability function.
[0037] In view of the foregoing discussion, the present disclosure provides a solution for leader-follower affine formation maneuver control of networked quadrotor unmanned aerial vehicles (UAVs) subjected to external disturbances, sensor deception attacks, and actuator injection attacks. By considering the safety and physical limitations of the quadrotor UAVs, their dynamic states are constrained to operate within a safe workspace. To prevent violation of the safety constraints, in the method and system of the present disclosure, a barrier Lyapunov function is involved to improve the safety of the system. The safety of the system may refer to the operational conditions under which the formation may maneuver without risking a loss of control of an individual UAV, collisions between UAVs, or some other undesirable flight path for the formation. Then, the actor-critic learning-based distributed sliding mode control technique is designed to aid the leader-follower formation maneuvers of the quadrotor UAVs within the prescribed workspace while mitigating cyber-attacks. The present disclosure provides a method and system that utilizes the properties of affine transformation and stress matrices, and various collective affine formation maneuvers of the networked quadrotor UAVs such as scaling, translation, rotation, shearing, and collinearity. Moreover, the quadrotors in this technology have improved freedom of maneuverability. The present disclosure addresses a safety-guaranteed leader-follower affine formation maneuver control problem associated with multiple quadrotor UAVs under sensor deception attacks and actuator injection attacks. Compared with the conventional systems, where the problem of input gains induced by attack signals was not considered, the present disclosure addresses this problem by integrating the Nussbaum function into the controller of the present disclosure. As such, the controller of the present disclosure does not require any prior information about the signs of the input gains induced by the attack signals to counteract them. In conventional systems the learning-based controllers are in the form of linear quadratic regulators, in contrast, the present disclosure utilizes a distributed sliding mode control approach integrated with a learning mechanism.
[0038] According to an embodiment, the quadrotor of the present disclosure is a system having six degrees of freedom with four control inputs. Assuming that the quadrotor framework is rigid and symmetric, and the center of gravity coincides with the body-frame origin, the dynamic equations of the displacement and rotation of the quadrotor are as given by equations (1) to (6):
wherein x, y, and z denote the position of the quadrotors in the earth frame, , , and represent the roll, yaw, and pitch angles, respectively, .sub.i, .sub.i, .sub.i, .sub.xi, .sub.yi, and .sub. stand for the unknown time-varying external disturbances, l is the length from the center of each actuator to the center of mass, I.sub.p is the propeller inertia, I.sub.x, I.sub.y, I.sub.z are the moment of inertia along the x, y, and z axes, respectively, g represents gravitational acceleration, a.sub.1, a.sub.2, a.sub.3, a.sub.4, a.sub.5, and a.sub.6 stand for the drag coefficients, u.sub.1i, u.sub.2i, u.sub.3i, and u.sub.4i denote the control inputs.
[0039] The three-degree-of-freedom translational equations of the N quadrotors are given by equations (7) to (9):
[0040] Assuming that the quadrotors are flying at the same altitude, the translational equations of the quadrotors in two-dimensional space can be written as:
[0041] The dynamic states of the quadrotors in equation (10) are constrained in the following compact sets:
where k.sub.ci>0 is a constant. The compact set (11) specifies the region of safety operation of the quadrotors.
[0042] Assumption 1: Let q.sub.i* be the target position of each quadrotor. It is assuming that
is bounded as:
and its 1st and 2nd derivatives are bounded as follows:
are constants,
[0043] The two-dimension translational equations of the quadrotors in the face of cyber-attacks are given as:
wherein {grave over (q)}.sub.i.sup.2 is the vector of the compromised states, .sub.i
.sup.2 is the vector of the compromised control inputs, .sub.i(t, q.sub.i)
.sup.2 is the vector of the deception attack signals,
.sub.i=[
].sup.T
.sup.2 is the vector of time-varying injected attack signals, and h.sub.i=dia g{h.sub.xih.sub.yi}
.sup.22
[0044] The deception attack signals can be described by state-dependent function as .sub.i(t, q.sub.i)=.sub.i(t)q.sub.i since they are mimicking the state variables, where .sub.i(t).sup.22 is a time-varying weight. Thus, the quadrotor states corrupted by the deception attacks are given as:
where .sub.i(t)=(1+.sub.i(t))
[0045] As a result of the attack, the actual state variables q.sub.i are no longer available. Therefore, the contaminated state variables q.sub.i will be utilized for control design.
[0046] Deriving from (16):
[0047] Noting that
and 15, (18) becomes:
[0048] The quadrotor systems under the cyber-attacks could be expressed as:
[0049] The N quadrotors in (20) can be partitioned into two groups: the N.sub.l leaders and N.sub.f=(NN.sub.l) followers.
[0050] Let {grave over (q)}.sub.l and {grave over (q)}.sub.f be the vectors of the N.sub.l leaders and N.sub.f followers, respectively. Then:
[0051] The configuration of the N quadrotors which consists of their positions in two-dimensional space under cyberattacks is thus:
[0052] It is advantageous to provide a control law that can maintain healthy quadrotors states within the safety constraints (11) even in the presence of cyberattacks.
[0053] Assumption 2: Considering that the N.sub.l leader quadrotors are already controlled to acquire the desired formation maneuvers. In this sense, the control procedure for the leader quadrotors will not be considered.
[0054] Definition 1. Given a continuous function H():
.fwdarw.
that satisfies:
then H():
.fwdarw.
is a Nussbaum function. Many functions satisfy the condition in (24), for instance,
.sup.2 cos (
), e
.sup.
) and
.sup.2 sin(
).
[0055] Lemma 1. Defining the smooth functions L(t) and s(t) over the range [0, t.sub.f) with L(t)0, H(s) as a smooth Nussbaum function, the following inequality holds:
where r.sub.1>0, r.sub.2>0 are constants, g()isatimevarying function, then, L(t), (t), and
are bounded over [0, t.sub.f].
[0056] According to an embodiment, radial basis function neural networks are commonly employed to approximate any smooth and continuous unknown term as:
where is the input vector, W=[w.sub.1 w.sub.2 . . . w.sub.k].sup.T.sup.k is the weight vector, k is the number of nodes in the hidden layer, () is the approximation error and (x)
with
>0, ()=[.sub.1().sub.2() . . . .sub.k()].sup.T represent the basis function vector with entries .sub.i() given by
where and .sub.i denote the center and width of the Gaussian function, respectively.
[0057] According to an embodiment, a barrier function is widely employed to guarantee the safety of dynamic systems. The barrier function tends to infinity when it reaches the barriers of the specified safety constraints, but never violates the constraints. This property is utilized to design control laws that keep dynamic systems within the safety barriers.
[0058] Lemma 2. A Lyapunov candidate function L(z) complying L(z).fwdarw. as |z|.fwdarw.c, is deployed to ensure that the state-variables boundaries are not transgressed. The Lyapunov candidate function is as follows:
where c is the constraint on z. Thus, L is positive definite and continuous within the set |z|c. The control policy is developed to meet {grave over (L)}0.
[0059] Lemma 3. For any constant c>0 and z meeting |z|c, one gets
[0060] See Y.-J. Liu, J. Li, S. Tong, C. L. P. Chen, Neural network control based adaptive learning design for nonlinear systems with full state constraints, IEEE Transactions on Neural Networks and Learning Systems 27 (2016) 1562-1571. doi 10.1109, incorporated herein by reference in its entirety.
[0061] The exchange of information among the quadrotors is modeled by an undirected graph =(
, ,
) consisting of N vertices. Define
={.sub.1, . . . , .sub.N} and .Math.
as the sets of vertices and edges, respectively. Then, the set of neighbors of the vertex i is defined by
=j
:(i, j).
[0062] The formation of the quadrotors (, {grave over (q)}) is defined as the graph of the quadrotors
=(
, ,
) with their corresponding configuration {grave over (q)}.
[0063] The scalar weight allotted to each edge of the formation (, q) is termed as the stress of the edge and it can be positive or negative. The set of the scalar weights is defined by {s.sub.ij}.sub.(i,j) with s.sub.ij=s.sub.ji
. The stress matrix S
.sup.NN is described as:
[0064] When the stress satisfies s.sub.ij({grave over (q)}.sub.j{grave over (q)}.sub.i)=0 or (S.Math.I.sub.2){grave over (q)}=0 in compact form, it is referred to as equilibrium stress.
[0065] Thus, (30) can be rewritten as:
where S.sub.ll.sup.2N.sup.
.sup.2N.sup.
.sup.2N.sup.
[0066] Let
be a constant configuration. In view of the graph , a nominal formation (
, n) is formed. Afterward, the time-varying reference formation is:
where A(t).sup.22 and b(t)
.sup.2 are time-varying. The affine transformation of can be achieved by specifying the entries of A(t) and b(t).
[0067] The affine image of is defined by:
[0068] If q(), the N agents acquire the affine formation maneuvers.
[0069] Definition 2. Affine localizability: If for any
determine {grave over (q)}.sub.f, then (, ) is affinely localizable by the leaders.
[0070] Assumption 3. The stress matrix S is positive semidefinite, and ran k(S)=Ndim1 holds, where dim is the dimension of the systems in the Euclidean space. In this work, two-dimensional systems are considered, i.e., dim=2. This assumption implies that the matrix S.sub.ff is positive definite and invertible.
[0071] By virtue of Definition 2 and Assumption 3, for any
{grave over (q)}.sub.l can uniquely determine
as:
where
is the target position of the followers.
[0072] The aforementioned mathematics support an intelligent learning control method to ensure the safe formational maneuvering of a system of quadrotor UAVs and shield them against harmful cyber-attacks. Considering Assumption 2, the control of the leaders will not be considered. The distributed sliding mode surface of the followers' affine formation maneuver is designed as:
where .sub.1>0, .sub.2>0 are constant diagonal matrices, the function .sub.i(t) is designed as:
where 1<.Math..sub.1<2, and .Math..sub.2>.Math..sub.1
[0073] The compact form of (35) can be expressed as:
represent the target formation of the followers and its first-order derivative, respectively, and
[0074] The main objective here is to achieve:
[0075] Differentiating (37) with respect to time gives:
From (35), it is known that .sub.i=[.sub.xi.sub.yi].sup.T. Then, a barrier Lyapunov function for the tracking error system (39) is defined as follows:
where c.sub.xi>0 and c.sub.yi>0 are chosen to make sure that the states' constraints are not violated. From Remark 11 and Assumption 1, setting c.sub.xi=k.sub.ci.sub.i and c.sub.yi=k.sub.ci.sub.i.
[0076] Differentiating L.sub.1 with respect to time gives:
[0077] Equation (41) can be rewritten as:
[0078] The control protocol for the quadrotors with loss of direction resulting from an actuator attack can be designed as:
where .sub.3 is a constant positive definite diagonal matrix.
[0079] Remark 6. the control input suffered from a loss of direction as a result of the attack gain .sub.f. The Nussbaum function N(
.sub.f) is employed to tackle this problem. The control input u.sub.f is the corrected signal.
[0080] Remark 7. The control protocol (43) cannot be applied to the quadrotors directly because of the loss of control direction and the unknown value of vector .
[0081] The critic network is designed to assess the performance of the present control action and produce punishment/reward signals for adaptive learning. The critic network is designed using the neural network. A long-term cost function is defined as:
is a constant used to discount the future cost, (t) is the current cost function expressed as:
where Q and R are constant positive definite matrices.
[0082] The long-term cost function (44) contains future system information coupled with the compromised variables, so the solution is difficult to calculate. Thus, the critic neural network will be deployed to approximate J as
[0083] The critic neural network estimate of J is given by:
where {grave over (W)}.sub.c is the weight estimate of W.sub.c.
[0084] The continuous-time temporal difference error is obtained as:
[0085] For large , i.e .fwdarw., (48) simplifies to:
where stands for the gradient operator.
[0086] Define the temporal difference error objective function as:
[0087] The weight update rule of the critic network is derived as follows:
where .sub.c>0 is the learning rate,
[0088] Design a Lyapunov function L.sub.c for the critic network as:
[0089] Using (51) and (52), one gets:
[0090] At the end of the reinforcement learning, the weight update law (51) will minimize the objective function (50) such that .sub.c.fwdarw.0 and
It follows that:
where .sub.c=.sub.c+.sub.c{grave over ()}.sub.c, .sub.c.sub.c,max, with >0 being the upper-bound.
[0091] Putting (54) into (53), one achieves:
[0092] Therefore, (55) represents a Lyapunov function for the critic network which has been modified by the weight update law (51) to minimize the objective function (50). This function (55) may be utilized by the actor-critic learning scheme to protect the UAVs from external disturbances, sensor deception attacks, and injection attacks on the actuators.
[0093]
[0094] In the actor-critic control scheme, information about the lumped function is unavailable due to the existence of cyber-attacks, unknown external disturbances, and uncertain dynamics. The actor neural network may be deployed to approximate
as follows:
[0095] The actor network's estimation of is thus:
[0096] The current weight estimation error of the actor network is:
where {grave over (W)}.sub.a=W.sub.A{grave over (W)}.sub.a.
[0097] Let J.sub.d=0 be the desired cos-to-go in the actor network. The error between {grave over (J)} and J.sub.d is:
[0098] The total of the errors in the actor system is expressed as:
where K.sub.a is a diagonal gain matrix. The weight update law of the actor network,
where .sub.a>0 is the learning rate, is:
[0099] Because is unknown, the weight update law is rewritten as:
[0100] A candidate Lyapunov function for the actor-network is selected as follows:
[0101] The time-derivative of L.sub.a yields:
[0102] Considering that {grave over (J)}=W.sub.c.sup.T.sub.c(.sub.c)+{grave over (W)}.sub.c.sup.T.sub.c(.sub.c), gets:
[0103] Inserting (65) into (64):
[0104] Using the actor estimation in (57), (43) becomes:
[0105] Equation (42) can be evaluated as follows:
[0106] Consider the following Young's inequalities:
[0108] Using the inequalities above, (68) gives:
[0109] For
[0110] The scheme ensures that the negative effects of the cyber-attacks are diminished because the state constraints are not violated, and the quadrotors maintain operation within the safety boundaries and the tracking errors in the closed-loop system are bounded within a compact set. The leader-follower affine formation maneuver of the quadrotors is therefore realized.
[0111] The control scheme is proved by choosing a Lyapunov function candidate as follows:
[0112] By differentiating L with respect to time:
[0113] Based on Lemma 3:
[0114] Equation (75) can be simplified as:
[0115] Integrating (78) over the interval [0 t] leads to:
[0116] Selecting h.sub.3 as the upper-bound of
achieves:
[0117] Based on the inequality above, the tracking error signals will, in the end, remain in the compact sets defined by:
[0118]
[0119]
[0120]
The stress matrix is computed:
[0121] The initial positions of the quadrotors are set as q.sub.1(0)=[1 0], q.sub.2(0)=[0.5 1], q.sub.3(0)=[0.5 0.75], q.sub.4(0)=[0 0.75], q.sub.5(0)=[0 0.75], q.sub.6(0)=[0.75 1.5] and q.sub.7(0)=[0.75 1.25] and q(0)=[q.sub.1(0).sup.T, q.sub.2(0).sup.T, . . . , q.sub.7(0).sup.T].sup.T. To enforce safety for the group of quadrotors, the motion of each quadrotor is constrained within the safety region such that |x.sub.i|<k.sub.ci and |y.sub.i|<k.sub.ci, with k.sub.ci=15. It follows that the tracking errors are also constrained as |.sub.xi|<c.sub.xi and |y.sub.i|<c.sub.yi, with c.sub.xi=c.sub.yi=2 to avoid violation of the safety region. The sensor deception attacks are modeled as .sub.i=(1+cos (t)), the actuator injection attacks are modeled as m.sub.i=sin (q.sub.i(t){grave over (q)}.sub.i(t)). The controller gains are chosen as .sub.1=dia g 15,15, . . . ,15, .sub.2=diag 2,2, . . . ,2, .sub.3=diag 10,10, . . . ,10, .Math..sub.1=1.5, and .Math..sub.2=2.1. The parameters in the cost function are R=diag 5,5, . . . ,5, Q=diag 10,10, . . . ,10 and the discount factor =0.2. When defining the reinforcement learning parameters 5 nodes are considered in both actor and critic neural networks; the learning rates of the actor and critic neural networks are .sub.a=1.5 and .sub.c=0.01, respectively; the centers of the Gaussian functions of both actor and critic networks are selected between 0.5 and 0.5, while the width of both functions are set as 0.25; and the initial weights of the actor and critic networks are selected as {grave over (W)}.sub.ai(0)={grave over (W)}.sub.ci(0)=[0.5,0.5, . . . ,0.5].sup.T.
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129] The present disclosure investigates the actor-critic learning-based affine formation maneuver controls of a group of quadrotor UAVs while considering safety constraints and providing security against cyber-attacks. The leaders specify desired formation maneuvers through stress matrices and an affine transformation scheme. By constructing a barrier Lyapunov function, the operation of quadrotor UAVs within a predefined safety range is guaranteed. A distributed sliding mode control coupled with an actor-critic learning mechanism is designed to counter cyber-attacks and achieve the leader-follower affine formation maneuvers of the quadrotor UAVs. The critic system is used to estimate the objective function of the system whereas the actor system takes the proper control action necessary to achieve control objectives. A Lyapunov stability function is employed to prove that the closed-loop system is bounded. The provided example reveals that the method and system of the present disclosure are able to meet the control objective. Compared with traditional systems wherein input gains induced by attack signals are not considered, the present disclosure addresses these problems by integrating the Nussbaum function into the controller. As such, the controller of the present disclosure does not require any prior information about the signs of the input gains induced by the attack signals.
[0130] In conventional methods and systems, learning-based controllers use linear quadratic regulators, in contrast the present disclosure teaches a distributed sliding mode control approach equipped with a learning mechanism. The learning mechanism may be a machine learning mechanism. A machine learning mechanism may be understood to be any formula, computer program, computer, computer system, program, network of computers, or the like, which is configured to develop, change, or improve its functionality with exposure to data such that improved performance of a particular task, or more desirable responses to particular stimuli, may be learned or developed by the mechanism.
[0131] The quadrotors of the present disclosure may be any unmanned arial vehicle (UAV) which is configured to use spinning rotors to generate thrust. The control scheme of the present disclosure may be applied to any plurality of UAVs, including but not limited to helicoptors, dicopters, tricopters quadcopters, hexacopters, and octocopters.
[0132] The cyber-attack of the present disclosure may be any malicious digital signal which are configured to interrupt the operation of at least one UAV of the present disclosure. The cyber-attack may be, amongst other attacks, a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.
[0133] Numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.