A METHOD AND SYSTEM FOR CONTROLLING THE FLIGHT OF A PLURALITY OF QUADCOPTERS

20260023396 ยท 2026-01-22

Assignee

Inventors

Cpc classification

International classification

Abstract

A method and system for controlling the flight of a plurality of quadcopters includes communicating a flight instruction from a user to a leader quadcopter. The method includes calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to convert a flight instruction into a leader formation maneuver and a follower formation maneuver. The method may include communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter. The method may include executing the leader formation maneuver on the leader quadcopter. The method may include executing the follower formation maneuver on the follower quadcopter.

Claims

1. A method for controlling the flight of a plurality of quadcopters, comprising: communicating a flight instruction from a user to a leader quadcopter; calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to convert the flight instruction into a leader formation maneuver and a follower formation maneuver; communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter; executing the leader formation maneuver on the leader quadcopter; and executing the follower formation maneuver on the follower quadcopter.

2. The method of claim 1, wherein the leader-follower formation controller is further configured to utilize a barrier function to calculate the leader formation maneuver and the follower formation maneuver.

3. The method of claim 2, wherein the barrier function is a Lyapunov candidate function which trends to infinity at a predetermined constraint value.

4. The method of claim 2, wherein the leader-follower formation controller is further configured to use an actor-critic learning mechanism.

5. The method of claim 4, wherein the actor-critic learning mechanism is a machine learning algorithm.

6. The method of claim 1, wherein the leader-follower formation controller is further configured to use a distributed sliding mode control.

7. The method of claim 6, wherein the distributed sliding mode control is configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack.

8. The method of claim 7, wherein the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.

9. The method of claim 7, wherein the distributed sliding mode control is configured to use a Nussbaum gain function to mitigate the effects of input gain on the follower formation maneuver, the input gain being created by a malicious cyber-attack.

10. The method of claim 1, wherein the follower formation maneuver comprises: a scaling maneuver; a shearing maneuver; a translation maneuver; and a collinearity maneuver.

11. The method of claim 10, wherein the position of each of the plurality of quadcopters within the follower formation maneuver is defined by: an x position; a y position; a z position; a roll angle; a yaw angle; and a pitch angle.

12. The method of claim 1, wherein the leader-follower formation controller is configured to use a radial basis function neural network to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter.

13. A method for controlling the flight of a plurality of quadcopters, comprising: communicating a flight instruction from a user to a leader quadcopter; calculating a leader formation maneuver and a follower formation maneuver with a leader-follower formation controller configured to use a distributed sliding mode control and an actor-critic learning mechanism to convert a flight instruction into a leader formation maneuver and a follower formation maneuver; communicating the follower formation maneuver from the leader quadcopter to a follower quadcopter; executing the leader formation maneuver on the leader quadcopter; and executing the follower formation maneuver on the follower quadcopter.

14. The method of claim 13, wherein the sliding mode control is configured to convert the flight instruction into a follower formation maneuver with affine transformations and stress matrices.

15. The method of claim 13, wherein the actor-critic learning mechanism is a machine learning algorithm and the machine learning algorithm is further configured to use a radial basis function to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter.

16. The method of claim 13, wherein the sliding mode control and the actor-critic learning mechanism are configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack.

17. The method of claim 16, wherein the distributed sliding mode control and the actor-critic learning mechanism are configured to use a Nussbaum gain function to mitigate the effects of input gain on a follower formation maneuver, the input gain being created by a malicious cyber-attack.

18. The method of claim 16, wherein the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.

19. A system for controlling a plurality of quadcopters, the system comprising: a user input device configured to receive a flight instruction from a user and deliver the flight instruction to a leader quadcopter; a leader quadcopter configured to: receive a flight instruction from a user input device; calculate a leader formation maneuver; calculate a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to calculate the follower formation maneuver; communicate the follower formation maneuver to a follower quadcopter; and execute the leader formation maneuver; and a follower quadcopter configured to: receive the follower formation maneuver; and execute the follower formation maneuver.

20. The method of claim 19, wherein the leader-follower formation controller is further configured to use an actor-critic machine learning algorithm to calculate the follower formation maneuver.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] A more complete appreciation of this disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:

[0012] FIG. 1A depicts a block diagram of a system for controlling a plurality of quadcopters, according to certain embodiments.

[0013] FIG. 1B depicts several formation shape maneuvers of a plurality of quadcopters, according to certain embodiments.

[0014] FIG. 2 depicts a nominal formation of the quadrotors, according to certain embodiments.

[0015] FIG. 3A depicts a graph of a leader-follower time-varying position trajectories in x-axis, according to certain embodiments.

[0016] FIG. 3B depicts a graph of a leader-follower time-varying position trajectories in y-axis, according to certain embodiments.

[0017] FIG. 4A depicts a graph of tracking errors of followers in the x-axis, according to certain embodiments.

[0018] FIG. 4B depicts a graph of tracking errors of followers in the y-axis, according to certain embodiments.

[0019] FIG. 5A depicts a graph of actor-critic learning control protocols of the followers in the x-axis, according to certain embodiments.

[0020] FIG. 5B depicts a graph of actor-critic learning control protocols of the followers in the y-axis, according to certain embodiments.

[0021] FIG. 6A-FIG. 6H depicts a graph of Norm-2 of actor weights, according to certain embodiments.

[0022] FIG. 7A-FIG. 7H depicts a graph of Norm-2 of critic weights, according to certain embodiments.

[0023] FIG. 8 depicts a graph of leader-follower affine formation maneuvers, according to certain embodiments.

[0024] FIG. 9 depicts a flowchart of a method for controlling the flight of a plurality of quadcopters, according to certain embodiments.

[0025] FIG. 10 depicts a flowchart of a method for controlling the flight of a plurality of quadcopters, according to certain other embodiments.

[0026] FIG. 11 depicts a block diagram of a computer and/or networked device configured to communicate with at least one quadcopter of the plurality of quadcopters according to certain embodiments.

[0027] FIG. 12 depicts a block diagram of networked device configured to communicate with at least one quadcopter of the plurality of quadcopters according to certain other embodiments.

[0028] FIG. 13 depicts a block diagram of a central processing unit configured to process information related to the controlling of at least one quadcopter of the plurality of quadcopters according to certain embodiments.

[0029] FIG. 14 depicts a network diagram of a communication network configured to relay information to and from at least one networked device configured to communicate with at least one quadcopter of the plurality of quadcopters according to certain embodiments.

DETAILED DESCRIPTION

[0030] In the drawings, like reference numerals designate identical or corresponding parts throughout the several views. Further, as used herein, the words a, an and the like generally carry a meaning of one or more, unless stated otherwise.

[0031] Furthermore, the terms approximately, approximate, about and similar terms generally refer to ranges that include the identified value within a margin of 20%, 10%, or preferably 5%, and any values therebetween.

[0032] The term affine formation maneuver refers to a maneuver of a plurality of objects in formation which retain a spatial relationship consistent with an affine transformations. The term affine transformation may refer to the class of linear mapping methods which preserves points, straight lines, and planes, but do not necessarily preserve Euclidian distances or angles. Therefore, the geometry of a formation at a first point in time will have the same points, straight lines, and planes of the geometry of the formation after an affine formation maneuver has been performed.

[0033] The terms leader and follower may describe the independently controlled components of a leader-follower control scheme respectively. In a leader-follower control scheme, at least one member of the control scheme is chosen as a leader and the leader(s) may dictate or decide the whole formation group's moving trajectory, typically by communicating commands to the non-deciding follower members. In one embodiment of a leader-follower control scheme a user may send user commands to leader vehicles, thereby externally controlling the leaders. The leaders may then send information about the user commands and/or send follower commands to the follower vehicles which may act on those instructions, thereby internally controlling the followers. In another embodiment, the user commands may be additionally or alternatively be sent to the followers by a user or a separate computer and/or networked device.

[0034] According to an embodiment, an actor-critic learning scheme for safe leader-follower affine formation maneuver control of networked quadrotor unmanned aerial vehicles (UAVs) is described in the face of external disturbances, sensor deception attacks, and injection attacks on the actuators. Typically, the follower quadrotors (followers) aim to track the formation maneuvers such as scaling, shearing, translation, and rotation decided by the leader quadrotors (leaders). Motivated by increasing safety and performance requirements during formation maneuvering, the dynamic states of the quadrotor UAVs are constrained within the prescribed safety region. The term dynamic states may refer to the full range of actual positions/formations that the UAVs may be arranged in during operation.

[0035] To guarantee that the safety constraints are not violated, a barrier Lyapunov function may be used. A distributed sliding mode control with actor-critic learning is also formulated and implemented to facilitate accurate leader-follower affine formation maneuvers and reject malicious cyber-attack signals. A sliding mode control may refer to any nonlinear control method that alters the dynamics of a nonlinear system, such as a flight path, by applying a discontinuous control signal which forces the system to slide along a cross-section of the system's normal behavior thereby regulating the system. The term actor-critic learning may refer to a machine reinforcement learning system wherein software is trained on the outcomes of actions its decisions have informed such that it will make more effective decisions in the future.

[0036] Additionally, input gains that arise due to the attacks might corrupt the control direction and so in the present disclosure, a Nussbaum gain function is coupled to the controller to address input gain corruption. The actor system may be responsible for estimating uncertain dynamics together with the malicious attack signals, while the critic network evaluates the control performance through an estimated long-term performance index. In the method and system of the present disclosure, the overall stability of the closed-loop system can be uniformly bounded using a Lyapunov stability function.

[0037] In view of the foregoing discussion, the present disclosure provides a solution for leader-follower affine formation maneuver control of networked quadrotor unmanned aerial vehicles (UAVs) subjected to external disturbances, sensor deception attacks, and actuator injection attacks. By considering the safety and physical limitations of the quadrotor UAVs, their dynamic states are constrained to operate within a safe workspace. To prevent violation of the safety constraints, in the method and system of the present disclosure, a barrier Lyapunov function is involved to improve the safety of the system. The safety of the system may refer to the operational conditions under which the formation may maneuver without risking a loss of control of an individual UAV, collisions between UAVs, or some other undesirable flight path for the formation. Then, the actor-critic learning-based distributed sliding mode control technique is designed to aid the leader-follower formation maneuvers of the quadrotor UAVs within the prescribed workspace while mitigating cyber-attacks. The present disclosure provides a method and system that utilizes the properties of affine transformation and stress matrices, and various collective affine formation maneuvers of the networked quadrotor UAVs such as scaling, translation, rotation, shearing, and collinearity. Moreover, the quadrotors in this technology have improved freedom of maneuverability. The present disclosure addresses a safety-guaranteed leader-follower affine formation maneuver control problem associated with multiple quadrotor UAVs under sensor deception attacks and actuator injection attacks. Compared with the conventional systems, where the problem of input gains induced by attack signals was not considered, the present disclosure addresses this problem by integrating the Nussbaum function into the controller of the present disclosure. As such, the controller of the present disclosure does not require any prior information about the signs of the input gains induced by the attack signals to counteract them. In conventional systems the learning-based controllers are in the form of linear quadratic regulators, in contrast, the present disclosure utilizes a distributed sliding mode control approach integrated with a learning mechanism.

[0038] According to an embodiment, the quadrotor of the present disclosure is a system having six degrees of freedom with four control inputs. Assuming that the quadrotor framework is rigid and symmetric, and the center of gravity coincides with the body-frame origin, the dynamic equations of the displacement and rotation of the quadrotor are as given by equations (1) to (6):

[00001] ` i = 1 I x [ ( I y - I z ) ` i ` i - a 1 ` i 2 - I p ` i + lu 2 i ] + i ( 1 ) ` i = 1 I y [ ( I z - I x ) ` i ` - a 2 ` i 2 + I p ` i + lu 3 i ] + i ( 2 ) ` i = 1 I z [ ( I x - I y ) ` i ` i - a 3 ` i 2 + lu 4 i ] + i ( 3 ) x ` i = - a 4 x ` i m + ( cos i sin i cos i + sin i sin i ) u 1 i m + xi ( 4 ) y ` i = - a 5 y ` i m + ( cos i sin i sin i - sin i cos i ) u 1 i m + yi ( 5 ) z ` i = - a 6 z ` i m + ( cos i cos i ) u 1 i m - g m + zi ( 6 )

wherein x, y, and z denote the position of the quadrotors in the earth frame, , , and represent the roll, yaw, and pitch angles, respectively, .sub.i, .sub.i, .sub.i, .sub.xi, .sub.yi, and .sub. stand for the unknown time-varying external disturbances, l is the length from the center of each actuator to the center of mass, I.sub.p is the propeller inertia, I.sub.x, I.sub.y, I.sub.z are the moment of inertia along the x, y, and z axes, respectively, g represents gravitational acceleration, a.sub.1, a.sub.2, a.sub.3, a.sub.4, a.sub.5, and a.sub.6 stand for the drag coefficients, u.sub.1i, u.sub.2i, u.sub.3i, and u.sub.4i denote the control inputs.

[0039] The three-degree-of-freedom translational equations of the N quadrotors are given by equations (7) to (9):

[00002] x ` i = - a 4 x ` i m + u xi + xi ( 7 ) y ` i = - a 5 y ` i m + u yi + yi ( 8 ) z ` i = - a 6 z ` i m + u zi + zi where u xi = ( cos i sin i cos i + sin i sin i ) u 1 i m u yi = ( cos i sin i sin i - sin i cos i ) u 1 i m u zi = ( cos i cos i ) u 1 i m - g m ( 9 )

[0040] Assuming that the quadrotors are flying at the same altitude, the translational equations of the quadrotors in two-dimensional space can be written as:

[00003] q ` i = i ( q i , q ` i ) + u i + i where q i = [ x i y i ] T 2 , u i = [ u xi u yi ] T 2 , i ( q i , q ` i ) = [ - a 4 x ` i m - a 5 y ` i m ] T 2 , i = [ xi yi ] T 2 . ( 10 )

[0041] The dynamic states of the quadrotors in equation (10) are constrained in the following compact sets:

[00004] .Math. "\[LeftBracketingBar]" q i .Math. "\[RightBracketingBar]" = [ .Math. "\[LeftBracketingBar]" x i .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" y i .Math. "\[RightBracketingBar]" ] < [ k ci k ci ] ( 11 )

where k.sub.ci>0 is a constant. The compact set (11) specifies the region of safety operation of the quadrotors.

[0042] Assumption 1: Let q.sub.i* be the target position of each quadrotor. It is assuming that

[00005] q i *

is bounded as:

[00006] .Math. "\[LeftBracketingBar]" q i * .Math. "\[RightBracketingBar]" = [ .Math. "\[LeftBracketingBar]" x i .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" y i .Math. "\[RightBracketingBar]" ] < [ .Math. i .Math. i ] < [ k ci k ci ] ( 12 )

and its 1st and 2nd derivatives are bounded as follows:

[00007] .Math. "\[LeftBracketingBar]" q ` i .Math. "\[RightBracketingBar]" = [ .Math. "\[LeftBracketingBar]" x ` i .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" y ` i .Math. "\[RightBracketingBar]" ] < [ .Math. i .Math. i ] < [ k ci k ci ] ( 13 ) .Math. "\[LeftBracketingBar]" q ` i .Math. "\[RightBracketingBar]" = [ .Math. "\[LeftBracketingBar]" x ` i .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" y ` i .Math. "\[RightBracketingBar]" ] < [ .Math. i .Math. i ] < [ k ci k ci ] where .Math. i > 0 , .Math. i > 0 and .Math. i > 0 ( 14 )

are constants,

[0043] The two-dimension translational equations of the quadrotors in the face of cyber-attacks are given as:

[00008] { q i = i ( q i , q i ) + u i + i q i = q i + i ( t , q i ) u i = h _ i u i + h i ( t , q i , q i ) ( 15 )

wherein {grave over (q)}.sub.icustom-character.sup.2 is the vector of the compromised states, .sub.icustom-character.sup.2 is the vector of the compromised control inputs, .sub.i(t, q.sub.i)custom-character.sup.2 is the vector of the deception attack signals, custom-character.sub.i=[custom-character].sup.Tcustom-character.sup.2 is the vector of time-varying injected attack signals, and h.sub.i=dia g{h.sub.xih.sub.yi}custom-character.sup.22

[0044] The deception attack signals can be described by state-dependent function as .sub.i(t, q.sub.i)=.sub.i(t)q.sub.i since they are mimicking the state variables, where .sub.i(t)custom-character.sup.22 is a time-varying weight. Thus, the quadrotor states corrupted by the deception attacks are given as:

[00009] q i = q i + i ( t ) q i = i ( t ) q i ( 16 )

where .sub.i(t)=(1+.sub.i(t))

[0045] As a result of the attack, the actual state variables q.sub.i are no longer available. Therefore, the contaminated state variables q.sub.i will be utilized for control design.

[0046] Deriving from (16):

[00010] q . l = . l q i + i q . l ( 17 ) q .Math. l = .Math. l q i + 2 . l q . l + i q .Math. l ( 18 )

[0047] Noting that

[00011] q i = q i i and q i = 1 i [ q . l - i q i i ]

and 15, (18) becomes:

[00012] q i = i q i i + 2 i i [ q i - i q i i ] + i i + i i + i h _ i u i + i h i ( 19 )

[0048] The quadrotor systems under the cyber-attacks could be expressed as:

[00013] q i = i + g i u i where i = i q i i + 2 i i [ q .fwdarw. i - i q i i ] + i i + i i + i h i , and g i = i h _ i . ( 20 )

[0049] The N quadrotors in (20) can be partitioned into two groups: the N.sub.l leaders and N.sub.f=(NN.sub.l) followers.

[0050] Let {grave over (q)}.sub.l and {grave over (q)}.sub.f be the vectors of the N.sub.l leaders and N.sub.f followers, respectively. Then:

[00014] q l = [ q 1 T q 2 T .Math. q N l T ] T ( 21 ) q f = [ q N l + 1 T q N l + 2 T .Math. q N l + N f T ] T ( 22 )

[0051] The configuration of the N quadrotors which consists of their positions in two-dimensional space under cyberattacks is thus:

[00015] q = [ q l T q f T ] T = [ q 1 T q 2 T .Math. q N T ] T 2 N ( 23 )

[0052] It is advantageous to provide a control law that can maintain healthy quadrotors states within the safety constraints (11) even in the presence of cyberattacks.

[0053] Assumption 2: Considering that the N.sub.l leader quadrotors are already controlled to acquire the desired formation maneuvers. In this sense, the control procedure for the leader quadrotors will not be considered.

[0054] Definition 1. Given a continuous function H(custom-character):custom-character.fwdarw.custom-character that satisfies:

[00016] { t .fwdarw. sup 1 t 0 t N ( ) d = + t .fwdarw. inf 1 t 0 t N ( ) d = - ( 24 )

then H(custom-character): custom-character.fwdarw.custom-character is a Nussbaum function. Many functions satisfy the condition in (24), for instance, custom-character.sup.2 cos (custom-character), ecustom-character.sup.2 cos (1/2custom-character) and custom-character.sup.2 sin(custom-character).

[0055] Lemma 1. Defining the smooth functions L(t) and s(t) over the range [0, t.sub.f) with L(t)0, H(s) as a smooth Nussbaum function, the following inequality holds:

[00017] L ( t ) r 1 + e - r 2 t 0 t [ g ( . ) H ( ) + 1 ] e r 2 t dt ( 25 )

where r.sub.1>0, r.sub.2>0 are constants, g()isatimevarying function, then, L(t), custom-character(t), and

[00018] 0 t g ( ) N . ( ) dt

are bounded over [0, t.sub.f].

[0056] According to an embodiment, radial basis function neural networks are commonly employed to approximate any smooth and continuous unknown term as:

[00019] = W T ( ) + ( ) ( 26 )

where is the input vector, W=[w.sub.1 w.sub.2 . . . w.sub.k].sup.Tcustom-character.sup.k is the weight vector, k is the number of nodes in the hidden layer, () is the approximation error and (x)custom-character with custom-character>0, ()=[.sub.1().sub.2() . . . .sub.k()].sup.T represent the basis function vector with entries .sub.i() given by

[00020] i ( ) = exp ( - .Math. ( - i ) .Math. 2 _ i 2 ) , i = 1 , 2 , .Math. , k ( 27 )

where custom-character and .sub.i denote the center and width of the Gaussian function, respectively.

[0057] According to an embodiment, a barrier function is widely employed to guarantee the safety of dynamic systems. The barrier function tends to infinity when it reaches the barriers of the specified safety constraints, but never violates the constraints. This property is utilized to design control laws that keep dynamic systems within the safety barriers.

[0058] Lemma 2. A Lyapunov candidate function L(z) complying L(z).fwdarw. as |z|.fwdarw.c, is deployed to ensure that the state-variables boundaries are not transgressed. The Lyapunov candidate function is as follows:

[00021] L ( z ) = 1 2 log c 2 c 2 - z 2 ( 28 )

where c is the constraint on z. Thus, L is positive definite and continuous within the set |z|c. The control policy is developed to meet {grave over (L)}0.

[0059] Lemma 3. For any constant c>0 and zcustom-character meeting |z|c, one gets

[00022] log c 2 c 2 - z 2 < z 2 c 2 - z 2 ( 29 )

[0060] See Y.-J. Liu, J. Li, S. Tong, C. L. P. Chen, Neural network control based adaptive learning design for nonlinear systems with full state constraints, IEEE Transactions on Neural Networks and Learning Systems 27 (2016) 1562-1571. doi 10.1109, incorporated herein by reference in its entirety.

[0061] The exchange of information among the quadrotors is modeled by an undirected graph custom-character=(custom-character, , custom-character) consisting of N vertices. Define custom-character={.sub.1, . . . , .sub.N} and .Math.custom-charactercustom-character as the sets of vertices and edges, respectively. Then, the set of neighbors of the vertex i is defined by custom-character=jcustom-character:(i, j).

[0062] The formation of the quadrotors (custom-character, {grave over (q)}) is defined as the graph of the quadrotors custom-character=(custom-character, , custom-character) with their corresponding configuration {grave over (q)}.

[0063] The scalar weight allotted to each edge of the formation (custom-character, q) is termed as the stress of the edge and it can be positive or negative. The set of the scalar weights is defined by {s.sub.ij}.sub.(i,j) with s.sub.ij=s.sub.jicustom-character. The stress matrix Scustom-character.sup.NN is described as:

[00023] [ S ] ij = { 0 , i j , ( i , j ) .Math. i - s ij , i j , ( i , j ) i .Math. k i s ik , i j . ( 30 )

[0064] When the stress satisfies custom-characters.sub.ij({grave over (q)}.sub.j{grave over (q)}.sub.i)=0 or (S.Math.I.sub.2){grave over (q)}=0 in compact form, it is referred to as equilibrium stress.

[0065] Thus, (30) can be rewritten as:

[00024] S = S .Math. I 2 = [ S ll S lf S fl S ff ] .Math. I 2 = [ S ll S lf S fl S ff ] . ( 31 )

where S.sub.llcustom-character.sup.2N.sup.l.sup.2N.sup.l, S.sub.ffcustom-character.sup.2N.sup.f.sup.2N.sup.f and S.sub.flcustom-character.sup.2N.sup.l.sup.2N.sup.f

[0066] Let

[00025] = [ 1 T 2 T .Math. N T ] T = [ l T f T ] T = 2 N

be a constant configuration. In view of the graph custom-character, a nominal formation (custom-character, n) is formed. Afterward, the time-varying reference formation is:

[00026] q * ( t ) = [ I N .Math. A ( t ) ] + 1 N .Math. b ( t ) ( 32 )

where A(t)custom-character.sup.22 and b(t)custom-character.sup.2 are time-varying. The affine transformation of can be achieved by specifying the entries of A(t) and b(t).

[0067] The affine image of is defined by:

[00027] ( ) = { q ` 2 N : q ` = ( I N .Math. A ) + 1 N .Math. b } . ( 33 )

[0068] If qcustom-character(), the N agents acquire the affine formation maneuvers.

[0069] Definition 2. Affine localizability: If for any

[00028] q ` = [ q ` f T q ` l T ] T ( ) , q ` l

determine {grave over (q)}.sub.f, then (custom-character, ) is affinely localizable by the leaders.

[0070] Assumption 3. The stress matrix S is positive semidefinite, and ran k(S)=Ndim1 holds, where dim is the dimension of the systems in the Euclidean space. In this work, two-dimensional systems are considered, i.e., dim=2. This assumption implies that the matrix S.sub.ff is positive definite and invertible.

[0071] By virtue of Definition 2 and Assumption 3, for any

[00029] q ` = [ q ` f T q ` l T ] T ( ) ,

{grave over (q)}.sub.l can uniquely determine

[00030] q ` f *

as:

[00031] q ` f * = - ( S ff - 1 S fl .Math. I 2 ) q ` l . ( 34 )

where

[00032] q f *

is the target position of the followers.

[0072] The aforementioned mathematics support an intelligent learning control method to ensure the safe formational maneuvering of a system of quadrotor UAVs and shield them against harmful cyber-attacks. Considering Assumption 2, the control of the leaders will not be considered. The distributed sliding mode surface of the followers' affine formation maneuver is designed as:

[00033] i = .Math. j = 1 N s ij ( q ` ` i - q ` ` j ) - .Math. j = 1 N s ij ( q ` ` i ( 0 ) - q ` ` j ( 0 ) ) , i f + 1 .Math. j = 1 N s ij ( q ` i - q ` j ) + 2 0 t s ij ( i - j ) dt ( 35 )

where .sub.1>0, .sub.2>0 are constant diagonal matrices, the function .sub.i(t) is designed as:

[00034] i ( t ) = [ .Math. j = 1 N s ij tanh .Math. 1 ( q ` i - q ` j ) + .Math. j = 1 N s ij ( q .fwdarw. ` i - q .fwdarw. ` j ) ] ( 36 )

where 1<.Math..sub.1<2, and .Math..sub.2>.Math..sub.1

[0073] The compact form of (35) can be expressed as:

[00035] f = S ff ( ` f - q f * ) - z ` f ( 0 ) + 1 z f + 2 0 t f dt where z f = S ff ( q ` f - q f * ) , q f * = - ( S ff - 1 S fl .Math. I 2 ) q ` l and q ` f * = - ( S ff - 1 S fl .Math. I 2 ) q ` ` l ( 37 )

represent the target formation of the followers and its first-order derivative, respectively, and

[00036] f = [ N l + 1 T N l + 1 T .Math. N l + N f T ] T .

[0074] The main objective here is to achieve:

[00037] lim t .fwdarw. ( q ` f ( t ) - q f * ( t ) ) = 0 ( 38 )

[0075] Differentiating (37) with respect to time gives:

[00038] ` f = S ff ( q ` ` f - q ` f * ) + 1 z ` f + 2 f = S ff ( f + f u f - q ` f * + 1 S ff - 1 z ` f + 2 S ff - 1 f ) Where f = [ N l + 1 T N l + 1 T .Math. N l + N f T ] T , z f = [ z N l + 1 T z N l + 1 T .Math. z N l + N f T ] T , f = [ N l + 1 T N l + 2 T .Math. N l + N f T ] T , f = diag { N l + 1 N l + 1 .Math. N l + N f } , and u = [ u N l + 1 T u N l + 2 T .Math. u N l + N f T ] T . ( 39 )

From (35), it is known that .sub.i=[.sub.xi.sub.yi].sup.T. Then, a barrier Lyapunov function for the tracking error system (39) is defined as follows:

[00039] L 1 = Trace ( S ff - 1 .Math. i = N l + 1 N l + N f 1 2 ( c x i 2 c x i 2 - x i 2 + c y i 2 c y i 2 - y i 2 ) ) ( 40 )

where c.sub.xi>0 and c.sub.yi>0 are chosen to make sure that the states' constraints are not violated. From Remark 11 and Assumption 1, setting c.sub.xi=k.sub.ci.sub.i and c.sub.yi=k.sub.ci.sub.i.

[0076] Differentiating L.sub.1 with respect to time gives:

[00040] L ` 1 = Trace ( S ff - 1 .Math. i = N l + 1 N l + N f ( x i x i c x i 2 - x i 2 + y i y i c y i 2 - z y i 2 ) ) = Trace ( S ff - 1 .Math. i = N l + 1 N l + N f ( [ x i c x i 2 - x i 2 y i c y i 2 - y i 2 ] [ ` x i ` y i ] ) ) = Trace ( S ff - 1 .Math. i = N l + 1 N l + N f O i T ` i ) where O i = [ x i c x i 2 x i 2 y i c y i 2 - y i 2 ] T ( 41 )

[0077] Equation (41) can be rewritten as:

[00041] L ` 1 = Trace ( O f T S ff - 1 ` f ) = O f T ( f + f u f - q ` f * + I S ff - 1 z ` f + 2 S ff - 1 f ) where O f = [ O N l + 1 T O N l + 2 T .Math. O N l + N f T ] T . ( 42 )

[0078] The control protocol for the quadrotors with loss of direction resulting from an actuator attack can be designed as:

[00042] u f = f - 1 ( ` f + 3 f + 1 2 O f + 1 S ff - 1 z ` f + 2 S ff - 1 f - S ff - 1 q ` f * ) u f = N ( f ) u f ` f = O f T ( ` f + 3 f + 1 2 O f + 1 S ff - 1 z ` f + 2 S ff - 1 f - S ff - 1 q ` f * ) ( 43 )

where .sub.3 is a constant positive definite diagonal matrix.

[0079] Remark 6. the control input custom-character suffered from a loss of direction as a result of the attack gain .sub.f. The Nussbaum function N(custom-character.sub.f) is employed to tackle this problem. The control input u.sub.f is the corrected signal.

[0080] Remark 7. The control protocol (43) cannot be applied to the quadrotors directly because of the loss of control direction and the unknown value of vector custom-character.

[0081] The critic network is designed to assess the performance of the present control action and produce punishment/reward signals for adaptive learning. The critic network is designed using the neural network. A long-term cost function is defined as:

[00043] J ( t ) = c e - t a ( ) d ( 44 ) where J = [ J N l + 1 T J N l + 1 T .Math. J N l + N f T ] T , > 0

is a constant used to discount the future cost, (t) is the current cost function expressed as:

[00044] ( t ) = f T Q f + u f T R u f ( 45 )

where Q and R are constant positive definite matrices.

[0082] The long-term cost function (44) contains future system information coupled with the compromised variables, so the solution is difficult to calculate. Thus, the critic neural network will be deployed to approximate J as

[00045] J = W c T c ( c ) + c ( 46 )

[0083] The critic neural network estimate of J is given by:

[00046] J ` = W ` c T c ( c ) ( 47 )

where {grave over (W)}.sub.c is the weight estimate of W.sub.c.

[0084] The continuous-time temporal difference error is obtained as:

[00047] c ( t ) = ( t ) - 1 J ` ( t ) + J ` ` ( t ) ( 48 )

[0085] For large , i.e .fwdarw., (48) simplifies to:

[00048] c ( t ) = ( t ) + J ` ` ( t ) = ( t ) + J ` ( t ) ` c ( 49 )

where stands for the gradient operator.

[0086] Define the temporal difference error objective function as:

[00049] E c = 1 2 c T c ( 50 )

[0087] The weight update rule of the critic network is derived as follows:

[00050] W ` ` c = - c E c W c = - c c ( t ) c W c = - c c ( t ) [ ( t ) - ( 1 / ) J ` ( t ) + J ` ( t ) ] W c = - c c ( t ) [ - 1 J ` W c + W c ( J ` c ) ] = - c ( ( t ) + W C T ) ( 51 )

where .sub.c>0 is the learning rate,

[00051] = - c + c ` c .

[0088] Design a Lyapunov function L.sub.c for the critic network as:

[00052] L c = 1 2 W ` C T W ` c ( 52 )

[0089] Using (51) and (52), one gets:

[00053] L ` c = W ` c T W ` ` c = W ` c T W ` ` c = - c W ` c T ( ( t ) + W c T ) ( 53 )

[0090] At the end of the reinforcement learning, the weight update law (51) will minimize the objective function (50) such that custom-character.sub.c.fwdarw.0 and

[00054] J - j .

It follows that:

[00055] ( t ) = W c T c + c - I ` c = W c T c + c - ( W c T c ( c ) + c ) ` c = - W c T + c ( 54 )

where .sub.c=.sub.c+.sub.c{grave over ()}.sub.c, .sub.c.sub.c,max, with >0 being the upper-bound.

[0091] Putting (54) into (53), one achieves:

[00056] L ` c = - c W ` c T ( W ` c c + c ) = - c T W ` c T W ` c - c W ` c T c - c T 2 W ` c T W ` c + c 2 c T c - c T 2 W ` c T W ` c + c 2 .Math. c , m ax .Math. 2 ( 55 )

[0092] Therefore, (55) represents a Lyapunov function for the critic network which has been modified by the weight update law (51) to minimize the objective function (50). This function (55) may be utilized by the actor-critic learning scheme to protect the UAVs from external disturbances, sensor deception attacks, and injection attacks on the actuators.

[0093] FIG. 1A depicts a block diagram of a system 100 for controlling a plurality of quadcopters, according to certain embodiments. The system 100 includes a user input device 102, a leader quadcopter 104, and a follower quadcopter 106. The user input device 102 is configured to receive a flight instruction from a user and deliver the flight instruction to a leader quadcopter. The leader quadcopter 104 is configured to receive a flight instruction from a user input device, calculate a leader formation maneuver, calculate a follower formation maneuver with a leader-follower formation controller configured to use affine transformations and stress matrices to calculate the follower formation maneuver, and communicate the follower formation maneuver to a follower quadcopter and execute the leader formation maneuver. The leader quadcopter 104 is a multirotor drone with four arms or booms, each with a rotor. Multirotor drones are unmanned aerial vehicles (UAV) with multiple rotors that are used to generate lift to enable the aircraft to fly. The follower quadcopter 106 is configured to receive the follower formation maneuver and execute the follower formation maneuver. In an implementation, the leader-follower formation controller is further configured to use an actor-critic machine learning algorithm to calculate the follower formation maneuver.

[0094] In the actor-critic control scheme, information about the lumped function custom-character is unavailable due to the existence of cyber-attacks, unknown external disturbances, and uncertain dynamics. The actor neural network may be deployed to approximate custom-character as follows:

[00057] f = W a T c ( a ) + a ( 56 )

[0095] The actor network's estimation of custom-character is thus:

[00058] ` f = W ` a a ( a ) ( 57 )

[0096] The current weight estimation error of the actor network is:

[00059] ` f = W ` a T S a ( a ) ( 58 )

where {grave over (W)}.sub.a=W.sub.A{grave over (W)}.sub.a.

[0097] Let J.sub.d=0 be the desired cos-to-go in the actor network. The error between {grave over (J)} and J.sub.d is:

[00060] J ` d = J ` - J d = J ` ( 59 )

[0098] The total of the errors in the actor system is expressed as:

[00061] a = ` f + K a J ` ( 60 ) W ` ` a = - a E a W a ( 61 )

where K.sub.a is a diagonal gain matrix. The weight update law of the actor network,
where .sub.a>0 is the learning rate, is:

[00062] E a = 1 2 a T a E a W ` a = E a a a ` f ` f W ` a = ( ` f + K a J ` ) a

[0099] Because custom-character is unknown, the weight update law is rewritten as:

[00063] W ` ` a = - a ( W ` a T a ( a ) + K a J ` ) a ( 62 )

[0100] A candidate Lyapunov function for the actor-network is selected as follows:

[00064] L a = 1 2 W ` a T W ` a ( 63 )

[0101] The time-derivative of L.sub.a yields:

[00065] L ` a = - a W ` a T a ( W ` a T a ( a ) + K a J ` ) ( 64 )

[0102] Considering that {grave over (J)}=W.sub.c.sup.T.sub.c(.sub.c)+{grave over (W)}.sub.c.sup.T.sub.c(.sub.c), gets:

[00066] J ` T J ` 2 ( W c T c ) T W c c + 2 ( W ` c T c ) T W ` c T c ( 65 )

[0103] Inserting (65) into (64):

[00067] L ` a = - a W ` a T a ( W a T a ( a ) - W ` a T a ( a ) + K a J ` ) = - a W ` a T a W a T a ( a ) + a .Math. W ` a .Math. 2 .Math. a .Math. 2 - a W ` a T a K c J ` - a W ` a T a W a T a ( a ) + a .Math. W ` a .Math. 2 .Math. a .Math. 2 - a 2 .Math. W ` a .Math. 2 .Math. a .Math. 2 + a 2 K a T K a .Math. J ` .Math. .Math. 2 - a 2 .Math. W ` a .Math. 2 .Math. a .Math. 2 + a 2 .Math. W a .Math. 2 .Math. a .Math. 2 - a 2 .Math. W ` a .Math. 2 .Math. a .Math. 2 + a K a T K a ( .Math. W c .Math. 2 .Math. c .Math. 2 + .Math. W ` c .Math. 2 .Math. c .Math. 2 ) ( 66 )

[0104] Using the actor estimation in (57), (43) becomes:

[00068] u f = f - 1 ( W ` a T a ( a ) + 3 f + 1 2 O f + 1 S ff - 1 z ` f + 2 S ff - 1 f - S ff - 1 q ` f * ) u f = N ( f ) u f ` f = O f T ( W ` a T a ( a ) + 3 f + 1 2 O f + 1 S ff - 1 z ` f + 2 S ff - 1 f - S ff - 1 q ` f * ) ( 67 )

[0105] Equation (42) can be evaluated as follows:

[00069] L ` 1 = O f T ` f = ` f - ` f + O f T f u f + O f T ( W a T ( a a ) + a - q ` f * + 1 S ff - 1 z ` f + 2 S ff - 1 f ) = S ` f + f N ( f ) S ` f + O f T ( W a T ( a a ) + a - q ` f * + 1 S ff - 1 z ` f + 2 S ff - 1 f ) - O f T ( W ` a T a ( a ) + 3 f + 1 2 O f + 1 S ff - 1 z ` f + 2 S ff - 1 - S ff - 1 q ` f * ) = - O f T 3 f - 1 2 O f T O f + O f T W ` a T a ( a ) b + O f T a + [ f N ( f ) + 1 ] ` f ( 68 )

[0106] Consider the following Young's inequalities:

[00070] O f T a o f T o f 2 + a T a 2 ( 69 ) - O f T W ` a T a ( a ) o f T o f T 2 + W a T a a T W a 2 ( 70 ) [0107] where >0 is a constant.

[0108] Using the inequalities above, (68) gives:

[00071] L ` 1 - O f T 3 f + O f T O f ( - 1 2 ) + W ` a T a a T W ` a 2 + a T a 2 + [ f N ( f ) + 1 ] ` f ( 71 )

[0109] For

[00072] = 1 2 : L ` 1 - m i n ( 3 ) .Math. i = N l + 1 N l + N f ( xi 2 c xi 2 - xi 2 + yi 2 c yi 2 - yi 2 ) + .Math. W ` a .Math. 2 .Math. a .Math. 2 2 + .Math. a .Math. 2 2 + [ f N ( f ) + 1 ] ` f ( 72 )

[0110] The scheme ensures that the negative effects of the cyber-attacks are diminished because the state constraints are not violated, and the quadrotors maintain operation within the safety boundaries and the tracking errors in the closed-loop system are bounded within a compact set. The leader-follower affine formation maneuver of the quadrotors is therefore realized.

[0111] The control scheme is proved by choosing a Lyapunov function candidate as follows:

[00073] L = L 1 + L a + L c ( 73 )

[0112] By differentiating L with respect to time:

[00074] L ` - m i n ( 3 ) .Math. i = N l + 1 N l + N f ( xi 2 c xi 2 - xi 2 + yi 2 c yi 2 - yi 2 ) - .Math. W ` a .Math. 2 .Math. a .Math. 2 ( a - 1 2 ) - .Math. W ` c .Math. 2 ( c 2 .Math. .Math. 2 - a .Math. K a .Math. 2 .Math. c .Math. 2 ) + a .Math. K a .Math. 2 .Math. W c .Math. 2 .Math. c .Math. 2 + a 2 .Math. W a 2 .Math. .Math. a .Math. 2 + c 2 .Math. c , m ax .Math. 2 + .Math. a .Math. 2 2 + [ f N ( f ) + 1 ] ` f ( 74 )

[0113] Based on Lemma 3:

[00075] log [ c xi 2 c xi 2 - xi 2 ] xi 2 c xi 2 - xi 2 ( 75 ) log [ c yi 2 c yi 2 - yi 2 ] yi 2 c yi 2 - yi 2 ( 76 )

[0114] Equation (75) can be simplified as:

[00076] L ` h 1 L + h 2 + [ f N ( f ) + 1 ] S ` f where h 1 = min ( 2 m i n ( 3 ) , 2 .Math. a .Math. 2 ( a - 1 2 ) ) 2 ( c 2 .Math. .Math. 2 - a .Math. K a .Math. 2 .Math. c .Math. 2 ) ) h 2 = a .Math. K a .Math. 2 .Math. W c .Math. 2 .Math. c .Math. 2 + a 2 .Math. W a .Math. 2 .Math. a .Math. 2 + c 2 .Math. c , ma x .Math. 2 + .Math. a .Math. 2 2 and ( a - 1 2 ) > 0 ( c 2 .Math. .Math. 2 - a .Math. K a .Math. 2 .Math. c .Math. 2 ) > 0 ( 77 )

[0115] Integrating (78) over the interval [0 t] leads to:

[00077] L ( t ) h 2 h 1 + L ( 0 ) e - h 1 t + e - h 1 t 0 t [ N ( ) + 1 ] ` e h 1 d ( 78 )

[0116] Selecting h.sub.3 as the upper-bound of

[00078] e - h 1 t 0 t [ N ( ) + 1 ] s ` e h 1 d

achieves:

[00079] L ( t ) h 2 h 1 + L ( 0 ) e - h 1 t + h 3 ( 79 )

[0117] Based on the inequality above, the tracking error signals will, in the end, remain in the compact sets defined by:

[00080] .Math. xi .Math. c xi 1 - e - 2 ( h 2 i h 1 i + h 3 i + L xi ( 0 ) ) ( 80 ) .Math. yi .Math. c yi 1 - e - 2 ( h 2 i h 1 i + h 3 i + L yi ( 0 ) ) ( 81 ) .Math. W ~ a .Math. 2 ( h 2 h 1 + h 3 + L ( 0 ) ) ( 82 ) .Math. W ~ c .Math. 2 ( h 2 h 1 + h 3 + L ( 0 ) ) ( 83 )

[0118] FIG. 1B illustrates several formation shape maneuvers of a plurality of quadcopters described in FIG. 1A. The formation shape maneuvers depicted in FIG. 1B are non-limiting examples of some of the possible affine formation maneuvers which may be joined in sequence to comprise the overall flight path of a plurality of quadcopters.

[0119] FIG. 2 depicts a nominal formation of the quadrotors 200, according to certain embodiments. The nominal formation 200 is a two-dimensional plot of the position of the quadcopters along the y-axis 202 versus time along the x-axis 204.

[0120] FIG. 3A depicts a graph 300 of leader-follower position trajectories in the x-axis over time, according to certain embodiments. FIG. 3B depicts the graph 324 of leader-follower position trajectories in the y-axis over time, according to certain embodiments. FIG. 3A shows the graph 300 obtained by plotting position trajectories along the y-axis 302 and time along x-axis 304 and includes curves of first leader 306, second leader 308, third leader 310, first follower 312, second follower 314, third follower 316, fourth follower 318, k_c 320 and k_c 322. FIG. 3B shows the graph 324 obtained by plotting position trajectories in the y-axis along y-axis 326 and time along x-axis 328 and includes curves of first leader 330, second leader 332, third leader 334, first follower 336, second follower 338, third follower 340, fourth follower 342, k_c 344 and k_c 346. FIG. 3A-3B illustrate the leader-follower time-varying trajectories of the quadrotors under sensor deception attacks and actuator injection attacks. It can be seen that the learning-based controller of the present disclosure can keep the quadrotors within the safety region despite influence from cyberattacks. Here, simulations are given to highlight the validity of the developed control method. The nominal formation of the quadrotors 200 is depicted in FIG. 2. The formation group comprises three leaders (N.sub.l=3) and four followers (N.sub.f=4). The quadrotors tagged (1,2, &3) and (4,5,6&7) represent the groups of leaders and followers, respectively, with corresponding nominal positions given as .sub.1=[1 0], .sub.2=[0.5 0.5], .sub.3=[0.5 0.5], .sub.4=[0 0.5], .sub.5=[0 0.5], .sub.6=[0.5 0.5] and .sub.7=[0.5 0.5] and

[00081] = [ 1 T , 2 T , .Math. , 7 T ] T .

The stress matrix custom-character is computed:

[00082] S = [ S ll S lf S fl S ff ] ( 84 ) where S fl = [ 0 . 1 3 7 0 - 0 . 5 4 8 2 0 0.137 0 - 0 . 5 4 8 2 0 0 0.137 0 0 . 1 3 7 0 0 ] ; S lf = [ 0 . 1 3 7 0 0 . 1 3 7 0 0 0 - 0 . 5 4 8 2 0 0 0 . 1 3 7 0 0 - 0 . 5 4 8 2 0 . 1 3 7 0 0 ] S ff = [ 0 . 7 5 3 8 - 0 . 0 6 8 5 - 0 . 2 7 4 1 0 - 0 . 0 6 8 5 0 . 7 5 3 8 0 - 0 . 2 7 4 1 - 0 . 2 7 4 1 0 0 . 2 7 4 1 - 0 . 1 3 7 0 0 - 0 . 2 7 4 1 - 0 . 1 3 7 0 0 . 2 7 4 1 ] ; S ll = [ 0.2742 - 0.2741 - 0.2741 - 0.2741 0.6853 0 - 0.2741 0 0.6853 ]

[0121] The initial positions of the quadrotors are set as q.sub.1(0)=[1 0], q.sub.2(0)=[0.5 1], q.sub.3(0)=[0.5 0.75], q.sub.4(0)=[0 0.75], q.sub.5(0)=[0 0.75], q.sub.6(0)=[0.75 1.5] and q.sub.7(0)=[0.75 1.25] and q(0)=[q.sub.1(0).sup.T, q.sub.2(0).sup.T, . . . , q.sub.7(0).sup.T].sup.T. To enforce safety for the group of quadrotors, the motion of each quadrotor is constrained within the safety region such that |x.sub.i|<k.sub.ci and |y.sub.i|<k.sub.ci, with k.sub.ci=15. It follows that the tracking errors are also constrained as |.sub.xi|<c.sub.xi and |y.sub.i|<c.sub.yi, with c.sub.xi=c.sub.yi=2 to avoid violation of the safety region. The sensor deception attacks are modeled as .sub.i=(1+cos (t)), the actuator injection attacks are modeled as m.sub.i=sin (q.sub.i(t){grave over (q)}.sub.i(t)). The controller gains are chosen as .sub.1=dia g 15,15, . . . ,15, .sub.2=diag 2,2, . . . ,2, .sub.3=diag 10,10, . . . ,10, .Math..sub.1=1.5, and .Math..sub.2=2.1. The parameters in the cost function are R=diag 5,5, . . . ,5, Q=diag 10,10, . . . ,10 and the discount factor =0.2. When defining the reinforcement learning parameters 5 nodes are considered in both actor and critic neural networks; the learning rates of the actor and critic neural networks are .sub.a=1.5 and .sub.c=0.01, respectively; the centers of the Gaussian functions of both actor and critic networks are selected between 0.5 and 0.5, while the width of both functions are set as 0.25; and the initial weights of the actor and critic networks are selected as {grave over (W)}.sub.ai(0)={grave over (W)}.sub.ci(0)=[0.5,0.5, . . . ,0.5].sup.T.

[0122] FIG. 4A depicts the graph 400 of tracking errors of followers in the x-axis, according to certain embodiments. FIG. 4B depicts the graph 418 of tracking errors of followers in the y-axis, according to certain embodiments. FIG. 4A shows the graph 400 obtained by plotting tracking errors in x-axis position along the y-axis 402 of the graph versus time in seconds along the x-axis 404 of the graph, and includes curves of C_x4 406, C_x6 408, C_x 410, C_x5 412, C_x7 414 and C_x 416. FIG. 4B shows the graph 418 obtained by plotting tracking errors in y-axis position along the y-axis 420 of the graph and time in seconds along x-axis 422 of the graph and includes curves of C_y4 424, C_y6 426, C_y 428, C_y5 430, C_y7 430 and C_y 432. Moreover, the leader-follower tracking errors did not violate the prescribed constraints, as depicted in FIG. 4.

[0123] FIG. 5A depicts the graph 500 of actor-critic learning control protocols of followers in the x-axis, according to certain embodiments. FIG. 5B depicts graph 514 of actor-critic learning control protocols of followers in y-axis position, according to certain embodiments. FIG. 5A shows the graph 500 obtained by plotting controllers in the x-axis position along the y-axis 502 of the graph and time in seconds along the x-axis of the graph 504 and includes curves of uf_x4 506, uf_x5 508, uf_x6 510, and uf_x7 512. FIG. 5B shows the graph 514 obtained by plotting controllers in y-axis position 516 along the y-axis of the graph and time in seconds 518 along the x-axis of the graph and includes curves of uf_y4 520, uf_y5 522, uf_y6 524, and uf_y7 526. The responses of the actor-critic learning-based control protocols that aid the realization of the aforementioned control performance are depicted in FIG. 5.

[0124] FIG. 6A-6H depict the graphs 600, 608, 616, 624, 632, 640, 648, 656 of Norm-2 of the actor weights, according to certain embodiments. FIG. 6A shows the graph 600 obtained by plotting x-axis weights along the y-axis 602 and time in seconds along the x-axis 604 and includes the curve of Wa_x4 606. FIG. 6B shows the graph 608 obtained by plotting x-axis weights along the y-axis 610 and time in seconds along x-axis 612 and includes the curve of Wa_x6 614. FIG. 6C shows the graph 616 obtained by plotting y-axis weights 618 along the y-axis and time in seconds 620 along x-axis and includes the curve of Wa_y4 622. FIG. 6D shows the graph 624 obtained by plotting y-axis weights along y-axis 626 and time in seconds along x-axis 628 and includes the curve of Wa_y6 630. FIG. 6E shows the graph 632 obtained by plotting x-axis weights along y-axis 634 and time in seconds along x-axis 636 and includes the curve of Wa_x5 638. FIG. 6F shows the graph 640 obtained by plotting x-axis weights along the y-axis 642 versus time in seconds along the x-axis 644 and includes the curve of Wa_x7 646. FIG. 6G shows the graph 648 obtained by plotting y-axis weights along the y-axis 650 versus time in seconds along x-axis 652 and includes the curve of Wa_y5 654. FIG. 6H shows the graph 656 obtained by plotting y-axis weights along the y-axis 658 versus time in seconds along x-axis 660 and includes the curve of Wa_y7 662. The evolution of the Norm-2 of the weight vectors of the actor and critic networks is depicted in FIGS. 6A-6H.

[0125] FIG. 7A-7H depicts the graphs 700, 708, 716, 724, 732, 740, 748, 756 of Norm-2 of the critical weights, according to certain embodiments. FIG. 7A shows the graph 700 obtained by plotting positions in the x-axis along the y-axis 702 versus time in seconds along the x-axis 704 and includes the curve of Wc_x4 706. FIG. 6B shows the graph 708 obtained by plotting the position in the x-axis along the y-axis 710 and time in seconds along the x-axis 712 and includes the curve of Wc_x6 714. FIG. 7C shows the graph 716 obtained by plotting position in the y-axis along the y-axis 718 and time in seconds along the x-axis 720 and includes the curve of the Wc_y4 722. FIG. 7D shows the graph 724 obtained by plotting the position in the y-axis along the y-axis 726 and time in seconds along the x-axis 728 and includes the curve of Wc_y6 730. FIG. 7E shows the graph 732 obtained by plotting the position in the x-axis along the y-axis 734 and time in seconds along the x-axis 736 and includes the curve of Wc_x5 738. FIG. 7F shows the graph 740 obtained by plotting the position in x-axis along the y-axis 742 and time in seconds the along x-axis 744 and includes the curve of Wc_x7 746. FIG. 7G shows the graph 748 obtained by plotting the position in the y-axis along the y-axis 750 and time in seconds along the x-axis 752 and includes the curve of Wc_y5 754. FIG. 7H shows the graph 756 obtained by plotting the position in the y-axis along the y-axis 758 versus time in seconds along the x-axis 760 and includes the curve of Wc_y7 762. The evolution of the Norm-2 of the weight vectors of the actor and critic networks is depicted in FIGS. 6A-6H and FIGS. 7A-7H, respectively.

[0126] FIG. 8 depicts the graph 800 of leader-follower affine formation maneuvers, according to certain embodiments. The graph 800 is obtained by plotting y(m) along the y-axis 802 versus x(m) along the x-axis 804 and includes the curves of 806 and 808. The affine formation maneuvers of the leader-follower multiple-quadrotor system are shown in FIG. 8.

[0127] FIG. 9 depicts a flowchart 900 of a method for controlling the flight of a plurality of quadcopters, according to certain embodiments. At step 902, a flight instruction is communicated from a user to a leader quadcopter. At step 904, a leader formation maneuver and a follower formation maneuver are calculated with a leader-follower formation controller configured to use affine transformations and stress matrices to convert a flight instruction into a leader formation maneuver and a follower formation maneuver. At step 906, the follower formation maneuver is communicated from the leader quadcopter to a follower quadcopter. At step 908, the leader formation maneuver is executed on the leader quadcopter. At step 910, the follower formation maneuver is executed on the follower quadcopter. In an implementation, the leader-follower formation controller is further configured to utilize a barrier function to calculate the leader formation maneuver and the follower formation maneuver. In an implementation, the barrier function is a Lyapunov candidate function which trends to infinity at a predetermined constraint value. In an implementation, the leader-follower formation controller is further configured to use an actor-critic learning mechanism. In an implementation, the actor-critic learning mechanism is a machine learning algorithm. In an implementation, the leader-follower formation controller is further configured to use a distributed sliding mode control. In an implementation, the distributed sliding mode control is configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack. In an implementation, the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack. In an implementation, the distributed sliding mode control is configured to use a Nussbaum gain function to mitigate the effects of input gain on the follower formation maneuver, the input gain being created a malicious cyber-attack. In an implementation, the follower formation maneuver comprises a scaling maneuver, a shearing maneuver, a translation maneuver, and a collinearity maneuver. In an implementation, the position of each of the plurality of quadcopters within the follower formation maneuver is defined by an x position, a y position, a z position, a roll angle, a yaw angle, and a pitch angle. In an implementation, the leader-follower formation controller is configured to use a radial basis function neural network to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter.

[0128] FIG. 10 depicts a flowchart 1000 of a method for controlling the flight of a plurality of quadcopters, according to certain other embodiments. At step 1002, flight instructions are communicated from a user to a leader quadcopter. At step 1004, a leader formation maneuver and a follower formation maneuver are calculated with a leader-follower formation controller configured to use a distributed sliding mode control and an actor-critic learning mechanism to convert a flight instruction into a leader formation maneuver and a follower formation maneuver. At step 1006, the follower formation maneuver is communicated from the leader quadcopter to a follower quadcopter. At step 1008, the leader formation maneuver is executed on the leader quadcopter. At step 1010, the follower formation maneuver is executed on the follower quadcopter. In an implementation, the sliding mode control is configured to convert the flight instruction into a follower formation maneuver with affine transformations and stress matrices. In an implementation, the actor-critic learning mechanism is a machine learning algorithm, and the machine learning algorithm is further configured to use a radial basis function to smooth a flight instruction after the flight instruction has been communicated from a user to a leader quadcopter. In an implementation, the sliding mode control and the actor-critic learning mechanism are configured to use a control law to protect at least one of the plurality of quadcopters from a malicious cyber-attack. In an implementation, the distributed sliding mode control and the actor-critic learning mechanism are configured to use a Nussbaum gain function to mitigate the effects of input gain on a follower formation maneuver, the input gain being created by a malicious cyber-attack. In an implementation, the malicious cyber-attack is a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.

[0129] The present disclosure investigates the actor-critic learning-based affine formation maneuver controls of a group of quadrotor UAVs while considering safety constraints and providing security against cyber-attacks. The leaders specify desired formation maneuvers through stress matrices and an affine transformation scheme. By constructing a barrier Lyapunov function, the operation of quadrotor UAVs within a predefined safety range is guaranteed. A distributed sliding mode control coupled with an actor-critic learning mechanism is designed to counter cyber-attacks and achieve the leader-follower affine formation maneuvers of the quadrotor UAVs. The critic system is used to estimate the objective function of the system whereas the actor system takes the proper control action necessary to achieve control objectives. A Lyapunov stability function is employed to prove that the closed-loop system is bounded. The provided example reveals that the method and system of the present disclosure are able to meet the control objective. Compared with traditional systems wherein input gains induced by attack signals are not considered, the present disclosure addresses these problems by integrating the Nussbaum function into the controller. As such, the controller of the present disclosure does not require any prior information about the signs of the input gains induced by the attack signals.

[0130] In conventional methods and systems, learning-based controllers use linear quadratic regulators, in contrast the present disclosure teaches a distributed sliding mode control approach equipped with a learning mechanism. The learning mechanism may be a machine learning mechanism. A machine learning mechanism may be understood to be any formula, computer program, computer, computer system, program, network of computers, or the like, which is configured to develop, change, or improve its functionality with exposure to data such that improved performance of a particular task, or more desirable responses to particular stimuli, may be learned or developed by the mechanism.

[0131] The quadrotors of the present disclosure may be any unmanned arial vehicle (UAV) which is configured to use spinning rotors to generate thrust. The control scheme of the present disclosure may be applied to any plurality of UAVs, including but not limited to helicoptors, dicopters, tricopters quadcopters, hexacopters, and octocopters.

[0132] The cyber-attack of the present disclosure may be any malicious digital signal which are configured to interrupt the operation of at least one UAV of the present disclosure. The cyber-attack may be, amongst other attacks, a distributed denial of service attack, a sensor deception attack, or an actuator injection attack.

[0133] Numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.