Generative adversarial network-based optimization method and application

11551098 · 2023-01-10

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention discloses a generative adversarial network-based optimization (GAN-O) method. The method includes: transforming an application into a function optimization problem; establishing a GAN-based function optimization model based on a test function and a test dimension of the function optimization problem, including constructing a generator G and a discriminator D based on the GAN; training the function optimization model by training the discriminator and the generator alternatively, to obtain a trained function optimization model; and using the trained function optimization model to perform iterative calculation to obtain an optimal solution. In this way, the optimal solution is obtained based on the GAN. The present invention can improve the parameter training process of a deep neural network to obtain a better local optimal solution in a shorter time, making the training of the deep neural network more stable and obtaining better local search results.

Claims

1. A generative adversarial network-based optimization (GAN-O) method, comprising: (A1) transforming an application into a function optimization problem; (A2) establishing a GAN-based function optimization model according to a test function and a test dimension of the function optimization problem, comprising constructing a generator G and a discriminator D based on the GAN, wherein inputs of the discriminator D are two vectors whose sizes are the same as the test dimension; and an output is a scalar whose value is 1 or 0, indicating whether a latter vector is better than a former vector; an input of the generator is a vector whose size is the test dimension plus a noise dimension; and an output is a vector indicating a guiding direction, whose size is equal to the test dimension; and training the function optimization model by training the discriminator and the generator, to obtain a trained function optimization model; (A3) using the trained function optimization model for iterative calculation; (A31) before iteration, randomly initializing a current solution; (A32) performing the following operations during each iteration: (A321) sampling a motion vector from the generator, calculating a generative solution, performing estimation by using a fitness value function, calculating a loss function of the discriminator, and updating network parameters of the discriminator; (A322) determining the network parameters of the discriminator, connecting the discriminator to the generator, calculating a loss function of the generator, and updating network parameters of the generator accordingly; (A323) using the generator to generate a new solution and combine it with the current solution to obtain all candidate solutions; (A324) calculating fitness values of all the candidate solutions, and assigning retention probabilities based on the fitness values; (A325) selecting, based on the retention probabilities, solutions with a same quantity as the current solutions from the candidate solutions as new current solutions; and (A326) stopping iteration when a maximum number of iterations is reached, wherein the solutions retained in this case are optimal solutions; through the foregoing steps, optimal solutions are obtained based on the GAN.

2. The GAN-O method according to claim 1, wherein step (A1) of transforming an application into a function optimization problem specifically comprises: determining the test function, namely a function representing the function optimization problem of the application; determining the test dimension; and setting the noise dimension; (1a) determining a to-be-optimized test function, comprising a function expression, a domain denoted as D, and a function input dimension denoted as ∥x∥, wherein after an independent variable x meeting requirements is input, the test function outputs a fitness value F(x) of the independent variable; (1b) setting a test dimension ∥x∥, namely the function input dimension; and (1c) setting a noise dimension ∥z∥ that controls the size of a noise vector and is connected to an input solution as an input when the generative network generates the solution.

3. The GAN-O method according to claim 1, wherein the application in step (A1) comprises but is not limited to logistics distribution, machine learning or deep learning.

4. The GAN-O method according to claim 1, wherein for the logistics distribution problem, step (A1) aims to optimize the logistics speed or efficiency by using the existing facilities, so that the total logistics speed is the highest or the total distribution time is the shortest; and specifically, there are a total of M delivery persons in one distribution station, the delivery speed of each delivery person is the same, N locations are covered by the distribution station, and distances between every two distribution locations form matrix T, wherein T(i,j) represents the time for a delivery person to go from location i to location j; and transforming the logistics distribution problem into a discrete function optimization problem specifically comprises the following operations: S1: designing a coding scheme for solutions to form a code x.sub.i ϵ ∥N∥, i=1,2, . . . , M of a solution, which specifically comprises: (S1a) sorting locations assigned to delivery persons to form a digital code string denoted as x.sub.i ϵ ∥N∥, i=1,2, . . . , M , wherein x.sub.i represents a location assigned to the i-th delivery person; M is a total quantity of delivery persons; the maximum dimension of x.sub.i is ∥N∥; and if a quantity of locations assigned to a delivery person is less than N, zero is filled to an unassigned location, indicating that no delivery is required; (S1b) connecting codes of M delivery persons to form a complete solution: x=[x.sub.1, x.sub.2, . . . , x.sub.M]; S2: determining a solution space, namely the domain, wherein: (S2a) the dimension of the solution is equal to the sum of delivery codes of each delivery person, that is, ∥x∥=N* M; (S2b) a value of x.sub.i ranges from 0 to N, wherein 0 represents that no delivery is required; a number ranging from 1 to N indicates delivery to a location represented by the number, and the number can appear only once in all dimensions of this solution; and the dimension of the solution is expressed as follows: x i Z + , 0 x i N , for i = 1 , 2 , .Math. , N * M , ( .Math. i = i N * M ( 1 if x i == k else 0 ) = 1 , k Z + , 1 k N . wherein Z.sup.+ represents a set of positive integers, N is a quantity of locations, and M is a quantity of delivery persons; S3: determining an objective function, namely the test function, wherein M delivery persons deliver goods at the same time, and the total delivery time is the time spent by a delivery person who completes the delivery last, that is, an objective function is expressed as follows: T ( x ) = max i = 0 N - 1 ( .Math. j = 1 N - 1 t ( x i * N + j , x i * N + j + 1 ) ) wherein t ( k , 1 ) = { T ( k , l ) , if k 1 and l 1 0 , else k, l are x.sub.i+N+j, x.sub.i+N+j+1.

5. The GAN-O method according to claim 1, wherein step (A2) of establishing a network structure and a loss function of the discriminator, and receiving, by the discriminator D, two input solutions x.sub.1 and x.sub.2, and outputting D (x.sub.1, x.sub.2), wherein the output presents comparison between fitness values of x.sub.2 and x.sub.1, specifically comprises the following operations: (2a) for the two input solutions x.sub.1 and x.sub.2, using a 4-layer fully connected network to extract features, wherein the dimension of an input layer of D.sub.f is the dimension of the solution, namely the test dimension; the dimension of the second layer is 64, the dimension of the third layer is 64, and the dimension of the fourth layer is 10; and an activation function of each layer is a rectified linear unit (ReLU), that is, x.sub.1 and x.sub.2 are received through D.sub.f, two 10-dimensional vectors are extracted as features of x.sub.1 and x.sub.2; (2b) performing subtraction between the two extracted vectors; (2c) sending a vector obtained after the subtraction to a 3-layer fully connected network D.sub.c, wherein the dimension of an input layer of D.sub.c is 10, the dimension of the second layer is 10, and the dimension of the third layer is 1; an activation function of the second layer is an ReLU, and an activation function of an output layer is Sigmoid; and finally a scalar ranging from 0 to 1 is output through D.sub.c; (2d) expressing the output of the discriminator as follows:
D(x.sub.1, x.sub.2)=D.sub.c(D.sub.f(x.sub.2)−D.sub.f(x.sub.1)) ; (2e) using a cross-entropy loss function as the loss function of the discriminator, and using Adam algorithm as an optimization algorithm.

6. The GAN-O method according to claim 1, wherein step (A2) of establishing a network structure and a loss function of the generator, wherein the generator G receives inputs, comprising a current solution x.sub.cur, noise z, and a step size L, and outputs a motion vector G(x.sub.cur, z, L), specifically comprises the following operations: (3a) connecting the input current solution x.sub.cur and the noise z to obtain a vector whose dimension is equal to the sum of the test dimension and the noise dimension; (3b) sending the vector to a 3-layer fully connected network G.sub.d, and outputting a direction vector with the same dimension as a solution vector to guide subsequent motion directions; wherein the dimension of an input layer of G.sub.d is equal to the sum of the dimension of the solution vector and the dimension of a noise vector, the dimension of the second layer is 64, and the dimension of an output layer is equal to the dimension of the solution vector; an activation function of the second layer is an ReLU, and an activation function of the output layer is a hyperbolic tangent function (Tanh); (3c) calculating the dot product of the direction vector obtained in (3b) and the input step size, to obtain a motion vector as the final output of the entire generator, wherein the motion vector represents a specific motion direction and distance, and is used to guide how the current solution moves to obtain a better fitness value; (3d) expressing the output of the generator as follows:
G(x.sub.cur, z, L)=L.Math.G.sub.d([x.sub.cur.sup.T,T, Z.sup.T].sup.T); (3e) using a cross-entropy loss function as the loss function of the generator, and using Adam algorithm as an optimization algorithm.

7. The GAN-O method according to claim 1, wherein step (A31) of randomly initializing a current solution S.sub.cur through global random search, and initializing a step size L specifically comprises the following operations: (4a) in global random search GS, randomly sampling solutions through standard Gaussian distribution, and then using formula 1 to move the solutions to a solution space:
GS(x.sub.gs)=B.sub.l+(B.sub.u−B.sub.l)* x.sub.gs, x.sub.gs˜P.sub.gaussian(0,1)   Formula 1 wherein x.sub.gs˜P.sub.gaussian(0,1) represents that x.sub.gs is randomly sampled through standard Gaussian distribution; and B.sub.l and B.sub.u represent an upper limit and a lower limit of the domain D of the test function; (4b) for current solutions x.sub.cur in the set S.sub.cur, using the test function to calculate corresponding fitness values F(x.sub.cur), and sorting them in ascending order of the fitness values to obtain an optimal current solution x.sub.best and its fitness value F(x.sub.best); and (4c) initializing the step size L to 50, setting a currently optimal fitness value L.sub.best.sub.F recorded in a step controller to F(x.sub.best), and setting a counter L.sub.cnt of the step controller to 0.

8. The GAN-O method according to claim 1, wherein the training the function optimization model and using a trained model for iterative calculation specifically comprises the following operations: (5) generating a data set T.sub.D for training the discriminator: (5a) using the generator (GE), local random search (LS), and global random search (GS) respectively to generate a new solution x.sub.gen based on the current solution x.sub.cur selected from a current solution set S.sub.cur, and calculating a fitness value of x.sub.gen; comparing the generated new solution x.sub.gen with the current solution x.sub.cur based on which the new solution is generated, and labeling it based on fitness values (labeling it as 1 when a fitness value of x.sub.gen is greater than that of x.sub.cur; otherwise, labeling it as 0) as the data for training the discriminator; (5b) combining the solutions generated by using the three methods to generate a new solution, and calculating its fitness value; (5c) combining the new solution obtained in (5b) and the current solution based on which the new solution is generated into a data pair, and comparing their fitness values; and if a fitness value of the new solution is less than a fitness value of the current solution, labeling the data pair as 1, otherwise, labeling it as 0; (5d) generating the training data set T.sub.D, which is expressed as follows:
T.sub.D ={(x.sub.cur, x.sub.gen, 1 if F(x.sub.gen)≤F(x.sub.cur) else 0)}  Formula 4 wherein x.sub.gen˜GS(x.sub.gs), GE(x.sub.cur), LS(x.sub.cur), x.sub.cur ϵ S.sub.cur; (6) using the training set T.sub.D to train the discriminator: (6a) sequentially taking m pieces of training data {(x.sub.cur.sup.i, x.sub.gen.sup.i, y.sup.i), i=1,2, . . . , m } in batches from the training set T.sub.D, and calculating a loss function according to formula 5:
loss.sub.D=−Σ.sub.i=1.sup.my.sup.i* log.sub.2(D(x.sub.cur.sup.i, x.sub.gen.sup.i))   Formula 5 wherein D represents the discriminator, and D(x.sub.cur.sup.i, x.sub.gen.sup.i) represents an output of the discriminator when inputs are x.sub.cur.sup.i and x.sub.gen.sup.i; (6b) calculating a gradient g.sub.D, based on the loss function according to formula 6: g D = 1 m θ D l o s s D Formula 6 wherein θ.sub.D represents a network parameter of the discriminator; (6c) updating the network parameter of the discriminator by using Adam algorithm: θ D = θ D - μ s ^ r ^ + δ Formula 7 wherein μ is a learning step size, which is set to 0.001 in the present invention; δ is a small constant for numerical stability; ŝ represents correction of the first-moment deviation; and {circumflex over (r)} represents correction of the second-moment deviation; (7) training the generator by performing the following operations: (7a) setting the network parameter of the discriminator to un-trainable, and connecting the discriminator to the generator to obtain a network C, of which input is the input of the generator, and output is the output of the discriminator, wherein the specific connection method is as follows: step 1: entering the current solution, noise, and step size into the generator and running the generator; step 2: adding a motion vector generated by the generator to the current solution to obtain a generative solution; step 3: using the current solution and the generative solution obtained in step 2 as the inputs of the discriminator, and using the output of the discriminator as the output of the network C, as shown in formula 8:
C(c.sub.cur, z, L)=D(x.sub.cur, x.sub.cur+G(x.sub.cur, z, L))   Formula 8 (7b) using current solutions batch-sampled from S.sub.cur, random Gaussian noise, and a step size as inputs of C, and running C to obtain output C(c.sub.cur, z, L); (7c) calculating a loss function of the network C according to formula 9:
loss.sub.c=−Σ.sub.i=1.sup.m1* log(C(x.sub.cur.sup.i, z, L))   Formula 9 wherein m is a quantity of samples each time; (7d) calculating a gradient g.sub.G based on the loss function according to formula 10, wherein a network parameter of C is the network parameter of the generator: g G = 1 m θ G l o s s C Formula 10 wherein θ.sub.G represents the network parameter of the generator; (7e) updating the network parameter of the generator G according to formula 11: θ G = θ G - μ s ^ r ^ + δ Formula 11 (8) selecting and retaining current solutions by performing the following steps: (8a) combining a current solution set and a generative solution set to form a candidate solution set S.sub.candidate; (8b) for each solution in the candidate solution set, calculating its retention probability based on the order of its fitness value in all candidate solutions, as shown in formula 11: P retain ( x ) = r F ( x ) - α .Math. i = 1 .Math. s candidate .Math. r F ( x i ) - α Formula 11 wherein P.sub.retain(x) represents a retention probability of solution x; r.sub.F(x) represents the order of a fitness value of solution x in all candidate solutions arranged in ascending order; ∥S.sub.candidate∥ represents a size of the candidate solution set; and α is a parameter that controls the probability distribution; (8c) selecting a solution based on the retention probability of each solution, retaining the selected solution, and updating S.sub.cur; (9) reducing a step size L by performing the following steps: (9a) checking whether a difference between a fitness value of an optimal solution x.sup.* in the current S.sub.cur and L.sub.best_F is less than a sufficiently small constant:
|F(x.sup.*)−L.sub.best.sub.F|≤ε, wherein ε is a sufficiently small constant; if so, performing (9b); otherwise, setting L.sub.cnt to 0 and performing (9c); (9b) increasing L.sub.cnt by 1, checking whether L.sub.cnt is greater than a specified threshold T, and if so, multiplying the step size L by an attenuation factor γ; (9c) updating L.sub.best_F to the fitness value F(x.sup.*) of the optimal solution x.sup.* in the current S.sub.cur; and (10) determining whether the number of current iterations reaches the maximum number of iterations, and if so, outputting the current optimal solution; otherwise, going back to step (5).

9. The GAN-O method according to claim 8, wherein step (5a) of using the generator G to generate a solution comprises: using the generator to generate a motion vector based on the current solution x.sub.cur, and adding it to the current solution to obtain a generative solution; and the specific steps are as follows: step 1: performing random sampling through standard Gaussian distribution to obtain a noise vector; step 2: inputting the current solution, the noise vector obtained in step 1, and a step size into the generator, and running the generator to generate the motion vector; step 3: adding the motion vector obtained in step 2 to the current solution to obtain the generative solution, as shown in Formula 2:
GE(x.sub.cur)=x.sub.cur+G(x.sub.cur, z, L), z˜P.sub.gaussian(0,1)  Formula 2 wherein z represents the noise vector; and L represents the step size.

10. The GAN-O method according to claim 8, wherein step (5a) of using local random search to obtain a generative solution comprises: performing random Gaussian sampling around the current solution x.sub.cur to obtain the generative solution; and the specific steps are as follows: step 1: performing random sampling through standard Gaussian distribution to obtain a direction vector; step 2: multiplying the direction vector obtained in step 1 by the step size to obtain a motion vector; and step 3: adding the motion vector obtained in step 2 to the current solution to obtain the generative solution, as shown in formula 3:
LS(x.sub.cur)=x.sub.cur+L*d, d˜P.sub.gaussian(0,1)  Formula 3 wherein d represents the direction vector; and L represents the step size.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a flowchart according to the present invention.

(2) FIG. 2 is a schematic diagram of data flows in a generator and a discriminator in specific implementation of the present invention.

(3) FIG. 3 is a structural diagram of a discriminator according to the present invention.

(4) FIG. 4 is a structural diagram of a generator according to the present invention.

DETAILED DESCRIPTION

(5) The following further describes the present invention through embodiments with reference to the accompanying drawings, but the scope of the present invention is not limited by any means.

(6) The present invention proposes a new algorithm framework for GNA-based function optimization, to address the lack of diversity of local search in function optimization. FIG. 1 shows a general process of a method in the present invention.

(7) (1) For a given test function set, a generator network and a discriminator network are involved.

(8) (2) Randomly initialize a current solution and a direction vector.

(9) (3) Calculate a loss function of the discriminator network based on the current solution and the direction vector, and update parameters of the discriminator network.

(10) (4) Determine the parameters of the discriminator network, connect the discriminator to the generator network, calculate a loss function of the generator network based on the current solution and randomly generated noise, and update parameters of the generator network.

(11) (5) Determine whether the maximum number of estimations is reached, and if so, stop iteration and output a current solution; otherwise, go back to step (3). According to the present invention, a discriminator network and a generator network are trained alternately, where the discriminator network is used to determine whether one solution is superior to another solution, and the generator network is used to generate a guiding vector (GV) from a given solution, and add the GV to a current solution to obtain a new solution.

(12) The following uses logistics distribution as an example to describe in detail the specific implementation of the method for GAN-based function optimization according to the present invention.

(13) Step 1: Determine a test function and a test dimension to be input, and set a noise dimension.

(14) (1a) Determine a to-be-optimized function, including a function expression, a domain D, and a function input dimension ∥x∥. After an independent variable x meeting requirements is input, the test function outputs a fitness value F(x) of the independent variable. Generally, function optimization is to find an independent variable solution x* with a minimum fitness value to the function, that is:
F(x*)≤F(x),∀x∈D(F);

(15) (1b) Set a test dimension ∥x∥ (the function input dimension) as required.

(16) (1c) Set a noise dimension ∥z∥. The noise dimension controls a size of a noise vector, and is connected to an input solution as an input when the generative network generates the solution. In the present invention, the noise dimension is set to one-third of the test dimension.

(17) In specific implementation of the present invention, transforming the logistics distribution problem into a function optimization problem includes the following operations:

(18) The logistics distribution problem is a discrete function optimization problem, aiming to minimize the delivery time. For example, there are a total of M delivery persons in one distribution station, the delivery speed of each delivery person is the same, and N locations are covered by the distribution station. Assuming that a distance from the delivery station to each locations is the same, a distance from the delivery station to the first delivery location can be ignored, and distances between every two distribution locations form matrix T, where T(i, j) represents the time for a delivery person to go from location i to location j. Logistics distribution is to arrange the delivery persons as reasonably as possible to minimize the total delivery time. The steps for transforming the logistics distribution problem into the function optimization problem are as follows:

(19) 1: Design a coding scheme for solutions to form a code x.sub.i∈∥N∥, i=1, 2, . . . , M of a solution.

(20) (1a) Because each delivery person can only deliver goods in order, sort locations assigned to each delivery person in sequence to form a digital code string denoted as x.sub.i∈∥N∥, i=1, 2, . . . , M. x.sub.i represents a code of a location assigned to the i-th delivery person; and M is a total quantity of delivery persons. Since it is possible to assign all locations to one delivery person, the maximum dimension of x.sub.i is ∥N∥. If a delivery person is assigned less than N delivery locations, add 0 for an unassigned location, indicating that no delivery is required.

(21) (1b) Connect codes of M delivery persons to form a complete solution: x=[x.sub.1, x.sub.2, . . . , x.sub.M].

(22) 2: Determine a solution space, namely the domain.

(23) (1a) The dimension of the solution is equal to the sum of delivery codes of each delivery person, that is, ∥x∥=N*M.

(24) (1b) Because the problem is discrete, each dimension x.sub.i of x must range from 0 to N (0 indicates that no delivery is required; and a number ranging from 1 to N indicates delivery to a location represented by the number), and the number can only appear once in all dimensions of this solution. This can be expressed as follows:

(25) x i Z + , 0 x i N , for i = 1 , 2 , .Math. , N * M , ( .Math. i = i N * M ( 1 if x i == k else 0 ) ) = 1 , k Z + , 1 k N .

(26) where Z.sup.+ represents a set of positive integers, N is a quantity of locations, and M is a quantity of delivery persons.

(27) 3: Determine an objective function.

(28) (1a) Because M delivery persons deliver goods at the same time, the total delivery time is the time spent by a delivery person who completes the delivery last, that is:

(29) T ( x ) = max i = 0 N - 1 ( .Math. j = 1 N - 1 t ( x i * N + j , x i * N + j + 1 ) ) where : t ( k , 1 ) = { T ( k , l ) , if k 1 and l 1 0 , else

(30) Then the objective function is:

(31) T ( x ) = max i = 0 N - 1 ( .Math. j = 1 N - 1 t ( x i * N + j , x i * N + j + 1 ) ) , where : t ( k , 1 ) = { T ( k , l ) , if k 1 and l 1 0 , else ;

(32) the dimension of the solution is ∥x∥=N*M; and the domain is:

(33) 0 x i Z + , 0 x i N , for i = 1 , 2 , .Math. , N * M , ( .Math. i = i N * M ( 1 if x i == k else 0 ) ) = 1 , k Z + , 1 k N .

(34) Step 2: Establish a network structure of a discriminator.

(35) (2a) As shown in FIG. 3, the discriminator receives two input solutions x.sub.1 and x.sub.2, and outputs D(x.sub.1, x.sub.2). An output nearer to 1 indicates a higher probability that a fitness value of x.sub.2 is better than a fitness value of x.sub.1, that is, F(x.sub.2)≤F(x.sub.1); an output farther from 1 indicates a higher probability that the first solution is better than the second solution.

(36) (2b) For the two input solutions x.sub.1 and x.sub.2, use a same 4-layer fully connected network (denoted as D.sub.f) to extract features. The dimension of an input layer of the fully connected network is the solution dimension (that is, the test dimension), the dimension of the second layer is 64, the dimension of the third layer is 64, and the dimension of the fourth layer is 10. An activation function of each layer is a rectified linear unit (ReLU). The fully connected network receives the two input solutions x.sub.1 and x.sub.2, and extracts two 10-dimensional vectors as their respective features.

(37) (2c) Subtract the vector of the first solution from the vector of the second solution.

(38) (2d) Send a vector obtained after the subtraction to a 3-layer fully connected network. The dimension of an input layer of the fully connected network (denoted as D.sub.c) is 10, the dimension of the second layer is 10, and the dimension of the third layer is 1. An activation function of the second layer is an ReLU, and an activation function of an output layer is Sigmoid. After receiving a 10-dimensional vector obtained after the subtraction in (2c), the fully connected network finally outputs a scalar ranging from 0 to 1.

(39) (2e) The output of the discriminator can be expressed as follows:
D(x.sub.1,x.sub.2)=D.sub.c(D.sub.f(x.sub.2)−D.sub.f(x.sub.1));

(40) (2f) Use a cross-entropy loss function as a loss function of the discriminator, use Adam algorithm as an optimization algorithm, and set a learning rate to 0.001.

(41) Step 3: Establish a network structure of a generator.

(42) (3a) As shown in FIG. 4, the generator receives inputs such as a current solution (denoted as x.sub.cur), noise (denoted as z), and a step size (denoted as L), and outputs a motion vector G(x.sub.cur, z, L). The motion vector is used to guide how the current solution moves to obtain a better fitness value. This vector will be added to the input current solution to generate a new solution.

(43) (3b) Connect the input current solution x.sub.cur and noise z to obtain a new vector whose dimension is equal to the sum of the solution vector dimension and the noise vector dimension.

(44) (3c) Send the vector obtained in (3b) to a 3-layer fully connected network. The dimension of an input layer of the fully connected network (denoted as G.sub.d) is the sum of the solution vector dimension and the noise vector dimension, the dimension of the second layer is 64, and the dimension of an output layer is the solution vector dimension. An activation function of the second layer is an ReLU, and an activation function of the output layer is a hyperbolic tangent function (Tanh). The fully connected network receives the vector obtained in (3b), and outputs a direction vector with the same dimension as a solution vector to guide subsequent motion directions.

(45) (3d) Calculate the dot product of the direction vector obtained in (3c) and the input step size to obtain a motion vector as a final output of the entire generator, where the motion vector represents a specific motion direction and distance.

(46) (3e) The output of the generator can be expressed as follows:
G(x.sub.cur,z,L)=L.Math.G.sub.d([x.sub.cur.sup.T,z.sup.T].sup.T);

(47) (3f) Use a cross-entropy loss function as a loss function of the generator, use Adam algorithm as an optimization algorithm, and set a learning rate to 0.001.

(48) Step 4: As shown in FIG. 1, randomly initialize the current solution denoted as a set S.sub.cur through global random search, and initialize the step size.

(49) (4a) In global random search (denoted as GS), randomly sample solutions through standard Gaussian distribution, and then use the following formula to move the solutions to a solution space:
GS(x.sub.gs)=B.sub.l+(B.sub.u−B.sub.l)*x.sub.gs,x.sub.gs˜P.sub.gaussian(0,1),

(50) where B.sub.l and B.sub.u represent an upper limit and a lower limit of the domain D of the test function, and P.sub.gaussian(0,1) represents standard Gaussian distribution.

(51) (4b) For current solutions x.sub.cur in the set S.sub.cur, calculate corresponding fitness values F(x.sub.cur), and sort them in ascending order of the fitness values to obtain an optimal current solution x.sub.best and its fitness value F(x.sub.best).

(52) (4c) Initialize the step size L=50, record the fitness value L.sub.best_F=F(x.sub.best) of the optimal current solution, and set an internal counter L.sub.cnt for reducing the step size to 0.

(53) Step 5: Generate a data set T.sub.D for training the discriminator, as shown in FIG. 1.

(54) (5a) Use the generator, local random search, and global random search respectively to generate a new solution x.sub.gen based on the current solution x.sub.cur∈S.sub.cur selected from a current solution set x.sub.cur∈S.sub.cur, and calculate a fitness value. Compare the generated solution with the current solution, and label it as 0 or 1 based on their fitness values as the data for training the discriminator.

(55) (5b) Generator (denoted as GE): Use the generator to generate a motion vector based on the current solution and add it to the current solution to obtain a generative solution. This can be expressed as follows:
GE(x.sub.cur)=x.sub.cur+G(x.sub.cur,z,L),z˜P.sub.gaussian(0,1);

(56) (5c) Local random search (denoted as LS): Perform random Gaussian sampling around the current solution to obtain a generative solution. This can be expressed as follows:
LS(x.sub.cur)=x.sub.cur+L*d,d˜P.sub.gaussian(0,1);

(57) (5d) Global random search: Perform it as described in (4a).

(58) (5e) Combine the solutions generated by using the three methods to obtain a new solution, and calculate its fitness value.

(59) (5f) Pair the new solution and a current solution based on which the new solution is generated, and compare their fitness values; and if a fitness value of the new solution is less than a fitness value of the current solution, label the data pair as 1, otherwise, label it as 0.

(60) (5g) The generated training data can be expressed as follows:
T.sub.D={(x.sub.cur,x.sub.gen,1 if F(x.sub.gen)≤F(x.sub.cur) else 0)},

(61) where x.sub.gen˜GS(x.sub.gs), GE(x.sub.cur), LS(x.sub.cur), x.sub.cur∈S.sub.cur.

(62) Step 6: As shown in FIG. 1, use a training set T.sub.D to train the discriminator.

(63) (6a) Sequentially take m pieces of training data {(x.sub.cur.sup.i, x.sub.gen.sup.i, y.sup.i), i=1, 2, . . . , m} in batches from the training set T.sub.D, and calculate a loss function:

(64) l o s s D = - .Math. i = 1 m y i * log ( D ( x cur i , x g e n i ) ) ;

(65) (6b) Calculate a gradient based on the loss function:

(66) g D = 1 m θ D l o s s D ,

(67) where θ.sub.D represents a network parameter of the discriminator.

(68) (6c) Use Adam algorithm to update the network parameter of the discriminator:

(69) θ D = θ D - μ s ^ r ^ + δ ,

(70) where μ is a learning step size, which is set to 0.001 in the present invention; δ is a small constant for numerical stability, which is set to 1e-7 in the present invention; ŝ represents correction of the first-moment deviation; and {circumflex over (r)} represents correction of the second-moment deviation. The calculation formulas are as follows:

(71) s = ρ 1 s + ( 1 - ρ 1 ) g D , r = ρ 2 r + ( 1 - ρ 2 ) g D g D , s ^ = s 1 - ρ 1 t , r ^ = r 1 - ρ 2 t ,

(72) where ρ.sub.1 and ρ.sub.2 are exponential attenuation rates of moment estimation, which are set to 0.9 and 0.999 in the present invention; t is the number of batches, which is initialized to 0 before training, and increases by 1 upon update of each batch; and s and t are variables of the first moment and the second moment, which are initialized to 0 before training, and updated according to the foregoing formula each time.

(73) Step 7: As shown in FIG. 1, train the generator.

(74) (7a) Set the network parameter of the discriminator to un-trainable, and connect the discriminator to the generator, to form a network C, of which input is the input of the generator, and output is the output of the discriminator. The specific connection method is as follows:

(75) Step 1: enter the current solution, noise, and step size into the generator and run the generator.

(76) Step 2: add a motion vector generated by the generator to the current solution to obtain a generative solution.

(77) Step 3: use the current solution and the generative solution obtained in step 2 as the inputs of the discriminator, and use the output of the discriminator as the output of the network C.

(78) The output of C can be expressed as follows:
C(c.sub.cur,z,L)=D(x.sub.cur,x.sub.cur+G(x.sub.cur,z,L));

(79) (7b) Use current solutions batch-sampled from S.sub.cur, random Gaussian noise, and a step size as inputs of C, and run C to obtain output C(c.sub.cur, z, L).

(80) (7c) Calculate a loss function of the network C:

(81) l o s s C = - .Math. i = 1 m 1 * log ( C ( x cur i , z , L ) ) ,

(82) where m is a quantity of samples each time.

(83) (7d) Calculate a gradient based on the loss function (because the network parameter of the discriminator is set to un-trainable, the network parameter of C is the network parameter of the generator):

(84) g G = 1 m θ G l o s s C ,

(85) where θ.sub.G represents the network parameter of the generator.

(86) (7d) Update the network parameter of the generator G:

(87) θ G = θ G - μ S ^ r ^ + δ ,

(88) The settings and calculation methods of μ, δ, ŝ, and {circumflex over (r)} are described in (6c).

(89) Step 8: As shown in FIG. 1, select and retain current solutions.

(90) (8a) Combine a current solution set and a generative solution set to form a candidate solution set S.sub.candidate.

(91) (8b) For each solution in the candidate solution set, calculate its retention probability based on the order of its fitness value in all candidate solutions:

(92) P retain ( x ) = r F ( x ) - α .Math. i = 1 .Math. S candidate .Math. r F ( x i ) - α ,

(93) where r.sub.F(x) represents the order of a fitness value of solution x in all candidate solutions arranged in ascending order; ∥S.sub.candidate∥ represents a size of the candidate solution set; and α is a parameter that controls the probability distribution, where a larger value of α indicates a higher retention probability of a solution with a small fitness value and a lower retention probability of a solution with a large fitness value; and α is set to 3 in the present invention.

(94) (8c) Select a solution based on the retention probability of each solution, retain the selected solution, and update S.sub.cur.

(95) Step 9: As shown in FIG. 1, reduce the step size L.

(96) (9a) Check whether a difference between a fitness value of an optimal solution x* in the current S.sub.cur and L.sub.best_F is less than a sufficiently small constant:
|F(x*)−L.sub.best.sub.F|≤ε,

(97) where ε is a sufficiently small constant, which is set to 1e-8 in the present invention. If so, perform (9b); otherwise, set L.sub.cnt to 0 and perform (9c).

(98) (9b) Increase L.sub.cnt by 1, check whether L.sub.cnt is greater than a specified threshold T, and if so, multiply the step size L by an attenuation factor γ:
L=L*γ, if L.sub.cnt>T,

(99) In the present invention, the threshold T is set to 50, and the attenuation coefficient γ is set to 0.6. If not so, perform step (9c).

(100) (9c) Update L.sub.best_F to the fitness value F(x*) of the optimal solution x* in the current S.sub.cur.

(101) Step 10: As shown in FIG. 1, determine whether the number of current iterations reaches the maximum number of iterations, and if so, output the optimal current solution; otherwise, go back to step (5).

(102) In the present invention, a simple logistics distribution simulation experiment is designed, with the following parameter settings:

(103) N = 3 , M = 2 , T = [ 0 1 1 0 1 0 1 0 1 0 1 0 0 ] .

(104) Through the foregoing steps, an optimal solution x=[1, 2, 0, 3, 0, 0] is obtained, corresponding to the following optimization scheme: delivery person 1 delivers goods to locations 1 and 2, and delivery person 2 delivers goods to location 3. This is the optimal delivery scheme.

(105) In specific implementation, a CEC2013 real function test set is selected as a test function set to further verify the method in the present invention. The test function set contains 28 real functions, of which the first 5 functions are single-mode functions and the last 23 functions are multi-mode functions. The domain is defined as having a value between −100 and 100 on any dimension. The test dimension is set to 30, and the noise dimension is set to 10. An absolute difference between an algorithm optimal solution and an actual optimal solution is used as an evaluation criterion. The test platform used in the present invention uses NVIDIA GeForce GTX TITAN X and runs on Ubuntu 14.04.5 LTS, with Python 3.5 and Tensorflow 1.3.0.

(106) Specific test parameter settings are as described above. Four other algorithms, including artificial bee colony (ABC), particle swarm optimization (SPSO), differential evolution (DE), and guided fireworks algorithm (GFWA), are selected for comparison. The average value results of 51 validations on the CEC2013 real function test set are shown in Table 1.

(107) TABLE-US-00001 TABLE 1 Average optimization results and average ranking of the present invention and the four selected methods CEC2013 GAN-O ABC SPSO DE GFW A 1 0.00E+00 0.00E+00 0.00E+00 1.89E−03 0.00E+00 2 4.36E+05 6.20E+06 3.38E+05 5.52E+04 6.96E+05 3 3.68E+07 5.74E+08 2.88E+08 2.16E+06 3.74E+07 4 1.40E+04 8.75E+04 3.86E+04 1.32E−01 5.00E−05 5 9.43E−04 0.00E+00 5.42E−04 2.48E−03 1.55E−03 Average ranking 2.4 3.4 2.6 2.8 2.6 (single-mode) 6 1.89E+01 1.46E+01 3.79E+01 7.82E+00 3.49E+01 7 4.10E+01 1.25E+02 8.79E+01 4.89E+01 7.58E+01 8 2.09E+01 2.09E+01 2.09E+01 2.09E+01 2.09E+01 9 1.56E+01 3.01E+01 2.88E+01 1.59E+01 1.83E+01 10 2.23E−02 2.27E−01 3.40E−01 3.24E−02 6.08E−02 11 8.66E+01 0.00E+00 1.05E+02 7.88E+01 7.50E+01 12 7.76E+01 3.19E+02 1.04E+02 8.14E+01 9.41E+01 13 1.63E+02 3.29E+02 1.94E+02 1.61E+02 1.61E+02 14 3.27E+03 3.58E−01 3.99E+03 2.38E+03 3.49E+03 15 3.04E+03 3.88E+03 3.81E+03 5.19E+03 3.67E+03 16 1.31E−01 1.07E+00 1.31E+00 1.97E+00 1.00E−01 17 1.24E+02 3.04E+01 1.16E+02 9.29E+01 8.49E+01 18 1.22E+02 3.04E+02 1.21E+02 2.34E+02 8.60E+01 19 4.53E+00 2.62E−01 9.51E+00 4.51E+00 5.08E+00 20 1.32E+01 1.44E+01 1.35E+01 1.43E+01 1.31E+01 21 2.76E+02 1.65E+02 3.09E+02 3.20E+02 2.59E+02 22 3.84E+03 2.41E+01 4.30E+03 1.72E+03 4.27E+03 23 3.67E+03 4.95E+03 4.83E+03 5.28E+03 4.32E+03 24 2.53E+02 2.90E+02 2.67E+02 2.47E+02 2.56E+02 25 2.75E+02 3.06E+02 2.99E+02 2.80E+02 2.89E+02 26 2.00E+02 2.01E+02 2.86E+02 2.52E+02 2.05E+02 27 7.85E+02 4.16E+02 1.00E+03 7.64E+02 8.15E+02 28 3.41E+02 2.58E+02 4.01E+02 4.02E+02 3.60E+02 Average ranking 2.17 2.96 4.00 2.83 2.57 (multi-mode) Average ranking 2.21 3.04 3.75 2.82 2.57 (all)

(108) Test results show that the present invention achieves better results than the other four algorithms, which demonstrates the validity of the method proposed by the present invention.

(109) It should be noted that the embodiment is intended to help further understand the present invention, but those skilled in the art can understand that various replacements and modifications may be made without departing from the spirit and scope of the present invention and the claims. Therefore, the present invention is not limited to the content disclosed in the embodiment, and the protection scope of the present invention shall be subject to the protection scope of the claims.