Automatic signal deployer, signal deployment system, automatic signal path deployment method, and behavior control signal generation method of deployment agent
12368472 ยท 2025-07-22
Assignee
Inventors
- Li-Hsiang Shen (Hsinchu, TW)
- Kai-Ten Feng (Hsinchu, TW)
- Chun-Chieh Kuo (New Taipei, TW)
- Hua-Pei Chiang (Taipei, TW)
- Chyi-Dar Jang (Taipei, TW)
- Teng-Chieh Yang (New Taipei, TW)
- Tsung-Jen Wang (Taipei, TW)
- Chi-Hung Lin (New Taipei, TW)
- Chi-En Chien (New Taipei, TW)
Cpc classification
H04B17/3913
ELECTRICITY
International classification
Abstract
The present invention discloses an automatic signal deployer, a signal deployment system, an automatic signal path deployment method, and a behavior control signal generation method of a deployment agent. The signal deployment system includes an automatic signal deployer, a deployment agent and a base station. The deployment agent receives signal quality data, generates a behavior control signal according to the signal quality data, and sends out the behavior control signal to the automatic signal deployer. The automatic signal deployer receives the behavior control signal and a source signal coming from the base station, performs deployment according to the behavior control signal, whereby the automatic signal deployer can transmit the source signal toward a signal path allocation direction and complete automatic deployment of signal paths.
Claims
1. A behavior control signal generation method of a deployment agent comprising steps: receiving signal quality data, which includes a plurality of pieces of signal quality information; determining whether a quantity of the plurality of pieces of signal quality information reaches a preset value; if the quantity of the plurality of pieces of signal quality information reaches the preset value, using a signal average, which is calculated from the plurality of pieces of signal quality information, as a reward value of a plurality of sub-agents of a multi-agent reinforcement learning algorithm; making the plurality of sub-agents use the reward value generate a behavior control signal through the multi-agent reinforcement learning algorithm and send out the behavior control signal; receiving new signal quality data after an automatic signal deployer transmits a source signal, which comes from a base station, to a direction of signal path allocation according to the behavior control signal; and stopping sending out the behavior control signal after new signal quality signal data achieve a deployment requirement.
2. The behavior control signal generation method of a deployment agent according to claim 1, wherein before the new signal quality data reach the deployment requirement, the deployment agent resumes to undertake the steps beginning from the step of receiving the signal quality data until the new signal quality data reach the deployment requirement.
3. The behavior control signal generation method of a deployment agent according to claim 1, wherein the behavior control signal includes one of carrier behavior signals, behavior adjusting signals and mode adjusting signals, or a combination thereof; the carrier behavior signals include a move-forward instruction, a move-backward instruction, a move-leftward instruction, a move-rightward instruction, and a stop-moving instruction for a self-propelled carrier of the automatic signal deployer; the mode adjusting signals include one of a reflect-mode signal, a refract-mode signal and a relay transmission-mode signal, or a combination thereof for a signal path redistributor on the self-propelled carrier of the automatic signal deployer; the behavior adjusting signals include azimuth angle adjusting information, elevation angle adjusting information, and height adjusting information; the azimuth angle adjusting information is used to increase or decrease an azimuth angle of the signal path redistributor; the elevation angle adjusting information is used to increase or decrease an elevation angle of the signal path redistributor; the height adjusting information is used to increase or decrease a height of the signal path redistributor.
4. A signal deployment system comprising: a base station, sending out a source signal; a deployment agent, receiving signal quality data, generating a behavior control signal according to the signal quality data, and sending out the behavior control signal; and an automatic signal deployer, receiving the behavior control signal and the source signal, moving to an assigned position according to the behavior control signal, switching to one of a reflection mode, a refraction mode and a relay transmission mode, or simultaneously switching to more than two of the reflection mode, the refraction mode and the relay transmission mode, adjusting a signal path allocation direction, and sending the source signal toward the signal path allocation direction.
5. The signal deployment system according to claim 4, wherein the deployment agent receives the signal quality data through a user device or receives the signal quality data through the base station.
6. The signal deployment system according to claim 4, wherein the signal quality data contain a plurality of pieces of signal quality information, including Received-Signal Strength Indicator (RSSI), Signal-to-Noise Ratio (SNR), Signal to Interference plus Noise Ratio (SINR), Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), Bit Error Rate (BER), Packet Error Rate (PER), and Packet Drop Rate (PDR).
7. The signal deployment system according to claim 6, wherein the deployment agent contains a multi-agent reinforcement learning algorithm; the multi-agent reinforcement learning algorithm includes a plurality of sub-agents; the deployment agent uses the plurality of pieces of signal quality information to calculate a signal average value and uses the signal average value as a reward value of the plurality of sub-agents of the deployment agent; the deployment agent makes the plurality of sub-agents use the reward value to generate the behavior control signal and sends out the behavior control signal; after the automatic signal deployer transmits the source signal to the signal path allocation direction, the deployment agent receives new signal quality data; after the new signal quality data reach a deployment requirement, the deployment agent stops sending out the behavior control signal.
8. The signal deployment system according to claim 5, wherein after the deployment agent has received the new signal quality data and before the new signal quality data reach the deployment requirement, the deployment agent receives new signal quality data again and sends out a new behavior control signal until the new signal quality data reach the deployment requirement.
9. An automatic signal path deployment method, which is applied to a signal deployment system containing an automatic signal deployer, a deployment agent and a base station comprising steps: using the deployment agent to receive signal quality data, including a plurality of signal quality signals; using the deployment agent generates a behavior control signal according to the plurality of signal quality signals and send out the behavior control signal; using the automatic signal deployer to receive the behavior control signal, moving the automatic signal deployer to an assigned position according to the behavior control signal, and adjusting a signal path allocation direction according to the behavior control signal; and using the automatic signal deployer to transmit a source signal, which comes from the base station, to the signal path allocation direction.
10. An automatic signal deployer comprising: a self-propelled carrier; a signal path redistributor; a multidirectional adjusting device, connected with the signal path redistributor and the self-propelled carrier; a wireless transceiver, receiving a behavior control signal and transmitting the behavior control signal; and a controller, connected with the self-propelled carrier, the multidirectional adjusting device and the wireless transceiver, receiving the behavior control signal, analyzing the behavior control signal to generate one of a carrier behavior signal, a behavior adjusting signal and a mode adjusting signal, or a combination thereof, transmitting the carrier behavior signal to the self-propelled carrier, transmitting the mode adjusting signal to the signal path redistributor, and transmitting the behavior adjusting signal to the multidirectional adjusting device, wherein the self-propelled carrier moves in a space according to the carrier behavior signal; the signal path redistributor switches to one of a reflection mode, a refraction mode and a relay transmission mode or simultaneously switches to more than two of them according to the mode adjusting signal; the multidirectional adjusting device adjusts a signal path allocation direction of the signal path redistributor according to the behavior adjusting signal to enable the signal path redistributor to send the source signal toward the signal path allocation direction.
11. The automatic signal deployer according to claim 10, wherein the behavior adjusting signal includes azimuth angle adjusting information, elevation angle adjusting information, and height adjusting information; the multidirectional adjusting device includes an azimuth angle adjuster, an elevation angle adjuster, and a height lifter; one end of the height lifter is connected with the self-propelled carrier; another end of the height lifter is connected with the azimuth angle adjuster; the elevation angle adjuster is connected with the azimuth angle adjuster; the azimuth angle adjuster is connected with the signal path redistributor; the height lifter receives the height adjusting information and adjusts a height of the signal path redistributor according to the height adjusting information; the azimuth angle adjuster receives the azimuth angle adjusting information and adjusts an azimuth angle of the signal path redistributor according to the azimuth angle adjusting information; the elevation angle adjuster receives the elevation angle adjusting angle and adjusts a elevation angle of the signal path redistributor according to the elevation angle adjusting information.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION OF THE INVENTION
(10) The embodiments of the present invention will be further demonstrated in details hereinafter in cooperation with the corresponding drawings. In the drawings and the specification, the same numerals represent the same or the like elements as much as possible. For simplicity and convenient labelling, the shapes and thicknesses of the elements may be exaggerated in the drawings. It is easily understood: the elements belonging to the conventional technologies and well known by the persons skilled in the art may be not particularly depicted in the drawings or described in the specification. Various modifications and variations made by the persons skilled in the art according to the contents of the present invention are to be included by the scope of the present invention.
(11) Refer to
(12) Refer to
(13) Refer to
(14)
(15) In some embodiments, the deployment agent 2 may receive the signal quality data from the user devices 4. Alternatively, the signal quality data are received by the base station 3. The base station 3 may be a fixed base station, a small base station or an indoor base station, especially one of 5G millimeter-wave base stations.
(16) In some embodiments, the signal quality data includes a plurality of pieces of signal quality information, such as one of Received-Signal Strength Indicator (RSSI), Signal-to-Noise Ratio (SNR), Signal to Interference plus Noise Ratio (SINR), Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), Bit Error Rate (BER), Packet Error Rate (PER), and Packet Drop Rate (PDR).
(17) In one embodiment, the deployment agent 2 is equipped with a multi-agent reinforcement learning algorithm. The deployment agent 2 uses the signal averages, which are calculated from a plurality of pieces of signal quality information, as the reward values of a plurality of the sub-agents of the multi-agent reinforcement learning algorithm. The deployment agent 2 uses the plurality of sub-agents to generate behavior control signals. After the automatic signal deployer 1 transmits the source signal, which comes from the base station 3, toward the direction of the signal path allocation, the deployment agent 2 receives new signal quality data. After the new signal quality data have achieved the requirement of deployment, the deployment agent 2 stops sending out behavior control signals. In some embodiments, another deep-learning or machine-learning technology may replace the multi-agent reinforcement learning algorithm to generate behavior control signals.
(18) In one embodiment, before the new signal quality data reach the requirement of deployment, the deployment agent 2 receives signal quality data again and sends out new behavior control signals until the new signal quality data achieve the requirement of deployment. Herein, it should be explained particularly: the action that the deployment agent 2 receives signal quality data again may be that the deployment agent 2 receives new signal quality data or that the deployment agent 2 abandons new signal quality data and receives another signal quality data.
(19) Refer to
(20) Refer to
(21) In some embodiments, before the deployment agent 2 resumes to undertake the steps beginning from the step of receiving signal quality data, the deployment agent 2 determines whether the count of the iterations of the reinforcement learning algorithm has reached the upper limit. While the count of the iterations of the reinforcement learning algorithm has not reached the upper limit yet, the process proceeds to Step S501. Once the count of the iterations of the reinforcement learning algorithm has reached the upper limit, the process proceeds to Step S507.
(22) In some embodiments, the behavior control signals include one of carrier behavior signals, behavior adjusting signals and mode adjusting signals, or a combination thereof. The carrier behavior signals include a move-forward instruction, a move-backward instruction, a move-leftward instruction, a move-rightward instruction, and a stop-moving instruction for the self-propelled carrier 11. The mode adjusting signals include one of a reflect-mode signal, a refract-mode signal and a relay transmission-mode signal, or a combination thereof for the signal path redistributor 12. The behavior adjusting signals include azimuth angle adjusting information, elevation angle adjusting information, and height adjusting information. The azimuth angle adjusting information is used to increase or decrease of the azimuth angle of the signal path redistributor 12. The elevation angle adjusting information is used to increase or decrease of the elevation angle of the signal path redistributor 12. The height adjusting information is used to increase or decrease of the height of the signal path redistributor 12.
(23) Below, examples are used to further demonstrate how the multi-agent reinforcement learning algorithm generates the behavior control signals.
(24) In an example, N automatic signal deployers 1 (unmanned vehicles) provide service for K user devices 4 in a space X. It should be noted: each automatic signal deployer 1 is equipped with a signal path redistributor 12. Each signal path redistributor 12 has M elements. N, K, and M are positive integers. The nth unmanned vehicle is expressed by X.sub.n.sup.UV. The kth user device is expressed by X.sub.k.sup.UE. The base station is expressed by X.sup.BS n is the serial number of an unmanned vehicle in a series of N unmanned vehicles. For example, N=5 indicates that there are totally 5 unmanned vehicles; the 3rd unmanned vehicle is expressed by XIV. k is the serial number of a user device in a series of K user devices. For example, K=8 indicates that there are totally 8 user devices; the 6th user device is expressed by X.sub.6.sup.UE. The positions, heights, directions, elevation angles of the nth unmanned vehicle X.sub.n.sup.UV, the kth user device X.sub.k.sup.UE, and the base station X.sup.B5 are defined as follows:
(25)
wherein is the azimuth angle, the elevation angle, x the first axis, y the second axis, h the height; x and y are jointly used to express the position.
(26) The deployment of the nth automatic signal deployer 1 is defined as .sub.n.sup.X:
(27)
wherein Equation (4) expresses the reflection or refraction function. In other words, ref {Refl, Refr}, wherein denotes reflection, Refr refraction, the amplitude, the phase angle. The total number of the transmission elements to the automatic signal deployer 1 amounts to M, and m is the serial number of a transmission element in a series of M transmission elements. For example, M=9 indicates that there are totally 9 transmission elements; m=3 indicates the 3.sup.rd transmission element. The amplitude constraint is expressed by 0.sub.n,m.sup.x1. The relationship between reflection and refraction of the transmission element is expressed by
(28)
(29) The phase angle constraint is expressed by 0.sub.n,m.sup.ref2. The wireless communication channel between the base station 3 and the nth automatic signal deployer 1 is expressed by H.sub.n. The channel between the nth automatic signal deployer 1 and the kth user device 4 is expressed by G.sub.n,k. It is learned: the channel parameters of transmission and reception are highly geometrically correlated. Therefore, H.sub.n and G.sub.n,k may be expressed as mapping functions:
(30)
(31) Because of environmental complexity, it is hard to obtain the complicated mapping of f (X.sub.n.sup.UV, X.sup.BS) and f (X.sub.n.sup.UV, X.sub.k.sup.UE). For example, signal attenuation may occur in different extents for different distances and different directions in different environments.
(32) The signal Y.sub.k received by the kth user device 4 may be expressed by
(33)
wherein N.sub.k is the noise to the kth user device 4; L.sub.n,k.sup.x indicates that the kth user device 4 is located in a deployment area x of the nth automatic signal deployer 1 in the space X; X.sub.k is the expected signal emitted from the base station 3 to the kth user device 4; X.sub.i is the interference signal between the kth user device 4 and the base station 3, wherein i is the index of the interference to the user signal; ik; the remaining K1 users are the sources of interference signals. Therefore, the power P.sub.r,k of the reference signal, which is received by the kth user device 4, may be expressed by
(34)
wherein P expresses power; r expresses reference.
(35) The indicator of the intensity received by the kth user device 4 may be obtained from the mapping relationship and expressed by Equation (10):
(36)
(37) The Signal to Interference plus Noise Ratio (SINR) of the kth user device 4 may be obtained from Equation (11):
(38)
wherein I.sub.k=|.sub.n=1.sup.NL.sub.n,k.sup.xG.sub.n,k.sub.n.sup.xH.sub.n .sub.ik.sup.K X.sub.i|.sup.2 is the interference; .sup.2 is the interference power.
(39) The Shannon capacity R.sub.k may be obtained from Equation (12):
(40)
(41) The overall system speed of the kth user device 4 may be calculated from Equation (12) and expressed by
(42)
(43) The bit error rate of the kth user device 4 may be worked out with the Interference plus Noise Ratio (SINR) according to Equation (14):
(44)
(45) The Packet Error Rate (PER) may be worked out with the bit error rate according to Equation (15):
(46)
(47) The Packet Drop Rate (PDR) may be worked out according to Equation (16):
(48)
(49) It should be noted: f.sub.RSSI (P.sub.r,k), f.sub.BER(.sub.K)f.sub.PER (BER.sub.K) are normally hard to acquire mathematical closed forms in a complicated communication system.
(50) In such a problem, it is desired to optimize the overall system utility function:
(51)
wherein .sub.i is the weight of each performance index, and i{1, 2, 3, 4, 5}. It is observed: other performance indexes, such as RSRP, RSRQ, SNR, delay, tremble, and pack drop rate, may also be incorporated into Equation (17). The abovementioned performance indexes are only for exemplification. The present invention is not limited to use the abovementioned performance indexes. As the system varies with time, the optimized solution is unlikely to acquire instantly. Therefore, a Multi-Agent Reinforcement Learning (MARL) algorithm is designed for the deployment of the automatic signal deployer 1. The critical concept of MARL is that each sub-agent updates its strategy according to the observed status and the reward function. Firstly, the deployment action A.sub.n is defined as
(52)
n{1, . . . ,N},m{1, . . . M},x{Refl,Refr}Equation (18)
(53) wherein x.sub.n.sup.UV, y.sub.n.sup.UV, h.sub.n.sup.UV are the unit step lengths of the positional movement of the automatic signal deployer 1; .sub.n.sup.Uv, .sub.n.sup.UV are the step lengths of the azimuth angle and the elevation angle of the unmanned ref vehicle; .sub.n,m.sup.ref, .sub.n,m.sup.ref is the step lengths of the amplitude of the signal path redistributor and the phase angle. The status S.sub.n of each automatic signal deployer 1 is the current geometrical information and expressed by
(54)
(55) As A.sub.n and S.sub.n need larger storage space, they are divided into the combinations of the sub-agents of the parameters:
(56)
(57) Below are listed the calculation actions of the sub-agents of the multi-agent reinforcement learning algorithm of the automatic signal deployer 1.
(58) Position:
(59) Move-forward (x.sub.n.sup.UV=0,y.sub.n.sup.UV0); Move-backward (x.sub.n.sup.UV=0,y.sub.n.sup.UV0); Move-leftward (x.sub.n.sup.UV0,y.sub.n.sup.UV=0); Move-rightward (x.sub.n.sup.UV0,y.sub.n.sup.UV=0); Stop-moving (x.sub.n.sup.UV=0,y.sub.n.sup.UV=0).
Height: Rise (h.sub.n.sup.UV0); Descend (h.sub.n.sup.Uv0).
Azimuth Angle: Increase azimuth angle (.sub.n.sup.UV0); Decrease azimuth angle (.sub.n.sup.UV0).
Elevation Angle: Increase elevation angle (.sub.n.sup.UV0); Decrease elevation angle (.sub.n.sup.UV0).
Transmission Mode: Reflection (.sub.n,m.sup.Refl0, .sub.n,m.sup.Refr=0); Refraction (.sub.n,m.sup.Refl=0, .sub.n,m.sup.Refr0); Dual-mode (.sub.n,m.sup.refl0, .sub.n,m.sup.Refr0, .sub.n,m.sup.Refl+.sub.n,m.sup.Refr=1).
(60) The amplitude or phase angle may be increased or decreased to satisfy 0.sub.H,m.sup.x1 and 0.sub.n,m.sup.x2.
(61) The time is denoted by t. At the time point of t+1, the state z.sub.n,t+1.sup.UV of the sub-agent of each parameter may be updated as follows.
(62)
(63) Equation (21) is a general formula. After time t is introduced into Equation (21), Equation (21) becomes z.sub.n,t.sup.UV, wherein z.sub.n,t.sup.UV represents the status of the sub-agent of each parameter at different times. Equation (20) is a general formula. After time t is introduced into Equation (20), Equation (20) becomes z.sub.n,t.sup.UV, wherein z.sub.n,t.sup.UV represents the varying step length of the status of the sub-agent of each parameter at different times.
(64) The time scale of deployment (using second as its unit) is much larger than that of wireless signals (using millisecond as its unit). Thus, the average of W pieces of historical received data are used as the reward R.sub.t, which may be expressed as
(65)
(66) wherein Equation (17) is the general formula of the overall system utility function; U.sub.t- expresses historical received data and also represents the overall system utility function at the time point of t; is the time difference between the corresponding time point and the time point of t; W expresses the quantity of historical received data.
(67) In order to prevent from being trapped by a local optimal solution, a greedy algorithm is used to balance the mobile exploration and its utilization of each automatic signal explorer and expressed by
(68)
wherein is a random number, which is pseudo-randomly generated and ranges between 0 and 1; .sub.th is the exploration rate. After executing the action A.sub.n,z,t and acquiring the reward R.sub.t, the automatic signal deployer 1 will update the Q table of each sub-agent. The Q table is denoted by Q(S.sub.n,z,t, A.sub.n,z,t), and the equation thereof is expressed by
(69)
wherein .sub.1 is the discount factor; .sub.2 is the learning rate; N(A.sub.n,z,t) is the total number of the execution of the action A.sub.n,z,t; R.sub.t is the reward obtained from the base station 3.
(70) The process that the automatic signal deployer performs deployment in experiment is described below. Refer to
(71) Next are established the deployment boundaries, which are also called the measured area boundaries and expressed by z.sub.min,n.sup.UV and z.sub.max,n.sup.UV. The deployment should satisfy the following condition:
(72)
(73) It should be noted: the robot using Lidar/Radar/Infrared sensor may be used to perform the mission of establishing deployment boundaries, whereby to draw the layout of the space X, which functions as the measured area boundaries. Next are established the size W of the observation window of the sub-agents and the parameters of the reinforcement learning algorithm, including the discount factor .sub.1, the learning rate .sub.2, the Q table, and the exploration rate .sub.th, wherein the Q table is expressed by Q(S.sub.n,z.t, A.sub.n,z.t). After the data transmission process between the base station 3 and the user device 4 has been initialized, the automatic signal deployer 1 will obtain the overall system utility function U.sub.t at the time point of t (refer to Equation (17)), i.e. the performance of the new signal quality data. For example, the new signal quality data may be the signal throughput between the base station 3 and the user device 4, which amounts to 900 Mbps. If we cannot obtain sufficient signal quality information from the observation window with a given size of W, we will wait until we can obtain sufficient signal quality information. In other words, after tW, the multi-agent reinforcement learning algorithm will be executed. W pieces of U.sub.t data are retrieved to calculate the reward R.sub.t of the multi-agent reinforcement learning algorithm. In the present invention, t is not the absolute time (such as five past three). Only a determination is done at a time point, i.e. only an iteration is done at a time point. tW indicates that the time spent in iterations is greater than the size of the observation window. While the system is restarted, t is reset to zero (expressed by t=0). For example, suppose that the time takes 3 seconds as a unit; while t=1, it indicates that 3 seconds have elapsed from 1-0; while t=2, it indicates that 6 seconds have elapsed from t=0 and that 3 seconds have elapsed from t=1.
(74) Refer to
(75)
wherein U.sub.th is the lowest requirement of deployment. The total process will end while the deployment requirement is satisfied or the maximum number of iterations is exceeded.
(76) In the experiment, the present invention is tested in conditions: non-line of sight, multiple reflections, barrier of human bodies, handheld behaviors, distance, and operating bandwidth range. In the non-line of sight, the throughput of the base station 3 may reach as high as 1 Gbps with a bandwidth of 100 MHz and a BS transmit power of 21 dBm. In comparison with the case free of the automatic signal deployer 1, the present invention may raise the speed by as high as 400 Mbps (2-3 times). The signal automatic deployer 1 needn't use optical fiber, reducing the deployment cost, realizing automatic deployment, and consuming less manpower. The signal automatic deployer 1 can learn from historical data and thus decrease the cost of re-measurement and off-line learning. The signal deployment system of the present invention spends shorter time in deployment (about 10 minutes for an area of 108 square meters). In comparison with the conventional technology, the present invention can reduce the deployment time by more than 1 hour.
(77) In other experiments, the automatic signal deployer 1 is moved to the assigned position from different start points in an indoor space. Refer to
(78) It should be explained herein: in the experiments of the present invention, the deployment agent 2 is connected with the user devices 4 through optical fiber. However, the user devices 4 may be wirelessly connected with the deployment agent 2 in practical application. The deployment agent 2 is disposed in the indoor space to receive the signal quality data of the user devices 4. In some embodiments, the deployment agent 2 is divided into a computing unit and a transmitting unit. The computing unit is connected with the base station 3, receiving signal quality data from the base station 3 and generating behavior control signals through the multi-agent reinforcement learning algorithm. The computing unit sends the behavior control signals to the transmitting unit, and the transmitting unit transmits the behavior control signals to the automatic signal deployer 1. Such a measure has advantages: 1. The telecommunication industries may collect all the signal quality data so as to verify the signal qualities of the indoor environments; 2. The transmitting unit needn't perform computation, whereby is reduced the volume of the apparatus and decreased the power consumed by computation, wherefore the telecommunication industries can afford the power consumption of computation.
(79) In conclusion, the present invention proposes an automatic signal deployer, a signal deployment system, a signal path automatic deployment method, and a behavior control signal generation method of a deployment agent. The present invention is to optimize the performance indexes of one or more pieces of signal quality data, such as Throughput, RSSI, BER, PER, and PDR to satisfy the requirement of communication systems. Considering a plurality of behaviors of the automatic signal deployer 1 and the constraint of the certainty of the behaviors, a multi-agent reinforcement learning algorithm is designed to search for an optimal solution in the action space, whereby to optimize one or more performance indexes. In the specification, the embodiment of optimizing a single performance index of throughput is used to exemplify the present invention. However, the present invention is not limited by the embodiment. In the present invention, one or more optimized performance indexes or several front ones of the acceptable performance indexes may be used as the signal quality data, such as Throughput, RSSI, and BER. In a preferred embodiment, two acceptable ones of Throughput, RSSI, and BER are used as the signal quality data of the automatic signal deployer 1.
(80) In the present invention, the signal deployment system uses the plug-and-play automatic signal deployer 1 to evaluate the transmission performance of the base station 3 with the automatic signal deployer 1 disposed indoors. The present invention can give telecommunication operators, device manufacturers, and telecommunication industries the following advantages: 1. The deployment cost of the present invention is lower than the small base station of the 5G communication system. The lower cost may increase the signal coverage and realize a Gbps-scale throughput. 2. The present invention is compatible with the existing communication protocol and exempted from additional modification. 3. The present invention can decrease manpower requirement. 4. The algorithm of the present invention can eliminate unnecessary measurement points in different environments and makes deployment fast and correct. 5. The present invention can plug-and-play fast, neither requiring high-performance computation resource nor spending too much cost in training deep-learning models. 6. In the present invention, some base stations may be closed in the idle time, and the automatic signal deployers may provide service for different spaces. In comparison with the conventional deployment method, the present invention can decrease a lot of base stations and thus save considerable cost. 7. The application of the automatic signal deployer may be extended to the fields of intelligent services, networks, etc., such as intelligent storages, private networks, WiFi deployment, and wireless resource management.
(81) The embodiments described above are only to exemplify the present invention but not to limit the scope of the present invention. The embodiments involving equivalent replacement or variation made easily according to the technical contents disclosed by the specification or claims are to be also included by the scope of the present invention.