METHOD FOR PREDICTING DIOXIN EMISSION CONCENTRATION
20220092482 · 2022-03-24
Inventors
Cpc classification
G06N7/01
PHYSICS
G06N5/01
PHYSICS
International classification
Abstract
A method for predicting dioxin (DXN) emission concentration based on hybrid integration of random forest (RF) and gradient boosting decision tree (GBDT). A random sampling of a training sample and an input feature is performed on a modeling data with a small sample size and a high-dimensional characteristic to generate a training subset. J RF-based DXN sub-models based on the training subset are established. J×I GBDT-based DXN sub-models are established by performing I iterations on each of the RF-based DXN sub-models. Predicted outputs of the RF-based DXN sub-model and the GBDT-based DXN sub-model are combined by a simple average weighting method to obtain a final output.
Claims
1. A method for predicting dioxin (DXN) emission concentration, comprising: (S1) performing, by a training sample and input feature random sampling module, a random sampling with replacement on a training sample set {X.Math.R.sup.N×M, y.Math.R.sup.N×1}N times and a random selection of a fixed number of input features from the training sample set to generate a training subset wherein X={x.sub.n}.sub.n=1.sup.N.Math.R.sup.N×M, which represents an input data {x|x.sub.1, . . . , x.sub.m, . . . x.sub.M} consisting of a process variable of a municipal solid waste incineration (MSWI) process acquired by a process control system while collecting a DXN test sample; the process variable comprises furnace temperature, activated carbon injection amount, stack emission gas concentration, grate speed, primary air flow and secondary air flow; N is the number of training samples; M is the number of the process variable; and y={y.sub.n}.sub.n=1.sup.N.Math.R.sup.N×1, which represents an output data consisting of the DXN emission concentration at an end of the MSWI process, wherein the end of the MSWI process is a stack emission end, and the DXN emission concentration is obtained by online collection and offline analysis; (S2) establishing, by a random forest (RF)-based DXN sub-model establishing module, a RF-based DXN sub-model {f.sub.RF.sup.j(⋅)}.sub.j=1.sup.J by utilizing the training subset {X.sup.j, y.sup.j}.sub.j=1.sup.J; and subtracting a predicted value {ŷ.sup.j}.sub.j=1.sup.J of the DXN emission concentration from a measured value {y.sup.j}.sub.j=1.sup.J of the DXN emission concentration to obtain a prediction error {e.sup.j,0}.sub.j=1.sup.J; (S3) performing, by a gradient boosting decision tree (GBDT)-based DXN sub-model establishing module, iteration I times on each of a new training subset {X.sup.j, e.sup.j,0}.sub.j=1.sup.J to build I×J GBDT-based DXN sub-models {{f.sub.GBDT.sup.j,i(⋅)}.sub.j=1.sup.J}.sub.j=1.sup.J; wherein the new training subset is formed by the prediction error {e.sup.j,0}.sub.j=1.sup.J as an output data true value and an input data of a training subset {X.sup.j}.sub.j=1.sup.J; (S4) subjecting, by a simple average-based DXN integrated prediction module, the RF-based DXN sub-model {ŷ.sub.RF.sup.j}.sub.j=1.sup.J and the GBDT-based sub-model {{f.sub.GBDT.sup.j,i(⋅)}.sub.j=1.sup.J}.sub.j=1.sup.J to simple averaging to establish a DXN emission concentration prediction model f.sub.DXN(⋅); and (S5) taking the input data {x|x.sub.1, . . . , x.sub.m, . . . x.sub.M} as an input of the DXN emission concentration prediction model; and calculating, successively by the RF-based DXN sub-model establishing module, the GBDT-based DXN sub-model establishing module and the simple average-based DXN integrated prediction module, a current DXN emission concentration value as a DXN emission concentration predicted value of the MSWI process.
2. The method of claim 1, wherein the training sample and input feature random sampling module is operated through steps of: processing data of the process variable of the MSWI process by a Bootstrap method and a random subspace method (RSM); extracting the training subset by the Bootstrap method, wherein the number of samples in the training subset is the same with the number of samples of the training sample set; and introducing the RSM to randomly select some features to generate J training subsets comprising N training samples and M.sup.j input features; expressed as follows:
3. The method of claim 2, wherein the RF-based DXN sub-model establishing module is operated through the following steps with the j.sup.th training subset {(x.sup.j,M.sup.
4. The method of claim 3, wherein the GBDT-based DXN sub-model establishing module is operated through steps of: establishing multiple weak learner models “in series”; wherein an input data of a training subset of the multiple weak learner models is unchanged; a true value of output data of a training subset of a first GBDT-based DXN sub-model is an error between the predicted output of the RF-based DXN sub-model and the measured value; and a true value of output data of a training subset of other GBDT-based DXN sub-models is a prediction error of the GBDT-based DXN sub-model iterated in a previous iteration; taking establishment of a jth GBDT-based DXN sub-model as an example, and supposing that there are I GBDT-based DXN sub-models to be established by the CART: establishing a first GBDT-based DXN sub-model:
5. The method of claim 4, wherein the simple average-based DXN integrated prediction module is operated through steps of: indicating J RF-based DXN sub-models established in parallel as {f.sub.RF.sup.j(⋅)}.sub.j=1.sup.J; and indicating J×I GBDT-based DXN sub-models established in series and parallel simultaneously as
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0016]
[0017]
[0018]
[0019]
DETAILED DESCRIPTION OF EMBODIMENTS
Description of MSWI Process for DXN Generation
[0020] MSW is transported by a vehicle to a weighbridge to be weighted and discharged into a garbage pool. After biologically fermented and dehydrated for 3-7 days, the MSW is transferred to a garbage hopper by a grab, fed to an incineration grate through a feeder, and subjected to drying, burning and incineration successively. Combustible components of the MSW after drying are burned in the combustion air delivered by a primary air fan. The ash residue generated by burning falls from an end of the grate to a slag conveyor to be transported to a slag pit, and finally is landfilled at a designated location. A temperature of the flue gas produced in the combustion process should be controlled above 850° C. in a first combustor to ensure a complete decomposition and combustion of harmful gas. When the flue gas passes through a second combustor, air delivered by a secondary air fan generates a turbulence, which ensures that the residence time of the flue gas exceeds 2 s, such that the harmful gas is further decomposed. The flue gas then enters a waste heat boiler and absorbs heat to generate high-temperature steam to drive a turbo-generator set to generate electricity. Subsequently, the flue gas is mixed with lime and activated carbon, and enters a deacidification reactor to undergo a neutralization reaction to allow the DXN and heavy metals therein to be adsorbed. Then, a flue gas particle, neutralization reactant and activated carbon are removed in a bag filter. Part of the gas and ash mixture is mixed with water in a mixer and then transported into the deacidification reactor for repeated treatment. Fly ash produced in the deacidification reactor and the bag filter enters a fly ash tank and is needed to be transported to experience further processing. The final gas is emitted to the atmosphere by an induced draft fan through a stack, which includes soot, CO, NOx, SO2, HC1, HF, Hg, Cd, DXN and so on.
[0021] As shown in
[0022] As shown in
[0023] In
[0024] All of sub-models of the DXN emission concentration prediction model based on hybrid integration of EnRFGBDT herein are established by maximize growth classification and regression trees (CART). The training subset of the RF-based DXN sub-model and the input feature of the RF-based DXN sub-model are generated by random sampling, where the number of features of the RF-based DXN sub-model is much smaller than that of an initial modeling data, therefore a correlation between the CART is reduced, and a robustness of an outlier and a noisy data are improved. Multiple GBDT-based DXN sub-models in series further improve a prediction precision of the CART. As a consequence, the DXN emission concentration prediction model in “parallel+series” is established. Different modules are performed as follows.
[0025] (1) A random sampling with replacement N time to the training sample set {X.Math.R.sup.N×M, y.Math.R.sup.N×1} and a random selection of a fixed number of input features from the training sample set are performed by the training sample and input feature random sampling module to generate the training subset {X.sup.j, y.sup.j}.sub.j−1.sup.J.
[0026] (2) A RF-based DXN sub-model {f.sub.RF.sup.j(⋅)}.sub.j=1.sup.J is established by the random forest (RF)-based DXN sub-model establishing module. A predicted value {ŷ.sup.j}.sub.j=1.sup.J of the DXN emission concentration is subtracted from a measured value {y.sup.j}.sub.j=1.sup.J of the DXN emission concentration to obtain a prediction error {e.sup.j,0}.sub.j=1.sup.J.
[0027] (3) I iterations are performed on each of a new training subset {X.sup.j, e.sup.j,0}.sub.j=1.sup.J to build I×J GBDT-based DXN sub-models {{f.sub.GBDT.sup.j,i(⋅)}.sub.i=1.sup.I}.sub.j=1.sup.J by the GBDT-based DXN sub-model establishing module, where the new training subset is formed by the prediction error {e.sup.j,0}.sub.j−1.sup.J as an output data true value and an input data of a training subset {X.sup.J}.sub.j−1.sup.J.
[0028] (4) The RF-based DXN sub-model {ŷ.sub.RF.sup.J}.sub.j−1.sup.J and the GBDT-based sub-model {{f.sub.GBDT.sup.j,i(⋅)}.sub.j−1.sup.I}.sub.j−1.sup.J are subjected to simple averaging by the simple average-based DXN integrated prediction module to establish the DXN emission concentration prediction model f.sub.DXN(⋅).
[0029] Accordingly, steps of modeling the method herein is as follows.
[0030] (1) A random sampling with replacement and a random selection of a fixed number of input features are performed on the process variable of the MSWI process to generate J training subsets.
[0031] (2) J RF-based DXN sub-models {f.sub.RF.sup.j(⋅)}.sub.j−1.sup.J are established.
[0032] (3) I iterations are performed to build I×J GBDT-based DXN sub-models {{f.sub.GBDT.sup.j,i(⋅)}.sub.j−1.sup.I}.sub.j−1.sup.J where the prediction error {e.sup.j,0}.sub.j−1.sup.J of the {f.sub.RF.sup.j(⋅)}.sub.j−1.sup.J is used as the output data true value.
[0033] (4) The RF-based DXN sub-model and the GBDT-based sub-model are subjected to simple averaging to establish the DXN emission concentration prediction model.
Description of Work Process Of Training Sample and Input Feature Random Sampling Module
[0034] The process variable of the MSWI process is processed by a Bootstrap method and a random subspace method (RSM).
[0035] The training subset is extracted by the Bootstrap method, where the number of samples in the training subset is the same with the number of samples of the training sample set.
[0036] Then the RSM is introduced to randomly select some features to generate J training subsets including N training samples and M.sup.j input features.
[0037] Generation of the training subset is expressed as follows:
[0038] where {X.sup.j,y.sup.j} is a jth training subset; (x.sup.j,M.sup.
Description of Work Process of Work Process of RF-Based DXN Sub-Model Establishing Module
[0039] The RF-based DXN sub-model establishing module is operated through the following steps with the jth training subset {(x.sup.j,M.sup.
[0040] A duplicate sample is removed from the jth training subset {(x.sup.j,M.sup.
A mth input feature x.sup.j,m is taken as a splitting variable, and a value x.sub.n.sub.
[0041] A number of an optimal splitting variable and a value of the splitting point are found based on the following criterion by traversing all input features:
[0042] where y.sub.1.sup.j and y.sub.2.sup.j are a measured value of DXN emission concentration of the jth training subset in the R.sub.1 and the R.sub.2, respectively; and C.sub.1 and C.sub.2 are a mean value of a measured value of DXN emission concentration in the R.sub.1 and the R.sub.2, respectively.
[0043] The above processes are repeated respectively for R.sub.1 and R.sub.2 until the number of training samples in a leaf node is less than a preset threshold ∂.sub.RF to split the input feature space into K areas. The K areas are marked as R.sub.1, . . . , R.sub.k, . . . , R.sub.K, respectively, where K indicates the number of the leaf node of the CART.
[0044] The RF-based DXN sub-model established by the CART is expressed as follows
[0045] where N.sub.R.sub.
is a n.sub.R.sub.
[0046] A prediction error of the RF-based DXN sub-model established based on the jth training subset {(x.sup.j,M.sup.
[0047] where (e.sup.j,0).sub.n is a predicted error of DXN emission concentration based on a nth training sample.
[0048] The above processes are repeated to obtain J RF-based DXN sub-models {f.sub.RF.sup.j(⋅)}.sub.j=1.sup.J established by the CART. A predicted output {ŷ.sub.RF.sup.j}.sub.j=1.sup.J of the J RF-based DXN sub-models respectively is subtracted from a measured value {y.sup.j}.sub.j=1 to obtain the prediction error {e.sup.j,0}.sub.j=1.sup.J.
Description of Work Process of Work Process of the GBDT-Based DXN Sub-Model Establishing Module
[0049] Multiple weak learner models “in series” are established, where an input data of a training subset of the multiple weak learner models is unchanged. A true value of output data of a training subset of a first GBDT-based DXN sub-model is an error between the predicted output of the RF-based DXN sub-model and the measured value. And a true value of output data of a training subset of other GBDT-based DXN sub-models is a prediction error of the GBDT-based DXN sub-model iterated in a previous iteration.
[0050] Establishment of a jth GBDT-based DXN sub-model is taken as an example. I GBDT-based DXN sub-model are supposed to be established by the CART.
[0051] A first GBDT-based DXN sub-model is established:
[0052] where ŷ.sub.GBDT.sup.j,1 is the predicted output of the first GBDT-based DXN sub-model.
[0053] A loss function of the first GBDT-based DXN sub-model is defined as follows:
[0054] where (ŷ.sub.GBDT.sup.j,1).sub.n is a predicted value of a nth sample in a jth training subset.
[0055] An output residual e.sup.j,1 of the first GBDT-based DXN sub-model f.sub.GBDT.sup.j,1(⋅) is calculated, which is expressed as follows:
[0056] The e.sup.j,1 is taken as a true value of output data of a training subset of a second GBDT-based DXN sub-model f.sub.GBDT.sup.j,2(⋅). The second GBDT-based DXN sub-model is expressed as follows:
[0057] where (e.sup.j,1).sub.n is a predicted error of the first GBDT-based DXN sub-model of the nth sample.
[0058] The above processes are repeated. A ith (i≤I) GBDT-based DXN sub model is marked as f.sub.GBDT.sup.j,i(⋅) where an output residual of the ith GBDT-based DXN sub-model is expressed as follows:
[0059] After I−1 iterations, a true value of output data of a training subset of a Ith GBDT-based DXN sub-model is expressed as follows:
[0060] where ŷ.sub.GBDT.sup.j,I−1 is a predicted output of a (I−1)th GBDT-based DXN sub-model.
[0061] The Ith GBDT-based DXN sub-model is expressed as follows:
[0062] where (e.sup.j,I−1).sub.n is a predicted error of the (I−1)th GBDT-based DXN sub-model for the nth sample.
[0063] As a consequence, the I GBDT-based DXN sub-models based on the jth training subset are expressed as {f.sub.GBDT.sup.j,i(⋅)}.sub.i=1.sup.I, and an output of the I GBDT-based DXN sub-models is expressed as {ŷ.sub.GBDT.sup.j,i}.sub.i=1.sup.I,
Description of Work Process of Work Process of Simple Average-Based DXN Integrated Prediction Module
[0064] J RF-based DXN sub-models established in parallel are indicate as {f.sub.RF.sup.j(⋅)}.sub.j=1.sup.J. J×I GBDT-based DXN sub-models established in series and parallel simultaneously are indicated as
[0065] For the jth training subset, one RF-based DXN sub-model and I GBDT-based DXN sub-models are established in parallel. A sum of a predicted output of the one RF-based DXN sub-model and the I GBDT-based DXN sub-models are taken as a total output of the jth training subset, which is expressed as follows:
[0066] Since the J training subsets are parallel, the one RF-based DXN sub-model is combined with the I pieces of GBDT-based DXN sub-model through simple average weighting method, where the prediction model f.sub.DXN (⋅) is expressed as follows:
Description of Work Process of Prediction of DXN Emission Concentration Based on EnRFGBDT Method
[0067] The DXN emission concentration prediction model is established based on the training sample and input feature random sampling module, the RF-based DXN sub-model establishing module, the GBDT-based DXN sub-model establishing module and the simple average-based DXN integrated prediction module.
[0068] The process variable of the MSWI process, including furnace temperature, activated carbon injection amount, stack emission gas concentration, grate speed, primary air flow and secondary air flow, is taken as an input {x|x.sub.1, . . . , x.sub.m, . . . x.sub.M} of the DXN emission concentration prediction model. The input is calculated successively by the RF-based DXN sub-model establishing module, the GBDT-based DXN sub-model establishing module and the simple average-based DXN integrated prediction module, and a current DXN emission concentration value is as a DXN emission concentration predicted value of the MSWI process.
Experimental Verification
Modeling Data
[0069] The modeling data herein is the inspection data of the incinerator 1# and incinerator 2# of a MSWI power plant in Beijing in the past 6 years, including the process variable as the input data and the measured value of the DXN emission concentration as the output data. The process variable is obtained from 53 power generation systems, 115 public electrical systems, 14 waste heat boiler systems, 79 incineration systems, 20 flue gas treatment systems and 6 terminal detection systems. The DXN emission concentration is obtained by online collection and offline analysis, and an unit of the DXN emission concentration is ng/Nm.sup.3. ⅔ of the 67 samples (45 samples) are used as training data and ⅓ (22 samples) are used as testing data.
Modeling Experiment
[0070] A square error is taken as the loss function both in RF method and GBDT method. The number of the training sample is 45. A range of the number of input feature is [10,20,30,40,50,60,70,80,90,100]. A range of the iteration time of the GBDT is [1,2,3,4,5,6,7,8,9]. The minimum number of the training sample included in the leaf node of the CART is 3. An out-of-bag data (OOB) sampled by the Bootstrap algorithm is configured to perform model testing, with a root-mean-square error (RMSE) as an evaluation index.
[0071] For a DXN emission concentration prediction model based on RF, a relationship between the number of input feature and an OOB error is shown in the Table 1, where the number of the CART is 5 (expressed as an average of 50 experiments).
TABLE-US-00001 TABLE 1 OOB error under different number of input features The number of The number of CART input feature OOB error 5 5 1.254442 5 10 1.088729 5 15 1.071183 5 17 1.083559 5 20 1.186283 5 25 1.140331 5 30 1.254986
[0072] As shown in Table 1, the OOB error is minimum in the case of 15 input features. A relationship between the CART in the DXN emission concentration prediction model based on RF and the OOB error is shown in Table 2, where the number of the input features is unchanged. The experimental result is an average of 50 experiments.
TABLE-US-00002 TABLE 2 OOB error with different number of CART The number The number of input feature of CART OOB error 15 10 1.1185 15 20 1.0924 15 30 1.08139 15 40 1.0806 15 50 1.0972 15 60 1.1153 15 70 1.1128 15 80 1.1281 15 90 1.1248 15 100 1.1280
[0073] As shown in Table 2, the 00B error of the DXN emission concentration prediction model based on RF is minimum in the case of 40 CART, which is slightly higher than the minimum in the Table 1. Therefore, an optimization is required both in the number of the CART and the number of the input features to obtain a better prediction performance.
[0074] For a DXN emission concentration prediction model based on GBDT, a relationship between a loss function of square error and an iteration time is shown as Table 3.
TABLE-US-00003 TABLE 3 Relationship between loss function of square error and iteration time of DXN emission concentration prediction model based on GBDT Iteration time The number of CART Value of loss function 0 1 44.0000 1 2 3.6268 2 3 0.6641 3 4 0.1888 4 5 0.05339 5 6 0.02496 6 7 0.008141 7 8 0.003389 8 9 0.002034 9 10 0.001011
[0075] As shown in Table 3, the value of loss function is gradually decreased as the iteration time increases. A decreasing of the square error is slowed when the iteration time reaches 5. Accordingly, an appropriate iteration time is necessary for reducing a computing consumption.
[0076] Therefore, a preferable parameter herein is as follows: the number of the input feature is 10; the number of the CART is 5; and the number of the GBDT-based DXN sub-model (iteration time) is 5. Statistical results of training set and testing set based on different method is shown in Table 4.
TABLE-US-00004 TABLE 4 Statistical results of DXN emission concentration prediction models constructed respectively based on RF, GBDT and a combination thereof Training set Test set Method RMSE RMSE RF 0.34060 0.03019 GBDT 0.02355 0.03529 EnRFGBDT 0.01478 0.02844
[0077] As shown in Table 4,
[0078] Based on the process variable of MSWI process, for a technical problem of detection of DXN emission concentration in real-time, a DXN emission concentration prediction model based on hybrid integration of RF and GBDT is provided. The prediction model herein has following novelty: the first DXN sub-model is established based on RF, and other multiple DXN sub-models are established based on GBDT. The dimension is reduced and the predicted error of the prediction model herein is reduced, simultaneously. Result of simulation experiments based on real data of the MSWI process indicates that the method for predicting herein has an outstanding prediction performance compared to prediction model merely based on RF or GBDT.