SYSTEM AND METHOD FOR DYNAMIC-MODULAR-NEURAL-NETWORK-BASED MUNICIPAL SOLID WASTE INCINERATION NITROGEN OXIDES EMISSION PREDICTION

Abstract

A dynamic modular neural network (DMNN) for NOx emission prediction in MSWI process is provided. First, the input variables are smoothed and normalized. Then, a feature extraction method based on principal component analysis (PCA) was designed to realize the dynamic division of complex conditions, and the prediction task to be processed was decomposed into sub-tasks under different conditions. In addition, aiming each sub-tasks, a long short-term memory (LSTM)-based sub-network is constructed to achieve accurate prediction of NOx emissions under various working conditions. Finally, a cooperative strategy is used to integrate the output of the sub-networks, further improving the accuracy of prediction model. Finally, merits of the proposed DMNN are confirmed on a benchmark and real industrial data of a municipal solid waste incineration (MSWI) process. The problem that the NOx emission of MSWI process is difficult to be accurately predicted due to the sensor limitation is effectively solved.

Claims

1. A method for dynamic-modular-neural-network (DMNN)-based municipal solid waste incineration (MSWI) process nitrogen oxides (NOx) emission prediction, comprising steps of: obtaining sensor data associated with an MSWI process, the sensor data comprising a data set comprising a plurality of samples; preprocessing the sensor data to remove those of the samples that comprise noise and standardizing the data set; decomposing a task of prediction of a NOx emission associated with an MSWI process into a plurality of sub-tasks using principal component analysis, comprising applying a sliding window of a fixed size to the preprocessed sensor data set and identifying key variables of operating conditions of the MSWI process by key variables by applying a sliding window to the preprocessed sensor data set, each of the key variables associated with one of the sub-tasks: constructing a long-short-term memory (LSTM) neural network, the LTSM neural network comprising a plurality of sub-networks, wherein each of the sub-networks outputs a value for one of the sub-tasks and a key variable associated with that sub-task serves as an input for that sub-network; obtaining a further set of sensor data associated with a further MSWI process, the further sensor data comprising further data samples; comparing at least one of the further samples to at least some of the samples in the preprocessed sensor data set; activating at least some of the sub-networks based on the comparison; and using the activated subnetworks in the LTSM network to predict the NOx emission for the further MSWI process, wherein the steps are performed by at least one suitably-programmed computer and wherein a plant associated with the further MSWI process is operated based on the NOx prediction for the further MSWI process.

2. A method according to claim 1, wherein the comparison comprises finding a similarity between the at least one of the further samples and the at least some of the samples in the preprocessed sensor data set.

3. A method according to claim 2, wherein the similarity is determined using Euclidian distance.

4. A method according to claim 1, wherein each sub-network comprises at least one cell that comprises an input, a forget, an output, and a cell state gate.

5. A method according to claim 1, wherein the sensor data is obtained using one or sensors, the sensors comprising one or more a thermocouple temperature sensor, an air volume sensor, a liquid flow sensor, a continuous emission monitoring system, distributed control system and upper computer.

6. A method according to claim 1, wherein the sensor data comprises air flow of combustion grate left side 1-1, air flow of dry grate left side 1, temperature of primary combustion chamber, left side temperature of primary combustion chamber, right side temperature of primary combustion chamber, cumulative primary air flow, cumulative secondary air flow, accumulated urea solution flow, accumulated urea solution supply flow and a NOx emission value associated with the MSWI process.

7. A method according to claim 1, wherein the window moves forward along the preprocessed sensor data set by a step and the key variables are determined successively.

8. A method according to claim 1, wherein the NOx emission for the further MSWI process is predicted in accordance with: ${\overset{?}{y}}_{NOx} = \frac{{.Math.}_{r = 1}^{R} {\overset{?}{y}}_{NOx}^{r}}{R},$ where ?.sub.NOx denotes the predicted value of the NOx emission, and ?.sub.NOx.sup.r is a sub-network output, and r=1, 2, . . . , R, R represents a number of the activated sub-networks.

9. A method according to claim 1, wherein the at least one suitably-programmed computer receives the further sensor data in real-time.

10. A method according to claim 1, wherein a denitration control system of the plant is controlled based on the for the further MSWI process.

11. A system for dynamic-modular-neural-network (DMNN)-based municipal solid waste incineration (MSWI) process nitrogen oxides (NOx) emission prediction, comprising: at least one computer configured to: obtaining sensor data associated with an MSWI process, the sensor data comprising a data set comprising a plurality of samples; preprocessing the sensor data to remove those of the samples that comprise noise and standardize the data set; decompose a task of prediction of a NOx emission associated with an MSWI process into a plurality of sub-tasks using principal component analysis, comprising applying a sliding window of a fixed size to the preprocessed sensor data set and identifying key variables of operating conditions of the MSWI process by key variables by applying a sliding window to the preprocessed sensor data set, each of the key variables associated with one of the sub-tasks: construct a long-short-term memory (LSTM) neural network, the LTSM neural network comprising a plurality of sub-networks, wherein each of the sub-networks outputs a value for one of the sub-tasks and a key variable associated with that sub-task serves as an input for that sub-network; obtain a further set of sensor data associated with a further MSWI process, the further sensor data comprising further data samples; compare at least one of the further samples to at least some of the samples in the preprocessed sensor data set; activate at least some of the sub-networks based on the comparison; and use the activated subnetworks in the LTSM network to predict the NOx emission for the further MSWI process, wherein a plant associated with the further MSWI process is operated based on the NOx prediction for the further MSWI process.

12. A system according to claim 11, wherein the comparison comprises finding a similarity between the at least one of the further samples and the at least some of the samples in the preprocessed sensor data set.

13. A system according to claim 12, wherein the similarity is determined using Euclidian distance.

14. A system according to claim 11, wherein each sub-network comprises at least one cell that comprises an input, a forget, an output, and a cell state gate.

15. A system according to claim 11, wherein the sensor data is obtained using one or sensors, the sensors comprising one or more a thermocouple temperature sensor, an air volume sensor, a liquid flow sensor, a continuous emission monitoring system, distributed control system and upper computer.

16. A system according to claim 11, wherein the sensor data comprises air flow of combustion grate left side 1-1, air flow of dry grate left side 1, temperature of primary combustion chamber, left side temperature of primary combustion chamber, right side temperature of primary combustion chamber, cumulative primary air flow, cumulative secondary air flow, accumulated urea solution flow, accumulated urea solution supply flow and a NOx emission value associated with the MSWI process.

17. A system according to claim 11, wherein the window moves forward along the preprocessed sensor data set by a step and the key variables are determined successively.

18. A system according to claim 11, wherein the NOx emission for the further MSWI process is predicted in accordance with: ${\overset{?}{y}}_{NOx} = \frac{{.Math.}_{r = 1}^{R} {\overset{?}{y}}_{NOx}^{r}}{R},$ where ?.sub.NOx denotes the predicted value of the NOx emission, and ?.sub.NOx.sup.r is a sub-network output, and r=1, 2, . . . , R, R represents a number of the activated sub-networks.

19. A system according to claim 11, wherein the at least one suitably-programmed computer receives the further sensor data in real-time.

20. A system according to claim 11, wherein a denitration control system of the plant is controlled based on the for the further MSWI process.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0071] FIG. 1 is a Hardware system in MSWI process.

[0072] FIG. 2 is a Flow chart of MSWI process.

[0073] FIG. 3 is a Structure of LSTM neural cell.

[0074] FIG. 4 is a DMNN-based model for NOx emission prediction in MSWI process.

[0075] FIG. 5 is a Distribution of variable importance.

[0076] FIG. 6 is a Training result of DMNN.

[0077] FIG. 7 is a Testing result of DMNN.

[0078] FIG. 8 is a Regression analysis.

[0079] FIG. 9 is a Prediction errors comparison of different algorithms.

[0080] FIGS. 10a-10e are Prediction errors distribution of different algorithms.

[0081] FIG. 11 is a Distribution of variable importance.

[0082] FIG. 12 is a Training result of DMNN.

[0083] FIG. 13 is a Testing result of DMNN.

[0084] FIG. 14 is a Regression analysis.

[0085] FIG. 15 is a Prediction errors comparison of different algorithms.

[0086] FIGS. 16a-16e are Prediction errors distribution of different algorithms.

DETAILED DESCRIPTION

[0087] This invention firstly uses the debutanizer column process to verify the validity of DMNN method, and then it was applied on a real MSWI process to predict NOx emission concentration.

C4 Prediction Based on the DMNN

[0088] The original dataset is composed of 2394 samples with 7 variables. Table 1 gives a detailed description of these variables. Considering the dynamic characteristics in the real process, a set of optimal variables were selected for C4 prediction, as shown in Eq. (42).

[00016] $\begin{matrix} {[\begin{matrix} u_{1} (t), u_{2} (t), u_{3} (t), u_{4} (t), u_{5} (t), u_{5} (t - 1), \\ u_{5} (t - 2), u_{5} (t - 3), (u_{6} (t) + u_{7} (t)) / 2, \\ y (t - 1), y (t - 2), y (t - 3), y (t - 4) \end{matrix}]}^{T} & (42) \end{matrix}$

TABLE-US-00001 TABLE 1 Variable description on debutanizer column Secondary variables Description u.sub.1 Top temperature u.sub.2 Top pressure u.sub.3 Flow of reflux u.sub.4 Flow to the next process u.sub.5 Temperature of the sixth tray u.sub.6 Temperature A at bottom u.sub.7 Temperature B at bottom

[0089] The dataset was divided into training and testing set at a ratio of 7:3. The length and step of sliding window is 600 and 300, respectively.

Dynamic Task Decomposition Based on PCA

[0090] To explore the dynamics in different windows, a total of five sub-tasks are obtained via the dynamic task decomposition based on PCA. The variable importance in each sub-task is shown in FIG. 5.

[0091] The results in FIG. 5 illustrate that the distribution of variable importance in each sub-task is different. It can be seen that the dynamic operation characteristics can be described by the variable importance due to the different distribution of samples in each window. To further visualize the importance of each variable, the variables are sorted in descending order according to the cumulative contribution rate, as shown in Table 2.

TABLE-US-00002 TABLE 2 Sorting results of variables in different windows Window number Order of variables (In descending order of contribution rate) win_1 x.sub.2 x.sub.1 x.sub.6 x.sub.7 x.sub.9 x.sub.3 x.sub.5 x.sub.8 x.sub.12 x.sub.13 x.sub.10 x.sub.4 x.sub.11 win_2 x.sub.3 x.sub.6 x.sub.7 x.sub.2 x.sub.4 x.sub.9 x.sub.5 x.sub.1 x.sub.8 x.sub.13 x.sub.12 x.sub.11 x.sub.10 win_3 x.sub.2 x.sub.1 x.sub.3 x.sub.6 x.sub.7 x.sub.9 x.sub.5 x.sub.4 x.sub.8 x.sub.13 x.sub.12 x.sub.10 x.sub.11 win_4 x.sub.2 x.sub.1 x.sub.3 x.sub.6 x.sub.7 x.sub.4 x.sub.5 x.sub.9 x.sub.10 x.sub.13 x.sub.8 x.sub.12 x.sub.11 win_5 x.sub.2 x.sub.1 x.sub.6 x.sub.3 x.sub.4 x.sub.7 x.sub.9 x.sub.5 x.sub.8 x.sub.11 x.sub.13 x.sub.12 x.sub.10

[0092] Table 2 shows that the distribution of variables is different, which can be used to characterize different operation conditions. The cumulative contribution rate threshold is determined as ?=0.85, Then, the variables with the cumulative contribution rate higher than ? is regarded as the key variables for each sub-task.

C4 Prediction Based on DMNN

[0093] Aiming each sub-task, LSTM is used to established the sub-network driven by the key variables. The training and testing results of C4 prediction for debutanizer column are shown in FIG. 6, 7, which demonstrated that the proposed DMNN has the superior performance in terms of its approximation capability.

[0094] To demonstrate the merits of the proposed method, the performance of DMNN is compared with those of RBF, LSSVM, DBN and LSTM methods, as shown in Table 3.

TABLE-US-00003 TABLE 3 Comparison of C4 prediction results on debutanizer column Training phase Testing phase Methods RMSE MAPE R.sup.2 RMSE MAPE R.sup.2 RBF 0.0202 0.0789 0.9816 0.1575 0.3982 0.2316 LSSVM 0.0154 0.0581 0.9893 0.0538 0.1479 0.9105 DBN 0.1655 1.8736 LSTM 0.0736 0.8566 DMNN 0.0057 0.0215 0.9986 0.0311 0.1343 0.9701

[0095] Compared with RBF, LSSVM and DBN, LSTM neural network shows significant advantages in terms of its lower RMSE, MAPE and R.sup.2. It illustrates that LSTM is more suitable to tackle the complex task because of its memory properties. On this basis, PCA-based dynamic task decomposition method further improves the prediction accuracy of C4. In contrast with other methods, DMNN shows an average improvement of 65.35% in RMSE, 68.48% in MAPE, and 39.91% in R.sup.2. Besides, the regression performance of different methods plotted in FIG. 8 reveals the high approximation ability of the proposed DMNN.

[0096] For performance comparison, the prediction errors of each method are visualized in FIG. 9, 10a-10e. Compared with the other methods, the prediction error of the DMNN is significantly closer to 0, which indicates the effectiveness of this method.

Prediction of NOx Emission in MSWI Process

[0097] MSWI process is a complex dynamic system. As one of the important pollutant, accurate prediction of NOx emission has great significance to ensure the stable operation of MSWI plant. The experiment was implemented based on the real industrial data. A total of 2215 samples was collected from the DCS with the sampling interval of 10 s. 1550 samples are considered as the training set to construct the model, and the remaining are used to evaluate the proposed method. Combined with the operation characteristic of MSWI process, 10 variables that are highly related to NOx are used for establishing the prediction model, as shown in Table 4.

TABLE-US-00004 TABLE 4 Variable description of NOx prediction model. Index Variables Range Unit 1 Air flow of combustion grate (left side 1-1) 4~13 km.sup.3N/h 2 Air flow of combustion grate (right side 1-1) 5.5~10 km.sup.3N/h 3 Air flow of dry grate (left side 1-1) 1~5 km.sup.3N/h 4 Primary combustion chamber temperature 900~1040 ? C. 5 Primary combustion chamber temperature(left) 870~1070 ? C. 6 Primary combustion chamber temperature(right) 850~1050 ? C. 7 Accumulation of primary air flow 980~1383 km.sup.3N 8 Accumulation of secondary air flow 55~95 km.sup.3N 9 Accumulation of urea solution 1370~1876 L 10 Accumulation of urea solvent supply 3.31(*10.sup.4) ~ 3.37(*10.sup.4) L

[0098] In this section, the length of sliding window is 600. Considering the frequent changes of MSWI process, the moving step of the window is 100.

Dynamic Task Decomposition Based on PCA

[0099] A total of 11 sub-tasks are obtained using the dynamic task composition method. The variables importance in each window are shown in FIG. 11.

[0100] FIG. 11 reveals that the distribution of variables importance is different in each sub-tasks, which is closely related to the characteristic of real MSWI process. Influenced by the feed quantity, composition and operation means, MSWI process is complex and fluctuant. Thus, the principal components have different contribution rates to each variable. According to the ?=0.85, the dominant variables are determined for each sub-task, as shown in Table 5.

TABLE-US-00005 TABLE 5 Sorting results of variables in different windows Window number Order of variables (In descending order of contribution rate) number win_1 x.sub.4 x.sub.2 x.sub.3 x.sub.1 x.sub.8 x.sub.7 x.sub.10 x.sub.5 x.sub.6 x.sub.9 win_2 x.sub.2 x.sub.4 x.sub.3 x.sub.1 x.sub.8 x.sub.7 x.sub.6 x.sub.10 x.sub.9 x.sub.5 win_3 x.sub.1 x.sub.3 x.sub.2 x.sub.4 x.sub.7 x.sub.10 x.sub.8 x.sub.6 x.sub.5 x.sub.9 win_4 x.sub.1 x.sub.4 x.sub.2 x.sub.3 x.sub.7 x.sub.10 x.sub.8 x.sub.6 x.sub.5 x.sub.9 win_5 x.sub.1 x.sub.4 x.sub.3 x.sub.2 x.sub.7 x.sub.10 x.sub.5 x.sub.8 x.sub.9 x.sub.6 win_6 x.sub.1 x.sub.7 x.sub.4 x.sub.3 x.sub.5 x.sub.2 x.sub.6 x.sub.9 x.sub.8 x.sub.10 win_7 x.sub.4 x.sub.1 x.sub.2 x.sub.3 x.sub.5 x.sub.7 x.sub.9 x.sub.10 x.sub.8 x.sub.6 win_8 x.sub.4 x.sub.2 x.sub.1 x.sub.3 x.sub.5 x.sub.9 x.sub.10 x.sub.7 x.sub.6 x.sub.8 win_9 x.sub.4 x.sub.1 x.sub.7 x.sub.5 x.sub.3 x.sub.2 x.sub.6 x.sub.10 x.sub.8 x.sub.9 win_10 x.sub.4 x.sub.2 x.sub.7 x.sub.5 x.sub.1 x.sub.6 x.sub.3 x.sub.10 x.sub.8 x.sub.9 win_11 x.sub.7 x.sub.1 x.sub.4 x.sub.5 x.sub.2 x.sub.6 x.sub.3 x.sub.10 x.sub.8 x.sub.9

[0101] As can be seen from Table 5, the air flow of combustion grate and the primary combustion chamber temperature play a key role in sub-task-1, 2, 6-11 which indicates that the oxygen and temperature have an important impact for NOx emission. Besides, for the sub-tasks-3-5, the accumulation of urea solution is also an essential factor that cannot be ignored. From the analysis of NOx generation and emission mechanism, the coupling relationship between these variables and NOx is different in each sub-task.

NOx Emission Prediction Based on the DMNN

[0102] In this section, each sub-task is assigned to develop the corresponding sub-network using LSTM. The training and testing results of NOx emission prediction based on DMNN are shown in FIG. 12, 13. The results demonstrate that the predicted values of DMNN are close to the real values in general. Meanwhile, the testing results of the samples distributed in the range of 550?650 have a large deviation, which can be explained by the violent and frequent fluctuation of the MSWI process. To further demonstrate the merits of the proposed method, the performance of DMNN is compared with those of RBF, LSSVM, DBN and LSTM neural networks, as shown in Table 6.

TABLE-US-00006 TABLE 6 Comparison of NOx emission prediction results on MSWI process Training phase Testing phase Methods RMSE MAPE R.sup.2 RMSE MAPE R.sup.2 RBF 6.2630 4.0806 0.9696 12.3938 6.8560 0.7659 LSSVM 4.2520 2.5481 0.9860 10.4851 6.9292 0.8325 DBN 4.8278 2.9096 0.9819 8.0834 5.9306 0.9004 LSTM 4.0860 2.5801 0.9871 8.3332 5.0864 0.8942 DMNN 3.4603 2.0801 0.9890 7.3510 4.4921 0.9177

[0103] Table 6 presents the performance comparison of various methods for NOx emission prediction, wherein the effectiveness of the proposed DMNN is further manifested. Typically, LSTM neural network still shows significant advantages in processing time-series. In addition, the DMNN with dynamic task decomposition method based on PCA further improves the prediction accuracy in both the training and testing phase. Compared with other algorithms, the testing performance of the proposed method is improved by 23.25% (RMSE), 26.4%(MAPE), and 8.65 (R.sup.2) on average. FIG. 14 shows the regression performance of the various methods. Evidently, the prediction outputs of DMNN satisfactorily fit the desired outputs.

[0104] Accordingly, the prediction errors of the different methods in the testing phase are plotted in FIGS. 15 and 16a-16e, which clearly illustrates that most prediction errors of the proposed method were close to 0.

[0105] The reasonability and effectiveness of proposed DMNN were evaluated through an industrial benchmark, and it was then applied for NOx emission prediction in the MSWI process. The following advantages can be summarized based on the above analysis:

[0106] (1) A PCA-based dynamic task decomposition method: Different from traditional clustering methods, the proposed method was designed to detected the key variables in each sliding window. Then, the original task with complex dynamics was divided into several sub-tasks, thus simplifying the complexity of the task to be processed.

[0107] (2) A DMNN-based prediction model for NOx emission: Aiming each sub-task, a LSTM was constructed driven by the key variables. Then, the nonlinearity between the key variables and NOx value is learned to guarantee the prediction accuracy. Table 3 and Table 6 show the performance index of various algorithm. The experimental results demonstrated the higher generalization of DMNN via RMSEs, MAPEs and R.sup.2s on both the training and testing sets.

[0108] The technical scheme and steps above can also be described as follows:

[0109] Step 1: Dynamic task decomposition based on PCA;

[0110] Aiming to detect the dynamic operating conditions, a sliding window with fixed size was used to decompose complex task; Then, the characteristic of operating conditions can be represented by key variables in sliding window;

[0111] The algorithm is described as follows:

[0112] A sliding window is used to detect the principal components in the time-series; The size of sliding window is denoted by win_1; Assume that the observation sample matrix in the first sliding window is represented by X.sub.m?n.sub.1.sup.win_1

[00017] $\begin{matrix} X_{m ? n}^{win_1} = {[\begin{matrix} x_{1} & x_{2} & .Math. & x_{m} \end{matrix}]}^{T} = [\begin{matrix} x_{11} & x_{12} & .Math. & x_{1 n_{1}} \\ x_{21} & x_{22} & .Math. & x_{2 n_{1}} \\ .Math. & .Math. & .Math. \\ x_{m 1} & x_{m 2} & .Math. & x_{{mn}_{1}} \end{matrix}] & (1) \end{matrix}$ [0113] where m and n.sub.1 are the number of variables and samples in the sliding window win_1; x.sub.1 x.sub.2 . . . x.sub.m represent m variables of the matrix, which are inputs of prediction model;

[0114] For the debutanizer column dataset, x.sub.1 x.sub.2 . . . x.sub.m denote a total of 13 variables, they are top temperature, top pressure, flow of reflux, flow to the next process, temperature of the sixth tray at time t, temperature of the sixth tray at t-1, temperature of the sixth tray at t-2, temperature of the sixth tray at t-3, average value of the temperature at bottom at t, and the butane concentration at t-1, t-2, t-3, and t-4, respectively; The size of m is 13 in this case;

[0115] For MSWI process, x.sub.1 x.sub.2 . . . x.sub.m in represent a total of 10 variables, they are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, and accumulation of urea solvent supply, respectively; The size of m is 10 in the real industrial data;

[0116] The mean vector ? of sample matrix X.sub.m?n.sub.1.sup.win_1 is denoted as:

[00018] $\begin{matrix} ? = {[{\overline{?}}_{1}, {\overline{?}}_{2}, .Math., {\overline{?}}_{m}]}^{T} & (2) \end{matrix}$ $\begin{matrix} {\overset{?}{?}}_{i} = \frac{1}{n_{1}} {.Math.}_{j = 1}^{n_{1}} x_{ij} & (3) \end{matrix}$ [0117] where ?.sub.1, ?.sub.2, . . . , ?.sub.m represent mean value of each row in X.sub.m?n.sub.1.sup.win_1, and then the mean value of each variable can be obtained by Eq. (3); ?.sub.i denotes the i-th value of ?; i=1, 2, . . . , m, and m is the number of variables; x.sub.ij denote the value of i-th variable in j-th sample; j=1, 2, . . . , n.sub.1, n.sub.1 represents samples number in sliding window with the size of win_1;

[0118] All the samples of matrix X.sub.m?n.sub.1.sup.win_1 minus the mean (decentralized) are denoted as

[00019] $\begin{matrix} \begin{matrix} {\tilde{X}}_{m ? n_{1}}^{win_1} = [\begin{matrix} x_{11} x_{12} .Math. x_{1 n_{1}} \\ x_{21} x_{22} .Math. x_{2 n_{1}} \\ .Math. \\ x_{m 1} x_{m 2} .Math. x_{mn 1} \end{matrix}] - [\begin{matrix} {\overline{?}}_{1}, {\overline{?}}_{1}, .Math., {\overline{?}}_{1} \\ {\overline{?}}_{2}, {\overline{?}}_{2}, .Math., {\overline{?}}_{2} \\ .Math. \\ {\overline{?}}_{m}, {\overline{?}}_{m}, .Math., {\overline{?}}_{m} \end{matrix}] \\ = [\begin{matrix} {\tilde{x}}_{11} {\tilde{x}}_{12} .Math. {\tilde{x}}_{1 n_{1}} \\ {\tilde{x}}_{21} {\tilde{x}}_{22} .Math. {\tilde{x}}_{2 n_{1}} \\ .Math. \\ {\tilde{x}}_{m 1} {\tilde{x}}_{m 2} .Math. {\tilde{x}}_{mn 1} \end{matrix}] \\ = {[{\tilde{x}}_{1}, {\tilde{x}}_{2}, .Math., {\tilde{x}}_{m}]}^{T} \end{matrix} & (4) \end{matrix}$ [0119] where {tilde over (X)}.sub.m?n.sub.1.sup.win_1 represents the matrix after decentralization, {tilde over (x)}.sub.ij denotes the value of the i-th feature after decentralization in j-th sample, m represents the number of variables, and n is the number of samples contained in sliding window with the size of win_1;

[0120] The covariance matrix H.sub.m?m.sup.win_1 of {tilde over (X)}.sub.m?n.sub.1.sup.win_1 is calculated as:

[00020] $\begin{matrix} H_{m ? m}^{win_1} = \frac{1}{n_{1} - 1} {\tilde{X}}_{m ? n_{1}}^{win_1} .Math. {\tilde{X}}_{m ? n_{1}}^{win_1 T} & (5) \end{matrix}$ [0121] where {tilde over (X)}.sub.m?n.sub.1.sup.win_1T is the transpose of {tilde over (X)}.sub.m?n.sub.1.sup.win_1;

[0122] Then, the eigenvalue ? of covariance matrix H.sub.m?m.sup.win_1 can be calculated as

[00021] $\begin{matrix} .Math. H_{m ? m}^{win_1} - ? I .Math. = 0 & (6) \end{matrix}$ $\begin{matrix} I = [\begin{matrix} 1 & 0 & .Math. & 0 \\ 0 & 1 & .Math. & 0 \\ .Math. \\ 0 & 0 & .Math. & 1 \end{matrix}] & (7) \end{matrix}$ [0123] where I denote the unit matrix; Based on Eq. (6), the eigenvalues of H.sub.m?m.sup.win_1 can be represented as

?.sub.1??.sub.2? . . . ??.sub.Q (8) [0124] where Q is the number of eigenvalues; According to Eq. (8), the eigenvector ? corresponding to each eigenvalue is calculated as;

(H.sub.m?m.sup.win_1??.sub.kI)?.sub.k=0 (9) [0125] where H.sub.m?m.sup.win_1 is covariance matrix; ?.sub.k denotes the k-th eigenvalue; I is unit matrix, which is represented by Eq. (7); ?.sub.k is eigenvector corresponding to the k-th eigenvalue; ?.sub.k=[?.sub.1k, ?.sub.2k, . . . , ?.sub.mk].sup.T, (k=1, 2, . . . , Q.sub.0);

[0126] The threshold of cumulative variance contribution rate is set as ?, and if the cumulative variance satisfies

[00022] $\begin{matrix} {.Math.}_{i = 1}^{Q_{0}} ?_{k} > ? & (10) \end{matrix}$

[0127] Then the first Q.sub.0 principal components are selected for further analysis; Q.sub.0 is the number of principal components, which is determined by Eq. (10); The number of eigenvalues is Q.sub.0, which is equal to the number of principal components; ?.sub.k denotes the k-th eigenvalues; Furthermore, the threshold ? is selected as 0.85;

[0128] Then, the unit eigenvector ? corresponding to Q.sub.0 eigenvalues is used as a coefficient for linear transformation to obtain Q.sub.0 principal components:

z.sub.k=?.sub.k.sup.Tx (11) [0129] where ?.sub.k=[?.sub.1k, ?.sub.2k, . . . , ?.sub.mk].sup.T (k=1, 2, . . . , Q.sub.0)

[0130] Combining with the samples in X.sub.m?n.sub.1.sup.win_1, the principal components of n.sub.1 samples can be obtained by Eq. (11); The k-th principal component z.sub.kj of the j-th sample x.sub.j=[x.sub.1j, x.sub.2j, . . . , x.sub.mj].sup.T (j=1, 2, . . . , n.sub.1( is

[00023] $\begin{matrix} \begin{matrix} z_{kj} = {[?_{1 k}, ?_{2 k}, .Math., ?_{mk}] [x_{1 j}, x_{2 j}, .Math., x_{mj}]}^{T} \\ = {.Math.}_{i = 1}^{m} ?_{ik} x_{ij} \end{matrix} & (12) \end{matrix}$ [0131] where ?.sub.1k, ?.sub.2k, . . . , z.sub.mk denoted the m values of k-th unit eigenvector; x.sub.1j, x.sub.2j, . . . , x.sub.mj represent the m variables of j-th sample, respectively; j=1, 2, . . . , n.sub.1, i=1, 2, . . . , m, and k=1, 2, . . . , Q.sub.0;

[0132] According to Eq. (12), z.sub.k that containing k principal components can be denoted by z.sub.k=[z.sub.k1, z.sub.k2, . . . , z.sub.kn.sub.1]; Therefore, a factor load is defined as the correlation between the k-th principal component z.sub.k and i-th feature x.sub.i, which is calculated as

[00024] $\begin{matrix} ? (z_{k}, x_{i}) = \frac{\sqrt{?_{k}} ?_{ik}}{\sqrt{?_{ii}}} & (13) \end{matrix}$ [0133] where ?.sub.ik is the unit eigenvector, which denote the i-th value in ?.sub.k; ?.sub.ii is the variance of the i-th variable x.sub.i, which is also the i-th diagonal entry of covariance matrix H.sub.m?m.sup.win_1, k=1, 2, . . . , Q.sub.0, i=1, 2, . . . , m;

[0134] The factor load matrix is expressed as

[00025] $\begin{matrix} ? = [\begin{matrix} ? (z_{1}, x_{1}) & ? (z_{1}, x_{2}) & .Math. & ? (z_{1}, x_{m}) \\ ? (z_{2}, x_{1}) & ? (z_{2}, x_{2}) & .Math. & ? (z_{2}, x_{m}) \\ .Math. \\ ? (z_{q}, x_{1}) & ? (z_{q}, x_{2}) & .Math. & ? (z_{q}, x_{m}) \end{matrix}] & (14) \end{matrix}$

[0135] Then, the contribution rate ?.sub.i of Q.sub.0 principal components to the i-th variable x.sub.i (i=1, 2, . . . , m) is

[00026] $\begin{matrix} ?_{i} = {.Math.}_{k = 1}^{Q_{0}} ?^{2} (z_{k}, x_{j}) & (15) \end{matrix}$ [0136] where the contribution rate ?.sub.i is the sum of squares of factor loads between the Q.sub.0 principal components and i-th variable x.sub.i; Then, the contribution rate matrix ? of Q.sub.0 principal components corresponding to each variable can be expressed as

?=[?.sub.1, ?.sub.2, . . . , ?.sub.m](16) [0137] where m represents the number of variables contained in X.sub.m?n.sub.1.sup.win_1; The importance of variables changes with the fluctuation of complex operating conditions in MSWI furnace, that is, the contribution rate ?.sub.i of principal components corresponding to each variable will also change; Therefore, the contribution rate ? is reordered in a descending order

sort(?)=[?.sub.max, . . . , ?.sub.min](17) [0138] where the function of sort(.Math.) is to sort data in a descending order; ?.sub.max and ?.sub.min represent the maximum and minimum value of contribution rate, respectively; The key variables are determined by defining a threshold value ?;

[00027] $\begin{matrix} {{.Math.}_{i = 1}^{F} ?_{i}} > ? & (18) \end{matrix}$ [0139] where the value of ? is equal to the cumulative variance contribution rate, that is ?=0.85. F denote the number of key variables, which can be determined by ?; Equation (18) indicates that the first F variables have the greatest correlation with the principal components in the current window; Then, the first F key features are selected as reference vectors for condition identification, as shown in Eq. (19);

con_1=[x.sub.num_1.sup.win_1, x.sub.num_2.sup.win_1, . . . , x.sub.num_F.sup.win_1](19) [0140] where x.sub.num_1.sup.win_1, x.sub.num_2.sup.win_1, . . . , x.sub.num_F.sup.win_1 represent the first F variables in X.sub.m?n.sub.1.sup.win_1; Thereafter, the window moves forward by a certain step, and the key variables are detected successively; Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis, which is expressed as

condition_library=[con_1,con_2, . . . , con_W](20) [0141] where con_1, con_2, . . . , con_W represent reference vectors corresponding to different operating conditions, respectively; W denotes the number of operating conditions;

[0142] The size of sliding window and moving step is selected according to specific data sets; The simulation phase includes a debutanizer column process and a real industrial data of MSWI process; For debutanizer column process, the sliding window size is 600; Considering the dataset is accompanied by slow fluctuations, the moving step of sliding window is set to 300; For MSWI process, the size of sliding window is 600; Considering the complex variation and large fluctuation of the process, the moving step of sliding window is set to 100;

[0143] Step 2: Construction of the LSTM-based sub-network;

[0144] Aiming each sub-task, LSTM neural network is explored driven by the corresponding key variables; LSTM cell comprises input, forget, output and cell state gate, and each gate is calculated as follows:

[0145] Forget gate:

f.sub.t=?(W.sub.f.Math.[h.sub.t-1, x.sub.t]+b.sub.f) (21)

[0146] Input gate:

i.sub.t=?(W.sub.i.Math.[h.sub.t-1, x.sub.t]+b.sub.t) (22)

[0147] Cell state gate:

{tilde over (C)}.sub.t=tan h(W.sub.c.Math.[h.sub.t-1, x.sub.t]+b.sub.c) (23)

C.sub.t=f.sub.t.Math.C.sub.t-1+i.sub.t.Math.{tilde over (C)}.sub.t (24)

[0148] Output gate:

o.sub.t=?(W.sub.o[h.sub.t-1, x.sub.t]+b.sub.o) (25)

[0149] Using Eqs. (21)-(25), the final output of LSTM is

?.sub.NOx.sup.t=o.sub.t.Math.tan h(C.sub.t) (26) [0150] where x.sub.t denote the input of LSTM neural network at time t; They are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, accumulation of urea solvent supply at time t, respectively; h.sub.t-1 is the output of LSTM neural network at time t-1; W.sub.f, W.sub.i, W.sub.c and W.sub.o denote the weight matrix of the forget, input, cell state and output gate, respectively; b.sub.f, b.sub.i, b.sub.c and b.sub.o are the bias of the forget, input, cell state and output gate, respectively; f.sub.t, i.sub.t, C.sub.t and o.sub.t represent the output of the forget, input, cell state and output gate, respectively; ?.sub.NOx.sup.t is the output of LSTM neural network at time t; ?(.Math.) and tan h(.Math.) are the activation functions, which are calculated as

[00028] $\begin{matrix} ? (U) = \frac{1}{1 + e^{- U}} & (27) \end{matrix}$ $\begin{matrix} \tanh (U) = \frac{e^{U} - e^{- U}}{e^{U} + e^{- U}} & (28) \end{matrix}$ [0151] where U denote the input of activation function in each gate, as shown in Eqs. (29)-(32):

[0152] Forget gate:

U.sub.f=W.sub.f.Math.[h.sub.t-1, x.sub.t]+b.sub.f (29)

[0153] Input gate:

U.sub.i=W.sub.i.Math.[h.sub.t-1, x.sub.t]+b.sub.i (30)

[0154] Cell state gate:

U.sub.c=W.sub.c.Math.[h.sub.t-1, x.sub.t]+b.sub.c (31)

[0155] Output gate:

U.sub.o=W.sub.o.Math.[h.sub.t-1, x.sub.t]+b.sub.o (32)

[0156] Step 3: Cooperation decision strategy;

[0157] During testing stage, the similarity between the i-th testing sample and training samples is measured by Euclidean distance:

d.sub.g,j.sup.test=dist(x.sub.g.sup.test, x.sub.j.sup.train), (j=1, 2, . . . , N) (33)

dist(x.sub.g.sup.test, x.sub.j.sup.train)=?{square root over (?x.sub.g.sup.test_1?x.sub.j.sup.train_1?.sup.2+ . . . +?x.sub.g.sup.test_m?x.sub.j.sup.train_m?.sup.2)}(34)

d.sub.g.sup.test=[d.sub.g,1.sup.test, d.sub.g,2.sup.test, . . . , d.sub.g,N.sup.test](35) [0158] where x.sub.g.sup.test is the g-th sample of testing set; x.sub.g.sup.test_1 and x.sub.g.sup.test_m denote the first and m-th variable of g-th testing sample, respectively; Similarly, x.sub.j.sup.train_1 and x.sub.j.sup.train_m denote the first and m-th variable of j-th training sample, respectively; d.sub.g,1.sup.test, d.sub.g,2.sup.test, . . . , d.sub.g,N.sup.test represent Euclidean distance between g-th sample of testing set and samples of training set, respectively; g=1, 2, . . . , G, j=1, 2, . . . , N; N and G denote the number of samples in training and testing sets; According to Eq. (35), the training sample x.sub.j.sup.train which is closest to testing sample x.sub.g.sup.test is selected; Then, the operating condition of x.sub.g.sup.test is determined by that of x.sub.j.sup.train;

[0159] Finally, a decision operation strategy is adopted to generate the prediction outputs of MNN during testing phase;

[00029] $\begin{matrix} {\overset{?}{y}}_{NOx} = \frac{{.Math.}_{r = 1}^{R} {\overset{?}{y}}_{NOx}^{r}}{R} & (36) \end{matrix}$ [0160] where ?.sub.NOx denote the predicted value of NOx emission, and ?.sub.NOx.sup.r is the output of sub-network; r=1, 2, . . . , R, R represent the number of activated sub-networks;

[0161] Step 4: DMNN-based prediction model for NOx emission;

[0162] The NOx emission prediction model for MSWI process based on DMNN mainly includes four parts: data preprocessing, PCA-based dynamic task decomposition, construction of sub-network and cooperation decision strategy; As shown in FIG. 3, the original dataset is represented by X.sup.ori, and X.sup.ori?R.sup.L?m, where L denotes the number of samples and m is the number of variables; First, the original data is preprocessed via smooth and normalization, and then represented by X.sup.pre={x.sub.1.sup.i, x.sub.2.sup.i, . . . , x.sub.m.sup.i, y.sub.NOx.sup.i}.sub.i=1.sup.N; Second, to implement a dynamic task decomposition, a sliding window is performed on the training set to determine key variables; Furthermore, the corresponding sub-task is formed in each window; Then, a LSTM-based sub-network is established for sub-task with different key variables as inputs; During the testing phase, the sub-networks are activated using similarity between the testing and the training samples, which is measured via Euclidean distance; And the cooperative decision strategy is used to integrate each activated sub-network to generate the final prediction results of NOx;

[0163] In MSWI process, the sensors usually operate in a high temperature and dust environment, which bring the noise to original data; To reduce the effect of the noise on data analysis, Rajda is used to smooth the original data, as shown in Eq. (37);

|x.sup.ori??.sup.ori|?3?.sup.ori (37) [0164] where x.sup.ori denotes original sample, ?.sup.ori and ?.sup.ori denotes the mean and standard deviation of variables, respectively; The samples satisfying Eq. (37) are regarded as the outliers and removed from the original data; Then, the dataset after smoothing is expressed as X.sup.smo, and X.sup.smo?R.sup.N?m; N and m denote the number of samples and variables, respectively;

[0165] Z-score method is used to perform standardization on the dataset, which is calculated as Eq. (38);

[00030] $\begin{matrix} x_{i} = \frac{x_{i}^{smo} - ?_{i}^{smo}}{?_{i}^{sm}} & (38) \end{matrix}$ [0166] where x.sub.i, ?.sub.i.sup.smo, and ?.sub.i.sup.smo (i=1, 2, . . . , m) are the normalized vector, mean and standard deviation of the i-th dimension variable, respectively; The normalized dataset is represented by X.sub.N?m.sup.T; N and m denote the number of samples and variables, respectively;

[0167] The proposed DMNN-based NOx emission prediction framework for MSWI process (as shown in FIG. 4) is described as follows:

Training Phase

[0168] 1) Preprocess the original data ori_data=[X.sup.ori Y.sup.ori] based on Eqs. (37), (38), and then the dataset is expressed by dataset=[X Y]; [0169] 2) Set a sliding window with a fixed length of win, and the subset contained in the window is X.sup.win_1; The key features of X.sup.win_1 are constructed by Eqs. (1)-(20); Thereafter, the window moves forward by a certain step, and the key variables are detected successively; Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis; [0170] 3) For each sub-task, LSTM is applied to established the sub-network driven by the corresponding key variables; And the number of hidden neurons is optimized by trial-and-error method; [0171] 4) Move the sliding window in steps and repeat step 2)-step 3);

Testing Phase

[0172] 5) Calculate the similarity between the test sample and training samples via Eqs. (33)-(35) and generate the outputs of MNN by activating the corresponding the sub-networks; [0173] 6) The final prediction result of NOx emission is obtained by integrating the outputs of the sub-networks with a cooperation decision strategy by Eq. (36).

[0174] While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope.

SYSTEM AND METHOD FOR DYNAMIC-MODULAR-NEURAL-NETWORK-BASED MUNICIPAL SOLID WASTE INCINERATION NITROGEN OXIDES EMISSION PREDICTION

Inventors

Cpc classification

Classification Explorer

F23G5/50

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Classification Explorer

F23G2207/30

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

F23G2207/60

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Classification Explorer

G06N3/045

PHYSICS

International classification

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

F23G5/50

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Abstract

Claims

Description