Method of Fault Monitoring of Sewage Treatment Process Based on OICA and RNN Fusion Model

Abstract

The invent relates to an intelligent fault monitoring method based on high-order information enhanced recurrent neural network, for real-time fault monitoring of sewage treatment process. The invent includes two phases of offline modeling and online monitoring. In offline phase, the original data is extracted into high-dimensional high-order information features using OCIA, which can effectively deal with the non Gaussian feature of the data and solve the correlation between variables. Then the extracted features are trained by DRNN. In the online phase, the data are directly mapped to new high-order feature components, and to be discriminated in category by the DRNN network after trained offline. If there is no fault, then the results get into the monitoring model composed of simple OICA for unsupervised monitoring. If no fault is detected, it is determined that there is no fault in the process. On the contrary, the process fault is determined, and the fault information will be added to the training data of the network for training, so as to continuously improve the monitoring accuracy of DRNN.

Claims

1. A method of fault monitoring of sewage treatment process based on OICA and RNN fusion model, comprising an offline modeling phase and an online monitoring phase, the specific steps are as follows: A. offline modeling stage: 1) collect historical data X of the sewage treatment process, and the historical data X is composed of normal data of the sewage treatment process obtained from offline test, the data include N sampling times, and J process variables at each sampling time are collected to form a data matrix X=[x.sub.1, x.sub.2, . . . x.sub.N].sup.T∈ custom-character .sup.N×J, therein, x.sub.i=(x.sub.i,1, x.sub.i,2, . . . , x.sub.i,j), x.sub.i,j represents measured value of jth variable at ith sampling time; 2) then, the historical data X is standardized, therein standardized formula of the jth variable at the ith sampling time is as follows: $\overline{x_{i, j}} = \frac{x_{i, j} - Mean (j)}{S t d (j)}$ therein, i=1, 2, . . . N, j=1, 2, . . . J; the standardized data in step 2 is reconstructed into a two-dimensional matrix, as shown in the following formula: $\overline{X} = [\begin{matrix} \overline{x_{1, 1}} & .Math. & \overline{x_{1, j}} \\ .Math. & ⋱ & .Math. \\ \overline{x_{i, 1}} & .Math. & \overline{x_{i, j}} \end{matrix}]$ 3) X is mapped to a high-order feature matrix S using the algorithm of OICA, and the specific steps are as follows: an unmixing matrix W is calculated by OICA, and then the original data X is mapped into a high-order characteristic matrix S using W, a formula of higher-order characteristic matrix S of X is obtained by W as follows:
S=W.sup.TX.sup.T furthermore, residual E is obtained based on S, and a formula of solving residual is as follows:
E=X−WS 4) statistic I.sup.2 of independent component space and statistic SPE of residual space are calculated based on S and E respectively, as follows:
I.sup.2=S.sup.TS
SPE=E.sup.TE a kernel density estimation algorithm is used to obtain estimated value I.sub.limit.sup.2 and SPE.sub.limit of statistics I.sup.2 and SPE at a preset confidence limit, and take it as a control limit of subsequent fault monitoring using OICA; 5) then set up label Y for the historical data X, namely normal and fault; 6) the high-order characteristic matrix S obtained from step 3 and label data Y obtained from step 5 are put into deep recurrent neural network DRNN for supervised training; parameters and structure of neurons after supervised training by DRNN are saved; B. online monitoring stage: 1) a preprocessing method of new data during online monitoring is shown in offline step 2, and processed new data X.sub.new is obtained; 2) new high-order feature data S.sub.new is obtained from new data X.sub.new through the off-line unmixing matrix W
S.sub.new=W.sup.TX.sub.new.sup.T 3) put S.sub.new into a trained deep recurrent neural network (DRNN) in the offline stage to judge there is a fault or not; when the fault index data is greater than 0.5, it indicates there is fault, when the fault index data is less than 0.5, it indicates that it is normal; 4) when monitoring results obtained by DRNN are normal, secondary monitoring is carried out: firstly, residual E.sub.new of the data X.sub.new is calculated, as shown in the following formula:
E.sub.new−X.sub.new−WS.sub.new therein, W is the unmixing matrix determined in step 4); 5) the monitoring statistics I.sub.k.sup.2 and SPE.sub.k of current sampling time k are calculated, as shown in the following formula:
I.sub.k.sup.2=S.sub.new′S.sub.new
SPE.sub.k=E.sub.new′E.sub.new 6) the monitoring statistics I.sub.k.sup.2 and SPE.sub.k obtained from the above steps are compared with the control limit I.sub.limit.sup.2 and SPE.sub.limit obtained from step 6) in offline monitoring phase, if any of the above two indicators exceeds the limit, it is considered that there is a fault and an alarm is given; otherwise, it is considered as normal; 7) the fault data is set up fault label according to offline step 5 and is added into the training database of DRNN for training, DRNN is trained again using the updated training data for learning new fault information, so as to monitor accurately.

2. The method of fault monitoring of sewage treatment process based on OICA and RNN fusion model according to claim 1, wherein the loss function of deep recurrent neural network (DRNN) is cross entropy loss function.

Description

DESCRIPTION OF DRAWINGS

[0025] FIG. 1 is the overall flow chart of the algorithm in the invention;

[0026] FIG. 2 shows the monitoring chart of bulking fault of sewage and sludge in sunny day;

[0027] FIG. 3 shows the monitoring chart of the toxicity impact fault of sewage in sunny day;

[0028] FIG. 4 shows the monitoring chart of bulking fault of sewage and sludge in rainy day;

[0029] FIG. 5 shows the monitoring chart of the toxicity impact fault of sewage in rainy day;

[0030] FIG. 6 is the logic chart of hardware system this method relies on;

[0031] FIG. 7 is the schematic chart of the network structure proposed in the method of the present invention.

EXEMPLARY EMBODIMENT

[0032] In order to solve the above problems, a method of fault monitoring of sewage treatment process based on OICA and RNN fusion model is proposed, which is based on an online monitoring equipment. The whole equipment includes input module, information processing module, console module and output visualization module. The proposed method is imported into the information processing module, and then the network monitoring model is established using the process data retained in the actual industry, and the established model is saved for online fault monitoring. In the actual online monitoring of industrial process, firstly the real-time process variables collected by the factory data sensors are connected to the input module as the input information of the monitoring equipment, and then the trained model is selected through the console to monitor, and the monitoring results are displayed in real time through the visualization module, so that the on-site staff can take measures in time according to the visual monitoring results, reducing the economic loss caused by process faults.

[0033] The process of sewage treatment is extremely complex, including not only all kinds of physical and chemical reactions, but also biochemical reactions. In addition, various uncertain factors, such as influent flow rate, water quality and load changes, etc., have brought great challenges to the establishment of sewage treatment monitoring model. The invention uses “Benchmark Simulation Model 1” developed by IWA as the actual sewage treatment process for real-time simulation. The model consists of five reaction tanks (5999 m.sup.3) and one secondary sedimentation tank (6000 m.sup.3). In addition, there are three aeration tanks. The aeration tank has 10 layers, 4 meters deep and covers an area of 1500 m.sup.2. The reaction process includes internal backflow and external backflow. The average sewage treatment flow rate is 20,000 m.sup.3/d and the COD is 300 mg/L. The effluent quality index of the sewage model is shown in Table 1. The fault setting model in the invention simulates two kinds of faults based on BSM1 model, sludge bulking fault and toxic impact fault.

TABLE-US-00001 TABLE 1 The effluent quality index of the sewage Variable Unit Effluent flow rate m.sup.−3 .Math. d The concentration of SI in the Effluent g COD .Math. m.sup.−3 The concentration of SS in the Effluent g COD .Math. m.sup.−3 The concentration of XI in the Effluent g COD .Math. m.sup.−3 The concentration of XS in the Effluent g COD .Math. m.sup.−3 The concentration of XBH in the Effluent g COD .Math. m.sup.−3 The concentration of XBA in the Effluent g COD .Math. m.sup.−3 The concentration of XP in the Effluent g COD .Math. m.sup.−3 The concentration of SO in the Effluent g (−COD) .Math. m.sup.−3 The concentration of SNO in the Effluent g N .Math. m.sup.−3 The concentration of SNH in the Effluent g N .Math. m.sup.−3 The concentration of SND in the Effluent g N .Math. m.sup.−3 The concentration of XND in the Effluent g N .Math. m.sup.−3 The concentration of SALK in the Effluent mol HCO3− .Math. m.sup.−3 The concentration of TSS in the Effluent g SS .Math. m.sup.−3 The concentration of Kjeldahl N in the Effluent g N .Math. m.sup.−3

[0034] The application process of the invention in the BSM1 simulation platform is described as follows:

A. Offline Modeling Stage:

[0035] Step 1: The invention simulates the sludge bulking fault and toxicity impact fault in the sewage treatment process to verify the algorithm. 14-day data of normal weather and rainstorm are collected by BSM1 model with a sampling interval of 15 minutes and a total of 1344 sampling points for each weather. In the experiment, several batches of sludge bulking data and normal data with different fault degrees under the same type were used for offline training, and a group of new single batch of sludge fault data was trained for test. The training and test data of simulated toxicity impact fault were the same as those of sludge bulking fault.

[0036] Step 2: The offline data of sewage treatment process in the normal working condition was processed, and it includes N sampling times collected from multiple batches of data and 16 process variables, which form a data matrix X=[x.sub.1, x.sub.2, . . . x.sub.N].sup.T∈R.sup.N×16. Therein, for each sampling time x.sub.i=(x.sub.i,1, x.sub.i,2, . . . , x.sub.i,j), x.sub.i,j represents the measured value of the jth variable at the ith sampling time;

[0037] Step 3: Then, the historical data X is standardized, therein the standardized formula of the jth variable at the ith sampling time is as follows:

[00003] $\overline{x_{i, j}} = \frac{x_{i, j} - Mean (j)}{S t d (j)}$

[0038] Therein, i=1, 2, . . . N, j=1, 2, . . . J; the standardized data in step 2 is reconstructed into a two-dimensional matrix, as shown in the following formula:

[00004] $\overline{X} = [\begin{matrix} \overline{x_{1, 1}} & .Math. & \overline{x_{1, j}} \\ .Math. & ⋱ & .Math. \\ \overline{x_{i, 1}} & .Math. & \overline{x_{i, j}} \end{matrix}]$

[0039] Step 4: Using the OICA algorithm mentioned above, X is mapped to a high-order feature matrix S. The mapped higher-order feature can effectively reflect the non Gaussian feature of the data and provide more fault information. The specific steps are as follows: the unmixing matrix W is calculated by OICA, and then the original data X is mapped into a high-order characteristic matrix S using W. The formula of higher-order characteristic matrix S of X is obtained by W as follows:

S=W.sup.TX.sup.T

[0040] Furthermore, the residual E is obtained based on S, and the formula of solving residual is as follows:

E=X−WS

[0041] Step 5: The statistic I.sup.2 of independent component space and the statistic SPE of residual space are calculated based on S and E respectively, as follows:

I.sup.2=S.sup.TS

SPE=E.sup.TE

[0042] The kernel density estimation algorithm is used to obtain the estimated value I.sub.limit.sup.2 and SPE.sub.limit of statistics I.sup.2 and SPE at the preset confidence limit, and take it as the control limit of the subsequent fault monitoring using OICA.

[0043] Step 6: Then set up label Y for the historical data X. According to the fault type corresponding to X at each time, normal sewage treatment process is set as 1, while fault process is set as 0.

[0044] Step 7: The high-order characteristic matrix S obtained from step 3 and label data Y obtained from step 5 are put into deep recurrent neural network DRNN for supervised training. The input of deep recurrent neural networks is the high-order feature information S obtained by OICA, and the corresponding label data of network input is the fault classification label Y obtained from step 5. The parameters and structure of neurons in DRNN after supervised training in DRNN are saved. The specific neural network structure and its parameters of DRNN are shown in the table below.

TABLE-US-00002 TABLE 1 The neural network structure and its hyper-parameters of DRNN Hyper-parameters Parameter Values Iterations 100 Number of hidden layers 3 Number of Neurons in Each Layer of Hidden 30-20-10 Layer Learning Rate 0.01

B. Online Monitoring Stage:

[0045] Step 8 The preprocessing method of new data during online monitoring is shown in offline step 3, and the processed new data X.sub.new is obtained.

[0046] Step 9 New high-order feature data S.sub.new is obtained from new data X.sub.new through the off-line unmixing matrix W

S.sub.new=W.sup.TX.sub.new.sup.T

[0047] Step 10 Put S.sub.new as network input into deep recurrent neural network (DRNN) of the trained network parameters in the offline stage to execute operation. An output y will be got through the operation of DRNN neurons of the data, and y is the index data for us to judge there is a fault or not. When y is greater than 0.5, it indicates there is a fault; when y is less than 0.5, it indicates that there is no faults at the present time.

[0048] Step 11: The faults can be well supervised classified based on DRNN, but the monitoring performance of the above methods may decrease when there is a fault that does not exists in the training database of DRNN. Furthermore, the algorithm in the invention proposes an unsupervised algorithm based on OICA to monitor the above faults, so as to calibrate the monitoring results obtained by DRNN. When the monitoring results obtained by DRNN are normal, the secondary monitoring is carried out. The specific steps are as follows: firstly, residual E.sub.new of the new data X.sub.new is obtained through the high-order statistical information S.sub.new, as shown in the following formula:

E.sub.new=X.sub.new−WS.sub.new

[0049] Therein, W is the unmixing matrix determined in step 4);

[0050] Step 12: The monitoring statistics I.sub.k.sup.2 and SPE.sub.k of current sampling time k are calculated, as shown in the following formula:

I.sub.k.sup.2=S.sub.new′S.sub.new

SPE.sub.k=E.sub.new′E.sub.new

[0051] Step 13: The monitoring statistics I.sub.k.sup.2 and SPE.sub.k obtained from the above steps are compared with the control limit I.sub.limit.sup.2 and SPE.sub.limit obtained from step 6), if any of the above two indicators exceeds the limit, it is considered that there is a fault and an alarm is given; otherwise, it is considered as normal;

[0052] Step 15: The fault data is set up fault label according to offline step 5 and is added into the training database of DRNN for training. The continuous iterative training keeps DRNN learning new fault information all the time.

[0053] The above are the specific application steps of the fault monitoring of the sewage treatment process on the BSM1 sewage simulation platform. In order to verify the effectiveness of the method, the invention respectively sets up two kinds of faults of sludge bulking and toxicity impact of sewage in sunny days and in rainy days to test the monitoring accuracy of the invention under different weather conditions. FIGS. 2-5 are the monitoring charts of sludge bulking in sunny days and rainy days respectively, and 1 in the discrete classification value in the chart represents the occurrence of fault. Table 1 shows the alarm time, false alarm rate and missed alarm rate of the faults. It can be seen from FIGS. 2-5 and table 1 that the method of the invention can effectively monitor the occurrence of sludge fault, and has low missed alarm rate and false alarm rate. In addition, the method also has good monitoring performance in the complex environment such as rainy days, indicating that the invention has strong robustness.

TABLE-US-00003 TABLE 2 The monitoring performance of the invent under different conditions Number of Fault Alarm Number of Missed Type of Faults Time Time False Alarm Alarm Bulking Fault of 672-864 672 0 1 Sludge in Sunny Days Toxicity Impact 672-864 672 3 1 Fault in Sunny Days Bulking Fault of 672-864 672 1 2 Sludge in Rainny Days Toxicity Impact 672-864 672 0 1 Fault in Rainny Days

Method of Fault Monitoring of Sewage Treatment Process Based on OICA and RNN Fusion Model

Inventors

Cpc classification

Classification Explorer

G05B23/0254

PHYSICS

Classification Explorer

G05B13/04

PHYSICS

Classification Explorer

G05B23/024

PHYSICS

Classification Explorer

G05B23/0281

PHYSICS

Classification Explorer

G05B13/027

PHYSICS

International classification

Classification Explorer

G05B23/02

PHYSICS

Classification Explorer

G05B13/02

PHYSICS

Classification Explorer

G05B13/04

PHYSICS

Abstract

Claims

Description