Method for predicting aeration quantity required to maintain stable dissolved oxygen concentration in activated sludge system

Abstract

Disclosed is a method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system, including measuring operating parameters of a biochemical tank of a wastewater treatment plant at intervals of a t.sub.1 time period over a certain period of time; replacing the DO concentration data with DO concentration data after a t.sub.2 time period from the time of measurement; filtering the DO concentration data after being replaced to form a dataset; building a random forest model, and building a machine learning matrix using data in the dataset to train the random forest model; evaluating prediction performance of the trained random forest model; and using the trained random forest model to predict an aeration airflow rate required to achieve the target DO concentration value after the t.sub.2 time period, thereby adjusting an aeration airflow rate.

Claims

1. A method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system, comprising the following steps: step 1: measuring and recording influent flow rate, influent organic matter concentration, influent ammonia nitrogen concentration, dissolved oxygen (DO) concentration in the activated sludge system, sludge concentration, and aeration airflow rate in a biochemical tank of a wastewater treatment plant at intervals of a t.sub.1 time period over a certain period of time, wherein the sludge concentration is suspended solid concentration in the activated sludge system; step 2: in view of lag effect in changes in the DO concentration in the activated sludge system, replacing the DO concentration data recorded at a time of measurement in the step 1 with DO concentration data after a t.sub.2 time period from the time of measurement; step 3: filtering the DO concentration data after being replaced in the step 2, deleting the data that falls outside a normal range of DO concentration, and forming a dataset by combining the filtered DO concentration data with the influent flow rate, the influent organic matter concentration, the influent ammonia nitrogen concentration, the DO concentration in the activated sludge system, the sludge concentration, and the aeration airflow rate measured in the step 1; step 4: building a random forest model, building a machine learning matrix using data in the dataset, dividing the machine learning matrix into a training set and a test set, training the random forest model, and evaluating prediction performance of the random forest model; and step 5: taking a preset DO concentration required to be reached after the t.sub.2 time period at a current time as a target DO concentration value, inputting the target DO concentration value and influent flow rate, influent organic matter concentration, influent ammonia nitrogen concentration, and sludge concentration measured at the current time into the trained random forest model to predict an aeration airflow rate required to achieve the target DO concentration value, and adjusting an aeration airflow rate of an aeration blower according to the aeration airflow rate; wherein the machine learning matrix in the step 4 is expressed as follows: $[\begin{matrix} {FL}_{1} & {COD}_{1} & {intFL}_{1} & N_{1} & {DO}_{t - 1} & {SS}_{1} \\ {FL}_{2} & {COD}_{2} & {intFL}_{2} & N_{2} & {DO}_{t - 2} & {SS}_{2} \\ .Math. & .Math. & .Math. & .Math. & .Math. & .Math. \\ {FL}_{m} & {COD}_{m} & {intFL}_{m} & N_{m} & {DO}_{t - m} & {SS}_{m} \end{matrix}],$ in the matrix, FL.sub.1-FL.sub.m are data of aeration airflow rate, COD.sub.1-COD.sub.m are data of influent chemical oxygen demand, intFL.sub.1-intFL.sub.m are data of influent flow rate, N.sub.1-N.sub.m are data of influent ammonia nitrogen concentration, DO.sub.t-1-DO.sub.t-m are DO concentration data in the activated sludge system, SS.sub.1-SS.sub.m are data of sludge concentration, and m denotes a total number of records taken at intervals of a t.sub.1 time period over a certain period of time; a method for evaluating prediction performance in the step 4 is as follows: using an error between a predicted value and a measured value to characterize the prediction performance of the random forest model by calculating mean absolute percentage error (MAPE) and coefficient of determination (R.sup.2) of the trained random forest model on the test set; and when R.sup.2 is greater than 0.8 and MAPE is less than 20%, the random forest model is considered to have good prediction performance, otherwise, the DO concentration data in the dataset needs to be further replaced; a method for further replacing the DO concentration data in the dataset is as follows: adjusting values of the t.sub.2 time period and the normal range of DO concentration, replacing the DO concentration data at a time of measurement in the dataset with DO concentration data after an adjusted t.sub.2 time period from the time of measurement, and using the dataset to build a machine learning matrix again, so as to train the random forest model again to have the good prediction performance; the adjusted t.sub.2 time period represents lag feedback time for the changes in the DO concentration in the activated sludge system, and a value range thereof is 5-30 min and is an integer multiple of the t.sub.1 time period; and in the step 5, the aeration airflow rate of the aeration blower is adjusted in advance to reach the required aeration airflow rate predicted by the trained random forest model, under a condition that a directly monitored value of DO concentration in the activated sludge system exhibits significant fluctuations or is about to exceed a normal range of a target value, such that the DO concentration after the t.sub.2 time period reaches the target value.

2. The method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system according to claim 1, wherein the t.sub.2 time period in the step 2 represents lag feedback time for the changes in the DO concentration in the activated sludge system, and is an integer multiple of the t.sub.1 time period.

3. The method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system according to claim 1, wherein the normal range of DO concentration in the step 3 is usually 2.0-3.0 mg/L.

4. The method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system according to claim 1, wherein the DO concentration is considered to be in a stable state when the directly monitored value of DO concentration in the activated sludge system in the step 5 falls within a range of 0.5 mg/L of the target value; and when the DO concentration exceeds the range, the aeration airflow rate of the aeration blower is adjusted in advance to reach the required aeration airflow rate predicted by the trained random forest model.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is a schematic diagram of a method according to the present disclosure.

(2) FIG. 2 is a diagram showing fitting effect of prediction of an aeration airflow rate according to a method of the present disclosure.

DETAILED DESCRIPTIONS OF THE EMBODIMENTS

(3) The technical solutions of the disclosure will be further described in detail below with reference to the embodiments and the accompanying drawings.

(4) As shown in FIG. 1, the present disclosure provides a method for predicting aeration quantity required to maintain a stable dissolved oxygen concentration in an activated sludge system, including the following steps:

(5) step 1: measuring and recording influent flow rate, influent organic matter concentration, influent ammonia nitrogen concentration, dissolved oxygen (DO) concentration in the activated sludge system, sludge concentration, and aeration airflow rate in a biochemical tank of a wastewater treatment plant at intervals of a t.sub.1 time period over a certain period of time, where a value range of the t.sub.1 is 1-10 min.

(6) step 2: in view of lag effect in changes in the DO concentration in the activated sludge system, replacing the DO concentration data recorded at a time of measurement in the step 1 with DO concentration data after a t.sub.2 time period from the time of measurement;

(7) where the t.sub.2 time period represents lag feedback time for the changes in the DO concentration in the activated sludge system, and a value range of the t.sub.2 is 5-30 min and is an integer multiple of the t.sub.1 time period;

(8) step 3: filtering the DO concentration data after being replaced in the step 2, deleting the data that falls outside a normal range of DO concentration (2.0-3.0 mg/L), and having the filtered DO concentration data, together with the influent flow rate, the influent organic matter concentration, the influent ammonia nitrogen concentration, the DO concentration in the activated sludge system, the sludge concentration, and the aeration airflow rate measured in the step 1 formed a dataset.

(9) step 4: building a random forest model, building a machine learning matrix using data in the dataset, dividing the machine learning matrix into a training set and a test set, training the random forest model, and evaluating prediction performance of the random forest model.

(10) Specifically, the machine learning matrix is expressed as follows:

(11) $[\begin{matrix} {FL}_{1} & {COD}_{1} & {intFL}_{1} & N_{1} & {DO}_{t - 1} & {SS}_{1} \\ {FL}_{2} & {COD}_{2} & {intFL}_{2} & N_{2} & {DO}_{t - 2} & {SS}_{2} \\ .Math. & .Math. & .Math. & .Math. & .Math. & .Math. \\ {FL}_{m} & {COD}_{m} & {intFL}_{m} & N_{m} & {DO}_{t - m} & {SS}_{m} \end{matrix}]$

(12) where FL.sub.1-FL.sub.m are data of aeration airflow rate, COD.sub.1-COD.sub.m are data of influent chemical oxygen demand, intFL.sub.1-intFL.sub.m are data of influent flow rate, N.sub.1-N.sub.m are data of influent ammonia nitrogen concentration, DO.sub.t-1-DO.sub.t-m are DO concentration data in the activated sludge system, SS.sub.1-SS.sub.m are data of sludge concentration, and m denotes a total number of records taken at intervals of a t.sub.1 time period over a certain period of time.

(13) A training process of the random forest model is as follows: randomly extracting the training set to form a plurality of sub-datasets, build a decision tree independently based on each of the sub-dataset to generate a plurality of decision trees; and in the building each decision tree, select a random feature sub-set to divide nodes until a predefined number of leaf nodes is reached or further splitting is impossible.

(14) A method for evaluating prediction performance is as follows: using an error between a predicted value and a measured value to characterize the prediction performance of the random forest model by calculating mean absolute percentage error (MAPE) and coefficient of determination (R.sup.2) of the trained random forest model on the test set. Generally, the model is considered a relatively good prediction model when the MAPE is less than 10%, and prediction accuracy is acceptable when the MAPE falls within 10%-20%. When R.sup.2 is greater than 0.8 and the error between the predicted value and the actually measured value is less than 20%, the random forest model is considered to have good prediction performance. Otherwise, the DO concentration data in the dataset needs to be further replaced.

(15) A method for further filtering the DO concentration data in the dataset is as follows: adjusting values of the t.sub.2 time period and the normal range of DO concentration, replacing the DO concentration data at a time of measurement in the dataset with DO concentration data after an adjusted t.sub.2 time period from the time of measurement, and using the dataset to build a machine learning matrix again, so as to retrain the random forest model, where the adjusted t.sub.2 time period represents lag feedback time for the changes in the DO concentration in the activated sludge system, and a value range thereof is 5-30 min and is an integer multiple of the t.sub.1 time period

(16) In this embodiment, t.sub.1 is taken as 1 min, and t.sub.2 is taken as 10 min. Based on the facts of replacing the DO concentration data recorded at a time of measurement with DO concentration data after a t.sub.2 time period from the time of measurement, and the data after the DO concentration data are filtered, error metrics obtained by running the corresponding functions in Python using a scikit-learn package are: MSE=11528.36, MAE=59.68, RMSE=102.25, and R.sup.2=0.9828>0.8. When R.sup.2 is close to 1, indicating that the model has good training effect, calculated values of other error metrics are related to a data volume, and are used in this embodiment to compare modeling effect with that of the dataset before replacement and filtering. Based on a prediction dataset composed of predicted values and corresponding actual data of aeration airflow rate output of the sewage treatment plant, a relative error between a predicted value of aeration airflow rate obtained by the model in the prediction dataset and a measured value thereof is calculated, a mean absolute percentage error (MAPE=4.53%<10%) of the prediction dataset is finally obtained, a fitting degree of the prediction model effect can be then determined. As shown in FIG. 2, it indicates that the random forest model built according to the historical operating parameters of the wastewater treatment plant exhibits a good prediction effect.

(17) In order to compare the prediction performance of the dataset after replacement and filtering, an original dataset (where the DO concentration data recorded at a time of measurement are not replaced with DO concentration data after a t.sub.2 time period from the time of measurement, and the DO concentration data are not filtered) is also inputted into the above random forest model for training. The error metrics of the model are calculated as follows: MSE=12122.84, MAE=54.23, RMSE=110.10, R.sup.2=0.9719, and the prediction dataset has a MAPE=5.59%>4.53%. A comprehensive analysis of the evaluation metrics indicates that the accuracy of the model in predicting aeration airflow can be improved by filtering the training dataset.

(18) step 5: taking the DO concentration required to be reached after the t.sub.2 time period at a current time as a target DO concentration value, inputting the target DO concentration value and influent flow rate, influent organic matter concentration, influent ammonia nitrogen concentration, and sludge concentration measured at the current time into the trained random forest model to predict an aeration airflow rate required to achieve the target DO concentration value, and adjusting an aeration airflow rate of an aeration blower according to the aeration airflow rate.

(19) Specifically, in the step 5, the using the trained random forest model to predict an aeration airflow rate required to achieve the target DO concentration value is used for adjusting an aeration airflow rate of the aeration blower in advance by using the required aeration airflow rate predicted by the trained random forest model, under the condition that a directly monitored value of DO concentration in the activated sludge system exhibits significant fluctuations or is about to exceed a normal range of a target value, such that the DO concentration after the t.sub.2 time period reaches the target value. The DO concentration is considered to be in a stable state when the directly monitored value of DO concentration in the activated sludge system falls within a range of 0.5 mg/L of the target value. When the DO concentration exceeds the range, the required aeration airflow rate predicted by the trained random forest model is used for adjusting the aeration airflow rate of the aeration blower in advance. In this embodiment, the target DO concentration is 2.5 mg/L. When the directly monitored value of DO concentration in the activated sludge system exhibits significant fluctuations or is about to exceed the target range of 0.5 mg/L, that is, 2.0-3.0 mg/L, it is considered that the DO in the system cannot maintain the stability, in which case, it is necessary to predict the aeration airflow rate through the model, and control the aeration according to the predicted value of aeration airflow rate.

Method for predicting aeration quantity required to maintain stable dissolved oxygen concentration in activated sludge system

Assignee

Inventors

Cpc classification

Classification Explorer

C02F2209/225

CHEMISTRY; METALLURGY

Classification Explorer

C02F3/006

CHEMISTRY; METALLURGY

Classification Explorer

C02F2209/005

CHEMISTRY; METALLURGY

Classification Explorer

C02F2209/40

CHEMISTRY; METALLURGY

Classification Explorer

C02F2209/14

CHEMISTRY; METALLURGY

Classification Explorer

C02F2209/38

CHEMISTRY; METALLURGY

Classification Explorer

Y02W10/10

GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Classification Explorer

C02F3/12

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C02F3/12

CHEMISTRY; METALLURGY

Classification Explorer

C02F3/00

CHEMISTRY; METALLURGY

Abstract

Claims

Description