Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression

20170061305 ยท 2017-03-02

    Inventors

    Cpc classification

    International classification

    Abstract

    The invention provides a fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression, it is suitable for application in chemical process with time delay characteristics. This method can extract stable delay information from the historical database of process and introduce more relevant modeling data sequence to the dominant variable sequence. First of all, the method of fuzzy curve analysis (FCA) can intuitively judge the importance of the input sequence to the output sequence, estimate the time-delay parameters of process, and such offline time-delay parameter set can be utilized to restructure the modeling data. For the new input data, based on the historical variable value before a certain time, the current dominant value can be predicted by time difference Gaussian Process Regression (TDGPR) model. This method does not encounter the problem of model updating and can effectively track the drift between input and output data. Compared with steady-state modeling methods, this invention can achieve more accurate predictions of the key variable, thus improving product quality and reducing production costs.

    Claims

    1. A soft sensor modeling method of fuzzy curve analysis based time difference Gaussian process regression, wherein the method comprises the following steps: step 1: collecting input and output variables of a process, constructing a historical training database and obtaining N samples: {X(t), y(t)}, t=1, 2, . . . , N; and then preprocessing the data; according to a process mechanism and experience, determining a maximum time delay T.sub.max existing in auxiliary variables; step2: for each original auxiliary variable x.sub.i, i{1, 2, . . . , m}, extending it to input variable set with time delay {x.sub.i(t), =0, 1, . . . , T.sub.max}, wherein an extension mode is: ##STR00002## it ( x i ) = exp [ - ( x i ( t ) - x i b ) 2 ] ( 2 ) step 3: determining importance of each variable in time delay input variable set by fuzzy curve analysis and obtaining an optimal time delay variable x.sub.i(td.sub.i); wherein determining the importance comprises: wherein the input variable set is {x.sub.i, i=1, 2, . . . , m} and output variable is y, for input an variable x.sub.i, whose sample value collected at time t is denoted as x.sub.i(t), for (x.sub.i(t),y(t)), a fuzzy membership function of variable x.sub.i is defined as: it ( x i ) = exp [ - ( x i ( t ) - x i b ) 2 ] , ( 2 ) wherein for each x.sub.i, {.sub.it, y(t)} provides a fuzzy rule which is described as {if x.sub.i is .sub.it(x.sub.i), then y is y(t)}, and .sub.it is a fuzzy membership function of input variable x.sub.i at the t-th data point; wherein in formula (2), a Gaussian fuzzy membership function is selected; wherein b is determined as 20% the range of variable x.sub.i; wherein as a result, for N training samples, each sample corresponding to each variable has N fuzzy rules; in the fuzzy membership function, .sub.it=1 holds true at each point {x.sub.i(t),y(t)}; wherein for time delay process, by introducing time delay information, the original variable x.sub.i becomes (T.sub.max+1)-dimensional, which can be expressed as x.sub.i(t), =0, 1, . . . , T.sub.max, is a variable delay value introduced; fuzzy curve C.sub.i, with the condition that is the i-th variable delay value can be obtained by making centroid defuzzification of each new expanded variable with formula (3); wherein as shown in the formula (4), d.sub.i is the which can make the maximum coverage of fuzzy curve C.sub.i,; wherein C.sub.i,().sub.max is the maximum value of the fuzzy curve point range, while C.sub.i,().sub.min is the minimum value of the fuzzy curve point range; C i , ( ) = .Math. i = 1 N .Math. it [ x i ( t - ) ] .Math. y ( t ) .Math. i = 1 N .Math. it [ x i ( t - ) ] ( 3 ) d i = argmax [ C i , ( ) max - C i , ( ) min ] ( 4 ) wherein, if the scope of the C.sub.i,() range is closer to that of y, then the input variable x.sub.i(t) is more important; wherein the important degree of each variable is determined by sorting the coverage of C.sub.i,(); wherein the optimal time delay variable x.sub.i(td.sub.i) is obtained; step 4: using obtained x.sub.i(td.sub.i) in step 3 to form a delay input set X.sub.d(t)=[x.sub.1(td.sub.1), x.sub.2(td.sub.2), . . . , x.sub.m(td.sub.m)].sup.T, and reconstructing soft sensor training sample set {X.sub.d(t),y(t)}; wherein if there is a new input sample X(t+1) available, then the delay input set could be restructured based on historical database samples with the same parameters, then go to step 5, otherwise, wait for the arrival of new data; step 5: processing the training set and the new data by j order time difference treatment, wherein the value of j is determinable according to a sampling period and property of dominant variable; wherein a formula for time difference step is:
    X.sub.d,j(t)=X.sub.d(t)X.sub.d(tj)
    y.sub.j(t)=y(t)y(tj)(5) establishing a Gaussian process model between differential input samples and output samples; wherein a Gaussian process regression algorithm is presented as follows: wherein given training sample sets XR.sup.mN and yR.sup.N, m is the dimension of input data points, N is the number of samples, and the relationship between input sample x.sub.i and output sample y.sub.iR satisfies:
    y.sub.i=f(x.sub.i)+
    N(0,.sub.n.sup.2)(6) wherein in formula (6), f is an unknown function, is Gaussian noise with zero mean and .sub.n.sup.2 variance; for a new input sample, its corresponding probability prediction output also follows Gaussian distribution and the joint Gaussian distribution N(f.sub.gp, cov(f.sub.gp)) is described as below:
    f.sub.gp|X,y,x.sub.*N(f.sub.gp,cov(f.sub.gp))
    s.i. f.sub.gp=k(x.sub.*,X)[K(X,X)+.sub.n.sup.2I.sub.n].sup.1y
    cov(f.sub.gp)=k(x.sub.*,x.sub.*)k(X,x.sub.*).sup.T[K(X,X)+.sub.n.sub.2I.sub.n].sup.1.Math.k(X,x.sub.*)(7) wherein K(X, X) is a n-dimensional covariance matrix of training samples; wherein k(x.sub.*,X) is a covariance vector between test sample and training samples; wherein k(x.sub.*,x.sub.*) is the autocovariance of test sample; wherein a Gaussian covariance function is: k ( x p , x q ) = v .Math. .Math. exp [ - 1 2 .Math. .Math. d = 1 m .Math. .Math. d ( x p d - x q d ) 2 ] ( 8 ) wherein, in GPR algorithm, hyper-parameter .sub.gp=(v, .sub.1, . . . , .sub.D, .sub.n.sup.2) is obtained via maximum likelihood estimation approach; wherein the likelihood function is derived as: L ( gp ) = - 1 2 .Math. y T [ K ( X , X ) + n 2 .Math. I ] - 1 .Math. y - 1 2 .Math. log .Math. .Math. det [ K ( X , X ) + n 2 .Math. I ] - n 2 .Math. log .Math. .Math. 2 .Math. ( 9 ) wherein the hyper-parameter .sub.gp is set to be a random value within a reasonable range in the first place; wherein the conjugate gradient algorithm is utilized to obtain the optimized parameter set; wherein after obtaining the optimal hyper-parameter, for the test sample x.sub.*, formula (7) is used to estimate the output value of GPR model; step 6: after all samples are restructured with delays, when a new input data arrives at time t+1, based on y.sub.j(t+1j), calculating a predictive value y.sub.j,pred(t+1) by TDGPR algorithm, wherein the formula is presented as below:
    X.sub.d,j(t+1)=X.sub.d(t+1)X.sub.d(t+1j)
    y.sub.j,pred(t+1)=f.sub.GPR(X.sub.d,j(t+1))
    y.sub.j,pred(t+1)=y.sub.j(t+1j)+y.sub.j,pred(t+1)(10) wherein y.sub.j,pred(t+1) in formula (10) is the final dominant variable prediction value of the present invention.

    2. A soft sensor modeling method of fuzzy curve analysis based time difference Gaussian process regression according to claim 1, characterized in that this method extracts information of variable delay from process historical database, reconstructing soft sensor modeling data and correcting causal relationship between input and output data.

    Description

    BRIEF DESCRIPTION OF DRAWINGS

    [0015] FIG. 1 is a flow chart of online soft sensor method based on FCA-TDGPR;

    [0016] FIG. 2 is a schematic diagram of debutanizer process;

    [0017] FIG. 3 is a schematic diagram of TDGPR modeling approach;

    [0018] FIG. 4 contains fuzzy curve distribution diagrams of the original variables and optimal time delay variables;

    [0019] FIG. 5 contains scatterplots of butane concentration predictions with different j values.

    EXAMPLES

    [0020] The modeling flow chart, which is shown in FIG. 1 below, is further detailed in the present invention:

    [0021] Take the actual chemical process as an example, debutanizer is an important part of naphtha desulfurization and separation device of oil refining production process, and one of the dominant variables needed to be controlled for this process is the concentration of the bottom butane (C4). The schematic diagram of the process is shown in FIG. 2, due to the value of C4 cannot be directly measured, therefore, there is a delay issue in analyzing and obtaining C4 concentration values. At the same time, different auxiliary variables show different degrees of time delay. Experimental data is derived from the actual industrial process which contains 2394 samples, a total of 7 auxiliary variables. As shown in FIG. 2, x.sub.1 is the top temperature; x.sub.2 is the top pressure; x.sub.3 is the reflux flow; x.sub.4 is the top product outflow; x.sub.5 is the 6th tray temperature; x.sub.6 is the bottom temperature 1; x.sub.7 is the bottom temperature 2. The 1 dominant variable is the bottom butane concentration, which in this invention is predicted as the key variable of the process.

    [0022] Step1: Collect historical input and output data to form a training database which contains N continuous samples. Assuming that the data is expressed as {X(t),y(t)},t 1, 2, . . . , N and is preprocessed, and the 2 bottom temperature variables are averaged as 1 auxiliary variable, then, X(t)=[x.sub.1(t), x.sub.2(t), x.sub.3(t), x.sub.4(t), x.sub.5(t), x.sub.6(t)].sup.T. The maximum time delay T.sub.max of 6 variables is set to 19.

    [0023] Step2: For each of the original variables x.sub.i, i{1, 2, . . . , 6}, they are extended to the input variables with time delay {x.sub.i(t), =0, 1, . . . , T.sub.max} by formula (1), and a set of 120 dimensional delay variables will be obtained for subsequent analysis

    ##STR00001##

    [0024] Step3: Determine the importance of each variable in the time delay input variable set by FCA, for (x.sub.i(t),y(t)), fuzzy membership function of variable x.sub.i is defined as:

    [00001] it ( x i ) = exp [ - ( x i ( t ) - x i b ) 2 ] ( 2 )

    [0025] For each x.sub.i, {.sub.it, y(t)} provides a fuzzy rule which is described as {if x.sub.i is .sub.it(x.sub.i), then y is y(t)}, and .sub.it is a fuzzy membership function of input variable x.sub.i at t-th data point; In formula (2) a Gaussian fuzzy membership function is selected; b is determined as 20% the range of variable x.sub.i. As a result, For N training samples, each sample corresponding to each variable has N fuzzy rules. In the fuzzy membership function, .sub.it=1 holds true at each point {x.sub.i(t),y(t)}.

    [0026] For time delay process, by introducing time delay information the original variable x.sub.i becomes (T.sub.max+1)-dimensional, which can be expressed as x.sub.i(t), =0, 1, . . . , T.sub.max, is a variable delay value to be introduced; fuzzy curve C.sub.i, with the condition that is the i-th variable delay value can be obtained by making centroid defuzzification of each new expanded variable using formula (3); as shown in the formula (4), d.sub.i is the which can make the maximum coverage of fuzzy curve C.sub.i,; C.sub.i,().sub.max is the maximum value of the fuzzy curve point range, while C.sub.i,().sub.min is the minimum value of the fuzzy curve point range;

    [00002] C i , ( ) = .Math. t = 1 N .Math. it [ x i ( t - ) ] .Math. y ( t ) .Math. t = 1 N .Math. it [ x i .Math. ( t - ) ] ( 3 ) d i = argmax [ C i , ( ) max - C i , ( ) min ] ( 4 )

    [0027] If the scope of the C.sub.i,() range is closer to that of y, then the input variable x.sub.i(t) is more important. In view of this point, the importance degree of each variable can be determined by sorting the coverage of C.sub.i,(). Finally, the optimal delay parameter d.sub.i as well as time delay variable x.sub.i(td.sub.i) can thus be obtained by FCA method, which later on can be used for soft sensor modeling data reconstruction.

    [0028] Step4: Based on the previous step, the time delay parameters d.sub.1, d.sub.2, . . . , d.sub.m are used to reconstruct the training input sample set for on-line modeling, the reconstructed input dataset is denoted as X.sub.d(t), X.sub.d(t)[x.sub.1(td.sub.1), x.sub.2(td.sub.2), x.sub.3(td.sub.3), x.sub.4(td.sub.4), x.sub.5(td.sub.5), x.sub.6(td.sub.6)]. If there is a new input sample X(t+1), then the delay input set could be restructured based on historical database samples with the same parameters, then go to step 5, otherwise, wait for the arrival of new data.

    [0029] Step 5: After the reorganization procedure, the training set and the new data are processed by j order time difference treatment (the value of j can be determined according to the sampling period and property of dominant variable):


    X.sub.d,j(t)X.sub.d(t)X.sub.d(tj)


    y.sub.j(t)=y(t)y(tj)(5)

    [0030] Next, make a regression of the relationship between X.sub.d,j(t) and y.sub.j(t) by GPR, which satisfies (t)=f(X.sub.j(t))+e(t). The GPR method can obtain the mapping relationship through the given training input and output samples. In this way, the corresponding predictive value and the uncertainty degree can be obtained given the new input data, which means the result will be probabilistic. The GPR algorithm is shown as below.

    [0031] In general, the relationship between the observed output value y and noise e satisfies:


    y.sub.i=f(x.sub.i)+


    N(0,.sub.n.sup.2)(6)

    [0032] If the mean function and covariance function are determined, then the distribution of the Gaussian process is well-determined. For simplicity, the mean function is usually preprocessed into 0. Covariance function can transform the correlation of output data into the function of input data. As similar inputs produce similar outputs, the covariance function can be selected according to the characteristics of the sample distribution. One condition which must be satisfied is that the closer the distance of samples is, the more correlated the two samples are, and vise versa. The covariance function form of this invention is shown in formula (7):

    [00003] k ( x p , x q ) = v .Math. .Math. exp [ - 1 2 .Math. .Math. d = 1 D .Math. .Math. d ( x p d - x q d ) 2 ] ( 7 )

    [0033] In the formula, x.sub.p, x.sub.qR.sup.D, v controls the magnitude of the covariance function. .sub.d describes the relative importance of each input attribute x.sup.d. The determination of the hyper-parameter .sub.gp=(v, .sub.1, . . . , .sub.D, .sub.R.sup.2) in the Gaussian process is generally estimated by the MLE method. The optimization of the parameters can be realized by using the conjugate gradient method. Based on test sample and training data, the posterior distribution of test data x.sub.* can be calculated, and its predictive value obey the joint Gaussian distribution described in formula (9), where K(X,X) is n-dimensional covariance matrix of training samples; k(x.sub.*,X) is the covariance vector of test sample and training samples; k(x.sub.*,x.sub.*) is the autocovariance of test sample, and f.sub.gp is a predictive value of GPR.

    [00004] L ( gp ) = - 1 2 .Math. y T [ K ( X , X ) + n 2 .Math. I ] - 1 .Math. y - 1 2 .Math. log .Math. .Math. det [ K ( X , X ) + n 2 .Math. I ] - n 2 .Math. log .Math. .Math. 2 .Math. ( 8 ) .Math. f gp X , y , x * ~ N ( f gp _ , cov ( f gp ) ) .Math. .Math. .Math. .Math. s . t . .Math. f gp _ = k ( x * , X ) [ K ( X , X ) + n 2 .Math. I n ] - 1 .Math. y .Math. .Math. .Math. cov ( f gp ) = k ( x * , x * ) - k ( X , x * ) T [ K ( X , X ) + n 2 .Math. I n ] - 1 .Math. k ( X , x * ) ( 9 )

    [0034] when the new input data arrives at time t+1, the calculating formula of predictive value y.sub.j,pred(t+1) with TDGPR method is:


    X(t+1)=X(t+1)X(t+1j)


    y.sub.j,pred(t+1)=f.sub.GPR(X.sub.j(t+1))


    y.sub.j,pred(t+1)=y.sub.j(t+1j)+y.sub.j,pred(t+1)(10)

    [0035] In the actual industrial process, there will be the case of instrument damage or laboratory analysis with delay, and the circumstance that the time interval of obtaining dominant variable is large and the quantity is small or there is a lack of dominant analysis value in the database. Thus, shown in FIG. 3, for new incoming test data X(t+1), based on y.sub.i(t+1j) stored in the database j moment ago, the predictive value of the dominant variable at time t+1 can be obtained. The predicted output y.sub.j,pred(t+1) of the online model is calculated by formula (10), and the predicted result of bottom butane concentration can be obtained.

    [0036] As shown in FIG. 4, compared the original variables without time delay, 6 reconstructed variables contribute more to the dominant variable, which introduce more relevant modeling data for online modeling. At the same time, in order to verify the effectiveness of this invention for on-line estimation, the first 1519 samples are selected in 2394 samples to reconstruct 1500 training samples. The final 875 samples are then used as test samples, and a soft sensor is established for on-line prediction of butane concentration.

    [0037] FIG. 5 contains scatterplots of butane concentration prediction results with two methods respectively denoted as the FCA-TDGPR method (present method) which involves delay estimation, and t-TDGPR method which is without time delay estimation.

    [0038] From FIG. 5, when the time difference order j increases from 1 to 10, the time interval of prediction based on historical database is gradually increasing, and the prediction accuracy is declining. This is because the more recent the analysis value is, the better tracking ability the model has for current process dynamics.

    [0039] Although the accuracy of the two methods are in decline, compared with the TDGPR method without considering the time delay, the predicted results of the present invention can be better close to the true value of the butane concentration when the time difference increases. This suggests that extracted delay information is in line with the actual causal relationship of the process, and the soft sensor model with variable time delay estimation is more accurate.

    [0040] After fuzzy curve analysis method is taken to determine the optimal parameters, reconstructed data is proved to be capable of enhancing the accuracy of online model significantly by introducing more contributing auxiliary variables to dominant variable sequence. At the same time, it reflects that the GPR method can explain the dynamic change of the process well. The online soft sensor model based on TDGPR method can adaptively estimate real-time butane concentration with historical variables collected j time ago.

    [0041] FIGS. 4 and 5 have jointly validated that fuzzy curve analysis based time difference Gaussian process regression soft sensor modeling method has good accuracy for on-line prediction of bottom butane concentration.

    [0042] While the present invention has been described in some detail for purposes of clarity and understanding, one skilled in the art will appreciate that various changes in form and detail can be made without departing from the true scope of the invention. All figures, tables, appendices, patents, patent applications and publications, referred to above, are hereby incorporated by reference.