Process Modelling Method and System for Non-Linear Continuous-Like Process and Application in Pulp and Paper Industry
20260062867 ยท 2026-03-05
Inventors
- Jiangsheng You (Andover, MA, US)
- Julian Santos (Bogota D.C., CO)
- Simon Feng (Katy, TX, US)
- Hong Zhao (Sugar Land, TX)
- Janet L. Tateya-Blancett (Houston, TX, US)
- Heiko Claussen (Auburndale, MA, US)
Cpc classification
D21H23/78
TEXTILES; PAPER
International classification
Abstract
A method, system, and computer program product are described capable of controlling an industrial based chemical process through accessing sensor data, pre-processing accessed sensor data through an automated process, forming an initial prediction model of the chemical process, and automatically determining linearity or non-linearity of the initial prediction model. As a function of quantitatively measured linearity, being non-linear, quasilinear, or linear, automatically train the initial prediction model, then deploy the trained prediction model in a manner that controls the subject chemical process at the industrial plant, including optimizing consumption of a certain resource. Data-driven modeling of non-linear, continuous-like industrial or chemical processes results.
Claims
1. A computer-implemented method of controlling an industrial based chemical process, the method comprising: obtaining working data representative of a subject chemical process of an industrial plant, said obtaining being automatically performed by a digital processor; responsively in computer memory, forming an initial model of the subject chemical process based on the working data, the working data having empirical values of dependent variables and independent variables of the subject chemical process, different characteristics of the subject chemical process being represented by different mathematical relationships of respective dependent variables and independent variables, including a certain resource consumption being indicated by one or more independent variables; said forming including the processor automatically determining non-linearity of the initial model as a function of non-linearity of the different mathematical relationships in the working data, and automatically training the initial model based on results of the determining, such that: (a) where the different relationships are determined to be substantially linear, then training the initial model as a linear predictive model and resulting in a trained prediction model, and (b) where the different relationships are determined to be quasi-linear or non-linear, then (i) training the initial model as a non-linear predictive model, (ii) combining the trained non-linear predictive model with one or more local linear models dynamically adjusting control variables of the certain resource consumption in respective windows of time, and (iii) producing a resulting trained prediction model based on the non-linear predictive model combined with the one or more local linear models; and deploying the resulting trained prediction model in a manner controlling the subject chemical process at the industrial plant, including optimizing consumption of the certain resource.
2. The method of claim 1, wherein the subject chemical process is continuous, semi-continuous, or continuous in one or more parts.
3. The method of claim 1, wherein the subject chemical process is a Kraft process or similar, and the industrial plant is of a pulp and paper industry.
4. The method of claim 3, wherein the working data represents quality of production of the subject chemical process as a function of the certain resource consumption, and the quality of production is determined by any one or combination of: a measurement of completeness of a pulping process, a Kappa number measurement or equivalent, total alkaline charge, and amount of residual alkali.
5. A method as claimed in claim 3, wherein the certain resource is a reagent utilized in the Kraft process.
6. A method as claimed in claim 5, wherein the reagent is white liquor.
7. A method as claimed in claim 3, wherein output of the deployed prediction model further controls any one or combination of: total H factor, a black liquor stream after extraction, digester discharge consistency of a digester at the industrial plant, and liquor temperature in the digester.
8. A method as claimed in claim 3, wherein the prediction model is further configured to manipulate control parameters of the subject chemical process at the industrial plant in a manner that minimizes any one or more of: toxic waste of the industrial plant, waste water, and chemicals in a drying process.
9. A method as claimed in claim 1, wherein obtaining working data includes accessing sensor output data indicative of the subject chemical process; and the method further comprising automatically adjusting the prediction model over time based on additional sensor output data.
10. A method as claimed in claim 1 wherein training the initial model as a linear predictive model employs partial least squares (PLS) regression, and number of components for the PLS regression is an estimated rank among the independent variables and dependent variables.
11. A method as claimed in claim 1 wherein training the initial model as a non-linear predictive model employs extreme gradient boosting (XGBoost), and number of estimators for XGBoost is based on an estimated rank among the independent variables and dependent variables.
12. A method as claimed in claim 1 wherein obtaining working data includes: accessing sensor output data indicative of the subject chemical process, the accessed sensor output data including one or more time periods of operating states of the subject chemical process; and pre-processing the accessed sensor output data in a manner that: (i) groups data based on time, and (ii) removes outlier data from groups of data, said pre-processing resulting in the working data, the accessing and pre-processing being automatically performed by the digital processor.
13. A system for controlling an industrial based chemical process, the system comprising: a digital processor; and a process modeler executable by the digital processor such that during execution the digital processor: automatically obtains working data representative of a subject chemical process in an industrial plant; responsively forms in computer memory, an initial model of the subject chemical process based on the working data, the working data having empirical values of dependent variables and independent variables of the subject chemical process, different characteristics of the subject chemical process being represented by different mathematical relationships of respective dependent variables and independent variables, including a certain resource consumption being indicated by one or more independent variables; said forming including the digital processor automatically determining non-linearity of the initial model as a function of non-linearity of the different mathematical relationships in the working data, and automatically training the initial model based on results of the determining, such that: (a) where the different relationships are determined to be substantially linear, then training the initial model as a linear predictive model and resulting in a trained prediction model, and (b) where the different relationships are determined to be quasi-linear or non-linear, then (i) training the initial model as a non-linear predictive model, (ii) combining the trained non-linear predictive model with one or more local linear models dynamically adjusting control variables of the certain resource consumption in respective windows of time, and (iii) producing a resulting trained prediction model based on the non-linear predictive model combined with the one or more local linear models; and deploys the resulting trained prediction model in a manner controlling the subject chemical process at the industrial plant, including optimizing consumption of the certain resource.
14. The system of claim 13, wherein the subject chemical process is continuous, semi-continuous, or continuous in one or more parts.
15. The system of claim 13, wherein the subject chemical process is a Kraft process or similar, and the industrial plant is of a pulp and paper industry.
16. The system of claim 15, wherein the working data represents quality of production of the subject chemical process as a function of the certain resource consumption, and the quality of production is determined by any one or combination of: a measurement of completeness of a pulping process, a Kappa number measurement or equivalent, total alkaline charge, and amount of residual alkali.
17. A system as claimed in claim 15, wherein the certain resource is a reagent utilized in the Kraft process that is white liquor.
18. A system as claimed in claim 15, wherein output of the deployed prediction model further controls any one or combination of: total H factor, a black liquor stream after extraction, digester discharge consistency of a digester at the industrial plant, and liquor temperature in the digester.
19. A system as claimed in claim 15, wherein the prediction model is further configured to manipulate control parameters of the subject chemical process at the industrial plant in a manner that minimizes any one or more of: toxic waste of the industrial plant, waste water, and chemicals in a drying process of the Kraft process.
20. A system as claimed in claim 13 wherein the digital processor obtaining working data includes accessing sensor output data indicative of the subject chemical process; and the process modeler when executed further comprising the digital processor automatically adjusting the prediction model over time based on additional sensor output data.
21. A system as claimed in claim 13, wherein the digital processor obtaining working data includes: automatically accessing sensor output data indicative of the subject chemical process, the accessed sensor output data including one or more time periods of operating states of the chemical process at the industrial plant; and automatically pre-processing the accessed sensor output data in a manner that: (i) groups or otherwise classifies data based on time, and (ii) removes outlier data from the groups of data, said pre-processing resulting in the working data.
22. A non-transitory computer program product controlling an industrial based chemical process, the computer program product comprising a computer-readable medium with computer code instructions stored thereon, the computer code instructions being configured, when executed by a processor, to cause an apparatus associated with the processor to: obtain working data representative of a subject chemical process in an industrial plant; responsively in computer memory, form an initial model of the subject chemical process based on the working data, the working data having empirical values of dependent variables and independent variables of the subject chemical process, different characteristics of the subject chemical process being represented by different mathematical relationships of respective dependent variables and independent variables, including representing quality of production of the subject chemical process as a function of a certain resource consumption, quality of production corresponding to one or more dependent variables, and the certain resource consumption being indicated by one or more independent variables; said responsively form including the apparatus automatically determining non-linearity of the initial model as a function of non-linearity of the different mathematical relationships in the working data, and automatically training the initial model based on results of the determining, such that: (a) where the different relationships are determined to be substantially linear, then training the initial model as a linear predictive model and resulting in a trained prediction model, and (b) where the different relationships are determined to be quasi-linear or non-linear, then (i) training the initial model as a non-linear predictive model, (ii) combining the trained non-linear predictive model with one or more local linear models dynamically adjusting control variables of the certain resource consumption in respective windows of time, and (iii) producing a resulting trained prediction model based on the non-linear predictive model combined with the one or more local linear models; and deploy the resulting trained prediction model in a manner controlling the subject chemical process at the industrial plant, including optimizing consumption of the certain resource.
23. The computer program product of claim 22, wherein obtaining the working data includes: accessing sensor output data indicative of the subject chemical process, the accessed sensor output data including one or more time periods of operating states of the chemical process at the industrial plant; and pre-processing the accessed sensor output data in a manner that: (i) groups or otherwise classifies data based on time, and (ii) removes outlier data from the groups of data, said pre-processing resulting in working data.
24. The computer program product of claim 22 wherein the apparatus automatically determining non-linearity of the initial model includes quantitatively measuring non-linearity of the working data.
25. A computer-implemented method of modeling an industrial-based process, the method comprising: obtaining working data based on the past collected data and real-time collected data indicative of a subject industrial-based process, said obtaining being automatically performed by one or more digital processors; as a function of linearity of the obtained working data, automatically selecting between linearly modeling the subject industrial-based process and non-linearly modeling the subject industrial-based process, said selecting being automatically performed by the one or more digital processors, the working data having empirical values of dependent variables and independent variables of the subject industrial-based process, different characteristics of the subject industrial-based process being represented in the working data by different mathematical relationships of respective dependent variables and independent variables; the one or more digital processors automatically selecting being by: (a) testing linearity of the different mathematical relationships in the obtained working data, and (b) where the different relationships are determined to be substantially linear, then selecting and training a linear predictive model as representative of the subject industrial-based process, and where the different relationships are determined to be quasi-linear or non-linear, then selecting and training a non-linear predictive model as representative of the subject industrial-based process; and generating a resulting model of the subject industrial-based process based on one of: (i) the selected and trained linear predictive model, and (ii) a combination of the selected and trained non-linear predictive model and one or more local linear models dynamically adjusting certain variables of the subject industrial-based process.
26. A computer-implemented method as claimed in claim 25 wherein the testing linearity of the different mathematical relationships in the obtained working data includes the one or more digital processors quantitatively measuring non-linearity of the different mathematical relationships in the obtained working data.
27. A computer-implemented method as claimed in claim 25 wherein the industrial-based process is a chemical process, an industry plant process, or the like.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] The foregoing will be apparent from the following more particular description of example embodiments, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments.
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
DETAILED DESCRIPTION
[0042] A description of example embodiments follows. Embodiments of the present invention provide methods and systems of generating data-driven models of an industrial (e.g., chemical) process of interest, especially of non-linear continuous chemical processes. By way of non-limiting example and for purposes of illustrating the principles of the present invention, a pulp and paper industry chemical process is described next. It is understood that embodiments (computer-based methods and systems) of the present invention may be applied to various and numerous other industrial (chemical) processes, non-linear such processes, continuous-like processes and the like, producing data-driven models of the same.
[0043] In the pulp and paper industry, the Kraft process is a process for conversion of wood into wood pulp, which comprises almost pure cellulose fibers, the main component of paper. The Kraft process involves treatment of wood chips with a hot mixture of water, sodium hydroxide, and sodium sulfide, known as white liquor, that breaks the bonds that link lignin, hemicellulose, and cellulose. This treatment, and more generally, the Kraft process entails several steps, both mechanical and chemical including the stages of: impregnation, cooking, recovery, blowing, screening, washing, and bleaching.
[0044] In addition, the digestion process 111b comprises a separate cycle relative to the cycle described above. From the digestion process 111b, a thin liquor 112a is removed and evaporated 112b, yielding a black liquor 112c. The black liquor is fed to a recovery boiler 112d, where the cycle splits to afford a green liquor 112f and released energy 112e. The green liquor 112f then undergoes causticization 112g that yields white liquor 112k that is fed back into the digestor 111b as so called recovered white liquor. The causticization 112g has an additional cycle where calcium carbonate 112h is removed from the recovered white liquor 112k into a lime kiln 112i to produce lime 112j. The lime 112j then reenters the causticization reactor 112g to complete the cycle.
[0045] At pulp and paper mill factories or industrial plants implementing the Kraft process, there is a mathematical value, called a Kappa number for pulp and paper. The Kappa number provides a quantitative test method for determining a level of lignin remaining in a sample, either finished from the pulp process (Kraft process 110) or during/undergoing the pulp process. The Kappa number is a value that: (a) measures the completeness of the pulping process, as well as (b) provides information about properties of the produced pulp (especially the level of residual lignin present). The Kappa number is based on the various chemical reactions performed during the process, and specifically on potassium permanganate oxidizing the wood to release lignin. Kappa number has a range from 1 to 100. Depending on the wood species involved, it is possible for the Kappa number to be above 100, which may be indicative of precision of the test decreasing and of relationship between Kappa number and lignin content decreasing.
[0046] An equation to represent or approximate Kappa number is shown in Equation 1, wherein lignin percent is equal to Kappa number multiplied by a value (a constant). (A more precise relationship can be established by testing the specific pulp of interest through in-process test to reflect the changing conditions of process flow described in
[0047] Turning to
[0048] Based on the recovery cycle 220 described herein, it is possible to determine the Kappa number 332 from this cycle.
[0049] The amount of white liquor that goes into the process of digesting wood chips to produce the pulp and eventually the paper of interest is a variable that can be optimized. The optimization of white liquor consumption would result in reducing production of hazardous waste from the Kraft process 110. Such optimization challenges advanced modeling of the intricate wood variable, process variables, and variable interactions in the Kraft process. Applicant's approach of using advanced machine learning (ML) algorithms to model non-linear interactions advantageously unlocks substantial cost savings and waste reduction by reducing chemical and fuel consumption, thereby enhancing cellulose pulp production within existing facilities.
[0050]
[0051] Returning to
[0052]
[0053] According to an embodiment, obtaining working data includes: accessing sensor output data indicative of the subject chemical process of an industrial plant as shown in step 451. The accessed sensor output data includes one or more time periods of operating states of the subject chemical process.
[0054] Step 452 continues the tasks of step 441 by pre-processing the accessed sensor output data (from step 451). In particular, the digital processor automatically preprocesses the sensor output data at step 452 by: (i) grouping or otherwise classifying data based on time, and (ii) removing outlier data from the groups of data. The pre-processing by step 452 results in working data obtained at step 441.
[0055] The method 440 continues at step 442 (
[0056] Next at step 443, the processor automatically determines non-linearity of the initial model (formed by step 442) as a function of non-linearity of the different mathematical relationships in the working data output by step 441. The processor/step 443 automatically trains the initial model based on results of the linearity/non-linearity. Specifically: (a) where the different mathematical relationships in the obtained working data are determined to be substantially linear, then the processor trains the initial model as a linear predictive model resulting in a trained prediction model, and (b) where the different mathematical relationships are determined to be quasi-linear or non-linear, then the processor (i) trains the initial model as a non-linear predictive model, (ii) combines the trained non-linear predictive model with one or more local linear models dynamically adjusting control variables of the certain resource consumption in respective windows of time (time periods), and (iii) produces a resulting trained prediction model based on the non-linear predictive model combined with the one or more local linear models. In some embodiments, step 443 training of the initial model as a linear predictive model employs partial least squares (PLS) regression, and the number of components for the PLS regression is an estimated rank among independent variables and dependent variables. In embodiments, step 443 training of the initial model as a non-linear predictive model employs extreme gradient boosting (XGBoost), and the number of estimators for XGBoost is based on an estimated rank among the independent variables and dependent variables as will be made clear below.
[0057] At step 444, the method 440/processor deploys the resulting trained prediction model (from step 443) in a manner controlling the subject chemical process at the industrial plant, including optimizing consumption of the certain resource, such as white liquor in the above non-limiting example embodiment. Restated, output of the deployed prediction model includes values of plant equipment settings or subject chemical process (Kraft process 110) settings that are the control variables corresponding to the one or more independent variables indicative of the certain resource (white liquor) consumption. In some embodiments, the subject chemical process is continuous, semi-continuous, continuous in parts (or portions), or a combination thereof. According to an embodiment, the method 440 of
[0058] In some embodiments, working data is representative of a continuous, semi-continuous, or continuous in one or more parts or portions process that is based on sensor output data and/or derived from sensor output data.
[0059] In some embodiments, continuous-like can refer to a process that is continuous, semi-continuous, or continuous in one or more parts or portions of the process. It is understood that as used herein the process may be a chemical process, an industrial plant process, a manufacturing process, a reactor process, and the like. The terms industrial-based process, process of interest, and equivalents are used interchangeably with plant process or process given the context of the present disclosure.
[0060] In some embodiments, output of the deployed prediction model further controls any one or combination of: total H factor, a black liquor stream after extraction (filter 331), digester discharge consistency of a digester 221c at the industrial plant, and liquor temperature in the digester. The method 440 may also further configure the prediction model to manipulate control parameters of the subject chemical process (Kraft process 110) at the industrial plant in a manner that minimizes: toxic waste overall, wastewater, and chemicals in a drying stage of the Kraft process in some embodiments.
[0061] In an embodiment, the resulting trained prediction model (deployed at step 444 of
[0062] During model training, embodiments estimate contribution coefficients or ranks advantageously increasing accuracy of Applicant's resulting trained prediction model 560. Embodiments provide this and other advantages over the prior art including: [0063] a novel method to measure the degree of nonlinearity by estimating the partial ranks of projections of independent and dependent variables represented by sensor data; [0064] a novel combination of machine learning methods PCA (Principal Component Analysis), tSNE, and UMAP followed by clustering methods HDBSCAN and GMM on the tasks of identifying nonuniform distribution of covariances and spatial connectedness of data 551; [0065] application of XGBoost to model both linear and nonlinear relationships between the independent and dependent variables; [0066] a novel scheme of using local LASSO (Least Absolute Shrinkage and Selection Operator) and PLS to approximate global nonlinear correlations to estimate the sensitivity to perform local optimization; and [0067] novel applications of three data processing results: (a) a normalized score in the range [0,1] for different algorithms to measure the distance of a sample to its cluster center for outlier removal; (b) the optimal number of model components or estimators using the partial ranks; and (c) the partial ranks of concatenated independent and dependent variables to infer the data redundancy.
[0068] In embodiments, data selection module 552 includes both (i) user interactive selection of outlier data for removal from model training, and (ii) automated processor selection of outlier data in source data 551 and removal of same from model training. The user-interactive selection allows domain knowledge to guide outlier data removal in data selection module 552 as illustrated next. In an embodiment, preprocessing in data selection module 552 includes at least the following steps: 1) data dimension reduction by PCA (Principal Component Analysis), tSNE (t-distributed Stochastic Neighbor Embedding), and/or UMAP (Uniform Manifold Approximation and Projection) to visually identify three different types of data distributions: concentration of variances, statistical density, and spatial density; and 2) clustering through combining gaussian mixture models (GMM) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) on original data 551 and projected data to combine visual observations and machine learning in a hybrid model for outlier removal. Doing so: (a) enables visualization for user-interactive selection of data 551 to be excluded from model training applying domain knowledge at runtime, (b) enables visualization of clusters from high-dimensional algorithms in dimension-reduced 2D planes for automated inspection by black-box machine learning algorithms, and (c) provides visual insights on a timeline and spatial location for each of different types of conditions of the subject chemical process, including in some embodiments each pulp type.
[0069] Data selection module 552 projects high dimension data 551 onto a 2D plane for visualizing spatial and statistical distribution in an augmented (dimension reduced) way. In particular, data processing 550 projects source data 551 into a 2D plane through linear projection, statistical transformation, and spatial mapping using PCA, tSNE, and UMAP, respectively. Data selection module 552 continues with user-interactive steps and processor automated steps as follows.
[0070] In non-limiting embodiments, linear models or algorithms that may be used by the present invention include: principal component analysis (PCA), PLS and LASSO, and combinations thereof. In other non-limiting embodiments, quasi-linear models or algorithms that may be used by the present invention include: GMM, HDBSCAN, SVM, and combinations thereof. In other non-limiting embodiments, non-linear models or algorithms that may be used by the present invention include: GMM, HDBSCAN, tSNE, UMAP, XGBoost, SVM, and combinations thereof. In other non-limiting embodiments, other models or algorithms that may be used to improve the predictions made by the present invention include: Auto Encoder, ridge regressions, NNs (e.g., long short-term memory (LSTM), temporal convolutional network (TCN), Transformer, etc.), and combinations thereof.
Data Selection Module 552: User-Interactive
[0071] Data Selection module 552 uses density mapping techniques to present the 2D projections (visualizations thereof) to a user for interactive visual inspection. The density maps visually illustrate statistical distribution and spatial distribution of sensor data 551. In
[0072] Data selection module 552 also performs a Z-score visualization for sensor data normalization and selection. This involves an application of sigma rules, which are as follows: 1) exclusion of sensor data with little variances (no dynamics except bias); 2) exclusion of data outside of 3-sigma for each sensor; and 3) use of selected data to calculate minimum ranks. An example output is shown in
[0073] The interactive visualizations also can have a timeline visualization based on operating states from GMM and HDBSCAN 880 as shown in
Data Selection Module 552: Processor-Automated
[0074] 2D projections can incorporate states from GMM and HDBSCAN. As shown from the 2D view of high-dimensional multivariate, there are several clusters that represent different process conditions. Manual masking is slow and inconsistent over time. To automatically extract the clusters without manual masking and maintain the consistency, the data selection module 552 applies GMM and HDBSCAN to identify the clusters with domain knowledge guided specification of model parameters. Here are some illustrations of the clusters from clustering methods GMM and HDBSCAN (for non-limiting example) in the 2D plane. In
[0075] To improve automation in selecting desired training data without the outliers, the data selection module 552 applies GMM and HDBSCAN to cluster a z-transformed data into multiple components, then an automated processor (without human interactive intervention) masks isolated components with a small number (defined threshold) of data points as outliers to be excluded in training models. This processing advantageously compliments and improves on outcomes of the user-interactive steps described above. For a given test dataset, data selection module 552 performs a dimension reduction-clustering pair of algorithms from dimension-reduction methods PCA, tSNE, and UMAP and clustering methods GMM and HDBSCAN. In turn, data selection module 552 shows the overlay of clusters on a 2D plane for visual data validation by domain experts. To associate the clusters in the 2D plane and location of these clusters along the timeline, the same data points are visualized in
[0076] According to the above, embodiments of the present invention provide novel data analysis techniques and use including: (1) a combination of density distribution by GMM and spatial connectedness by HDBSCAN for data conditions, (2) 2D visualization of projected data conditions for interactive user-guided selection, and (3) a unified score of all data conditions from different methods.
[0077] Returning to
[0078] With reference to
[0079] In embodiments, the model selection module 552 evaluates a nonlinearity measurement as follows. For a semi-continuous process (such as a non-limiting example Kraft process 110), the nonlinearity measurement may be expressed by a mathematical function between an independent (predictor) variable X and a dependent (response) variable Y=(X). If function (X.sub.t) is linear, then the rank of covariance of sample sets {X.sub.t} and {Z.sub.t}, where Z=(X.sub.t, Y.sub.t), would be the same due to the linear correlation. Thus, the rank difference between the covariances can be used to estimate the degree of nonlinearity. Accordingly, embodiments employ the following steps. The data processing 550 (i.e., modules 552, 554): [0080] (1) perform z-score normalization for both X and Y; [0081] (2) calculate minimum dimensions of selected samples of X and Z=(X, Y) for a given percentage of total mean squared error in a projected subspace; [0082] (3) average dimensions and differences for evenly sampled percents in [0.90, 0.99] (with a default number of 5 samples); [0083] One embodiment utilizes a regression linearity ranking function. The regression function measures the partial dimension of combined data Z to keep the same percent of covariance from predictor variables X and the combined variables Z. Briefly the regression function in pseudo code is: [0084] For a percent , calculate the difference of minimum ranks of {X.sub.t} and {X.sub.t, Y.sub.t} with keeping percent of covariances; [0085] Take evenly sampled percents in [0.90, 0.99], and then regress the integer ranks to estimate the average of extra dimensions introduced by nonlinear (); [0086] The extra dimension is zero if the function (X.sub.t) is linear but could become a fraction depending on the nonlinearity of (X.sub.t); [0087] (4) use an average dimension to define the model size when initializing a prediction model for both linear and nonlinear cases; [0088] (5) use the difference (from step 3 extra dimension) to measure the nonlinearity with the following criteria: (a) very linear for a range 0.0 to about 0.2 not inclusive, i.e., [0.0, 0.20) which will recommend a linear model; (b) quasilinear for a range of about 0.2 to about 0.5 not inclusive, i.e., [0.20, 0.50) which will prefer to use a nonlinear model; and (c) nonlinear for a range about 0.5 to about dimension of Y not inclusive, i.e., [0.50, dimension of Y) which will recommend to use a nonlinear model. Use of other criteria ranges for linear, quasilinear, and/or nonlinear are suitable. In this way, embodiments quantitatively test linearity and measure severity of non-linearity (including gain in non-linearity over time).
[0089] In some embodiments, the trained prediction model 560 predicts at least three response values. In embodiments pertaining to the Kraft process 110, the three response values are: Kappa number, total alkaline charge, and residual alkali. In that case, the data processing 550 includes: (1) the data selection module 552 recognizing an operating state (i.e. a combination of certain types of wood chips and pulps) and calculating ranks of concentrated independent and dependent variables (i.e. measured sensor data 551); (2) the data selection module 552 using increased ranks and determining whether a linear relationship exists, where a determined nonlinear relationship exists, then use a nonlinear algorithm (e.g. XGBoost) to train the prediction model 560; (3) the model selection module 554 using estimated ranks to initialize a number of estimators for the nonlinear algorithm (e.g. XGBoost), or a number of components for a linear algorithm (e.g. PLS) such as: the number of components for the linear algorithm is the estimated rank among the independent and dependent variables, and a multiplication of number of estimators and depth of the nonlinear algorithm is equal to twice the estimated ranks; (4) the model selection module 554 balancing at least three prediction variables and configuring a single model with multiple inputs and multiple outputs for all used algorithms (e.g. global XGBoost and local LASSO) through packing/unpacking high-dimensional data; and (5) prediction accuracy being measured by a normalized mean square error which is consistent with model requirements.
[0090] In other embodiments, a nonlinearity measurement of predictor and response variables were evaluated as shown in
[0091] In other embodiments, data was evaluated from the Kraft process 110 using a linear PLS algorithm from the largest component by HDBSCAN as shown in
[0092] The XGBoost algorithm was then applied and evaluated similarly to the PLS algorithm. XGBoost is a state of art algorithm for small to mid-sized tabular data prediction tasks. Data from the cooking process of the Kraft process 110 is tabular and nonlinear with size from hundreds to thousands of samples. Thus in one embodiment, XGBoost is employed to solve the white liquor prediction and consumption optimization tasks and in turn overcome limitations of existing linear model approaches. To maintain consistency and compatibility in using XGBoost with existing linear methods, Applicants adopted the following initialization and training techniques: (1) number of estimators, maximum depth, and maximum leaves are proportional to the average dimension obtained from a previous step (detailed above). Empirical criterion is that multiplication of the three parameters is three times the average dimension; and (2) selection of the three parameters is further optimized from maximizing the score of XGBoost.
[0093] In other embodiments, data was evaluated from the Kraft process 110 using a nonlinear XGBoost algorithm from largest component by HDBSCAN as shown in
[0094] In some embodiments, PLS-based prediction model coefficients can be used for contribution analysis and optimization control. For the XGBoost model described in
[0095] That is to say, after selecting a linear model configuration or a non-linear model configuration for the target prediction model, model selection module 554 next configures model coefficients. Specifically, model selection module 554 bases model coefficients on relative importance (importance factors) and optimizes predictor variables such as white liquor consumption in our example case Kraft process 110. In embodiments, the model selection module 554 uses data 551 in an interval (time interval) to estimate the importance factors of contributing variables which can be denoted as {X.sub.i} and {Y.sub.i}. In a semi-continuous or continuous process (such as Kraft process 110), there is a data shift () that is accounted for or otherwise considered for the interval. The model selection module 554 selects the length of according to the data acquisition frequency and chemical reaction response time. In one embodiment, the larger of the two values is chosen to define the length of . The model selection module 554 accordingly shifts the data an interval based on to estimate contributing variables for the same importance factors estimated from the previous step. The contributing variable values are thus moved to {X.sub.i+} and {Y.sub.i+}. In this way, embodiments employ duo-interval values to analyze importance factors and to optimize Kappa number (a response variable Y). Such duo-interval data technique can be used in modeling other semi-continuous or continuous processes and configuring model coefficients for manipulating control parameters to reach desired behaviors. For non-limiting example, the model coefficient configuration can be applied to: reducing toxic waste, wastewater, and certain chemicals in a drying process for pulp and paper; calculating control set input in model predictive control to reach desired output; and/or optimizing distribution system in a short-term supply chain (to increase profit).
[0096] Embodiments use a local PLS model and LASSO (least absolute shrinkage and selection operator) model on a small set of samples in a moving window to calculate coefficients when the prediction model 560 is approximated in a local interval. A subject chemical process, industrial plant process, or otherwise operation of interest is described in Equation 2. Embodiments determine, for predictor-corresponding response (X.sub.t, Y.sub.t), the amount of contribution of each X.sub.t to Y.sub.t. For a new target Y.sub.t+4, embodiments also determine how to change X.sub.t+ to meet Y.sub.t+=(X.sub.t+) as follows:
[0097] For samples of {X.sub.i} predictor and {Y.sub.i} response, LASSO approximation of the function is expressed as Equation 3. In Equation 3, N is the number of samples for {X.sub.i} and {Y.sub.i}, and {.sub.j} is the solution of Lasso under constraints. The purpose of this regression is for variable selection in terms of which and how much each variable contributes to the target. If function (X.sub.t) is linear, {.sub.j, j=1, . . . , N} may be the exact solution if all the constraints are met. Further, for linear cases, Equation 3 may be an exact solution but may be a local approximation for nonlinear cases.
[0098] In other non-limiting embodiments, if optimizing a change to response Y.sub.t+ is desired, then the corresponding changes to prediction X.sub.t+ need to be solved. These changes can be solved according to Equation 4. Let and
stand for the desired predictors and response, respectively. Correspondingly, locally assume that
and
follows the same model as obtained from the previous LASSO regression.
[0099] In Equation 4, to be close to the previous states, and
may be subject to constraints such as lower/upper bounds.
[0100] In other non-limiting embodiments, alternating the regression of Equation 4 may provide a more accurate estimation as described in Equation 5. Equation 5 moves Y.sub.t to Y.sub.t+ in multiple steps.
[0101] Based on validation experiments, the coefficients from PLS and LASSO are similar in term of ranks for the important variables. For simplicity without loss of generality, the coefficients from a nonlinear mathematical function x+y.sup.2z.sup.4 in three different regions of [0.0, 0.1][0.0, 0.1][0.0, 0.1], [0.5, 0.6][0.5, 0.6][0.5, 0.6], and [1.0, 1.1][1.0, 1.1][1.0, 1.1] are illustrated in
[0102] To find the optimized predictor variable X for a modified response Y, an embodiment (model selection module 554) applies or otherwise uses a local LASSO in a predefined window with corresponding constraints (Equation 6):
here {tilde over (X)} is the average of samples (data 551) of predictor variable {tilde over (X)} in a time window of [t.sub.0, t.sub.1], Y is the desired response at next timestamp of t.sub.1+t, and {tilde over (X)} is the optimal solution for given . Note that such local LASSO may not be applicable if the change from Y to Y is not continuously dependent on the changes from X to {tilde over (X)}. The recommended use case would be within the same process control state. For the contribution analysis during the transition period, a nonlinear approach may be needed to reflect the complicated relationship between the predictors and responses (i.e., variables X, Y).
[0103] In other embodiments, a combination of PLS and LASSO models may be employed in an iterative fashion in calculating the contribution coefficients and performing optimization in a local time window. This novel approach and technique can also be interpretated as a federated gradient-based approach because the iteration is performed on a group of points instead of individual data points in the traditional gradient-based optimization solver. In sum, embodiments of the present invention apply local linear models (e.g., PLS, LASSO, etc.) to address global nonlinear data 551. As shown in the previously mentioned math function of
Computer Support
[0104]
[0105] Client computers/devices 50 and server computer(s) 60 may execute any of the modules, computation steps, or data processes embodying the functionalities and workflows of the present invention as detailed in
[0106]
[0107] In one embodiment, the processor routines 92 and data 94 are a computer program product (generally referenced 92), including a computer readable medium (e.g., a removable storage medium such as one or more DVD-ROM's, CD-ROM's, diskettes, tapes, cloud storage, SD cards, etc.) that provides at least a portion of the software instructions for the invention system. Computer program product 92 can be installed by any suitable software installation procedure, as is well known in the art. In another embodiment, at least a portion of the software instructions may also be downloaded over a cable, communication and/or wireless connection. In other embodiments, the invention programs are a computer program propagated signal product 107 embodied on a propagated signal on a propagation medium (e.g., a radio wave, an infrared wave, a laser wave, a sound wave, or an electrical wave propagated over a global network such as the Internet, or other network(s)). Such carrier medium or signals provide at least a portion of the software instructions for the present invention routines/program 92.
[0108] In alternate embodiments, the propagated signal is an analog carrier wave or digital signal carried on the propagated medium. For example, the propagated signal may be a digitized signal propagated over a global network (e.g., the Internet), a telecommunications network, or other network. In one embodiment, the propagated signal is a signal that is transmitted over the propagation medium over a period of time, such as the instructions for a software application sent in packets over a network over a period of milliseconds, seconds, minutes, or longer. In another embodiment, the computer readable medium of computer program product 92 is a propagation medium that the computer system 50 may receive and read, such as by receiving the propagation medium and identifying a propagated signal embodied in the propagation medium, as described above for computer program propagated signal product.
[0109] Generally speaking, the term carrier medium or transient carrier may encompass the foregoing transient signals, propagated signals, propagated medium, storage medium and the like. In other embodiments, the program product 92 may be implemented as a so-called Software as a Service (SaaS), or other installation or communication supporting end-users.
[0110] It should be understood that the example embodiments described herein may be implemented in many different ways. In some instances, the various methods and machines described herein may each be implemented by a physical, virtual, or hybrid general purpose computer, such as the computer system 50, 60, 2140, 3100, or a computer network environment such as those described below in relation to
[0111] Turning to
[0112] The process control, planning, scheduling, and real-time optimization applications are based on models (of the subject physical, chemical, or engineering process 2124) generated by process modeling system 2130. Plant 2120 may have any number of chemical processes 2124, any number of controllers 2122, and any number of process modeling systems 2130 used to configure and maintain their respective settings 2132.
[0113] In prior art methods, process modeling system 2130 may have utilized controlling an industrial based chemical process. In embodiments of the present invention, the process modeling system 2130 generates and deploys improved models 2110 of the subject chemical process 2124 generated from received and working data 2101a, 2101b, and 2101c (generally 2101) detailing the physical characteristics and operating conditions of chemical process 2124 and from initial models 2102a, 2102b, and 2102c (generally 2102) representing linearity-based prediction models (non-linear, quasi-linear, or linear) of the chemical process 2124. Working data 2101a, 2101b, and 2101c may be received as part of plant data 2105, part of a preexisting dataset, include simulated or otherwise derived data, preprocessed (dimension reduced, outliers removed) data, or any combination of the aforementioned. Models 2110 may be generated from any amount of working data 2101 and initial models 2102 (including global non-linear, and local linear models described above).
[0114] The models 2110 (also referred to herein as resulting models 2110) predict, with improved accuracy, the progress and physical characteristics/conditions of the subject chemical process 2124 (such as white liquor consumption of a Kraft process 110 in
[0115] In a generalized sense, controller 2122 is an interface between process modeling system 2130 and industrial plant 2120. Other interfaces between process modeling system 2130 and plant 2120 in addition to and/or instead of controller 2122 are suitable and in the purview of one skilled in the art given the disclosure herein. For example, there may be an interface between process modeling system 2130 and plant 2120 systems. There may be a user interface for process modeling system 2130. Process modeling system 2130 may effectively be part of a simulator or optimizer for non-limiting examples. Various such interfaces enable an end user, e.g., process engineer, to utilize model predictions in (a) determining different mathematical relationships (of the chemical process physics-based or chemistry-based characteristics) that are substantially linear, then training the initial model as a linear predictive model, in (b) where the different mathematical relationships are determined to be quasi-linear or non-linear, then (i) training the initial model as a global non-linear predictive model, (ii) combining the trained global non-linear predictive model with one or more local linear models dynamically adjusting control variables of the certain resource consumption in respective windows of time, and (iii) producing a resulting trained prediction model based on the global non-linear predictive model combined with the one or more local linear models. In embodiments, an interface enables a process engineer to utilize the model predictions in optimizing (online or offline) the chemical process 2124 at the plant 2120. In these and other similar ways, embodiments enable various improvements in performance of the chemical process 2124 at the subject plant 2120.
[0116]
[0117] The system computers 3101 and 3102 may communicate with the data server 3103 to access collected data for measurable process variables from a historian database 3111. The data server 3103 may be further communicatively coupled to a distributed control system (DCS) 3104, or any other plant control system, which may be configured with instruments 3109A-3109I, 3106, 3107 that collect data at a regular sampling period (e.g., one sample per minute) for the measurable process variables. Instruments 3106, 3107 are online analyzers (e.g., gas chromatographs) that collect data at a longer sampling period. The instruments 3109A-3109I, 3106, 3107 may communicate the collected data to an instrumentation computer 3105, also configured in the DCS 3104, and the instrumentation computer 3105 may in turn communicate the collected data to the data server 3103 over communications network 3108. The data server 3103 may then archive the collected data in the historian database 3111 for model calibration, inferential model training purposes, and the like. The data collected varies according to the type of target process.
[0118] The collected data may include measurements for various measurable process variables. These measurements may include, for example, a feed stream flow rate as measured by a flow meter 3109B, a feed stream temperature as measured by a temperature sensor 3109C, component feed concentrations as determined by an analyzer 3109A, and reflux stream temperature in a pipe as measured by a temperature sensor 3109D. The collected data may also include measurements for process output stream variables, such as, for example, the concentration of produced materials, as measured by analyzers 3106 and 3107. The collected data may further include measurements for manipulated input variables, such as, for example, reflux flow rate as set by valve 3109F and determined by flow meter 3109H, a re-boiler steam flow rate as set by valve 3109E and measured by flow meter 3109I, and pressure in a column as controlled by a valve 3109G. The collected data reflect the operation conditions of the representative plant during a particular sampling period. The collected data is archived in the historian database 3111 for model calibration and inferential model training purposes. The data collected varies according to the type of target process.
[0119] The system computers 3101 or 3102 may execute various types of process controllers for online deployment purposes. The output values generated by the controller(s) on the system computers 3101 or 3102 may be provided to the instrumentation computer 3105 over the network 108 for an operator to view, or may be provided to automatically program any other component of the DCS 3104, or any other plant control system or processing system coupled to the DCS system 3104. Alternatively, the instrumentation computer 3105 can store the historical data through the data server 3103 in the historian database 3111 and execute the process controller(s) in a stand-alone mode. Collectively, the instrumentation computer 3105, the data server 3103, and various sensors and output drivers (e.g., 3109A-3109I, 3106, 3107) form the DCS 3104 and can work together to implement and run the presented application.
[0120] The example architecture 3100 of the computer system supports the process operation of a representative plant. In this embodiment, the representative plant may be, for non-limiting example, a pulp and paper refinery or a chemical processing plant having a number of measurable process variables, such as, for example, temperature, pressure, and flow rate variables. It should be understood that in other embodiments a wide variety of other types of technological processes or equipment in the useful arts may be used.
[0121] Embodiments or aspects thereof may be implemented in the form of hardware, firmware, or software. If implemented in software, the software may be stored on any non-transient computer readable medium that is configured to enable a processor to load the software or subsets of instructions thereof. The processor then executes the instructions and is configured to operate or cause an apparatus to operate in a manner as described herein.
[0122] Further, firmware, software, routines, or instructions may be described herein as performing certain actions and/or functions of the data processors. However, it should be appreciated that such descriptions contained herein are merely for convenience and that such actions in fact result from computing devices, processors, controllers, or other devices executing the firmware, software, routines, instructions, etc. Likewise, where a digital processor is described as performing certain actions, it is understood that one or more digital processors may be performing the actions.
[0123] It should be understood that the flow diagrams, block diagrams, and network diagrams may include more or fewer elements, be arranged differently, or be represented differently. But it further should be understood that certain implementations may dictate the block and network diagrams and the number of block and network diagrams illustrating the execution of the embodiments be implemented in a particular way.
[0124] Accordingly, further embodiments may also be implemented in a variety of computer architectures, physical, virtual, cloud computers, and/or some combination thereof, and thus, the data processors described herein are intended for purposes of illustration only and not as a limitation of the embodiments.
[0125] While example embodiments have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the embodiments encompassed by the appended claims.
[0126] For example, the foregoing description and details of embodiments in the figures reference Applicant-Assignee AspenTech, tools and platforms, for purposes of illustration and not limitation. Other similar tools and platforms are suitable. In one embodiment, the forgoing methods, techniques, and functions may be integrated into existing software programs or products, such as Applicant-Assignee Aspen Tech ProMV (trademark). Functions are implemented through python-based services (for non-limiting example) to ensure the reusability for on-prem, cloud, and MS Edge (trademark) applications. REST (Represented State Transfer) API may be utilized to unify data input and output to neutralize a specific domain characteristic of different applications, and/or facilitate economic and quick integration across products, solutions, and platforms. Other integration techniques are suitable and in the purview of one skilled in the art given this disclosure.
[0127] Given the above, the present invention provides advantages for process model/algorithm selection based on working data from chemical processes of interest. The determined process model/algorithm optimizes independent and dependent variables and then automatically continues to monitor and adjust the model/algorithm to best fit the working data. In this way, Applicant's approach is (and the methods and systems embodying the present invention are) a data-driven generation of process models of nonlinear, continuous-like industrial (e.g., chemical) processes. The subject chemical process may be continuous, semi-continuous, or continuous in one or more parts or portions. In such industrial (e.g., chemical) processes, the computer-automated data-driven selection of operating conditions and models is unachieved in the prior art. Applicant's computer-automated data-driven selection of operating conditions and models is advantageously more consistent than manual domain knowledge selection in the art. The dynamic calculations of dynamic coefficients and optimization through moving windows of time are advantageous over prior global objective function and solver methods for determining optimized independent variables.
[0128] The non-linear correlations of the present invention help fill the gap of modeling processes that are somewhere between fully continuous and fully batch operations. As described above, the pulp and paper industry is a good example because of the discrete nature of the sequential subprocesses from digesting the wood fibers to extruding the sheets of finished paper. Part of the process is batch; other parts are continuous. Other industries that share similar process traits are reactors that are batch operated as individual process units but also have characteristics of an overall semi-continuous process when one batch can affect the next one in the same production line. Establishing a lag of process conditions from the previous batch to predict the current one may not always linearly model the process with sufficient accuracy. Some polymer reactors and reactor furnaces such as smelting furnaces have this complexity due to less than ideal operating practices where reaction products are not completely recovered between batches. Since the art will continue to be challenged with complex processes, Applicant with the present invention data driven approach provides an alternative way to empirically model such industrial (e.g., chemical) processes.
[0129] Applicant's methods and systems advantageously automatically detect linearity for purposes of digital processor selection of prediction (e.g., process) model. As detailed above, the methods and systems quantify severity of linearity, i.e., quantitatively measure non-linearity, and track over time (across the moving windows of time periods) any gradual changes in linearity/non-linearity of data and relationships therein. Such quantitative measuring of gain in non-linearity and quantitative metrics of linearity/non-linearity severity in embodiments is advantageous in process modeling increasing model accuracy and model performance, and is heretofore not contemplated in the prior art.