SYSTEM AND PREDICTIVE MODELING METHOD FOR SMELTING PROCESS CONTROL BASED ON MULTI-SOURCE INFORMATION WITH HETEROGENEOUS RELATEDNESS

20180081339 ยท 2018-03-22

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention is a system and predictive modeling method specially designed to improve process control and energy efficiency for a smelting process used to produce pure metal from an ore containing said metal. Data is collected from various sensor sources in the smeltering process to predict whether an increase or decrease is needed in controlling two variables comprising temperature and an additive that allows the reaction in the electrolytic process to proceed at a lower bath temperature. The invention provides a generalized framework to learn the complex heterogeneity embedded in the collected data, and employs a regularized non-negative matrix factorization problem, which simultaneously decomposes the instance-feature and instance-label matrices, while enforcing task relatedness, feature type consistency and label correlations on the collected data. The predictive modeling method disclosed herein effectively mines the hidden correlation among the heterogeneous data and improves the prediction accuracy.

Claims

1. A method for predicting temperature and additive concentration variables in an electrolytic smelting bath process, said variables being regulated together to provide in combination, optimal improvement in the physico-chemical properties of said bath and to lower the temperature of said bath, said optimal combination resulting in a lowering of the melting temperature of said bath at any given time, comprising: measuring process variables of each pot used in an electrolytic process of forming a pure metal; sending said process variables to a central database; receiving said process variables in said central database that was generated in each of said pots and receiving therein historical measurements; representing respective pots as quality predictive modeling tasks based on said historical measurements; grouping predictive modeling tasks for said temperature and said additive in said pots according to heterogeneous relatedness of the production equipment, said grouping step being included, for each pot in which a metal ore is simultaneously being processed, grouping said predictive modeling tasks for all of the respective pots, each said pot of the multiple pots being treated as a group of one or more predictive modeling tasks; partitioning said process variables into two sets comprising generating a prediction model that accommodates said grouping of said pots predictive modeling tasks from said grouping step and said partitioned process variables from said partitioning step; predicting a temperature and or a concentration of said additive in individual pots based on said prediction model generated in the generating step; sending predicted values of temperature and/or additive concentration obtained from each pot to an advanced process controller and the central database; measuring an actual number value of the temperature and/or additive concentration in at least one of said pots; sending said actual number value of the said temperature and/or said additive concentration in at least one of said pots to an advanced process controller and a virtual machine; updating said virtual machine with said actual number value of the temperature and/or additive concentration in at least one of said pots; determining a feedback control by said advanced process controller: and processing, by increasing or decreasing said temperature and/or said additive concentration by the production equipment in accordance with the feedback control, wherein the receiving, representing, grouping, partitioning, generating, and predicting steps are executed by a virtual machine implemented on a computer.

2. The method defined in claim 1 wherein said additive is AlF.sub.3.

3. The method defined in claim 2 wherein ranges of data measurements generated in said pots comprising various control mechanisms comprising power and resistance control, noise control, alumina feed control and chemical combination data, are classified in a functional 33, 9-box matrix control block that issues a series of control signals based upon a low (L), medium (M) and high (H) range of the functional relationship between bath temperature and AlF.sub.3, the optimal bath temperature and AlF.sub.3 concentration ranges for maximum efficiency being located in Box.sub.22 of said 33, 9-box matrix having the depicted configuration plotting ideal bath temperature as the ordinate as a function of AlF.sub.3 concentration ranges as the abscissa; said functional 33, 9-box matrix being that depicted in FIG. 2 hereof, which is hereby incorporated herein by reference as if set forth identically herein.

4. The method defined in claim 3 wherein said various process control mechanisms are connected in series to smelter groups 1 through 4 through n, which host the smelting process, and which comprise machine heterogeneity, said smelter groups being connected in series respectively to a sensor type functional block comprising a sensor type 1, a sensor type 2, a sensor type j, a sensor type m, through to sensor n, each said sensor collecting one of the feature type data source measurements selected from the power and resistance variables, noise related process variables, various feeding parameters and various chemical contents; and the plurality of process variables resulting from the sensor measurement in a feature heterogeneity stage is divided into different types of said variables as taken from different sources; the data from said feature heterogeneity stage is transmitted to a data warehouse which is a functional block that stores said variable process data, measures said 33, 9-box control variables and predicts the 9-box control variables; the output of the data warehouse is transmitted to a predictive modeler that is a functional block that uses a trained model that processes a plurality of feature types and data sources and, based upon historical process data and measurements of temperature and AlF.sub.3, predicts control measurements of bath temperature and concentration of AlF.sub.3; the data obtained from said trained model is transmitted to said 33, 9-box matrix control that issues a series of control signals based upon the range of temperature and AlF.sub.3 concentration.

5. The method defined in claim 4 wherein said modeling approach comprises the following steps: train a random forest three class Model 1 using all training test data developed and stored in said data warehouse; train a binary classification Model 2 using training data belonging to a class high (H) and medium (M); apply said 3-class Model 1 to said test data; if predicted low Box.sub.22, apply Model 2, output is low or medium; if predicted medium, output is medium, if predicted high above Box.sub.22 apply Model 3, output is high or medium.

6. The method defined in claim 5 wherein triple types of complex heterogeneity embedded in the data using said predictive modeling approach with respect to Model 2 and Model 3 is determined using a regularized non-negative matrix triple factorization operation based upon the depicted matrix in FIG. 6 hereof, which is hereby incorporated herein by reference as if set forth identically herein wherein instance feature data ({tilde over (X)}.sub.ij) is decomposed into 3 non-negative matrices; and wherein instance label data (Y.sub.i) is decomposed into 3 non-negative matrices; the task relatedness, view/feature type, consistence and label correlations are regularized, subject to the limitations wherein: with respect to task relatedness, the jth view feature type, the decomposition of ({tilde over (X)}.sub.ij) in different tasks shares the same feature encoding matrix C.sub.j; with respect to view/feature consistence, for the ith task, the decomposition of ({tilde over (X)}.sub.ij) in different views shares the same feature encoding matrix R.sub.i; with respect to the label correlations, the labels share the same label encoding matrix C.sub.Y across different tasks; M.sub.ij models the correlations between instance clusters and feature clusters; M.sub.iY models the correlations between instance clusters and label clusters; said method simultaneously decomposes the instance-feature and instance label matrices, while concurrently enforcing task relatedness, feature type consistency and label correlations on said data.

7. The method defined in claim 6 wherein said hidden correlations among said heterogeneous data is governed according to: min { R , M , C } .Math. > .Math. O .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. X ~ ij - R ~ i .Math. M ij .Math. C j T .Math. F 2 + .Math. .Math. i = 1 T .Math. .Math. Y i - R i .Math. M iY .Math. C Y T .Math. F 2 + .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. M ij - M iY .Math. F 2 wherein T is the number of tasks, V is the number of views, M is the number of labels, respectively, F represents Frobenius norm and and are regularization coefficients.

8. A method for operating a smelter to produce a pure base metal, said method comprising producing said pure base metal in pots with the aid of a digital computer and a central database for storing data derived from a metal smelter process to produce a pure base metal, said data comprising process variables comprising power and resistance, noise control, feed control and chemicals received from said pots for said smelter process, comprising: means for mining hidden correlations among heterogeneous data within said central database; means decomposing instance feature data ({tilde over (X)}.sub.ij} into three non-negative matrices; means decomposing instance feature data (Y.sub.t) into three non-negative matrices; means regularizing task relatedness, view (feature type) consistence and label correlations; means for determining the objective function iteratively; wherein for said task relatedness, the jth view/feature type, said decomposition of {tilde over (X)}.sub.ij in different tasks shares the same feature encoding matrix C.sub.j; wherein, for said ith task, said decomposition said decomposition of {tilde over (X)}.sub.ij in different tasks shares the same feature encoding matrix R.sub.i; wherein for said label correlations, said labels share the same label encoding matrix C.sub.Y across different tasks; and instance clusters and label clusters. wherein M.sub.ij models the correlation between instance clusters and feature clusters, and wherein M.sub.iY, models the correlation between instance clusters and label clusters.

9. The method defined in claim 8 wherein said additive is AlF.sub.3.

10. The method defined in claim 9 wherein ranges of data measurements generated in said pots comprising various control mechanisms comprising power and resistance control, noise control, alumina feed control and chemical combination data, are classified in a functional 33, 9-box matrix control block that issues a series of control signals based upon a low (L), medium (M) and high (H) range of the functional relationship between bath temperature and AlF.sub.3, the optimal bath temperature and AlF.sub.3 concentration ranges for maximum efficiency being located in Box.sub.22 of said 33, 9-box matrix having the depicted configuration plotting ideal bath temperature as the ordinate as a function of AlF.sub.3 concentration ranges as the abscissa; said functional 33, 9-box matrix being that depicted in FIG. 2 hereof, which is hereby incorporated herein by reference as if set forth identically herein.

11. The method defined in claim 10 wherein said various process control mechanisms are connected in series to smelter groups 1 through 4 through n, which host the smelting process, and which comprise machine heterogeneity, said smelter groups being connected in series respectively to a sensor type functional block comprising a sensor type 1, a sensor type 2, a sensor type j, a sensor type m, through to sensor n, each said sensor collecting one of the feature type data source measurements selected from the power and resistance variables, noise related process variables, various feeding parameters and various chemical contents; and the plurality of process variables resulting from the sensor measurements in a feature heterogeneity stage is divided into different types of said variables as taken from different sources; the data from said feature heterogeneity stage is transmitted to a data warehouse which is a functional block that stores said variable process data, measures the 33, 9-box control variables and predicts the 33, 9-box control variables; the output of the data warehouse is transmitted to a predictive modeler that is a functional block that uses a trained model that processes a plurality of feature types and data sources and, based upon historical process data and measurements of temperature and AlF.sub.3, predicts control measurements of bath temperature and concentration of AlF.sub.3; the data obtained from trained model is transmitted to said 33, 9-box matrix control that issues a series of control signals based upon the range of temperature and AlF.sub.3.

12. The method defined in claim 11 wherein said modeling approach comprises the following steps: train a random forest three class Model 1 using all training test data developed and stored in said data warehouse; train a binary classification Model 2 using training data belonging to a class high (H) and medium (M); apply said 3-class model to test data; if predicted low beneath Box.sub.22, apply Model 2, output is low or medium; if predicted medium, output is medium, if predicted high above Box apply Model 3, output is high or medium.

13. The method defined in claim 12 wherein triple types of heterogeneity embedded in the data using said predictive modeling approach with respect to Model 2 and Model 3 is determined using a regularized non-negative matrix triple factorization operation based upon the depicted matrix in FIG. 6 hereof, which is hereby incorporated herein by reference as if set forth identically herein wherein instance feature data ({tilde over (X)}.sub.ij) is decomposed into 3 non-negative matrices; and wherein instance label data (Y.sub.i) is decomposed into 3 non-negative matrices; the task relatedness, view/feature type, consistence and label correlations are regularized, subject to the limitations wherein: with respect to task relatedness, the jth view feature type, the decomposition of ({tilde over (X)}.sub.ij) in different tasks shares the same feature encoding matrix C.sub.j; with respect to view/feature consistence, for the ith task, the decomposition of ({tilde over (X)}.sub.ij) in different views shares the same feature encoding matrix R.sub.i; with respect to the label correlations, the labels share the same label encoding matrix Cy across different tasks; M.sub.ij models the correlations between instance clusters and feature clusters; M.sub.iY, models the correlations between instance dusters and label clusters.

14. The method defined in claim 13 wherein said hidden correlations among said heterogeneous data is governed according to: min { R , M , C } .Math. > .Math. O .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. X ~ ij - R ~ i .Math. M ij .Math. C j T .Math. F 2 + .Math. .Math. i = 1 T .Math. .Math. Y i - R i .Math. M iY .Math. C Y T .Math. F 2 + .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. M ij - M iY .Math. F 2 wherein T is the number of tasks, V is the number of views, M is the number of labels, respectively, F represents Frobenius norm and and regularization coefficients.

15. A non-transitory computer readable storage medium having a computer readable program code stored therein, said computer readable code embodied therein storing a program of instructions adapted to be executed to implement a method for predicting temperature and additive concentration variables in an electrolytic smelting bath process, said variables being regulated together to provide in combination, optimal improvement in the physico-chemical properties of said bath and to lower the temperature of said bath, said optimal combination resulting in a lowering of the melting temperature of said bath at any given time, comprising: measuring process variables of each pot used in an electrolytic process of forming a pure metal; sending said process variables to a central database; receiving said process variables in said central database that was generated in each of said pots and receiving therein historical measurements; representing respective pots as quality predictive modeling tasks based on said historical measurements; grouping predictive modeling tasks for said temperature and said additive in said pots according to heterogeneous relatedness of the production equipment, said grouping step being included, for each pot in which a metal ore is simultaneously being processed, grouping said predictive modeling tasks for all of the respective pots, each said pot of the multiple pots being treated as a group of one or more predictive modeling tasks; partitioning said process variables into two sets comprising generating a prediction model that accommodates said grouping of said pots predictive modeling tasks from said grouping step and said partitioned process variables from said partitioning step; predicting a temperature and or a concentration of said additive in individual pots based on said prediction model generated in the generating step; sending predicted values of temperature and/or additive concentration obtained from each pot to an advanced process controller and the central database; measuring an actual number value of the temperature and/or additive concentration in at least one of said pots; sending said actual number value of the said temperature and/or said additive concentration in at least one of said pots to an advanced process controller and a virtual machine; updating said virtual machine with said actual number value of the temperature and/or additive concentration in at least one of said pots; determining a feedback control by said advanced process controller; and processing, by increasing or decreasing said temperature and/or said additive concentration by the production equipment in accordance with the feedback control, wherein the receiving, representing, grouping, partitioning, generating, and predicting steps are executed by a virtual machine implemented on a computer.

16. The method defined in claim 15 wherein said additive is AlF.sub.3.

17. The method defined in claim 16 wherein ranges of data measurements generated in said pots comprising various control mechanisms comprising power and resistance control, noise control, alumina feed control and chemical combination data, are classified in a functional 33, 9-box matrix control block that issues a series of control signals based upon a low (L), medium (M) and high (H) range of the functional relationship between bath temperature and AlF.sub.3, the optimal bath temperature and AlF.sub.3 concentration ranges for maximum efficiency being located in Box.sub.22 of said 33, 9-box matrix having the depicted configuration plotting ideal bath temperature as the ordinate as a function of AlF.sub.3 concentration ranges as the abscissa; said functional 33, 9-box matrix being that depicted in FIG. 2 hereof, which is hereby incorporated herein by reference as if set forth identically herein.

18. The method defined in claim 17 wherein said various process control mechanisms are connected in series to smelter groups 1 through 4 through n, which host the smelting process, and which comprise machine heterogeneity, said smelter groups being connected in series respectively to a sensor type functional block comprising a sensor type 1, a sensor type 2, a sensor type j, a sensor type m, through to sensor n, each said sensor collecting one of the feature type data source measurements selected from the power and resistance variables, noise related process variables, various feeding parameters and various chemical contents; and the plurality of process variables resulting from the sensor measurement in a feature heterogeneity stage is divided into different types of said variables as taken from different sources; the data from said feature heterogeneity stage is transmitted to a data warehouse which is a functional block that stores said variable process data, measures said 33, 9-box control variables and predicts the 9-box control variables; the output of the data warehouse is transmitted to a predictive modeler that is a functional block that uses a trained model that processes a plurality of feature types and data sources and, based upon historical process data and measurements of temperature and AlF.sub.3, predicts control measurements of bath temperature and concentration of AlF.sub.3; the data obtained from said trained model is transmitted to said 33, 9-box matrix control that issues a series of control signals based upon the range of temperature and AlF.sub.3 concentration.

19. The method defined in claim 18 wherein said modeling approach comprises the following steps: train a random forest three class Model 1 using all training test data developed and stored in said data warehouse; train a binary classification Model 2 using training data belonging to a class high (H) and medium (M); apply said 3-class Model 1 to said test data; if predicted low beneath Box.sub.22, apply Model 2, output is low or medium; if predicted medium, output is medium, if predicted high above Box.sub.22 apply Model 3, output is high or medium.

20. The method defined in claim 19 wherein triple types of complex heterogeneity embedded in the data using said predictive modeling approach with respect to Model 2 and Model 3 is determined using a regularized non-negative matrix triple factorization operation based upon the depicted matrix in FIG. 6 hereof, which is hereby incorporated herein by reference as if set forth identically herein; wherein instance feature data ({tilde over (X)}.sub.ij) is decomposed into 3 non-negative matrices; and wherein instance label data (Y.sub.i) is decomposed into 3 non-negative matrices; the task relatedness, view/feature type, consistence and label correlations are regularized, subject to the limitations wherein: with respect to task relatedness, the jth view feature type, the decomposition of ({tilde over (X)}.sub.ij) in different tasks shares the came feature encoding matrix C.sub.j; with respect to view/feature consistence, for the ith task, the decomposition of ({tilde over (X)}.sub.ij) in different views shares the same feature encoding matrix R.sub.i; with respect to the label correlations, the labels share the same label encoding matrix C.sub.Y across different tasks; M.sub.ij models the correlations between instance clusters and feature clusters; M.sub.iY, models the correlations between instance clusters and label clusters; said method simultaneously decomposes the instance-feature and instance label matrices, while concurrently enforcing task relatedness, feature type consistency and label correlations on said data.

21. The method defined in claim 20 wherein said hidden correlations among said heterogeneous data is governed according to: min { R , M , C } .Math. > .Math. O .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. X ij - R i .Math. M ij .Math. C j T .Math. F 2 + .Math. .Math. i = 1 T .Math. .Math. Y i - R i .Math. M iY .Math. C Y T .Math. F 2 - .Math. .Math. i = 1 T .Math. .Math. j = 1 V .Math. .Math. M ij - M iY .Math. F 2 wherein T is the number of tasks, V is the number of views, M is the number of labels, respectively, F represents Frobenius norm and and are regularization coefficients.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0093] FIG. 1 is a side view of an electrolytic smelter.

[0094] FIG. 2 depicts a 33, 9-box matrix that embodies the predictive modeling approach to the present invention.

[0095] FIG. 3 is a block diagram of the four types of domain control that are based on a range combination of temperature and metal additive such as Aluminum Fluoride.

[0096] FIG. 4 is a flow diagram of an overview of the elements, methods and systems that are embodied in implementing the present invention.

[0097] FIG. 5 depicts the predictive modeling approach of the present invention.

[0098] FIG. 6 is a more detailed predictive modeling approach relating to two of the models disclosed in FIG. 5.

[0099] FIG. 7 depicts the expression for formulating the two models of FIG. 6.

DESCRIPTION OF THE PREFERRED EMBODIMENT

[0100] The system and predictive modeling technique for a smelting process control, is based upon multi-source information with triple types of heterogeneity that improves the efficiency of, preferably, an aluminum smelting process.

[0101] The triple types of heterogeneity comprise chamber/machine, source of features/process variables and multiple correlated data instance labels. The present invention addresses a novel setting in model-based predictions where each pot in the smelting manufacturing process represents a single modeling task that predicts whether a temperature and/or additive (aluminum fluoride) in the system should be raised or lowered or remain unchanged, based on historical data.

[0102] It is understood that the methods and systems discussed in the present disclosure include some conventional structures and/or steps. Since these structures and steps are well known in the art, they will only be discussed in a general level of detail.

[0103] FIG. 1 is a cross section of an aluminum producing pot 1 containing pre-baked carbon anodes used in the Hall-Heroult smelting process. The pot depicted is one of many that are connected electrically in series to form a potline.

[0104] In each pot, direct current passes from carbon anodes 2, 2, through a cryolite/alumina molten bath solution 3 containing alumina, to the carbon cathode cell 4 and then to the anodes of the next pot and so on down the line. Steel bars 5 embedded in cathode 4 carry the current out of pot 1 while the other pots themselves in line are connected through an aluminum bus-bar system (not shown). Pot 1 further consists of a steel shell base 6 in which carbon cathode lining 8 is housed. Insulation layer 7 is sandwiched between steel shell base 6 and cathode 4. Lining 8 holds the molten cryolite bath 3 containing alumina 9 which is added to the bath via hopper 10 in solution and molten aluminum 11 is created in the process. An electrically insulated superstructure, not shown, mounted above the shell stores alumina 9 automatically delivered via a sealed system, that holds the carbon anodes 2, 2, suspending them in the pot.

[0105] Aluminum fluoride is an essential additive in the smelting process as, inter alia, it must be added to the bath to match the soda content of the consumed alumina in order to maintain the optimal sodium fluoride/aluminum fluoride ratio. The small percentage of calcium oxide as a normally occurring impurity in the alumina is sufficient to maintain the desired concentration for calcium fluoride (fluorspar).

[0106] The electrolyte 3, which fills the space between anodes 2, 2 in the pot, consists of molten cryolite containing dissolved alumina. A solid crust 13 forms at the surface of the electrolyte.

[0107] The crust is broken periodically by crustbreaker 14 and alumina 9 is stirred into the electrolyte to maintain the alumina concentration.

[0108] As the electrolytic reaction proceeds, aluminum metal, which is slightly denser than the pot bath material, is continuously deposited in a metal pool 11 on the bottom of the pot while oxygen reacts with the carbon material of the anodes to form oxides of carbon. As the anodes are consumed during the process, they must be continuously lowered to maintain a constant distance between the anode and the surface of the metal, which electrically is part of the cathode. The anodes are replaced on a regular schedule.

[0109] The volatilized fluorides and gaseous hydrogen fluoride are collected with other gases evolved from the pots by gas-collecting hoods or manifolds in a chamber 15 and are passed through ducts 16 to central gas treatment and recovery facilities (not shown). For gaining access to the interior of the pot, there is a removable gas collection hood 17 continuing upward and arcuately from steel shell base 6.

[0110] The present invention focuses on two critical parameters that must be closely monitored for the instant process to proceed efficiently.

[0111] First, as electrolysis progresses, the aluminum oxide content of the bath is decreased and is intermittently replenished by feed additions from the pot's alumina storage to maintain the dissolved oxide content at about 2 to 5 percent. If the alumina concentration falls to about 1.5 to 2 percent, the phenomenon of anode effect (described above) may occur. The sensors monitoring the pots collect hundreds of variables that reflect the process state, controls signals and control responses that allow the predictive modeling approach in accordance with the present invention to insure that there is a proper alumina concentration of the bath.

[0112] Second, as noted above, a bath composition temperature of between about 920 and 980 C. is a critical factor in the aluminum production process. To reduce the melting point of bath (pure cryolite melts at 1009 C.), the bath contains fluorspar and some excess aluminum fluoride, which along with the dissolved alumina, reduces the melting temperature sufficiently to permit the pots to be operated in the 920 to 980 C. range. Reduced operating temperature improves pot efficiency (the CE/EE). Temperature of the bath in the proper range is obtained by adjustment using the predictive modeling method that effectively mines the hidden correlation among the heterogeneous data thus improving the prediction accuracy.

[0113] The lynchpin element that ultimately controls the predictive modeling approach of the present invention is a 9-box (33) matrix. The present invention uses a generalized framework to determine complex heterogeneity embedded in the data recorded by sensors associated with the pots.

[0114] The quest to determine the most efficient temperature range and concentration of AlF.sub.3 is solved by applying a regularized non-negative triple factorization to the data provided by the sensors. The regularized non-negative triple factorization simultaneously decomposes the instance-feature and instance label matrices while enforcing the task relatedness, feature type consistency and label correlations on the supplied data.

[0115] FIG. 2 is a 9-box matrix that plots temperature as a function of AlF.sub.3 concentration, which is explained in greater detail hereinafter.

[0116] Hundreds of process variables that reflect the process state are received by controls that comprise power and resistance control, noise control, alumina feed control and chemical combination control. The data received in the aforementioned controls are fed into the 9-box matrix control to predict any needed change in temperature and/or AlF.sub.3.

[0117] FIG. 3 is a block diagram of the four types of domain control that are based on the range combination of temperature and Aluminum Fluoride. Each control defined therein possesses specific data unique to the control box that is fed into the 9-box matrix control.

[0118] FIG. 4 is a flow diagram of an overview of the elements, methods and systems that are embodied in implementing the present invention. Control mechanism 101 includes therein various control mechanisms. Control mechanism 101 is a functional block of various control mechanisms, that issues variable control signals comprising a power control for voltage, current, etc.; noise control for a change of resistance, heat, etc.; feed control for feeding the alumina raw material and chemical control for defining the various alloys, such as Na, Si, Fe, etc.

[0119] The control signals from Control mechanism 101 are fed to a line of pots in smelters comprising Smelter Group 102, Smelter Croup 103, Smelter Group 104 and Smelter Group 105. These smelters collectively embody what is referred to herein as machine heterogeneity. In processing machine heterogeneity in the system, Applicants use MTL (discussed above), wherein a task is a model associated with the features (process variables) and label/target from a single source, in this case, a pot.

[0120] At this stage, the present invention uses MTL as described above, as its approach to machine learning that learns a particular problem together with other related problems at the same time, using a the shared representation noted.

[0121] Signals from smelter Groups 102, 103, 104 and 105 are fed to sensors 106 (type 1), 107 (type 2), 108 (type j) and 109 (type m) respectively. The sensors designated types 1, 2, j and m, are functional blocks that collect different types of process data from diverse sources. These sensor sources 106 to 109 respectively comprise feature type data source I consisting of power and resistance data; feature type data source 2 consisting of noise related process variables; feature type data source k consisting of feed parameters; and feature type data source K consisting of various chemical contents.

[0122] The data measurements in sensor type 1, 106, sensor type 2, 107, sensor type 3, 108 and sensor type 4, 109 are transmitted to feature type/data source entities comprising source 1, 110 (power and resistance), source 2, 111 (e.g., noise), source k, 112 (e.g., feed parameters) and source K, 113 (e.g., chemicals) respectively. The feature type/data sources depicted collectively embody what is referred to herein as feature heterogeneity. At this stage of the process, multi-view learning is in place. The feature type/data source stored in 110, 111, 112 and 113 is transmitted to data warehouse 114 which operates in collaboration with predictive modeler 115 to predict the optimal data settings for process temperature and AlF.sub.3.

[0123] The data relating to temperature 116 and alumina fluoride concentration 117 is then fed to the 9-box control 118 which is a functional block that issues a series of signals each of which is based upon, after application of an algorithm, the predicted range of temperature 116 and alumina fluoride concentration 117, designated in the 9-box matrix as ranges L, M and H, (e.g., Low, Moderate, High).

[0124] The 9 boxes comprising matrix 118 shown in FIG. 4 define a functional block relationship between the ordinate process temperature and the abscissa AlF.sub.3 concentration. The functional relationships are defined at three levels within the boxes in the matrix.

[0125] The predictive model is a functional block as noted, that uses a trained model that takes the various feature types of noted/data sources, and predicts control variables, temperature and additives, preferably AlF.sub.3 in the case of preparing Al metal. The model is trained based upon historical process data and measurements of temperature and additive (e.g., AlF.sub.3).

[0126] The data obtained pursuant to the steps detailed above are used to construct a predictive relationship in the predictive via a training data set. Most approaches that search through training data for empirical relationships tend to over-fit the data, meaning that they identify apparent relationships in the training data that do not hold in general. A test set in the present invention is prepared which is a set of data that is independent of the training data, but which follows the same probability distribution as the training data. If a model fit to the training set also fits the test set as well, minimal overfitting has taken place. A better fitting of the training set is preferred as opposed to the test set that usually points to overfitting.

[0127] There are a number of methods of learning from multi-view data by considering the diversity of different views. These views may be obtained from multiple sources or different feature subsets. There are a number of representative multi-view learning algorithms in different areas that have been classified into three groups: 1) co-training, 2) multiple kernel learning, and 3) subspace learning.

[0128] Co-training style algorithms train alternately to maximize the mutual agreement on two distinct views of the data; multiple kernel learning algorithms exploit kernels that naturally correspond to different views and combine kernels either linearly or non-linearly to improve learning performance; and subspace learning algorithms aim to obtain a latent subspace shared by multiple views by assuming that the input views are generated from this latent subspace.

[0129] Due to different control mechanisms, in this feature heterogeneity step in the process, hundreds of process variables are divided into different types or are considered from different sources.

[0130] The process data collected at the feature type/data sources are collectively fed to data warehouse 114 which is a functional block that stores valuable process data, measurement of the 9-box control variables and a prediction of 9-box control variables. This segment of the process is referred to as Label heterogeneity and multi label learning.

[0131] From data warehouse 114, the refined data is transmitted to predictive modeler 115, which is a functional block that takes various feature types/data sources, synthesizes same and predicts control variables, temperature 116 and alumina fluoride concentration 117.

[0132] The predictive model is a functional block as noted, that uses a trained model that takes the various feature types noted/data sources, and predicts control variables, temperature and additives, preferably AlF.sub.3 in the case of preparing Al metal. The model is trained based upon historical process data and measurements of temperature and additive (e.g., AlF.sub.3).

[0133] A test set is prepared which is a set of data that is independent of the training data, but that follows the same probability distribution as the training data. If a model fit to the training set also fits the test set as well, minimal overfitting has taken place. A better fitting of the training set is preferred as opposed to the test set that usually points to overfitting.

[0134] FIG. 2 is a more detailed depiction of 33, 9 Box Grid 118 of FIG. 4 plotting a specific temperature range along the X axis as a function of AlF.sub.3 concentration along the Y axis and includes box or zone consecutive identifying numbers (1 to 9) with the ranges covered within the corresponding limits of each numbered zone. In the 9 box matrix with the numbers listed adjacent the axes, both temperature and alumina fluoride have preferred target ranges that provide for a maximum efficiency of the smelting process (i.e., ranges at which temperature and AlF.sub.3 concentration synergistically combine to provide an absolute maximum result in improving process control and energy efficiency for a smelting process.

[0135] In accordance with the present invention, the preferred target temperature zone, as depicted in FIG. 3 is 965-1-8 C., which is in medium temperature zone 5-9 box matrix.sub.22. In FIG. 3, (941957 C. is the low temperature zone (L), and 973989 C. is the high temperature zone (H).

[0136] Similarly, low, medium and high alumina fluoride zones are defined based on its target value which is also located in zone 5 (10.8012.80).

[0137] Ideally, the pot conditions are controlled within medium temperature and medium alumina fluoride both of which are in zone 5 in the matrix at row 2, column 2.

[0138] In the present invention, as noted above, there are various control mechanisms that are essential in implementing the process.

[0139] Among the control mechanisms there are a number of functional blocks. As noted above, there is a functional block that issues variable control signals relating to power control, (voltage, current), noise control, (change of resistance, heat, etc.), feed control (supplying the alumina dirt to the pots) and chemical control, (monitoring the formation of various alloys, such as Na, Si, Fe, etc.). In addition there is a functional block that hosts the metal (aluminum) smelting process in the smelter groups. There are hundreds of smelter pots in series that have diverse structures that are significantly different due to generation and design.

[0140] There is also a functional block that collects different types of process data such as power related process variables, noise related process variables, various raw material feeding parameters, various chemical contents, etc.

[0141] As a result of the various different control mechanisms enumerated above, hundreds of process variables can be divided into types or considered from different sources.

[0142] The data warehouse 114 is a functional block that stores variable process data, measures the 9-box control variables and predicts the 9-box control variables.

[0143] The predictive modeler 115 is a functional box that uses a trained model, takes the various feature types/data sources, and predicts control variables, temperature and alumina fluoride. The model is trained based upon historical process data and measurements of temperature and alumina fluoride.

[0144] Finally, the 9-box control as described above is a designated as a functional block that issues a series of control signals based upon the range of temperature and alumina fluoride as described above and illustrated in FIG. 2.

[0145] FIG. 5 depicts the initial predictive modeling approach 201. The first step in this approach is to formulate a training step which develops a set of examples used for learning that is to fit the parameters [i.e., weights] of the classifier. Referring to 9-box matrix depicted in FIG. 4, 118, Model 1, 202 is a 3-class: L/M/H; Model 2, 203 is a 2-class: L/M and Model 3, 204 is a 2-class: H/M.

[0146] In Step 1 (training), Model 1(202) trains a 3-classification model which comprises a random forest using all of the training data. This is illustrated in FIG. 2 by the arrows extending from Model 1 (202) to the entries Low Medium High set forth in Step 2: testing 1.

[0147] Model 2 (203) of Step 1, trains a binary classification model. Model 2 uses training data belonging to the 2-class low and medium (L/M). This is illustrated in FIG. 2 by the arrow extending from Low in Step 2, testing to Step 1, Model 2, 2-class: L/M.

[0148] Model 3 (204) of Step 1, trains a 2 binary classification model. Model 3 uses training data belonging to the 2-class high and medium. This is illustrated in FIG. 2 by the arrow extending from High in Step 2, testing to Step 1, Model 3, 2-class: H/M.

[0149] In Step 2, final, the 3-class model 1 (202) is applied to the L/M/H test data for a predictive result. If the prediction is low, Model 2 (203) is applied and the output is low or medium.

[0150] In Step 2, final, the 3-class model 1 (202) is applied to the L/M/H test data for a predictive result. If the prediction is medium, Model 2 (203) the output is medium.

[0151] In Step 2, final, the 3-class model 1 (202) is applied to the L/M/H test data for a predictive result. If the prediction is high, Model 3 (204) is applied and the output is high or medium.

[0152] FIG. 6 presents the next step in the predictive modeling approach. It considers Model 2 (203) and Model 3 (204) of FIG. 5 and decomposes the instance-feature data ({tilde over (X)}.sub.ij) into 3 non-negative matrices ({tilde over (R)}.sub.i, M.sub.ij, C.sub.j). The next step is to decompose the instance-label data (Y.sub.i) into 3 non-negative matrices ({tilde over (R)}.sub.i, M.sub.iY, C.sub.Y). The final step is to regularize task relatedness, view (feature type) consistence and label correlations.

[0153] As has been stated above, a critical feature of the present invention resides in employing a non-negative matrix triple factorization calculation to model task relatedness by requiring that different feature types across different tasks share the same feature clustering coefficients and enhance feature type consistency by requiring that the instances (i.e., data points) share the same instance clustering coefficients across differ feature types, and model the label correlations by requiring that the labels share the same label clustering coefficients across different tasks.

[0154] Matrix factorization or matrix decomposition refers to the transformation of a given matrix into a right-hand-side product of canonical matrices.

[0155] Non-negative matrix factorization (NMF), as used in the present invention, is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Non-negativity is inherent to the data being considered.

[0156] Since predictive modeling approach problems are not exactly solvable in general, they are commonly approximated numerically.

[0157] The purposes of matrix factorization as employed in the present invention involves two aspects: computational convenience and analytic simplicity. Generally, it is not feasible for most of the matrix computations to be calculated in an optimal explicit way, such as matrix inversion, matrix determinant, etc. Thus to convert a difficult matrix computation problem into several easier tasks such as solving triangular or diagonal system will greatly facilitate the calculations. Data matrices representing some numerical observations such as proximity matrix or correlation matrix are often huge and hard to analyze, therefore to decompose the data matrices into some lower-order or lower-rank canonical forms reveal the inherent characteristic and structure of the matrices and help to interpret their meaning.

[0158] With respect to task relatedness, for the j.sup.th view/feature type, the decomposition of ({tilde over (X)}.sub.ij) in different tasks, shares the same feature encoding matrix (C.sub.j). As to view feature consistence, for the i.sup.th task, the decomposition of ({tilde over (X)}.sub.ij) in different views share the same instance encoding matrix R.sub.i. Referring to label correlations, the labels share the same label encoding matrix C.sub.Y across different tasks.

[0159] Finally, M.sub.ij models the correlation between the instance clusters and the feature clusters and M.sub.iY, models the correlation between the instance clusters and the label clusters. The next step is to solve the objective function iteratively.

[0160] Without loss of generality, assume that there are two tasks and two views.

[0161] The view consistence is modeled by sharing the instance encoding matrix R.sub.1 (or R.sub.2) across different feature types; the task relatedness is modeled by sharing the feature encoding matrix C.sub.1 (or C.sub.2) across different tasks; the label correlation, is modeled by sharing the label encoding matrix C.sub.Y across different tasks.

[0162] An instance as used herein is a single object of the world from which a model is learned, or on which a model will be used (e.g., for prediction). In most machine learning work, instances are described by feature vectors, which is a quantity describing an instance. An attribute as used herein has a domain defined by the attribute type, which denotes the values that can be taken by an attribute.

[0163] The following domain types are common: [0164] a. Categorical. A finite number of discrete values. The type nominal denotes that there is no ordering between the values, such as last names and colors. The type ordinal denotes that there is an ordering, such as in the present invention, an attribute taking on the values low, medium, or high. [0165] b. Continuous (quantitative) The subset of real numbers, where there is a measurable difference between the possible values. For the most part, integers are treated as continuous in practical problems.

[0166] More specifically, referring to FIG. 6, X.sub.11, X.sub.12, X.sub.21 and X.sub.22 via non-negative matrix factorization are decomposed into three matrices specifically identified approximately as instance feature data: R.sub.1, M.sub.11 and C.sub.1; R.sub.1, M.sub.12 and C.sub.2; R.sub.2, M.sub.22 and C.sub.1; R.sub.2, M.sub.22 and C.sub.2 respectively, with the Cs being shared across feature types. Instance label data Y.sub.1 is also decomposed into R.sub.1, M.sub.1Y and C.sub.Y; and Y.sub.2 is decomposed into R.sub.2; M.sub.2Y and C.sub.Y.

[0167] The general representation of elements depicted in FIG. 6 identified above is as follows:

[0168] Let

[00001] X ~ ij = ( X ij X ij u ) n i d j

be the instance-feature matrix for the i.sup.th task and j.sup.th view, where X.sub.ij is the training data and X.sub.ij.sup.u is the test data.

[0169] Let

[00002] R ~ i = [ R i R i u ] n i p

be the instance encoding matrix where p is the dimensionality of the instance latent space, R.sub.i and R.sub.i.sup.u are for training and test data, respectively.

[0170] Let C.sub.jcustom-character.sup.d.sup.j.sup.q be the feature encoding matrix, C.sub.Ycustom-character.sup.mq be the label encoding matrix where q is the dimensionality of feature (or label) latent space. Each row in {tilde over (R)}.sub.i (or C.sub.j, C.sub.Y) represents the coefficients of the instance (or feature, Label) associated with the instance (or feature, label) clusters.

[0171] M.sub.ijcustom-character.sup.pq, M.sub.iYcustom-character.sup.pq is denoted as the colatent space matrices.

[0172] Models 2 & 3 are formulated in the following expression as set forth in FIG. 7:


.sub.(R,M,C)>0.sup.min.sub.i=1.sup.T.sub.j=1.sup.V{tilde over (X)}.sub.ij{tilde over (R)}.sub.iM.sub.ijC.sub.j.sup.T.sub.F.sup.2+.sub.t=1.sup.TY.sub.iR.sub.iM.sub.iYC.sub.Y.sup.T.sub.F.sup.2+.sub.t=1.sup.T.sub.j=1.sup.VM.sub.ijM.sub.iY.sub.F.sup.2;

[0173] Wherein T is the number of tasks, V is the number of views, M is the number of labels, respectively, F represents Frobenius norm and and are regularization coefficients.

EXAMPLE

[0174] The method of the present invention as described herein formulated a problem as a regularized non-negative matrix triple factorization problem, which simultaneously decomposed the instance feature and instance-label matrices, while enforcing task relatedness, feature type consistency and label correlations on the data.

[0175] The endeavor involved 174 process variables that formed 4 views/feature types based upon the process control practice which included power and resistance, noise control, feed control and chemicals. The data used in this example was collected daily via sensors.

[0176] The process was concurrently run in 245 smelters, which are classified in 5 groups (tasks) based upon their design and generation.

[0177] The objective of the endeavor was to predict temperature and alumina fluoride when they are not measured using the 9-box control as depicted in FIGS. 2 and 4.

[0178] The results of collecting data from the sources and solving the objective function iteratively are reproduced in Table 1.

TABLE-US-00001 TABLE 1 ALGORITHM F.sub.1-Score.sup.1 Accuracy.sup.2 Hamming Loss.sup.3 HiMLS.sup.4 0.848 .003 0.847 003 0.140 .003 HiMLSD.sup.5 0.873 .006 0.871 .006 0.115 .006 L.sup.2F.sup.6 0.773 .001 0772 .001 0.214 .000 ML-kNN.sup.7 0.845 .001 0.842 .000 0.139 .001 LS-ML.sup.8 0.852 .000 0.850 .001 0.131 .001 TRAM.sup.9 0.383 .001 0.381 .001 0.605 .000 Table 1 Evaluation Metrics .sup.1F-1 Score: The harmonic mean of precision and recall - the larger the better. .sup.2Accuracy: For each instance the proportion of the predicted correct labels to the total number of labels for that instance - the larger the better. .sup.3Hamming loss: How many times on average, the relevance of an instance to a class label is incorrectly predicted - the smaller the better. Table 1 Algorithms .sup.4HiMLS: The method of the instant invention applied to regression; i.e., to predict the value of temperature and alumina fluoride concentration. .sup.5HiMLSD: The method of the instant invention applied to binary classification; i.e., M2 and M3. .sup.6L2F: The multi-view, multi-label learning method. .sup.7ML-kNN: Graph based multi label method. .sup.8LS-ML: multi-label method based upon subspace learning. .sup.9TRAM: Transductive multi-label learning method.

[0179] The method of the present invention as reflected in Table 1 outperforms all other competitive predictive model methods. The energy efficiency referred to hereinabove is increased about 10% and the cost savings amounts to about $1 million per year.

[0180] The matrix factorization or matrix decomposition of the present invention provides an optimal CE/EE, wherein desired process conditions and various control actions are applied to adjust the bath temperature, bath chemical composition, current and aluminum oxide content

[0181] In summary, smelting production equipment has heterogeneous relatedness. In one embodiment specific to an aluminum manufacturing environment, the production equipment may consist of multiple pots, et al., to produce liquid aluminum formed ultimately into aluminum ingots. Typical to aluminum manufacturing, there are multiple pots included in multiple lines of pots.

[0182] Accordingly, one liquid aluminum product is produced in one of the multiple pots of one of the multiple lines comprising the production equipment. The production equipment can measure the process variables of each the pots, which indicate the processing of each the aluminum produced by the production equipment. The process variables may be sent to the central database.

[0183] The process variables in accordance with the present invention are directed primarily to temperature and AlF.sub.3 concentration, but may include, but are not limit to pressures, gas flow per unit time, etc.

[0184] The central database includes databases, file systems, or other arrangement of data on nonvolatile memory (e.g., hard disk drives, tape drives, optical drives, etc.), volatile memory (e.g., random access memory (RAM)), or combination thereof. In one embodiment, the central database may receive and store data generated by and sent from the production equipment, the and/or the advanced process controller. The central database may also receive actual temperature and AlF.sub.3 concentration quality measurements sent from the actual metrology tool.

[0185] The actual wafer quality measurements may be stored by the central database as histories of the actual physical outcomes of the wafers processed by production equipment. In one embodiment, the central database sends the measured process variables and the historical actual wafer quality measurements to the virtual metrology machine to generate a prediction model of the quality of wafers produced by the production equipment.

[0186] The central processor received variables and the historical measurements from the central database and predicts the temperature and AlF.sub.3 concentration by the production equipment based on the process variables and the historical measurements.

[0187] In one embodiment, the pot is connected to the advanced process controller and sends the predicted temperature and AlF.sub.3 (alone or in combination) to the advanced process controller to facilitate and improve control of the production equipment. The predicted temperature and AlF.sub.3 concentration may also be sent by the pot to the central database for storage.

[0188] The advanced process controller may manage some or all operations of the manufacturing environment. In one embodiment, the advanced process controller monitors and controls the

[0189] This illustrates a conventional computer system within which a set of instructions, for causing the machine to perform any one or more of the methodologies and operations discussed herein, may be executed. Computer system includes a bus or other communication mechanism for communicating information, and a processor or processors coupled with bus for processing information.

[0190] Computer system also includes a main memory, such as a random access memory (RAM) or other dynamic storage device, coupled to bus for storing information and instructions to be executed by processor. Main memory also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 404. Computer system further includes a read only memory (ROM) or other static storage device coupled to bus for storing static information and instructions for processor. A storage device such as a magnetic disk or optical disk, is provided and coupled to bus for storing information and instructions.

[0191] Computer system may be coupled via bus to a display, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device, including alphanumeric and other keys, is coupled to bus for communicating information and command selections to processor.

[0192] Another type of user input device is cursor control such as a mouse, a trackball, or a cursor direction keys for communicating direction information and command selections to processor and for controlling cursor movement on display. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.

[0193] Computer system may be used to process all, using equations and principles discussed herein, into usable data.

[0194] The pertinent programs and executable code is contained in main memory and is selectively accessed and executed in response to processor, which executes one or more sequences of one or more instructions contained in main memory. Such instructions may be read into main memory from another computer-readable medium, such as storage device. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions and it is to be understood that no specific combination of hardware circuitry and software are required.

[0195] The instructions may be provided in any number of forms such as source code, assembly code, object code, machine language, compressed or encrypted versions of the foregoing, and any and all equivalents thereof.

[0196] Computer-readable medium refers to any medium that participates in providing instructions to processor for execution and program product refers to such a computer-readable medium bearing a computer-executable program. The computer usable medium may be referred to as bearing the instructions, which encompass all ways in which instructions are associated with a computer usable medium.

[0197] Computer-readable mediums include, but are not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as storage device. Volatile media include dynamic memory, such as main memory 406. Transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise bus. Transmission media may comprise acoustic or light waves, such as those generated during radio frequency (RF) and infrared (IR) data communications.

[0198] Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.

[0199] Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor for execution. For example, the instructions may initially be borne on a magnetic disk of a remote computer.

[0200] The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to a bus can receive the data carried in the infrared signal and place the data on bus. Bus carries the data to main memory, from which processor retrieves and executes the instructions. The instructions received by main memory may optionally be stored on storage device either before or after execution by processor.

[0201] Computer system may also include a communication interface coupled to bus to provide a two-way data communication coupling to a network link connected to a local network.

[0202] For example, communication interface may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

[0203] Network link typically provides data communication through one or more networks to other data devices. For example, network link may provide a connection through local network to a host computer or to data equipment operated by an Internet Service Provider (ISP).

[0204] ISP in turn provides data communication services through the worldwide packet data communication network, now commonly referred to as the Internet Local network and Internet both use electrical, electromagnetic or optical signals that carry digital data streams.