SYSTEM, DEVICE AND METHOD OF DETECTING ABNORMAL DATAPOINTS
20230267368 · 2023-08-24
Inventors
- Christoph Paulitsch (Karlsruhe, DE)
- Vladimir Lavrik (Dreieich, DE)
- Jing Feng (Hohhot, CN)
- Yang Qiao Meng (Beijing, CN)
Cpc classification
International classification
Abstract
System, Device and Method of detecting at least one abnormal datapoint in operation data (U) associated with an industrial environment (610) is disclosed. The method comprising iteratively applying one or more anomaly detection models (fi) to at least one subset (S) of the operation data (U), wherein the anomaly detection models (fi) are trained based on a training dataset (L) consisting of datapoints labeled as normal; classifying subset-datapoints in the subset (S) as one of normal datapoints (N) and abnormal datapoints (A) using the anomaly detection models (fi); updating the training dataset at least with the normal datapoints; retraining the anomaly detection models (fi) with the updated training dataset after expiration of a threshold time, wherein the threshold time is based on the number of updates to the training dataset; and detecting the at least one abnormal datapoint in the operation data (U) using the anomaly detection models (f′i).
Claims
1. A method of detecting at least one abnormal datapoint in operation data associated with an industrial environment, wherein the operation data comprises historical data and streaming data corresponding to operation of industrial assets in the industrial environment, the method comprising: applying one or more anomaly detection models to at least one subset of the operation data, wherein the anomaly detection models are trained based on a training dataset consisting of datapoints labeled as normal; classifying subset-datapoints in the subset as one of normal datapoints and abnormal datapoints using the anomaly detection models; determining a confidence index in the classification of the subset-datapoints based on at least one of an engineering software input and operation of a comparable industrial environment; re-classifying the subset-datapoints when a confidence threshold is not satisfied; updating the training dataset with the confidence index and the re-classified datapoints; updating the training dataset with the normal datapoints; retraining the anomaly detection models with the updated training dataset after expiration of a threshold time, wherein the threshold time is based on a number of updates to the training dataset; detecting the at least one abnormal datapoint in the operation data using the anomaly detection models; removing the subset from the operation data; detecting the at least one abnormal datapoint in remainder operation data without the subset using the retrained anomaly detection models; applying the retrained anomaly detection models to a new subset of the operation data and classifying new-datapoints in the new subset as one of the normal datapoints and the abnormal datapoints; and updating the training dataset and retraining the retrained anomaly detection models.
2. The method of according to one of claim 1, further comprising: populating an anomaly dataset containing the abnormal datapoints in the subset and the new subset.
3. The method of according to claim 2, further comprising: updating the training dataset with the anomaly dataset.
4. The method of according to claim 1, further comprising: receiving the training dataset comprising at least the normal datapoints, wherein the normal datapoints are classified based on at least one of an engineering software input and operation of a comparable industrial environment; and training the anomaly detection models based on the training dataset comprising the normal datapoints.
5. (canceled)
6. The method of claim 1 according to one of the preceding claims, wherein the one or more anomaly detection models are contained in an anomaly detection pipeline, wherein the anomaly detection pipeline is implemented as an iterative workflow of training, classification and retraining, wherein the anomaly detection models are trained with the training dataset, wherein the anomaly detection models are validated based on classification of the normal datapoints and the abnormal datapoints and wherein the anomaly detection models are retrained based on the updated training dataset.
7. The method of according to claim 6, further comprising: generating the anomaly detection pipeline containing the anomaly detection models associated with the industrial environment, wherein the anomaly detection models comprise physics-based models, data-driven models, and a combination thereof; and determining a deviation score for the anomaly detection models based on properties of the anomaly detection models.
8. The method of according to claim 7, wherein generating the anomaly detection pipeline further comprises: enabling selection of at least one anomaly detection model for the anomaly detection pipeline based on the deviation score.
9. The method of claim 6 according to one of the preceding claims, further comprising: detecting a batch of abnormal datapoints in the operation data using the anomaly detection pipeline, wherein detecting the batch of abnormal datapoints comprises: applying the anomaly detection pipeline to at least one subset of the batch, wherein the anomaly detection pipeline trained based on the training dataset consisting of the normal datapoints; classifying the subset-datapoints in the subset as one of the normal datapoints and the abnormal datapoints using the anomaly detection pipeline; enlarging the training dataset at least with the normal datapoints; and retraining the anomaly detection pipeline with the updated training dataset after expiration of the threshold time, wherein the threshold time is based on the number of updates to the training dataset.
10. A computing device for detecting at least one abnormal datapoint in operation data associated with an industrial environment, wherein the operation data comprises historical data and streaming data corresponding to operation of industrial assets in the industrial environment, the computing device comprising: a processing unit; and an anomaly module executable by the processing unit comprising computer-readable instructions when executed by the processing unit is configured to: apply one or more anomaly detection models to at least one subset of the operation data, wherein the anomaly detection models are trained based on a training dataset consisting of datapoints labeled as normal; classify subset-datapoints in the subset as one of normal datapoints and abnormal datapoints using the anomaly detection models; determine a confidence index in the classification of the subset-datapoints based on at least one of an engineering software input and operation of a comparable industrial environment; re-classify the subset-datapoints when a confidence threshold is not satisfied; update the training dataset with the confidence index and the re-classified datapoints; update the training dataset with the normal datapoints; retrain the anomaly detection models with the updated training dataset after expiration of a threshold time, wherein the threshold time is based on a number of updates to the training dataset; detect the at least one abnormal datapoint in the operation data using the anomaly detection models; remove the subset from the operation data; detect the at least one abnormal datapoint in remainder operation data without the subset using the retrained anomaly detection models; apply the retrained anomaly detection models to a new subset of the operation data and classifying new-datapoints in the new subset as one of the normal datapoints and the abnormal datapoints; and update the training dataset and retraining the retrained anomaly detection models.
11. (canceled)
12. A non-transitory computer readable medium, having machine-readable instructions stored therein for detecting at least one abnormal datapoint in operation data associated with an industrial environment, wherein the operation data comprises historical data and streaming data corresponding to operation of industrial assets in the industrial environment, wherein the machine-readable instructions when executed by a processor cause the processor to apply one or more anomaly detection models to at least one subset of the operation data, wherein the anomaly detection models are trained based on a training dataset consisting of datapoints labeled as normal; classify subset-datapoints in the subset as one of normal datapoints and abnormal datapoints using the anomaly detection models; determine a confidence index in the classification of the subset-datapoints based on at least one of an engineering software input and operation of a comparable industrial environment; re-classify the subset-datapoints when a confidence threshold is not satisfied; update the training dataset with the confidence index and the re-classified datapoints; update the training dataset with the normal datapoints; retrain the anomaly detection models with the updated training dataset after expiration of a threshold time, wherein the threshold time is based on a number of updates to the training dataset; detect the at least one abnormal datapoint in the operation data using the anomaly detection models; remove the subset from the operation data; detect the at least one abnormal datapoint in remainder operation data without the subset using the retrained anomaly detection models; apply the retrained anomaly detection models to a new subset of the operation data and classifying new-datapoints in the new subset as one of the normal datapoints and the abnormal datapoints; and update the training dataset and retraining the retrained anomaly detection models.
13. The non-transitory computer readable medium of claim 12, further comprising machine-readable that when executed by the processor, cause the processor to: populate an anomaly dataset containing the abnormal datapoints in the subset and the new subset.
14. The non-transitory computer readable medium of claim 13, further comprising machine-readable that when executed by the processor, cause the processor to: update the training dataset with the anomaly dataset.
15. The non-transitory computer readable medium of claim 12, further comprising machine-readable that when executed by the processor, cause the processor to: receive the training dataset comprising at least the normal datapoints, wherein the normal datapoints are classified based on at least one of an engineering software input and operation of a comparable industrial environment; and train the anomaly detection models based on the training dataset comprising the normal datapoints.
16. The non-transitory computer readable medium of claim 12, wherein the one or more anomaly detection models are contained in an anomaly detection pipeline, wherein the anomaly detection pipeline is implemented as an iterative workflow of training, classification, and retraining, wherein the anomaly detection models are trained with the training dataset, wherein the anomaly detection models are validated based on classification of the normal datapoints and the abnormal datapoints and wherein the anomaly detection models are retrained based on the updated training dataset.
17. The non-transitory computer readable medium of claim 16, further comprising machine-readable that when executed by the processor, cause the processor to: generate the anomaly detection pipeline containing the anomaly detection models associated with the industrial environment, wherein the anomaly detection models comprise physics-based models, data-driven models, and a combination thereof; and determine a deviation score for the anomaly detection models based on properties of the anomaly detection models.
18. The non-transitory computer readable medium of claim 17, wherein generating the anomaly detection pipeline further comprises computer readable instructions that when executed by the processor, cause the processor to: enable selection of at least one anomaly detection model for the anomaly detection pipeline based on the deviation score.
19. The non-transitory computer readable medium of claim 18, further comprising machine-readable that when executed by the processor, cause the processor to: detect a batch of abnormal datapoints in the operation data using the anomaly detection pipeline, wherein detecting the batch of abnormal datapoints comprises: applying the anomaly detection pipeline to at least one subset of the batch, wherein the anomaly detection pipeline trained based on the training dataset consisting of the normal datapoints; classifying the subset-datapoints in the subset as one of the normal datapoints and the abnormal datapoints using the anomaly detection pipeline; enlarging the training dataset at least with the normal datapoints; and retraining the anomaly detection pipeline with the updated training dataset after expiration of the threshold time, wherein the threshold time is based on the number of updates to the training dataset.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
DETAILED DESCRIPTION
[0041] Hereinafter, embodiments for carrying out embodiments are described in detail. The various embodiments are described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more embodiments. It may be evident that such embodiments may be practiced without these specific details.
[0042]
[0043] The method begins in a first iteration i=1 at step 102 with the receipt/determination of a reference normal dataset L. For example, the reference normal dataset L may be received as measurements of comparable industrial assets in comparable industrial environments. In another example, the reference normal dataset L may be generated based on simulations of the industrial environment. The reference normal dataset L is used as a training dataset for anomaly detection models. Hereinafter, the training dataset will be referred to as the training dataset L.
[0044] At step 104, the unlabeled operation data U is received. The unlabeled operation data U may be received in time series or as a batch file that may not be in continuous time series. The operation data U is a dataset/data stream in which abnormal datapoints are detected.
[0045] At step 106, an anomaly detection model fi is trained based on the training dataset L. Based on the training, the anomaly detection model is configured to classify datapoints as normal datapoints. At step 108, the anomaly detection model fi is applied to the training dataset to validate the classification as normal datapoints.
[0046] At step 110, a subset S is selected from the operation data U. At step 112, the anomaly detection model f.sub.i is applied to the subset S to generate classified datapoints (label.sub.i) that are labeled as one of normal datapoints N or abnormal datapoints A. In an embodiment, the classified datapoints label, are illustrated as in a trend of 1 and 0, where 1 indicates normal datapoints and 0 indicates abnormal datapoints.
[0047] At step 116, the training dataset is enlarged with the normal datapoints classified at step 112. Further, at step 118, an anomaly dataset is generated with the abnormal datapoints identified at step 112. In an embodiment, the steps 116 and 118 may be collectively referred to as enlarging the training dataset and is indicated by step 114. In addition to enlarging the training dataset, at step 118 the anomaly detection model fi is retrained based on the enlarged training dataset. The retrained anomaly detection model for the next iteration i+1 is referred as fi+1. In an embodiment, the retrained anomaly detection model f′i is generated according to the application of an optimization function of anomaly detection models fi and fi+1 in iterations i and i+1 f.sub.i=opt{f.sub.i,f.sub.i−1}. At step 120, the subset S is removed from the operation data U and the steps 110-118 are repeated with the retrained anomaly detection model f′i.
[0048]
[0049] At the supervised learning stage 210, the anomaly detection model is selected as part of an anomaly detection pipeline, and its parameters are initialized. For example, the anomaly detection model is an Isolation Forest model. At step 212, the selection of the Isolation Forest model is indicated by model type=1. Further, at step 212 a counter is initialized as cnt=0. At step 214, the training dataset L is received, and the Isolation Forest model is trained based on the training dataset L. The training dataset L includes normal datapoints N and abnormal datapoints A. The normal datapoints N are incorporated in the training dataset L at step 216. In an embodiment, the abnormal datapoints A are incorporated in the training dataset at step 218.
[0050] At the validation stage 220, domain experts may provide inputs to validate the classification of the normal datapoints N and the abnormal datapoints A. For example, at step 222 an automation engineer may provide ground-truth to validate the classification. At step 224 the anomaly detection pipeline is applied on the operation data U iteratively. At step 226, a subset S is picked from the operation data U and the anomaly detection pipeline is applied. The subset S is picked according to the time order. Accordingly, the leftmost block of the time-series operation data U is picked. Therefore, the earliest time block is the subset S and is input for classification in the first round.
[0051] At the unsupervised learning stage 230, the anomaly detection pipeline is applied on series of subsets of the operation data U. At predetermined time steps the anomaly detection pipeline is reinforced with an enlarged normal dataset. At step 232, the subset S is removed from the operation data U when the classification is completed by the application of the anomaly detection pipeline. At step 234, time-series operation data from the industrial assets is received in real-time. At step 236, the counter cnt is incremented. At step 238, the counter cnt is checked to confirm whether the counter cnt value is less than threshold time. Additionally, it is checked if the time series operation data U is not empty. If any of the checks are positive, a new subset is picked from the operation data U and the method continues from step 224.
[0052] The stage 230 is also the reinforcement stage. The reinforcement is performed by enlarging the training dataset L and retraining the anomaly detection pipeline. The reinforcement enables the anomaly detection pipeline to be robust and extends its generalizability through retraining.
[0053] The method 200 may be coded as follows.
[0054] Input: Data: L reference normal dataset as the training dataset; Data: M empty dataset to be populated as abnormal dataset; Data: U time series operation data; Threshold time: t the constraint to reinforce the trained model; Model: f model selected as part of the anomaly detection pipeline.
[0055] Output: A collection of normal datapoints and abnormal datapoints. Where each set contains: M detected anomalies, L enlarged normal dataset, and f robust anomaly detection model.
TABLE-US-00001 1. for f in model types do in parallel: 2. Train f on L. 3. cnt = 0 4. Initialize empty sets: abnormal set A, normal set N, 5. While (U is not empty) and (industrial asset is not interrupted) do: 6. While (cnt <= t) and (U is not empty) do: 7. Pick up subset S of U 8. Apply f to the subset S; add detected abnormal datapoints to A, and add detected normal datapoints to N; Add N to L; Add A to M 9. Remove subset S from U; 10. cnt += 1 11. if (cnt == t) then: 12. Train f from L 13. end 14. cnt = 0 15. End
[0056]
[0057] At step 304, the training dataset L is used to train the one class SVM model. At step 306, the anomaly detection pipeline is applied to the operation data U and classifies datapoints in the subset S as normal datapoints N and abnormal datapoints A. At step 308, a domain expert may validate the classification. At step 310, the normal datapoints N are incorporated in the training dataset to generate the enlarged training data L1. At step 312, the abnormal datapoints A are updated to an anomaly dataset M.
[0058] The aforementioned steps are repeated as the operation data U streams in at round 320. Accordingly, at step 322 the operation data U is streamed, and a new subset is picked at step 324. The anomaly detection pipeline is retrained/reinforced with the enlarged training dataset L1. The steps 330 and 332 are repetition of steps 310 and 312. Further rounds 330 is as indicated above.
[0059] The method 300 may be repeated steps until the operation data U is no longer streamed or the industrial asset operation is interrupted. For example, if a motor is stopped. When the steps are complete, the outcome will be enlarged training dataset L2 and enlarged anomaly dataset M. Further, the anomaly detection pipeline includes robust models for the industrial asset. Addition usages of enlarged datasets L2 and M may include clustering operation data from a comparable industrial asset.
[0060]
[0061] In
[0062]
[0063] The method begins at step 510 by applying one or more anomaly detection models to at least one subset of the operation data. The anomaly detection models are trained based on the training dataset. The training dataset initially includes datapoints labelled as normal. Accordingly, step 510 includes training the anomaly detection models based on the training dataset. The training dataset may be provided with the normal datapoints based on actions performed by an expert. Accordingly, the step 510 may include receiving the training data set including at least the normal datapoints, wherein the normal datapoints are classified based on at least one of engineering software input and operation of a comparable industrial environment.
[0064] When the anomaly detection models are applied to the subset of the operation data, classification of datapoints within the subset is possible. Step 520 includes classifying subset-datapoints in the subset normal datapoints or abnormal datapoints. The classification of the subset-datapoints enables labelling of unlabeled datapoints in the operation data.
[0065] In an embodiment, step 520 may include determining confidence index in the classification of the subset-datapoints based on at least one of the engineering software input and operation of the comparable industrial environment. The method may also include re-classifying the subset-datapoints when a confidence threshold is not satisfied and updating the training dataset with the confidence index and the re-classified datapoints. Embodiments provides a mechanism of validating the classification of the subset-datapoints or the new datapoints through the determination of the confidence index. For example, the difference between the datapoint classified as normal and the datapoints within boundary conditions set by the automation engineer may be used to determine the confidence index. Further, the expert input may be provided for the datapoints that require re-classification.
[0066] The labelling provides new training data for the anomaly detection models. Accordingly, step 530 includes updating the training dataset with at least the normal datapoints. Step 530 may also include populating an anomaly dataset containing the abnormal datapoints in the subset and the new subset. The anomaly dataset may be populated after classification of the datapoints in the subset of the new subset. The update of the training dataset may be performed iteratively. The iterative population of the anomaly dataset enables continuous learning and retraining of the anomaly detection models.
[0067] When an iterative threshold i.e., the threshold time is satisfied the anomaly detection models are retrained based on the updated training dataset. Step 540 includes retraining the anomaly detection models with the updated training dataset after expiration of the threshold time. The retraining of the anomaly detection models reinforces the classification performed on the subset-datapoints. Step 550 includes detecting the abnormal datapoint in the operation data based on the application of the retrained anomaly detection models.
[0068]
[0069] In an embodiment, the computing device 620 is located within the industrial environment 610. For example, the computing device 620 may be an edge computing device. The edge computing device 620 may be light weight, low-cost devices which collect data from various sensors deployed in the industrial environment 610, store and buffer the collected data (i.e., operation data of the industrial assets 612-618), conduct analysis of the collected data, and perform an action (e.g., issuing a control command) based on the outcome of analysis. In embodiments, the analysis performed by the edge computing device 620 is the detection of abnormal datapoints in the operation data.
[0070] In another embodiment, the computing device 620′ may be hosted on an IOT platform 650. The IOT platform 650 may include a cloud computing platform, a fog/edge computing platform or a combination of both. In an embodiment the IOT platform 650 serves as a host on which the computing device 620′ is implemented. The IOT platform 650 includes distributed compute resources distributed and connected by means of a communication network.
[0071] As used herein, “cloud computing” refers to a processing environment including configurable computing physical and logical resources, for example, networks, servers, storage, applications, services, etc., and data distributed over the network, for example, the internet. The cloud computing system provides on-demand network access to a shared pool of the configurable computing physical and logical resources. The network is, for example, a wired network, a wireless network, a communication network, or a network formed from any combination of these networks.
[0072] As used herein “fog computing” or edge computing enables the IOT platform to be realized closer to the automation environment. Fog/edge computing extends the cloud computing to the physical location of the devices belonging to the automation network. It may be a combination of multiple edge devices 620 that are configured to execute the operation of the IOT platform 650.
[0073] The predictive maintenance system 600 includes the computing devices 620 and 620′, the IOT platform 650 and a database 640. The database 640 is configured to store historical operation data of the industrial assets 612-618. The historical operation data refers to data collected from the sensors in the industrial environment 610 and stored in the database 640 or in the computing device 620. Further, the database 640 may also include training datasets associated with the industrial environments. Furthermore, the database 640 may be configured to store simulation models associated with the industrial environment 610 and the industrial assets 612-618.
[0074] The computing devices 620 and 620′ may include comparable hardware/software modules to implement embodiments. The following description is with regard to the edge device 620. The edge device 620 includes a processing unit 622, a display 624 configured to receive a user input and display an output based on commands from the processing unit 622 and memory 630 communicatively coupled to the processing unit 622. Although not illustrated in
[0075] The memory 630 includes an anomaly module 635 that when executed enables the edge device 620 to detect abnormal datapoints in the operation data. The anomaly module 635 includes a training module 632, a labelling module 634, an update module 636 and retraining module 638.
[0076] In operation, the edge device 620 receives operation data from the sensors associated with the industrial assets 612-618 via the communication interfaces. Further, the edge device 620 extracts the training dataset stored in the database 640. The training module 632 is configured to train one or more anomaly detection models generated from the simulation models in the database 640. The training is based on the training dataset extracted from the database 640. The training dataset includes normal datapoints that have been validated based on operation of the comparable industrial environment.
[0077] In an embodiment, the anomaly detection models are selected to form an anomaly detection pipeline. The anomaly detection pipeline is implemented as an isolated workflow of training, classification and retraining. The training module 632 may be configured to generate the anomaly detection pipeline containing the anomaly detection models based on a deviation score determined for each of the anomaly detection models. The deviation score is determined based on the properties of the anomaly detection models and indicates the accuracy of each of the anomaly detection models.
[0078] The labelling module 634 is configured to classify datapoints within a subset of the operation data. The datapoints within the subset are referred to as subset-datapoints. The subset-datapoints are classified as either the normal datapoints or abnormal datapoints using the anomaly detection models. In an embodiment, the labelling module 634 is further configured to determine a confidence index in the classification of the subset-datapoints based on the simulation models or operation of the comparable industrial environment. The labelling module 634 may be configured to classify the subset datapoints when a confidence threshold is not satisfied. For example, the confidence threshold may be a predefined value that an automation engineer may set using the GUI 626 to ensure that the quality of the classification does not fall beyond the predefined value. In another embodiment, the automation engineer may re-classify the datapoints that do not satisfy the confidence threshold.
[0079] The update module 636 is configured to update the training dataset with the newly classified normal datapoints in the subset. In an embodiment, the training dataset may also be updated with the confidence index and the re-classified datapoints. The update module 636 is configured to populate an anomaly dataset containing the abnormal datapoints in the subset. In an embodiment, the training dataset may be updated with the anomaly dataset. The update module 636 is configured to determine a threshold time based on the number of updates performed to the training dataset.
[0080] When the threshold time expires, the retraining module 638 is configured to retrain the anomaly detection models with the updated training dataset. Further, the retraining module 638 extracts a new subset from the operation data and applies the retrained anomaly detection models to a new subset of the operation data. Thereafter, the labelling module 634 is configured to classify new-datapoints in the new subset as the normal datapoints or the abnormal datapoints. Based on the classification, the anomaly module 635 is configured to iteratively detect the abnormal datapoints in the operation data. In an embodiment, the anomaly module 635 is configured to detect a batch of abnormal datapoints in the operation data.
[0081] Embodiments may take a form of a computer program product including program modules accessible from computer-usable or computer-readable medium storing program code for use by or in connection with one or more computers, processors, or instruction execution systems. For the purpose of this description, a computer-usable or computer-readable medium may be any apparatus that may contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The medium may be electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium in and of themselves as signal carriers are not included in the definition of physical computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, random access memory (RAM), a read only memory (ROM), a rigid magnetic disk and optical disk such as compact disk read-only memory (CD-ROM), compact disk read/write, and DVD. Both processors and program code for implementing each aspect of the technology may be centralized or distributed (or a combination thereof) as known to those skilled in the art.
[0082] It is to be understood that the elements and features recited in the appended claims may be combined in different ways to produce new claims that likewise fall within the scope of the present invention. Thus, whereas the dependent claims appended below depend from only a single independent or dependent claim, it is to be understood that these dependent claims may, alternatively, be made to depend in the alternative from any preceding or following claim, whether independent or dependent, and that such new combinations are to be understood as forming a part of the present specification.
[0083] While the present invention has been described above by reference to various embodiments, it may be understood that many changes and modifications may be made to the described embodiments. It is therefore intended that the foregoing description be regarded as illustrative rather than limiting, and that it be understood that all equivalents and/or combinations of embodiments are intended to be included in this description.