METHOD OF DETECTING ANOMALIES ON APPLIANCES AND SYSTEM THEREOF
20170315855 · 2017-11-02
Inventors
Cpc classification
G06F11/0757
PHYSICS
G06N7/01
PHYSICS
G06F11/0736
PHYSICS
G06F11/0709
PHYSICS
International classification
Abstract
A method, system and computer program product, the method comprising: obtaining transition probabilities, each transition probability associated with transition of a home appliance between states; receiving sensor readings indicating behavior of the home appliance; identifying by the processor a transition event occurring in the sensor readings; determining by the processor a source cluster and a destination cluster associated with the transition event; determining by the processor a duration indicator associated with the transition event; determining by the processor a transition probability by looking up in the transition probabilities, a probability associated with the duration indicator, the source cluster and the destination cluster; comparing by the processor the transition probability to a threshold; and responsive to the transition probability exceeding a threshold, providing an indication of abnormal behavior of the home appliance to a user.
Claims
1. A computer-implemented method for identifying anomalies in data streams using a processor operatively connected to a memory, the method comprising: receiving sensor readings associated with a home appliance of a home appliance type; clustering by a processor the sensor readings into a plurality of clusters; extracting by the processor from the sensor readings transition features associated with a transition, in accordance with the plurality of clusters, the transitions indicating state changes in the home appliance, each state associated with a cluster; and based on the transition features, determining transition probabilities between states of the home appliance for a plurality of transition time indicators and accommodating the transition probabilities in the memory, wherein the transition probabilities are adapted for detecting anomalies in transitions occurring in further sensor readings, thus identifying abnormal behavior of another appliance of the home appliance type.
2. The method of claim 1, wherein clustering is performed by a K-means clustering process.
3. The method of claim 1, wherein clustering is performed by a DBscan clustering process.
4. The method of claim 1, wherein determining the transition probabilities comprises: indicating a time duration for each transition; determining number of transitions for each combination of source and destination for each time duration; and normalizing the number of transitions.
5. The method of claim 3, wherein determining the number of transitions for each time duration comprises Markov chain sampling.
6. A computer-implemented method for identifying anomalies in data streams indicating behavior of a home appliance using a processor operatively connected to a memory, the method comprising: obtaining transition probabilities, each transition probability associated with transition of a home appliance between states; receiving sensor readings indicating behavior of the home appliance; identifying by the processor a transition event occurring in the sensor readings; determining by the processor a source cluster and a destination cluster associated with the transition event; determining by the processor a duration indicator associated with the transition event; determining by the processor a transition probability by looking up in the transition probabilities, a probability associated with the duration indicator, the source cluster and the destination cluster; comparing by the processor the transition probability to a threshold; and responsive to the transition probability exceeding a threshold, providing an indication of abnormal behavior of the home appliance to a user.
7. The method of claim 5, wherein the duration indicator is a discretized transition duration associated with the transition event.
8. The method of claim 6, wherein the discretized transition duration is an index of a Fibonacci number larger than the transition duration.
9. The method of claim 5, wherein the sensor readings refer to at least one item selected from the group consisting of: power consumption; current; voltage; fluid flow; temperature; and humidity.
10. The method of claim 5, wherein obtaining the transition probabilities comprises: receiving sensor readings associated with a home appliance; clustering the sensor readings into a plurality of clusters; extracting from the sensor readings transition features associated with a transition, in accordance with the plurality of clusters, the transitions indicating state changes in the home appliance, each state associated with a cluster; and based on the transition features, determining transition probabilities between states of the home appliance for a plurality of transition time indicators.
11. The method of claim 10, wherein clustering is performed by a K-means clustering process.
12. The method of claim 10, wherein clustering is performed by a process selected from the group consisting of: DBscan, K-Histograms and Ward's Method.
13. The method of claim 10, wherein determining the transition probabilities comprises: indicating a time duration for each transition; determining number of transitions for each combination of source and destination for each time duration; and normalizing the number of transitions.
14. The method of claim 13, wherein determining the number of transitions for each time duration comprises Markov chain sampling.
15. A computerized system for projecting a machine learning model, the system comprising a processor, wherein: the processor is configured to obtain transition probabilities, each transition probability associated with transition of a home appliance between states; the processor is configured to receive sensor readings indicating behavior of the home appliance; the processor is configured to identify by the processor a transition event occurring in the sensor readings; the processor is configured to determine a source cluster and a destination cluster associated with the transition event; the processor is configured to determine a duration indicator associated with the transition event; the processor is configured to determine a transition probability by looking up in the transition probabilities, a probability associated with the duration indicator, the source cluster and the destination cluster; the processor is configured to compare the transition probability to a threshold; and the processor is configured to provide an indication of abnormal behavior of the home appliance to a user determine, responsive to the transition probability exceeding a threshold.
16. The system of claim 15, wherein the duration indicator is a discretized transition duration associated with the transition event and wherein the discretized transition duration is an index of a Fibonacci number larger than the transition duration.
17. The system of claim 15, wherein obtaining the transition probabilities comprises: receiving sensor readings associated with a home appliance; clustering the sensor readings into a plurality of clusters; extracting from the sensor readings transition features associated with a transition, in accordance with the plurality of clusters, the transitions indicating state changes in the home appliance, each state associated with a cluster; and based on the transition features, determining transition probabilities between states of the home appliance for a plurality of transition time indicators.
18. The system of claim 17, wherein clustering is performed by a process selected from the group consisting of: DBscan, K-Histograms and Ward's Method.
19. The system of claim 17, wherein determining the transition probabilities comprises: indicating a time duration for each transition; determining number of transitions for each combination of source and destination for each time duration; and normalizing the number of transitions.
20. A computer program product comprising a computer readable storage medium retaining program instructions, which program instructions when read by a processor, cause the processor to perform a method comprising: obtaining transition probabilities, each transition probability associated with transition of a home appliance between states; receiving sensor readings indicating behavior of the home appliance; identifying by the processor a transition event occurring in the sensor readings; determining by the processor a source cluster and a destination cluster associated with the transition event; determining by the processor a duration indicator associated with the transition event; determining by the processor a transition probability by looking up in the transition probabilities, a probability associated with the duration indicator, the source cluster and the destination cluster; comparing by the processor the transition probability to a threshold; and responsive to the transition probability exceeding a threshold, providing an indication of abnormal behavior of the home appliance to a user.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] In order to understand the invention and to see how it can be carried out in practice, embodiments will be described, by way of non-limiting examples, with reference to the accompanying drawings, in which:
[0016]
[0017]
[0018]
[0019]
DETAILED DESCRIPTION
[0020] In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the presently disclosed subject matter may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the presently disclosed subject matter.
[0021] Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “computing”, “representing”, “comparing”, “generating”, “assessing”, “matching”, “updating”, “determining” or the like, refer to the action(s) and/or process(es) of a computer that manipulate and/or transform data into other data, said data represented as physical, such as electronic, quantities and/or said data representing the physical objects. The term “computer” should be expansively construed to cover any kind of hardware-based electronic device with data processing capabilities.
[0022] The terms “non-transitory memory” and “non-transitory storage medium” are used herein should be expansively construed to include any volatile or non-volatile computer memory suitable to the presently disclosed subject matter.
[0023] The operations in accordance with the teachings herein may be performed by a computer specially constructed for the desired purposes or by a general-purpose computer specially configured for the desired purpose by a computer program stored in a non-transitory computer-readable storage medium.
[0024] Embodiments of the presently disclosed subject matter are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the presently disclosed subject matter as described herein.
[0025] The disclosure relates to identifying abnormal behaviors in devices such as home appliances. It will be appreciated that in some cases it may take a long time after a problem in a device occurs until it is noticed, at which point in time it may be too late or more expensive to correct the situation. By identifying that an unlikely transition has occurred between states of a device, early problem discovery may be enabled which may avoid a problematic situation.
[0026] For example, a refrigerator door left open may be discovered before the temperature within the refrigerator increases enough to be noticed. In another example, by identifying that the filters of an air-conditioner need to be cleaned, energy may be saved and the air-condition engine can operate avoid excessive work.
[0027] Bearing this in mind, attention is drawn to
[0028] In some embodiments of the invention, the method comprises a training stage 100 and a runtime stage 104, each of which comprising multiple steps as detailed below.
[0029] During training stage 100 the normal behavior of a specific device such as a home appliance, or a device type such as a home appliance type may be learned, such that deviations from this behavior can then be detected, as they may indicate problems with the device.
[0030] On step 108, sensor readings may be received, for example as a data stream. The sensor readings may comprise readings of parameters associated with the device itself, such as current, voltage, temperature within the device, pressure, or the like. Additionally or alternatively, the readings may include environmental parameters, such as temperature in the environment of the device, pressure, light, noise, or any other measureable parameter. The sensor readings may be associated with time stamps, which may be absolute and indicate the time, or relative and indicate the time since measurements started. Alternatively, the measurements may be assumed to be taken at fixed time intervals, such that the same period of time elapses between any two consecutive measurements.
[0031] It will be appreciated that the sensor readings are not limited to a single parameter or to one dimensional parameter. Rather, readings may be received which relate to two or more parameters, such as voltage and temperature. Additionally or alternatively, the readings may relate to one or more multi-dimensional parameters, such as two-dimensional coordinates, or the like.
[0032] On step 112, the readings may be clustered into groups based on their values, using any desired clustering method, such as but not limited to K-means clustering but may include other methods such as K-Histograms, or DBSCANs. It will be appreciated that if readings are received from multiple sensors, or from one or more multi-dimensional sensors, then more complex clustering methods may be more appropriate, e.g., DBSCAN or Ward's Method.
[0033] The clustering results include two or more clusters, each having a cluster ID. For example, in K-means clustering, the cluster ID may be the centroid of a cluster.
[0034] Each reading is associated with one of the clusters and is closer to the centroid of the respective cluster than to the centroids of other clusters.
[0035] On step 116, transition features may be extracted from the readings and the clusters. A transition is identified when two consecutive measured values are associated with two different clusters. The features associated with each transition may thus comprise a source cluster, a destination cluster, and a transition duration, i.e., a period of time or number of measurements for which the measured values were associated with the first cluster prior to the transition. In some embodiments, the transition durations may be discretized to obtain transition indicators. In some embodiments, the discretization may use fixed intervals. However, in other embodiments, the discretization may use other scales, for example Fibonacci numbers. Extracting the transition features is further detailed in association with steps 128, 132 and 136 below.
[0036] It will be appreciated that the resulting features, obtained by discretization of the values as done by clustering, disctretization of time, and detecting the transitions may be viewed as Markov Chains. It will be appreciated that Markov chains are typically referred to as being memory-less, i.e., a transition is independent of a previously occurred transition. Additionally or alternatively Markov chains with memory may be used, typically referred to as “Additive Markov Chains” or “Markov chain of order m”, wherein m indicates the number of past states the transition depends on.
[0037] On step 120 the transition probabilities may be determined, for example by normalizing the numbers of all transitions associated with a given duration indicator and a given source cluster. The probabilities may thus indicate the probability of transition to a given destination cluster for a given transition duration and given source cluster.
[0038] The transition probabilities may then be stored and used for determining anomalies during runtime.
[0039] It will be appreciated that the training stage may be performed for a device type by a manufacturer and utilized for manufactured devices during usage. Alternatively, the training stage may be performed for each device when installed or when usage starts, and used later on. Even further, the training may be updated continuously or at times.
[0040] For runtime stage 104, the transition probabilities as determined on training stage 104 may be obtained. The transition probabilities may be calculated based on a training period, received with the device, received separately from another source, updated, or the like.
[0041] On step 122, sensor readings may be received, for example as a data stream, which may be received continuously, discretely, or the like. The readings may refer to the same parameter(s) for which training was performed.
[0042] On step 124, transition events may be identified within the received readings. On step 128, each reading may be associated with one of the clusters determined on step 112, for example by determining the cluster whose centroid is closest to the reading.
[0043] On step 132, transition may be identified as two consecutive readings being associated with two different clusters, such that a first reading is associated with a source cluster and a second reading is associated with a destination cluster.
[0044] On step 136, the transition duration may be determined as the period of time or the number of readings associated with the source cluster prior to the transition. A transition indicator may be obtained by time discretization thereof. The time discretization may be performed as the time discretization performed during training stage 100, i.e., using fixed time intervals, fixed number of readings, Fibonacci series, or the like. The transition indicator may also be obtained by a clustering technique, e.g. K-Means or others.
[0045] On step 140, the probability of the transition may be determined, by looking up at the received transition probabilities for the entry corresponding to the transition duration, the source cluster and destination cluster.
[0046] On step 144, the retrieved probability may be compared against a threshold.
[0047] On step 148, if the probability is below the threshold, this may indicate that the transition may be unlikely and may indicate abnormal behavior of the device, and an anomaly indication may be provided, for example by sending a message to a user, such as an instant message or a text message being sent to a mobile device of a user, an e-mail message sent to an e-mail account of a user, a message or a phone call initiated to an emergency center, or the like.
[0048] It is noted that the teachings of the presently disclosed subject matter are not bound by the flow chart illustrated in
[0049] Referring now to
[0050] In the example of
[0051] The values may then be clustered, using for example K-means clustering to obtain the clusters shown in table 204. Thus, cluster 0 has a centroid of 70, cluster 1 has a centroid of 30, and cluster 2 has a centroid of 40. It will be appreciated that the centroid is not necessarily a value that appeared in the measurements.
[0052] Transitions between clusters may then be identified within the readings of table 200. Thus, it can be seen that two minutes after the start of the readings, at 09:03, there was a transition between readings close to 70 (cluster 0) and readings close to 30 (cluster 1); after further five minutes there was a transition to values close to 40 (cluster 2); and after two more minutes a transition to a reading of 30 (cluster 1). The times and centroids of the involved clusters are summed in table 208.
[0053] Table 212 shows a series of Fibonacci numbers and their respective indices.
[0054] Table 216 shows table 208 in which the duration time in minutes has been converted to an index of the first Fibonacci number larger than the duration. Thus, the value of two is associated with Fibonacci index 1, while the value of five is associated with Fibonacci index 3. If the series had contained a transition having a duration of 18, then the Fibonacci number exceeding it is 21, and the transition would have been associated with the Fibonacci index of 6.
[0055] Then, a table may be constructed for each Fibonacci index. Thus, for the index of 1, table 220 may be created, showing that one transition occurred from 40 to 30, and another occurred from 70 to 30.
[0056] No transition occurred for the index of 2, thus table 224 is empty.
[0057] Table 228 shows the only transition that occurred within this time indicator, being from 30 to 40.
[0058] Referring now to
[0059] Each row in each table may then be normalized, obtaining normalized tables 320, 324 and 328. Thus, the second row of table 300 is normalized from {1, 1, 0} to {0.5, 0.5.0}, the first row of table 308 is normalized from {0, 2, 1} to {0, 0.67, 0.33}.
[0060] It will be appreciated that representing the data as the tables discussed above is exemplary only and any other data structure may be used to represent the probabilities.
[0061] Referring now to
[0062] An event 340 is received, in which at 1:45 minutes into the measurements a transition from a measurement of 42 to a measurement of 32 occurred.
[0063] On step 348 it is determined that the first measurement of the transition, being 42, is associated with cluster 2 having a centroid of 40.
[0064] On step 352 it is determined that the second measurement of the transition, being 32, is associated with cluster 0 having a centroid of 30.
[0065] On step 356 it is determined that the next Fibonacci number larger than the transition duration, being 1:45 minutes, is 2, which is associated with a Fibonacci index of 1.
[0066] Therefore table 320, associated with Fibonacci index of 1 is examined. The second row is associated with a source cluster having a centroid of 40, and the first entry in the row relates to transition to a destination cluster having a centroid of 30, which has a probability of 0.5.
[0067] Thus, the transition identified in the measurements has a probability of 0.5. Depending on a threshold associated with the device, this probability may or may not indicate an abnormal behavior and an anomaly indicator may or may not be issued to a user. It may be assumed that 0.5 is above the threshold for many cases, since such transition occurs in half the cases, and therefore an anomaly indication will not be provided, but this is not necessarily so.
[0068] It will be appreciated that in some cases multiple transition probabilities may be considered. For example, two or more transitions within a predetermined time period, each having a probability slightly above the threshold may be considered as an anomaly, too.
[0069] It will also be appreciated that different thresholds may be associated with differ tables or even different rows in the tables. For example, transition to high temperatures which endanger the home appliance may have a lower threshold than other transitions.
[0070] Referring now to
[0071] Computing platform 400 may comprise a storage device 404. Storage device 404 may be a hard disk drive, a Flash disk, a Random Access Memory (RAM), a memory chip, or the like. In some exemplary embodiments, storage device 404 may retain program code operative to cause processor 412 to perform acts associated with any of the subcomponents of computing platform 400.
[0072] In some exemplary embodiments of the disclosed subject matter, computing platform 400 may comprise an Input/Output (I/O) device 408 such as a display, a pointing device, a keyboard, a touch screen, or the like. I/O device 408 may be utilized to provide output to and receive input from a user.
[0073] Computing platform 400 may comprise one or more processor(s) 412. Processor 412 may be a Central Processing Unit (CPU), a microprocessor, an electronic circuit, an Integrated Circuit (IC) or the like. Processor 412 may be utilized to perform computations required by computing platform 400 or any of it subcomponents, such as steps of the method of
[0074] It will be appreciated that processor 412 can be configured to execute several functional modules in accordance with computer-readable instructions implemented on a non-transitory computer-readable storage medium. Such functional modules are referred to hereinafter as comprised in the processor.
[0075] Processor 412 may comprise clustering component 416 for receiving a series of values, for example values of readings of a parameter associated with a device. Clustering component 416 may then determine two or more clusters each having a centroid, such that each value is associated with one of the clusters. Clustering component 416 may use K-means clustering or any other clustering method currently known or that will become known in the future.
[0076] Processor 412 may comprise transition feature extraction component 420 for determining transition within a received series of values, wherein each transition may be associated with a source cluster, a destination cluster and a transition duration.
[0077] Processor 412 may comprise duration indication handling component 424 for discretizing the transition duration, for example using a Fibonacci series.
[0078] Processor 412 may comprise transition probability determination component 428 for determining the probabilities of each transition during training stage 100, for example determining tables 320, 324 and 328.
[0079] Processor 412 may comprise transition probability lookup component 432 for looking up a probability of a given transition, for example during runtime stage 104.
[0080] Processor 412 may comprise anomaly detection component 432 for comparing one or more transition probabilities to thresholds, and determining whether the transition may indicate an abnormal behavior.
[0081] Processor 412 may comprise interface to sensor readings 440 for receiving readings from one or more sensors associated with one or more devices, wither during training stage 100 or during runtime 104. The readings may be received by directly connecting to the device, from estimating conditions in the environment, by a remote computing platform through a communication channel, or in any other manner.
[0082] Processor 412 may comprise user interface 444 for receiving input from a user or providing output to a user, such as alert indications. User interface 444 may exchange information with a user utilizing I/O device 408.
[0083] The components detailed above may be implemented as one or more sets of interrelated computer instructions, executed for example by processor 412 or by another processor. The components may be arranged as one or more executable files, dynamic libraries, static libraries, methods, functions, services, or the like, programmed in any programming language and under any computing environment.
[0084] It will be appreciated that some components, such as clustering component 416 may not be present on a device coupled to a monitored device, but only to a system used during the training stage 100 for determining of the probability tables. On the other hand, components such as transition probability lookup component 432 may be present only in runtime stage 104 in a device coupled to a monitored appliance, or on a remote computing platform accessible from a computing platform receiving the measurements.
[0085] In some embodiments, each device may perform training stage 100 as well runtime stage 104 for a particular device, in which case all components may be present.
[0086] It is noted that the teachings of the presently disclosed subject matter are not bound by the computing platform described with reference to
[0087] The system can be a standalone entity, or integrated, fully or partly, with other entities, which may be directly connected thereto or via a network.
[0088] It is also noted that whilst
[0089] For purpose of illustration only, the description is provided for devices such as home appliances. Those skilled in the art will readily appreciate that the teachings of the presently disclosed subject matter are, likewise, applicable to any other electrical, mechanical, electro-mechanical or other devices, intended for domestic, industrial, commercial, or other devices.
[0090] It is to be understood that the invention is not limited in its application to the details set forth in the description contained herein or illustrated in the drawings. The invention is capable of other embodiments and of being practiced and carried out in various ways. Hence, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting. As such, those skilled in the art will appreciate that the conception upon which this disclosure is based may readily be utilized as a basis for designing other structures, methods, and systems for carrying out the several purposes of the presently disclosed subject matter.
[0091] It will also be understood that the system according to the invention may be, at least partly, implemented on a suitably programmed computer. Likewise, the invention contemplates a computer program being readable by a computer for executing the method of the invention. The invention further contemplates a non-transitory computer-readable memory tangibly embodying a program of instructions executable by the computer for executing the method of the invention.
[0092] Those skilled in the art will readily appreciate that various modifications and changes can be applied to the embodiments of the invention as hereinbefore described without departing from its scope, defined in and by the appended claims.