ANOMALY DETECTION IN AN ENVIRONMENT OR SYSTEM
20250377417 · 2025-12-11
Assignee
Inventors
Cpc classification
G01V3/087
PHYSICS
G01R33/0029
PHYSICS
International classification
G01R33/00
PHYSICS
G01V13/00
PHYSICS
Abstract
A device and a method for detecting anomalies, including: acquiring a normal signal reflecting a normal state, determining on the basis of the normal signal a probability density which models a normal behavioral state, setting an information filter based on the probability density , the information filter being configured to converge toward a limit value L when it is applied to samples of a normal signal, while increasing its value in response to the detection of an anomaly, setting a threshold value S based on the convergence limit value, acquiring a current signal reflecting the current behavioral state, sampling the current signal in a current series of N samples, computing a result of applying the information filter to the current series of N samples, comparing the result with the threshold value S, an anomaly being detected if the result exceeds the threshold value.
Claims
1. A method for detecting anomalies within an environment or system, comprising the following steps: acquiring a normal signal reflecting the normal state of the environment or system, determining on the basis of the normal signal a probability density which models a normal behavioral state of the environment or system, setting an information filter based on said probability density , said information filter being configured to converge toward a limit value L when it is applied to samples of a normal signal, while increasing its value in response to the detection of an anomaly, setting a threshold value S based on said convergence limit value, acquiring a current signal reflecting the current behavioral state of the environment or system, sampling said current signal in a current series of N samples, computing a result of applying the information filter to said current series of N samples, comparing (E8) said result with said threshold value S, an anomaly being detected if said result exceeds said threshold value.
2. The method according to claim 1, wherein said threshold value S is equal to said limit value L, plus an additional values chosen according to a compromise sought between reliable detection and minimizing the number of false alarms.
3. The method according to claim 1, wherein the information filter corresponds to a first filter designed such that, when it is applied to samples of a normal signal, it converges toward a limit value which corresponds to the entropy of the probability density associated with this normal signal.
4. The method according to claim 3, wherein applying said first filter to said current series of a determined number of N samples consists in computing the natural logarithm of the value of the probability density for each of the samples x.sub.k, then summing these logarithms, and multiplying the result of this sum by the negative factor corresponding to the inverse of said determined number N of samples, according to the following formula:
5. The method according to claim 3, wherein said threshold value S is equal to the value of the entropy H() of the probability density , plus a value within an interval ranging from 1% to 20% of the absolute value of said entropy, according to the following formula:
6. The method according to claim 5, wherein said additional value is defined in a range between 10% and 20% of the absolute value of the entropy, in order to achieve a very low probability of false alarms, between 10.sup.4 and 10.sup.3.
7. The method according to claim 5, wherein said additional value is defined in a range between 1% and 10% of the absolute value of the entropy, to promote achieving maximum reliability in anomaly detection.
8. The method according to claim 1, wherein the information filter corresponds to a second filter which is based on said first filter, and uses a continuous function which represents an approximation of the data probability density of the current signal over a predetermined sliding window W.sub.i.
9. The method according to claim 8, wherein applying said second filter to said current series of a determined number N of samples consists in computing the natural logarithm of the value of the continuous function for each of the samples, computing the average of the results obtained for these natural logarithms, and adding the result of said average to the result of the application of the first filter to the same current series of samples, according to the following formula:
10. A device for detecting anomalies within an environment or system, comprising: an acquisition module configured to acquire a current signal reflecting the current behavioral state of the environment or system, a processor configured for: sampling said current signal in a current series of N samples, computing a result of applying an information filter to said current series of N samples, said information filter being based on a probability density modeling a normal behavioral state of the environment or system, said information filter being configured to converge toward a limit value when it is applied to samples of a normal signal, while increasing its value in response to the detection of an anomaly, comparing said result with a threshold value based on said convergence limit value, an anomaly being detected if said result exceeds said threshold value, and an output interface configured to indicate the detection of an anomaly.
11. A detection device according to claim 10, wherein the detection device comprises a magnetometer configured to generate said current signal by picking up an ambient magnetic field, comprising the Earth's field and various magnetic disturbances.
12. The detection device according to claim 10, wherein the detection device comprises an accelerometer configured to generate the current signal by picking up the vibrations originating from a machine to be monitored.
13. The detection device according to claim 10, wherein the detection device comprises a counter configured to generate the current signal by measuring the volume of data exchanged at a specific node of a computer network to be monitored.
14. The detection device according to claim 10, wherein the detection device comprises an eddy current probe configured to produce the current signal by measuring the thickness of a pipeline under monitoring.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0047] The present invention will be better understood upon reading the following description of exemplary embodiments given for merely indicative and non-limiting purposes, with reference to the appended drawings, wherein:
[0048]
[0049]
[0050]
[0051]
[0052]
DETAILED DISCLOSURE OF PARTICULAR EMBODIMENTS
[0053] The underlying concept of the invention is that of providing a detection technique wherein the threshold is determined intuitively from a predefined value known in advance.
[0054]
[0055] The device 1 for detecting anomalies comprises an acquisition module 3, a processor 5 which could be a microcontroller, a central processor or a microprocessor, a dedicated memory 7, as well as input/output interfaces 9.
[0056] The detection device 1 is adapted to acquire signals which may be different in nature according to the problem considered. For example, it may pick up magnetic signals via a magnetometer to identify ferromagnetic targets, vibration signals from a vibration sensor for tracking the state of a machine, or acoustic signals to check the integrity of a pipeline, among other applications. Consequently, the device 1 is designed to be equipped with or associated with a suitable sensor 11, according to the specificities of the intended application.
[0057] The detection device 1 is specifically designed to implement a detection method detailed in the diagram of
[0058] Indeed,
[0059] The method is structured around two phases: an initial calibration phase, described by steps E1 to E4, and an operational phase, dedicated to active anomaly detection, detailed in steps E5 to E8.
[0060] In step E1, the acquisition module 3 is configured to acquire a normal signal S.sub.N reflecting the normal state of the environment or system. An example is represented by the time signal in the graph of
[0061] In step E2, the processor 5 is configured to determine a probability density, denoted as , which models a normal behavioral state of the environment or system studied. An example of probability density is given in the graph of
[0062] In step E3, the processor 5 is configured to set an information filter F based on the probability density . The information filter F is set taking into account the probability density and the size of the window N. It can also integrate a continuous function g constructed on the basis of a current signal. (For the inventor: here, the information filter F is a filter which generalizes the two filters Ii and Ki. Furthermore, without this generalization, we would be faced with an objection linked with the lack of unity of invention).
[0063] The information filter F(x.sub.k) converges toward a predefined limit value L when it is applied to samples of a normal signal S.sub.N. Thus, it is simply necessary to use samples of a normal signal S.sub.N to determine this limit value L. According to the example of a first embodiment, explained in relation to
[0064] In step E4, the processor 5 is configured to set a threshold value S based on the convergence limit value L. Since the information filter F(x.sub.k) naturally tends toward a well-determined limit value L when it analyzes a normal signal S.sub.N, and on the other hand it shows an increase in the event of an anomaly, the definition of the threshold value S becomes intuitive: it is set slightly above L so that, in operation, any value of the filter exceeding S signals an anomaly. For illustration, it is possible to model the threshold S as an affine function of L, according to the equation S=L+, where is a positive or zero adjustment factor and a small positive constant additional value, ensuring a margin above the limit value L for reliable detection of anomalies.
[0065] Advantageously, the threshold value S can simply be equal to the limit value L, plus the constant additional value . This additional value is chosen according to a compromise sought between reliable detection and minimizing the number of false alarms.
[0066] Within the detection device 1, the memory 7 stores the probability density function , the information filter F, the convergence limit value L as well as the value of the threshold S for efficient recall during the detection operations.
[0067] Steps E1 to E4 define the initial calibration phase which, once completed, does not need to be repeated for each new anomaly detection sequence.
[0068] Active anomaly detection starts at step E5, during which the processor 5 is configured to acquire a current signal S.sub.C reflecting the current behavioral state of the environment or system to be monitored.
[0069] In step E6, the processor 5 is configured to sample the current signal S.sub.C in an initial series of N samples using a determined sliding window Wi.
[0070] It should be pointed out that the size of the selected window is adjustable. This ability to adjust the size of the window is explained by the fact that, according to the present invention, the threshold value S is not linked to the adopted window size. The same threshold value S is applied, which ensures uniformity of the detection criterion.
[0071] In step E7, the processor 5 is configured to compute a result F() of applying the information filter to the current series S.sub.C of N samples.
[0072] In step E8, the processor 5 is configured to compare this result F() with the threshold value S. An anomaly is detected if the result F(
) exceeds the threshold value S. The detection of an anomaly can be signaled by the output interface 9 of the detection device 1.
[0073]
[0074] In order to clarify the presentation of this method, it is explained with reference to the examples illustrated by
[0075]
[0076] According to this example, the detection device 1 is associated with a magnetometer 11 configured to generate the current signal by picking up an ambient magnetic field, comprising the Earth's field and various magnetic disturbances. For example, the magnetometer 11 is positioned on the Earth's surface in order to pick up variations in the local magnetic field. The recorded signal reflects the natural Earth's magnetic field, while including various interferences and parasitic noises, such as the magnetometer's 11 own noise and external geomagnetic disturbances.
[0077]
[0078] As previously established, the calibration phase, which extends from steps E11 to E14, is designated for defining the probability density function, the information filter, the convergence limit value and the threshold value.
[0079] In step E11, the acquisition module 3 is configured to acquire a normal signal S.sub.N reflecting the normal state of the environment or system.
[0080] Such a signal S.sub.N is illustrated in
[0081] In step E12, the processor 5 is configured to determine the probability density which models the normal behavioral state of the environment or system studied.
[0082] By way of example,
[0083] Based on these parameters, the probability density can be expressed as follows:
[0084] Let us consider, for example, the normal signal of the Earth's magnetic field, observed over a period of 1000 minutes as illustrated in
[0085] It will be noted that the probability density function representing the normal behavior of a signal can be represented either by a continuous function, or by a histogram, according to the specific nature of the signal studied.
[0086] In step E13, the processor 5 is configured to set a mean information filter , named first filter, which is based on the probability density function . This first filter is designed such that, when it is applied to a normal signal S.sub.N, it converges toward a limit value which corresponds to the entropy of the probability density associated with this normal signal.
[0087] The first filter out of an initial series or a current series of a determined number N of samples consists in computing the natural logarithm of the value of the probability density for each of the samples x.sub.k in the set of N samples. The sum of these logarithms is then obtained, and the total is multiplied by the opposite of the inverse of the number N of samples, according to the following formula:
[0088] The value to which the output of the first filter tends, when the number of samples of the normal signal S.sub.N increases indefinitely, corresponds to the entropy of the probability density, denoted as H(). This relationship is described by the following formula:
where X corresponds to the set of possible values for x.
[0089] In step E14, the processor 5 is configured to set the threshold value S based on the value H() of the entropy of the probability density associated with the normal signal S.sub.N.
[0090] Advantageously, the threshold value S is equal to the value of the entropy H() of the probability density , increased by a value within an interval ranging from 1% to 20% of the absolute value of the entropy H(), according to the following formula:
[0091] Thus, the choice of the threshold S is very simple; all it needs is to compute the entropy H() of the probability density , and take this entropy plus a small value as the threshold value.
[0092] It should be pointed out that raising the additional value reduces the number of false alarms, but it can also reduce detection sensitivity and reliability. Conversely, by lowering the additional value , the reliability is enhanced thanks to better sensitivity, yet in exchange for an increase in false alarms. Consequently, the setting of this value can be carried out according to a balance between the correct detection rate and the risk of false alarms, according to the specific requirements of each application.
[0093] According to a first example, the additional value is defined in a range between 10% and 20% of the absolute value of the entropy, in order to achieve a very low probability of false alarms, between 10.sup.4 and 10.sup.3.
[0094] According to a further example, the additional value is defined in a range between 1% and 10% of the absolute value of the entropy, to promote achieving maximum reliability in anomaly detection.
[0095] For illustration purposes, let us take the entropy H() computed for the probability density shown in
[0096] Steps E11 to E14 are designated for the initial calibration phase. Once this phase is completed, the operational detection of anomalies is launched, starting with step E15.
[0097] The remainder of the description is continued through
[0098] In step E15, the processor 5 is configured to acquire a current signal S.sub.C reflecting the current behavioral state of the environment or system to be monitored. This current signal S.sub.C is shown in
[0099] In step E16, the processor 5 is configured to sample the current signal S.sub.C in a series of N samples using a determined sliding window Wi.
[0100] In step E17, the processor 5 is configured to compute the result I(W.sub.i) of applying the first information filter to the current series of N samples. Indeed, the processor 5 first evaluates the probability density for the data x.sub.k from the current signal S.sub.C illustrated in
[0101] In step E18, the processor 5 is configured to compare this result I(W.sub.i) with the threshold value S. An anomaly is detected if the result I(W.sub.i) exceeds the threshold value S:
[0102]
[0103]
[0104] This method differs from that described in
[0105] In step E21, the acquisition module 3 is configured to acquire a normal signal S.sub.N reflecting the normal state of the environment or system.
[0106] In step E22, the processor 5 is configured to determine on the basis of the normal signal S.sub.N the probability density which models the normal behavioral state of the environment or of the studied system.
[0107] In step E23, the processor 5 is configured to acquire a current signal S.sub.C reflecting the current behavioral state of the environment or system to be monitored.
[0108] In step E24, the processor 5 is configured to sample the current signal S.sub.C in order to obtain a series of N samples, using a predefined sliding window Wi.
[0109] In step E25, the processor 5 is configured to approximate the probability density of the data of the current signal S.sub.C on this sliding window W.sub.i by a continuous probability density g. The estimation of the probability density g can be performed using a histogram or by the non-parametric kernel estimation approach, also known as Kernel Density Estimation (KDE). KDE uses functions referred to as kernels to assign local weights, and makes it possible to obtain smoothing of data from a finite set of samples. This technique is particularly useful when the number of samples N is limited.
[0110] According to this second embodiment, applying the second filter, denoted as , to an initial calibration series or a current series of a determined number N of samples by the processor 5 consists in first computing the natural logarithm of the value of the continuous function g for each of the samples. Then, the processor 5 proceeds to compute the arithmetic mean of the results obtained for these natural logarithms. This mean is subsequently added to the result of applying the first filter to the same series of samples. The formula used by the processor 5 is as follows:
[0111] When this second filter is applied to the samples of the normal signal S.sub.N, its value converges toward zero. Furthermore, the value of the second filter increases in the presence of an anomaly, as for the first embodiment.
[0112] In step E26, the processor 5 is configured to set the threshold value S which is expressed simply according to the following formula:
[0113] Therefore, determining the threshold S is even simpler than in the first embodiment. It is not necessary to compute the entropy H() because it is simply necessary to take a small fixed value . The latter must always be adjusted according to a compromise between the probability of correct detection and the probability of a false alarm.
[0114] In step E27, the processor 5 is configured to compute the value K(W.sub.i) resulting from applying the second information filter to the current series of N samples.
[0115] Finally, in step E28, the processor 5 is configured to compare the value K(W.sub.i) obtained previously with the threshold S. The detection of an anomaly is confirmed if K(W.sub.i) exceeds the threshold value S:
[0116] It should be noted that the present invention, illustrated by a detailed example associated with
[0117] A first application example is the detection of attacks in a computer network. For this situation, a relevant time series, denoted as Xt, T (where t refers to time), registers the amount of data measured in bytes or bits transiting via a node of the network over the time interval [t; t+T]. The sensor used is then a counter configured to generate the current signal by measuring the volume of data exchanged at a specific node of the computer network to be monitored.
[0118] A second example relates to the detection of anomalies in the machines. For this purpose, the adapted sensor could be an accelerometer or an acoustic sensor, both configured to create the current signal by detecting the vibrations emitted by the machine to be monitored.
[0119] A third example is non-destructive testing of metal pipelines. In this scenario, a suitable sensor is an eddy current probe, which is configured to generate the current signal by measuring the thickness of a pipeline under monitoring.