SELF-TUNING EVENT DETECTION

Abstract

A method is provided for processing environmental sensor data comprising the following steps. One or more raw data values are received from an environmental sensor, an average value and a measure of dispersion are determined over a defined time period only from raw data values between a lower threshold and an upper threshold, and the lower threshold and the upper threshold are redefined depending on the average value and the measure of dispersion. The method may be used in an environmental sensor, e.g. a MOX sensor or a VOC sensor, and implemented as a computer program.

Claims

1. Method for processing environmental sensor data, comprising the steps of receiving from an environmental sensor one or more raw data values, determining an average value and a measure of dispersion over a defined time period only from raw data values between a lower threshold and an upper threshold, redefining the lower threshold and the upper threshold depending on the average value and the measure of dispersion, in response to a gating adaptation event determining the average value and the measure of dispersion dependent on the raw data value received for the corresponding time step even if not between the lower threshold and the upper threshold.

2. Method according to claim 1, wherein one of the raw data values is received per discrete time step, wherein the average value and the measure of dispersion are determined anew per time step dependent on the raw data value received at the corresponding time step, wherein the lower threshold and the upper threshold are redefined anew per time step dependent on the average value and the measure of dispersion determined for the corresponding time step,

3. Method according to claim 2, wherein the lower threshold is the average value minus the measure of dispersion, wherein the upper threshold is the average value plus the measure of dispersion, wherein the measure of dispersion corresponds to one or two times a standard deviation of the raw data values, and/or wherein the average value corresponds to an arithmetic mean of the raw data values, and/or wherein the defined time period is at least one hour.

4. Method according to claim 1, additionally comprising the steps of determining normalized data values from the raw data values depending on the average value and/or the measure of dispersion, and outputting the normalized data values, determining the normalized data value anew per time step from the corresponding raw data value depending on the average value and/or the measure of dispersion determined for the corresponding time step.

5. Method according to claim 1, wherein the average value and the measure of dispersion are determined recursively each.

6. Method according to claim 5, wherein the average value is determined by av.sub.t=α*av.sub.t−1+(1−α)*rdv.sub.t with av.sub.t as average value at time t, av.sub.t−1 as average value at time t−1, α as smoothing factor, and rdv.sub.t as raw data value received at time t, wherein the measure of dispersion is determined by σ.sub.t=√{square root over (α*σ.sup.2.sub.t−1+(1−α)(rdv.sub.t−av.sub.t−1).sup.2))} with σ.sub.t as standard deviation at discrete time step t, σ.sub.t−1 as standard deviation at discrete time step t−1, av.sub.t−1 as average value at discrete time step t−1, α as smoothing factor, and rdv.sub.t as raw data value received at discrete time step t.

7. Method according to claim 1, wherein in response to the gating adaptation event (36), between the gating adaptation event and a gating adaptation disengagement event and as long as the defined time period exceeds an interval between the gating adaptation event and the gating adaptation disengagement event, the average value and the measure of dispersion are determined dependent on all raw data values even if not between the lower threshold and the upper threshold received on or after the gating adaptation event and dependent on the last average value and the last measure of dispersion determined prior to the gating adaptation event.

8. Method according to claim 1, determining for a monitoring time period a ratio between a number of raw data values that are not between the lower threshold and the upper threshold and a total number of raw data values, and setting the gating adaptation event if the ratio is larger than a maximum ratio.

9. Method according to claim 1, setting the gating adaptation event if for a monitoring time period no raw data value is between the lower threshold and the upper threshold.

10. Method according to claim 7, setting the gating adaptation disengagement event after a predefined period in time since the gating adaptation event, in response to the gating adaptation disengagement event determining the average value and the measure of dispersion dependent on the raw data value received for the corresponding time step only if between the lower threshold and the upper threshold,

11. Method according to claim 10, after the gating adaptation disengagement event, determining the average value and the measure of dispersion per time step dependent only from the raw data values between the lower threshold and the upper threshold and received after the gating adaptation disengagement event and the last average value and the last measure of dispersion determined prior to the gating adaptation disengagement event.

12. Method according to claim 1, additionally comprising the steps of determining weights of a weighting function for the raw data values depending on the average value and the measure of dispersion, applying the weights to the raw data values when determining the average value and the measure of dispersion.

13. Method according to claim 12, wherein the weights are largest at or around the average value.

14. Method according to claim 13, wherein the weights increase monotonically for raw data values between zero and the average value and/or wherein the weights decrease monotonically for raw data values between the average value and infinity.

15. Method according to claim 1, comprising the steps of at the beginning, receiving initial values for the lower threshold and the upper threshold, iterating the steps of the method according to claim 1.

16. Method according to claim 1, comprising the steps of at the beginning, receiving from the environmental sensor initial raw data values, determining an average value and a measure of dispersion from the initial raw data values, defining a lower threshold and an upper threshold depending on the average value and the measure of dispersion, iterating the steps of the method according to claim 1.

17. Method according to claim 1, comprising the steps of in response to the receiving of the raw data value, determining if the received raw data value is between the lower threshold and the upper threshold, in response to determining if the received raw data value is between the lower threshold and the upper threshold determining the average value and the measure of dispersion over the defined time period only from raw data values between the lower threshold and the upper threshold thereby including the received raw data value only if between the lower threshold and the upper threshold, thereby excluding the received raw data value from the determination of the average value and the measure of dispersion if not between the lower threshold and the upper threshold, in response to determining the average value and the measure of dispersion redefining the lower threshold and the upper threshold depending on the determined average value and the determined measure of dispersion.

18. A computer program product comprising instructions which, when the program is executed by a processor, cause the processor to execute the steps of the method according to claim 1.

19. An environmental sensor comprising a sensor and a processor adapted to execute the steps of the method according to claim 1.

20. The environmental sensor of claim 19, wherein the sensor comprises a MOX sensor, and/or wherein the sensor comprises a VOC sensor.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0060] The invention will be better understood and objects other than those set forth above will become apparent from the following detailed description thereof. Such description makes reference to the annexed drawings, wherein:

[0061] FIG. 1 shows a time series of raw data and of different thresholds determined from the raw data in a conventional way and, respectively, in a method with gating according to an embodiment of the invention;

[0062] FIG. 2 shows a flow chart of a method for processing environmental sensor data according to an embodiment of the invention;

[0063] FIG. 3 shows a time series of raw data and of different thresholds applying “normal gating” and “robust gating”, respectively, according to embodiments of the invention;

[0064] FIG. 4 shows a flow chart of a method for processing environmental sensor data applying robust gating according to an embodiment of the invention;

[0065] FIGS. 5a and 5b show time series of two raw data signals each and of corresponding output signals when gating with a “hard threshold” is applied (FIG. 5a) and when a “soft threshold” is applied (FIG. 5b) according to embodiments of the invention;

[0066] FIG. 6 shows a weighting function with weights depending on the raw signal value for use in a method for processing environmental sensor data applying gating with a “soft threshold” according to an embodiment of the invention.

MODES FOR CARRYING OUT THE INVENTION

[0067] FIG. 1 shows an example time series of a raw signal 11, i.e. raw data values, as measured by an environmental sensor and optionally pre-processed, e.g. by a calibration algorithm. In the example time series, the raw signal 11 fluctuates around some constant value (not shown), while some of the raw data values are far off, i.e. significantly higher or lower than the constant value. Such significantly deviating raw data values, which are e.g. deviating from a mean/average value of the raw signal by more than one or more than two standard deviations, are denoted as outliers.

[0068] Further, FIG. 1 shows a lower threshold and an upper threshold 12 as determined without any gating. The threshold value 12 at a certain time is calculated as the mean value plus (for the upper threshold), and respectively minus (for the lower threshold), the standard deviation of all raw data values over a defined time period before the certain time. It is a known problem that the threshold 12 heavily depends on outliers, i.e. a significantly deviating raw data values strongly influences the threshold 12 as shown in FIG. 1.

[0069] It is proposed to apply a gating algorithm when determining the threshold, the result of which is shown as lower threshold and upper threshold 13 in FIG. 1. Gating means that in the determination of the lower and upper threshold 13 at a certain time only raw data values are taken into account which are between an earlier determined lower and upper threshold. This means that outliers are excluded from the determination of the mean and the standard deviation and consequently of the new lower and upper threshold. While both thresholds 12 and 13 adapt to the raw data values over time, i.e. they are not constant in time, the threshold 13 with “self-tuning” gating is more robust against outliers. Such threshold 13 with self-tuning gating may be obtained by the method depicted in FIG. 2.

[0070] FIG. 2 shows a flow chart of a method for processing environmental sensor data according to an embodiment of the invention. The method comprises the steps S1, S2 and S3 as well as optional steps S4 and S5. The steps may be iterated and e.g. be implemented as a loop in a computer program. In step S1, a “new” raw data value rdv is received from an environmental sensor, such as a MOX gas sensor or a VOC sensor. With condition C1, it is checked whether the new raw data value rdv falls within the interval between the lower threshold lt and the upper threshold ut. Condition C1 presupposes that a lower and an upper threshold are given, e.g. from an earlier deterurination step or as initial threshold values. Only if the new raw data values rdv fulfills the criterion lt<=rdv<=ut, the processing proceeds with step S2; otherwise it jumps back to step S1. Hence condition C1 may be seen as the “gate” in the gating algorithm.

[0071] In step S2, the arithmetic mean av and the standard deviation σ of the raw data values that fulfil condition C1 are determined. In the example of FIG. 1, the mean and the standard deviation are calculated over the defined time period, e.g. an hour or a day. When the method of FIG. 2 is iterated, this amounts to a moving window for which the mean and the standard deviation are determined, preferably in a recursive way, in which the updated average value av is determined by

av=α*av.sub.−1+(1−α)*rdv

[0072] in case of the raw data value rdv.sub.t is equal to or within the upper and the lower threshold ut and lt, i.e. “C1=yes”. In case the raw data value is outside the upper and the lower threshold ut and lt, i.e. “C1=no”, the updated average value av.sub.t is set to the previous average value: av=av.sub.−1

[0073] In general, the mean may be replaced by any average value, e.g. a median which typically is a more robust measure of average than the arithmetic mean, and the standard deviation may be replaced by any measure of dispersion, e.g. an interquartile range. In a specific embodiment, the average value and the measure of dispersion may be calculated recursively, e.g. by exponential smoothing functions as described above, which saves computing power and storage space.

[0074] In step S3, the lower threshold lt and the upper threshold ut are redefined depending on the average value and the measure of dispersion determined in step S2. This may be done by the relations lt=av−nσ and ut=av+nσ, wherein n is a constant factor which is typically in the range between 1 and 3, e.g. 2. Alternatively, the lower threshold lt may be chosen to be a percentile, e.g. the 10.sup.th or 25.sup.th percentile, of the raw data, and the upper threshold ut to be a higher percentile, e.g. the 90.sup.th or 75.sup.th percentile.

[0075] The lower threshold lt and the upper threshold ut redefined in step S3 replace their former values. Thus they are taken into account in the gate condition C1 in the next iteration of the loop depicted in FIG. 2.

[0076] In a variant, the method of FIG. 2 also comprises one or both of the optional steps S4 and S5. In step S4, a normalized data value ndv is determined from the raw data value rdv. A typical normalization is the subtraction of the average value as ndv=rdv−av. Alternatively, the normalized data value may additionally depend on the measure of dispersion, e.g. as ndv=(rdv−av)/σ. Such normalized data value is useful in that it better shows relative changes in the measured quantity than the raw data value. Thus the normalized data values are naturally suited as output quantities, e.g. for a rapid perception of relevant changes by the user and for a good user experience.

[0077] Accordingly, the normalized data value ndv is output in optional step S5. The output may be implemented as numerical value which is stored or transferred to a remote system via an interface. Or the normalized data value ndv may be output by means of a display, e.g. as graph showing the normalized data values over time.

[0078] FIG. 3 again shows a time series of a raw signal 31, i.e. a plurality of raw data values rdv. In this case, the raw signal 31 shows an abrupt increase at point 35, which is due to a change of environment, e.g. because the environmental sensor is transferred to another place or because a gas concentration at the environmental sensor changes abruptly. For the present purpose, an abrupt change is defined as a change in raw data values by more than the measure of dispersion, e.g. by more than one standard deviation.

[0079] Further, FIG. 3 shows lower and upper thresholds 33 as obtained by the above described method of FIG. 2. After point 35, at which time the environment changes, all raw data values fall outside the range between the lower and upper threshold 33. This means that none of the later raw data values passes the gate condition C1, and that the above method never reaches steps S2 and S3. Hence the average value av and the measure of dispersion σ are not re-determined, and the lower and upper threshold 33 are not redefined, such that all later raw data values are treated as outliers.

[0080] Such behaviour is undesired, especially if the later raw data values represent a new environmental state which is stable at least for some time, e.g. 1 min, 10 min or 1 h. In the case of normalized data values and an output according to steps S4 and S5, such situation leads to useless results. Either the normalization as described above works insufficiently and yields unreasonable results or, even worse, the outliers may even not be displayed at all.

[0081] As a remedy for such situation, a “robust gating” is proposed which leads to the lower and upper thresholds 34 in FIG. 3. In robust gating, the concept of a gating adaptation event 36 is introduced. At the gating adaptation event 36, the gating is suspended, i.e. condition C1 is removed from the method of FIG. 2, and all raw data values rdv are taken into account for the determination of the average value av and the measure of dispersion σ in step S2. In this way, the lower and upper thresholds 34 will adapt to the raw data values on the different level due to the changed environment, but only after the gating adaptation event 36. Hence robust gating facilitates reliable sensor readings and state estimation even under abruptly changing environmental conditions.

[0082] One way of implementing the gating adaptation event 36 is looking at the time period for which no raw data value rdv fulfilled the gating criterion C1. If this time period exceeds a monitoring time period 37, e.g. 1 min, 10 min or an hour, then the gating adaptation event 36 is triggered.

[0083] An embodiment of a method for processing environmental sensor data applying robust gating is shown in the flow chart of FIG. 4. Step S1 and condition C1 are analogous as in the method of FIG. 2. Criterion C2 represents the check whether gating adaptation is enabled. If the gating adaptation is enabled, i.e. there has been a gating adaptation event, a raw data value rdv after being received in step S1 directly goes into the re-determination of the average value av and the measure of dispersion σ in step S2, regardless of whether it would pass the gating criterion C1.

[0084] On the other hand, if the gating adaptation is not enabled, it is checked whether the raw data value rdv falls within the interval between the lower threshold and the upper threshold in criterion C1. If yes, rdv is directly passed to step S2 and contributes to the re-determination of av and σ. If no, it is checked with criterion C3 whether the conditions for a gating adaptation event are fulfilled. If the conditions for the gating adaptation event are fulfilled in C3, the gating adaptation event is triggered in step S6. This means that in step S6 the gating adaptation is enabled. If the conditions for the gating adaptation event are not fulfilled in C3, the raw data value rdv is not taken into account in the re-determination of av and σ, and the processing starts again with step S1, i.e. receiving a new raw data value.

[0085] As shown in the embodiment of FIG. 3, the condition for the presence of a gating adaptation event in criterion C3 may be formulated as: Does the time period since the last raw data value fulfilled criterion C1 exceed the monitoring time period? Alternatively, the gating adaptation criterion C3 may comprise the determination of a ratio between a number of raw data values not fulfilling the gating criterion C1 and a total number of raw data values during the monitoring time period. In that case, the gating adaptation event is triggered if the ratio is larger than a maximum ratio, e.g. 10%, 25% or 33%.

[0086] As it is not reasonable to take into account all raw data values for an infinite time after one gating adaptation event is triggered in step S6, it is useful to define a gating adaptation disengagement event 38 as shown in FIG. 3. The gating adaptation disengagement event 38 causes the data processing to take the normal “gated” route, i.e. after such event 38 newly received raw data values contribute to the re-determination of the average value, the measure of dispersion and the lower and upper thresholds only if they fulfil the gating criterion C1. In the embodiment of FIG. 3, the gating adaptation disengagement event 38 is triggered when a predefined period in time 39, e.g. in the range of 10 min to an hour, has passed since the last gating adaptation event 36.

[0087] In terms of the method depicted in FIG. 4, the gating adaptation disengagement event 38 may be represented as gating adaptation disengagement criterion C4, e.g. as a check whether the time since the last gating adaptation event 36 exceeds the predefined period 39. Alternatively, the gating adaptation disengagement criterion C4 may be formulated as whether a certain number of raw data values received since the last gating adaptation event 36 have passed the gating criterion C1.

[0088] If the outcome of the check of criterion C4 is positive, the gating adaptation disengagement event 38 is triggered, the gating adaptation is disabled in step S7, and the processing restarts with step S1. If the outcome of the check of criterion C4 is negative, the present received raw data value is taken into account in the re-determination of av and σ in step S2 and in the redefinition of the lower and upper thresholds in step S3. Details for steps S2 and S3 have been described earlier.

[0089] In general, the method of FIG. 4 may additionally comprise steps S4 and S5 as described above. Also a different ordering of the steps as well as variations in the implementation are possible.

[0090] FIGS. 5a, 5b and 6 show embodiments of a different aspect of a method applying “robust gating” which may be combined with the earlier described aspects. FIG. 5a depicts a time series of an undesirable situation what may occur in terms of output signals 52a, 52b in case of two raw signals 51a, 51b that slightly differ from each other. The raw signals 51a, 51b may be measured by two different environmental sensors of the same type. Both sensors are assumed to have a same current upper threshold 53. However, the raw data values of raw signal 51b measured by sensor B are slightly, e.g. by 1% or by 5%, lower than the ones of raw signal 51a measured by sensor A.

[0091] In case one or more raw data values of raw signal 51a are above the upper threshold 53, i.e. they do not pass the gating criterion C1, they do not contribute to the re-determination of av, σ and the thresholds. Hence the threshold 53 for sensor A is not adapted. Raw data values measured by sensor B at the same time may, however, fall below the threshold 53 and hence pass the gating criterion C1. Thus av, σ and the threshold for sensor B will be adapted on the basis of these raw data values. In that case, the average values av, the measures of dispersion σ and the thresholds for sensor A and sensor B will begin to differ, and they will also continue to differ as long as no reset is performed.

[0092] Since the determination of the output signals 52a, 52b may depend on the present average value av and measure of dispersion σ, see above in the context of normalized data, also the output signals 52a, 52b of the sensors A and B will differ. Such situation leads to non-intuitive readings of the two sensors and to a suboptimal user experience.

[0093] As a solution to such issue, a method for processing environmental sensor data with gating is proposed wherein the gating comprises a “soft” instead of a “hard” threshold. FIG. 5b shows the same time series of raw signals 51a, 51b of the sensors A and B as in FIG. 5a. However, instead of the hard threshold 53 of FIG. 5a, a soft threshold 54 is applied in FIG. 5b. Such soft threshold may be implemented as a weighting function, e.g. as shown in FIG. 6. Instead of applying the gating criterion C1 in the form of the check of lt<rdv<ut, raw data values rdv are weighted for the determination of the average value av and the measure of dispersion σ. Preferably raw data values near the previously defined average av have higher weights than raw data values further away from av, see e.g. FIG. 6.

[0094] As shown in FIG. 6, it is beneficial that the weighting function of a soft threshold is a continuous, in particular a smooth, function. Preferably, the weighting function tends to zero for raw data values far from the average value ay. The weighting function may e.g. take the form of a Gaussian function around av and with the measure of dispersion σ as standard deviation. For comparison, FIG. 6 shows as dotted line a weighting function that corresponds to the hard threshold of criterion C1 described earlier.

[0095] In general, the soft threshold with a weighting function has the effect that neighbouring raw data values, i.e. differing by 0 to 5%, get similar weights, i.e. differing by 0 to 5%. Hence in FIG. 5b, the output signals 52a, 52b, which are e.g. normalized versions of the corresponding raw signals 51a and 51b, respectively, differ only slightly. Such output is robust and reliable irrespective of the value range of the raw signals. Moreover it yields a desired user experience, in particular if two sensors of the same type are e.g. placed next to each other and measure the same environmental conditions.

SELF-TUNING EVENT DETECTION

Assignee

Inventors

Cpc classification

Classification Explorer

G01N33/0047

PHYSICS

Classification Explorer

G01N33/0062

PHYSICS

Classification Explorer

G06F18/10

PHYSICS

Classification Explorer

G01N33/0065

PHYSICS

International classification

Classification Explorer

G01N33/00

PHYSICS

Abstract

Claims

Description