ASYNCHRONOUS INTERCORRELATED TIME SERIES DATASETS ALIGNMENT METHOD

Abstract

A computer-implemented method for aligning intercorrelated asynchronous time series datasets includes the steps of: (a) retrieving a first time series dataset (x) and a second time series dataset (y), the first and second time series dataset being intercorrelated, (b) segmenting each of the first and second time series dataset (x, y) into a plurality of consecutive smaller segments (x.sub.i, y.sub.i), all segments of the first and second time series dataset (x, y) having the same length, (c) determining pairs of corresponding segments by associating successive segments of the first time series dataset (x) with corresponding segments of the second time series dataset (y), (d) optimizing, for each pair of corresponding segments, a correlation function to obtain an approximation of a first times series transformation function (f.sub.1) and of a second time series transformation function (f.sub.2), (e) using the first and second time series transformation functions (f.sub.1,f.sub.2) to determine a vector of segment shifts (s) whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments (x.sub.i, y.sub.i), (f) applying a multi-model fitting algorithm to the segment shift vector (s), said multi-model algorithm outputting a shift function (f.sup.opt) for aligning segments of each pair of corresponding segments (x.sub.i, y.sub.i), and (g) aligning the first time series dataset (x) with the second series dataset by applying said shift function (f.sup.opt) to all pairs of corresponding segments (x.sub.i, y.sub.i),
wherein the first time series transformation function (f.sub.1) is parametrized by weights (w.sub.1) of a first neural network (N.sub.1) and outputs a first highly correlated time series dataset, and wherein the second time series transformation function (f.sub.2) is parametrized by weights (w.sub.2) of a second neural network (N.sub.2) and outputs a second highly correlated time series dataset.

Claims

1. A computer-implemented method for aligning intercorrelated asynchronous time series datasets comprising the steps of: (a) retrieving a first time series dataset and a second time series dataset (y), the first and second time series dataset being intercorrelated, (b) segmenting each of the first and second time series dataset into a plurality of consecutive smaller segments, all segments of the first and second time series dataset having the same length, (c) determining pairs of corresponding segments by associating successive segments of the first time series dataset with corresponding segments of the second time series dataset, (d) optimizing, for each pair of corresponding segments, a correlation function to obtain an approximation of a first times series transformation function and of a second time series transformation function, (e) using the first and second time series transformation functions to determine a vector of segment shifts whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments, (f) applying a multi-model fitting algorithm to the segment shift vector, said multi-model algorithm outputting a shift function for aligning segments of each pair of corresponding segments, and (g) aligning the first time series dataset with the second series dataset by applying said shift function to all pairs of corresponding segments, wherein the first time series transformation function is parametrized by weights of a first neural network and outputs a first highly correlated time series dataset, and wherein the second time series transformation function is parametrized by weights of a second neural network and outputs a second highly correlated time series dataset.

2. Method according to claim 1, wherein said first and second time series datasets are respectively a first and a second digital signal.

3. Method according to claim 1, further comprising a step of measuring a first signal with a first sensor and a second signal with a second sensor and converting the first signal into said first digital signal and the second signal into said second digital signal.

4. Method according to claim 1, wherein said first and second sensors are biomedical sensors.

5. Method according to claim 1, wherein said biomedical sensors are both selected among ECG sensors, BCG sensors, PPG sensors, EEG sensors, EMG sensors and medical accelerometers.

6. Method according to claim 1, wherein said approximation of said first and second time series transformation functions is made by successively fixing the first times series transformation function to be the identity function while optimizing the second times series transformation function and then optimizing the first time series transformation function while fixing the second time series transformation that has been optimized.

7. Method according to claim 1, wherein at least one of said first and second neural network is a residual convolutional neural network with locally constrained receptive fields.

8. Method according to claim 1, wherein said optimizing of said correlation function comprises a step of optimizing a normalized same cross correlation function.

9. Method according to claim 1, wherein said optimizing of said normalized same cross correlation function is realized through regular stochastic gradient descent.

10. Method according to claim 1, wherein at least one of said first and second time series transformation functions is non-linear.

11. Method according to claim 1, wherein said multi-model fitting algorithm is an energy based multi-model fitting algorithm.

12. Method according to claim 1, wherein step (d)-(e) is performed in parallel on a plurality of processors.

Description

SHORT DESCRIPTION OF THE DRAWINGS

[0040] Exemplar embodiments of the invention are disclosed in the description and illustrated by the drawings in which:

[0041] FIG. 1 illustrates schematically the main steps of the method;

[0042] FIG. 2a illustrates a graphic of the shift values before the multi-model fitting step;

[0043] FIG. 2b illustrates a graphic of the shift values after the multi-model fitting step;

[0044] FIG. 3a illustrates a graphic of a raw PPG (photoplethysmographic) signal;

[0045] FIG. 3b illustrates a graphic of a raw BCG (ballistocardiographic) signal;

[0046] FIG. 3c illustrates an example of the output of a signal transformation function associated to the PPG signal of FIG. 3a;

[0047] FIG. 3d illustrates an example of the output of a signal transformation function associated to the BCG signal of FIG. 3b, and

[0048] FIG. 4 shows a graphic of aligned signals from a 12-leads ECG and an arm-worn device measuring PPG.

DETAILED DESCRIPTION OF THE INVENTION

[0049] The present invention relates to a method for aligning asynchronous intercorrelated time series. Misalignment between times series may for example arise from a lack of synchronization between the timestamping mechanisms that assign a time value to a certain measurement. It may also arise from delays inherent to the processes being observed or the technology used. The induced time-shift, or lag, may be constant over certain periods of time, but most often this time-shift is non-constant but piecewise continuous in the sense that over certain periods of time, the lag can be expressed as a continuous function of time. In some cases, this function is periodic, meaning that the lag is identical on certain periods of time, when, for instance, the clocks are regularly reinitialized. In some other cases, this function does not follow any particular pattern as the lag is due to a human interaction. This is typically the case for stock market prices time series where the timestamping depends on the reaction time of an investor.

[0050] The variety of fields requiring high-precision time series alignment is rather large and includes for instance:

[0051] Manufacturing and heavy industry: with the advent of industry 4.0, predictive maintenance in manufacturing has become key. This is often the result of the combination of measurements (e.g. temperature, pressure, vibration, force, current, voltage, radiation, inductance, etc.) taken continuously by independent sensors on machines at different locations. Predictive maintenance models rely on all these data sources being correctly synchronized, sometimes to a high precision in which case the invention is applicable.

[0052] Automotive: self-driving vehicles handle a stream of a variety of sensors as input for decision models. These sensors are currently generally managed centrally from a main processing unit in the vehicle. If these sensors can become independent and rely on their own internal clocks, clock drifts can provide a false correlated information. Here the invention could provide automatic and continuous synchronization.

[0053] Sensor calibration and characterization: the invention allows not only to synchronize two signals, but also to characterize the time-based behavior and accuracy across various operating conditions of a sensor with respect to a gold standard. Taking a gold standard as a reference and recording a series of test events through both the gold standard sensor or clock and a second sensor allows using the invention to measure and characterize the accuracy of the second sensor with respect to the gold standard. This can for instance be used for quality control.

[0054] Research: many research fields need time series from different sensors to be well synchronized. For instance, experimental physics (CERN), Astronomy, Oceanography (measurement of sound)

[0055] Telecommunication, defense & aerospace: all technologies related to Signal Intelligence, notably Radar, HF and microwave communication but also all other type of small size sensors with internal clocks.

[0056] In a particular embodiment, a first time series dataset x and a second time series dataset y are retrieved. These time series datasets are supposed to be intercorrelated and they may arise from any of the fields listed above. They may be collected by being retrieved in a time series database or by any other suitable way known to the man skilled in the art.

[0057] If needed, the time series datasets are resampled and cropped to a same time period so as to obtain time series datasets of the same length.

[0058] The first and second time series datasets are then segmented into a plurality of consecutive smaller segments x.sub.i, y.sub.i, all segments of the first and second time series dataset x, y having the same length. This allows a distribution of the processing over a plurality of processors, which considerably reduce the computing time. It also allows a processing of comparatively long datasets that would not fit the computer's memory. Then, pairs of corresponding segments are determined by associating successive segments of the first time series dataset x with corresponding segments of the second time series dataset y. In other words, the first segment of the first time series dataset is associated with the first segment of the second time series dataset, the second segment of the first time series dataset with the second segment of the second time series dataset, and so on until all segments are paired. In general, one has to ensure that the length of the smaller segments is comparatively important with respect to the expected time-shift in order to be able to detect the correlation between corresponding segments.

[0059] Next, for each pair of corresponding segments, a correlation function is optimized to obtain an approximation of a first times series transformation function f.sub.1 and of a second time series transformation function f.sub.2, the first time series transformation function f.sub.1 being parametrized by weights w.sub.1 of a first neural network N.sub.1 and the second time series transformation function f.sub.2 being parametrized by weights w.sub.2 of a second neural network N.sub.2. The first and second time series transformation functions f.sub.1,f.sub.2 are then used to determine a vector of segment shifts s whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments x.sub.i, y.sub.i.

[0060] The training of these transformation functions allows a filtering of the input time series datasets by detecting common patterns as well as transforming morphologically distinct time series datasets into correlated time series datasets. These transformation functions are asked to optimize a cross-correlation function. To efficiently optimize the correlation function, a differentiable normalized same cross correlation (NSCC) is used as loss function. This loss function can then be optimized by regular stochastic gradient descent.

[0061] The first and second neural networks N.sub.1, N.sub.2 output a first and a second filtered and highly correlated time series dataset, and, for each pair of segments, a shift si approximating the time shift between the first and the second segment. Because of the noise in the original time series datasets or the time series datasets being only partially correlated, some of the shifts do not correspond to the real misalignment so that a final step of noise tolerant energy based multi-model fitting algorithm is necessary.

[0062] The choice of the above loss function is not restrictive and other loss functions could be chosen by a man skilled in the art.

[0063] In the embodiment represented in FIG. 1, the training is iterative in the sense that the first transformation function is first trained while the other is fixed. The trained function is then fixed while the other is being trained. At the beginning of the training, the fixed function is set to be the identity function. By doing so, stability problems and overfitting behaviour can be avoided. Moreover, this iterative approach allows to filter out a percentage of segments with low correlation value at each step, which is useful if it is expected that the two time series datasets are only intercorrelated at some points in time.

[0064] The choice of neural networks for aligning intercorrelated high-frequency signals strongly depends on the nature of the time series datasets. Both deep and shallow neural networks may be used according to the complexity of the input time series datasets.

[0065] In one embodiment, the transformation functions are parametrized by ResNet like convolutional neural networks (convolutional residual neural networks) with locally constrained receptive fields. This type of network architecture is well suited for alignment tasks, allowing for effective computations and adjustment of network capacity based on the network depth.

[0066] The optimizing of the cross correlation function also provides, for each pair of segments x.sub.i, y.sub.i, a time shift value si. As mentioned before, this shift value may not correspond to the actual misalignment because of the presence of noise in the input time series datasets as well as because of the only partial correlation of the two signals. This problem is solved by further applying an energy based multi-model fitting algorithm to find correction models for each shift value si. In this particular embodiment, the used algorithm is the PEARL algorithm which combines the optimization of a global discrete energy with an a-expansion algorithm and a recomputing of the proposal models based on the previously found candidates. This provides a piecewise continuous shift function f.sup.opt that can be applied to the first and second times series datasets x,y to align every segment x.sub.i onto its corresponding segment y.sub.i.

[0067] The use of the PEARL algorithm should not be intended as a restriction. Other multi-model fitting algorithm may be chosen by the man skilled in the art.

Biomedical Sensors Embodiments

[0068] In some preferred embodiments described hereafter, the time series datasets consist in high-frequency signals that are sensed by biomedical sensors.

[0069] In the context of the present disclosure, the term biomedical sensor refers to any type of device that detects a signal of medical relevance such as a change of voltage, a change in light absorption or an acceleration. Electrocardiographs (ECG), ballistocardiographs (BCG), pulse oximeters for performing photoplethysmographies (PPG), all types of electrodes for performing electroencephalographies (EEG), all kind of accelerometers such as wrist or ankle accelerometers, and electromymographs (EMG) are a few examples among many others of biomedical sensors.

[0070] The frequency range that is considered in the present disclosure is dependent on the length of the patterns that are observed and on the amount of time-shift happening in the signal. Usually, it is assumed that the relative change of time-shift within a window is negligible. As an example, the heart rate is approximately 1 Hz and its monitoring with standard BCG and ECG typically leads to an observed time-shift of 20 s over 4 hours of measurements. With a window size (i.e. a segment size) of 60 s at sampling frequency of 100 Hz, every window has a relative time-shift of 1.4e-5 s, which is considered to be negligible. Moreover, the intrinsic quality of the sensor control unit which controls the sampling frequency as well as exterior factors like the temperature of the battery levels may induce a time-shift of a few nanoseconds or milliseconds. For these reasons, the term high-frequency refers, in the present disclosure, to frequencies that are in general higher than 1 Hz, but that strongly depend on the measured signal.

[0071] In a preferred embodiment represented in FIG. 1, a first and second intercorrelated high-frequency signals x,y are measured with respectively a first and a second biomedical sensor. These signals are converted in a first and second digital signal and possibly resampled so that the first and second digital signals have same length. Both signals are then segmented into a plurality of smaller segments x.sub.i,y.sub.i or time-windows, based on their respective sensor time. All the segments have the same length. Pairs of corresponding segments between the first and second signal are determined by associating the first segment of the first digital signal with the first segment of the second digital signal, the second segment of the first digital signal with the second segment digital signal, and so on until all segments are associated.

[0072] Each pair of corresponding segments x.sub.i, y.sub.i is subsequently input into a first and a second neural network N.sub.1,N.sub.2 whose respective weights w.sub.1,w.sub.2 parametrize respectively a first and a second signal transformation functions f.sub.1,f.sub.2. More precisely, the first segment x.sub.1 of the pair is input in the first neural network N.sub.1 and the second segment y.sub.1 of the pair is input in the second neural network N.sub.2. By optimizing a suitable correlation function, the first and second neural networks output a first and a second filtered and highly correlated signals, and, for each pair of segments, a shift si approximating the time shift between the first and the second segment. Because of the noise in the original signals or the signals being only partially correlated, some of the shifts do not correspond to the real signal misalignment so that a final step of noise tolerant energy based multi-model fitting algorithm is necessary.

[0073] In this embodiment, the original raw digital signals x, y are typically of same length T. They are both divided into n smaller segments x.sub.i, y.sub.i that are also of a fixed same length, say m so that T=m?n. This subdivision allows to work with long signals that would not fit into the computer memory. Moreover, it allows a distribution of the alignment workload across multiple processors which considerably reduces the computing time.

[0074] As previously mentioned, in the embodiment represented in FIG. 1, each segment of a pair of corresponding segments x.sub.i,y.sub.i is input into a neural network. Weights w.sub.1 of the first neural network N.sub.1 are used to parametrize the first transformation function fa, and similarly the second neural network N.sub.2 has weight w.sub.2 used to parametrize the second transformation function f.sub.2. This means that f.sub.1(x.sub.i,w.sub.1) and f.sub.2(y.sub.i,w.sub.2) are new signals obtained from two segments x.sub.i,y.sub.i by applying possibly non-linear transformations. The two neural networks N.sub.1, N.sub.2 are then trained iteratively to optimize the two transformation functions f.sub.1,f.sub.2. The training of these transformation functions allows a filtering of the first and second input signals x, y by detecting common patterns as well as transforming morphologically distinct signals into correlated signals. These transformation functions f.sub.1, f.sub.2 are asked to optimize a cross-correlation function to approximate time-shifts between corresponding segments x.sub.i,y.sub.i. To efficiently optimize the correlation function, a differentiable normalized same cross correlation (NSCC) is used as loss function. This loss function can then be optimized by regular stochastic gradient descent.

[0075] The choice of the above loss function is not restrictive and other loss functions could be chosen by a man skilled in the art.

[0076] In the embodiment represented in FIG. 1, the training is iterative in the sense that the first transformation function f.sub.1 is first trained while the second transformation function f.sub.2 is fixed. The trained function is then fixed while the other is being trained. At the beginning of the training, the fixed function is set to be the identity function. By doing so, stability problems and overfitting behaviour can be avoided. Moreover, this iterative approach allows to filter out a percentage of segments with low correlation value at each step, which is useful if it is expected that the two signals are only intercorrelated at some points in time.

[0077] The choice of neural networks for aligning intercorrelated high-frequency signals strongly depends on the nature of the signals. Both deep and shallow neural networks may be used according to the complexity of the input signals.

[0078] In one embodiment, the transformation functions are parametrized by ResNet like convolutional neural networks (convolutional residual neural networks) with locally constrained receptive fields. This type of network architecture is well suited for alignment tasks, allowing for effective computations and adjustment of network capacity based on the network depth.

[0079] In the embodiment illustrated in FIG. 1, the optimizing of the cross correlation function also provides, for each pair of segments x.sub.i, y.sub.i, a time shift value si. As mentioned before, this shift value may not correspond to the actual misalignment because of the presence of noise in the input signals as well as because of the only partial correlation of the two signals. In addition, this shift value is discrete and only sampled once for every segment. To find a piecewise continuous correction function, a robust multi-model fitting algorithm is used which allows to correctly resample each data point in the original signal y. In this particular embodiment, the used algorithm is the PEARL algorithm which combines the optimization of a global discrete energy with an a-expansion algorithm and a recomputing of the proposal models based on the previously found candidates. Applying this algorithm to a vector s made of the shift values si provides a piecewise continuous function f.sup.opt which can be applied to a segment x.sub.i of a pair of segments x.sub.i, y.sub.i to align it onto its corresponding segment y.sub.i. The term piecewise continuous means that f.sup.opt is a continuous function on each individual segment of the original digital signal x, but not necessarily between two successive segments.

[0080] The use of the PEARL algorithm should not be intended as a restriction. Other multi-model fitting algorithm may be chosen by the man skilled in the art.

[0081] In a particular embodiment the two sensors are accelerometers, the first being worn on the right wrist and the second on the right ankle. Since the movements of the wrist and the movements of the ankle are only partially correlated, it follows that the corresponding signals as partially correlated as well. FIG. 2a shows the shift values si before applying the multi-model fitting algorithm and FIG. 2b shows the shift values that were used to fit function f.sup.opt, demonstrating the approach's robustness towards high degrees of noise. In both FIGS. 2a and 2b, the x-axis is the time axis labelled with dates of signal measurement, and the y-axis is the time-shift axis in 10-millisecond steps.

[0082] In another embodiment, a first raw signal x, represented in FIG. 3a is a PPG signal and a second raw signal y, represented in FIG. 3b, is a BCG signal. FIG. 3c and FIG. 3d illustrate examples of signal output f.sub.1(x,w.sub.1), f.sub.2(y,w.sub.2) of the two transformation functions f.sub.1,f.sub.2. The x-axis of FIG. 3a-d is the time axis, and the y-axis corresponds to the amplitude of the signal.

[0083] In yet another embodiment, a signal is measured with a 12-leads ECG and a second PPG signal measured with an arm-worn device. FIG. 4 shows the two signals after their complete alignment. The x-axis is the time axis labelled by the measurement time and the y-axis corresponds to the amplitude of the signals.

ASYNCHRONOUS INTERCORRELATED TIME SERIES DATASETS ALIGNMENT METHOD

Inventors

Cpc classification

Classification Explorer

G06F18/2414

PHYSICS

Classification Explorer

G06F2218/00

PHYSICS

Classification Explorer

A61B5/7246

HUMAN NECESSITIES

International classification

Classification Explorer

G06F18/15

PHYSICS

Abstract

Claims

Description