ASYNCHRONOUS INTERCORRELATED TIME SERIES DATASETS ALIGNMENT METHOD
20240248954 ยท 2024-07-25
Inventors
- Narayan SCH?TZ (Bem, CH)
- Angela BOTROS (Z?rich, CH)
- Philipp BULUSCHEK (Denges, CH)
- Guillaume DuPASQUIER (Belmont-sur-Lausanne, CH)
- Michael SINGLE (Herzogenbuchsee, CH)
- Stephan GERBER (Lyss, CH)
- Tobias NEF (Visp, CH)
Cpc classification
G06F18/2414
PHYSICS
G06F2218/00
PHYSICS
International classification
Abstract
A computer-implemented method for aligning intercorrelated asynchronous time series datasets includes the steps of: (a) retrieving a first time series dataset (x) and a second time series dataset (y), the first and second time series dataset being intercorrelated, (b) segmenting each of the first and second time series dataset (x, y) into a plurality of consecutive smaller segments (x.sub.i, y.sub.i), all segments of the first and second time series dataset (x, y) having the same length, (c) determining pairs of corresponding segments by associating successive segments of the first time series dataset (x) with corresponding segments of the second time series dataset (y), (d) optimizing, for each pair of corresponding segments, a correlation function to obtain an approximation of a first times series transformation function (f.sub.1) and of a second time series transformation function (f.sub.2), (e) using the first and second time series transformation functions (f.sub.1,f.sub.2) to determine a vector of segment shifts (s) whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments (x.sub.i, y.sub.i), (f) applying a multi-model fitting algorithm to the segment shift vector (s), said multi-model algorithm outputting a shift function (f.sup.opt) for aligning segments of each pair of corresponding segments (x.sub.i, y.sub.i), and (g) aligning the first time series dataset (x) with the second series dataset by applying said shift function (f.sup.opt) to all pairs of corresponding segments (x.sub.i, y.sub.i),
wherein the first time series transformation function (f.sub.1) is parametrized by weights (w.sub.1) of a first neural network (N.sub.1) and outputs a first highly correlated time series dataset, and wherein the second time series transformation function (f.sub.2) is parametrized by weights (w.sub.2) of a second neural network (N.sub.2) and outputs a second highly correlated time series dataset.
Claims
1. A computer-implemented method for aligning intercorrelated asynchronous time series datasets comprising the steps of: (a) retrieving a first time series dataset and a second time series dataset (y), the first and second time series dataset being intercorrelated, (b) segmenting each of the first and second time series dataset into a plurality of consecutive smaller segments, all segments of the first and second time series dataset having the same length, (c) determining pairs of corresponding segments by associating successive segments of the first time series dataset with corresponding segments of the second time series dataset, (d) optimizing, for each pair of corresponding segments, a correlation function to obtain an approximation of a first times series transformation function and of a second time series transformation function, (e) using the first and second time series transformation functions to determine a vector of segment shifts whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments, (f) applying a multi-model fitting algorithm to the segment shift vector, said multi-model algorithm outputting a shift function for aligning segments of each pair of corresponding segments, and (g) aligning the first time series dataset with the second series dataset by applying said shift function to all pairs of corresponding segments, wherein the first time series transformation function is parametrized by weights of a first neural network and outputs a first highly correlated time series dataset, and wherein the second time series transformation function is parametrized by weights of a second neural network and outputs a second highly correlated time series dataset.
2. Method according to claim 1, wherein said first and second time series datasets are respectively a first and a second digital signal.
3. Method according to claim 1, further comprising a step of measuring a first signal with a first sensor and a second signal with a second sensor and converting the first signal into said first digital signal and the second signal into said second digital signal.
4. Method according to claim 1, wherein said first and second sensors are biomedical sensors.
5. Method according to claim 1, wherein said biomedical sensors are both selected among ECG sensors, BCG sensors, PPG sensors, EEG sensors, EMG sensors and medical accelerometers.
6. Method according to claim 1, wherein said approximation of said first and second time series transformation functions is made by successively fixing the first times series transformation function to be the identity function while optimizing the second times series transformation function and then optimizing the first time series transformation function while fixing the second time series transformation that has been optimized.
7. Method according to claim 1, wherein at least one of said first and second neural network is a residual convolutional neural network with locally constrained receptive fields.
8. Method according to claim 1, wherein said optimizing of said correlation function comprises a step of optimizing a normalized same cross correlation function.
9. Method according to claim 1, wherein said optimizing of said normalized same cross correlation function is realized through regular stochastic gradient descent.
10. Method according to claim 1, wherein at least one of said first and second time series transformation functions is non-linear.
11. Method according to claim 1, wherein said multi-model fitting algorithm is an energy based multi-model fitting algorithm.
12. Method according to claim 1, wherein step (d)-(e) is performed in parallel on a plurality of processors.
Description
SHORT DESCRIPTION OF THE DRAWINGS
[0040] Exemplar embodiments of the invention are disclosed in the description and illustrated by the drawings in which:
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
DETAILED DESCRIPTION OF THE INVENTION
[0049] The present invention relates to a method for aligning asynchronous intercorrelated time series. Misalignment between times series may for example arise from a lack of synchronization between the timestamping mechanisms that assign a time value to a certain measurement. It may also arise from delays inherent to the processes being observed or the technology used. The induced time-shift, or lag, may be constant over certain periods of time, but most often this time-shift is non-constant but piecewise continuous in the sense that over certain periods of time, the lag can be expressed as a continuous function of time. In some cases, this function is periodic, meaning that the lag is identical on certain periods of time, when, for instance, the clocks are regularly reinitialized. In some other cases, this function does not follow any particular pattern as the lag is due to a human interaction. This is typically the case for stock market prices time series where the timestamping depends on the reaction time of an investor.
[0050] The variety of fields requiring high-precision time series alignment is rather large and includes for instance:
[0051] Manufacturing and heavy industry: with the advent of industry 4.0, predictive maintenance in manufacturing has become key. This is often the result of the combination of measurements (e.g. temperature, pressure, vibration, force, current, voltage, radiation, inductance, etc.) taken continuously by independent sensors on machines at different locations. Predictive maintenance models rely on all these data sources being correctly synchronized, sometimes to a high precision in which case the invention is applicable.
[0052] Automotive: self-driving vehicles handle a stream of a variety of sensors as input for decision models. These sensors are currently generally managed centrally from a main processing unit in the vehicle. If these sensors can become independent and rely on their own internal clocks, clock drifts can provide a false correlated information. Here the invention could provide automatic and continuous synchronization.
[0053] Sensor calibration and characterization: the invention allows not only to synchronize two signals, but also to characterize the time-based behavior and accuracy across various operating conditions of a sensor with respect to a gold standard. Taking a gold standard as a reference and recording a series of test events through both the gold standard sensor or clock and a second sensor allows using the invention to measure and characterize the accuracy of the second sensor with respect to the gold standard. This can for instance be used for quality control.
[0054] Research: many research fields need time series from different sensors to be well synchronized. For instance, experimental physics (CERN), Astronomy, Oceanography (measurement of sound)
[0055] Telecommunication, defense & aerospace: all technologies related to Signal Intelligence, notably Radar, HF and microwave communication but also all other type of small size sensors with internal clocks.
[0056] In a particular embodiment, a first time series dataset x and a second time series dataset y are retrieved. These time series datasets are supposed to be intercorrelated and they may arise from any of the fields listed above. They may be collected by being retrieved in a time series database or by any other suitable way known to the man skilled in the art.
[0057] If needed, the time series datasets are resampled and cropped to a same time period so as to obtain time series datasets of the same length.
[0058] The first and second time series datasets are then segmented into a plurality of consecutive smaller segments x.sub.i, y.sub.i, all segments of the first and second time series dataset x, y having the same length. This allows a distribution of the processing over a plurality of processors, which considerably reduce the computing time. It also allows a processing of comparatively long datasets that would not fit the computer's memory. Then, pairs of corresponding segments are determined by associating successive segments of the first time series dataset x with corresponding segments of the second time series dataset y. In other words, the first segment of the first time series dataset is associated with the first segment of the second time series dataset, the second segment of the first time series dataset with the second segment of the second time series dataset, and so on until all segments are paired. In general, one has to ensure that the length of the smaller segments is comparatively important with respect to the expected time-shift in order to be able to detect the correlation between corresponding segments.
[0059] Next, for each pair of corresponding segments, a correlation function is optimized to obtain an approximation of a first times series transformation function f.sub.1 and of a second time series transformation function f.sub.2, the first time series transformation function f.sub.1 being parametrized by weights w.sub.1 of a first neural network N.sub.1 and the second time series transformation function f.sub.2 being parametrized by weights w.sub.2 of a second neural network N.sub.2. The first and second time series transformation functions f.sub.1,f.sub.2 are then used to determine a vector of segment shifts s whose components contain approximations of the shifts between the first and the second segment in a pair of corresponding segments x.sub.i, y.sub.i.
[0060] The training of these transformation functions allows a filtering of the input time series datasets by detecting common patterns as well as transforming morphologically distinct time series datasets into correlated time series datasets. These transformation functions are asked to optimize a cross-correlation function. To efficiently optimize the correlation function, a differentiable normalized same cross correlation (NSCC) is used as loss function. This loss function can then be optimized by regular stochastic gradient descent.
[0061] The first and second neural networks N.sub.1, N.sub.2 output a first and a second filtered and highly correlated time series dataset, and, for each pair of segments, a shift si approximating the time shift between the first and the second segment. Because of the noise in the original time series datasets or the time series datasets being only partially correlated, some of the shifts do not correspond to the real misalignment so that a final step of noise tolerant energy based multi-model fitting algorithm is necessary.
[0062] The choice of the above loss function is not restrictive and other loss functions could be chosen by a man skilled in the art.
[0063] In the embodiment represented in
[0064] The choice of neural networks for aligning intercorrelated high-frequency signals strongly depends on the nature of the time series datasets. Both deep and shallow neural networks may be used according to the complexity of the input time series datasets.
[0065] In one embodiment, the transformation functions are parametrized by ResNet like convolutional neural networks (convolutional residual neural networks) with locally constrained receptive fields. This type of network architecture is well suited for alignment tasks, allowing for effective computations and adjustment of network capacity based on the network depth.
[0066] The optimizing of the cross correlation function also provides, for each pair of segments x.sub.i, y.sub.i, a time shift value si. As mentioned before, this shift value may not correspond to the actual misalignment because of the presence of noise in the input time series datasets as well as because of the only partial correlation of the two signals. This problem is solved by further applying an energy based multi-model fitting algorithm to find correction models for each shift value si. In this particular embodiment, the used algorithm is the PEARL algorithm which combines the optimization of a global discrete energy with an a-expansion algorithm and a recomputing of the proposal models based on the previously found candidates. This provides a piecewise continuous shift function f.sup.opt that can be applied to the first and second times series datasets x,y to align every segment x.sub.i onto its corresponding segment y.sub.i.
[0067] The use of the PEARL algorithm should not be intended as a restriction. Other multi-model fitting algorithm may be chosen by the man skilled in the art.
Biomedical Sensors Embodiments
[0068] In some preferred embodiments described hereafter, the time series datasets consist in high-frequency signals that are sensed by biomedical sensors.
[0069] In the context of the present disclosure, the term biomedical sensor refers to any type of device that detects a signal of medical relevance such as a change of voltage, a change in light absorption or an acceleration. Electrocardiographs (ECG), ballistocardiographs (BCG), pulse oximeters for performing photoplethysmographies (PPG), all types of electrodes for performing electroencephalographies (EEG), all kind of accelerometers such as wrist or ankle accelerometers, and electromymographs (EMG) are a few examples among many others of biomedical sensors.
[0070] The frequency range that is considered in the present disclosure is dependent on the length of the patterns that are observed and on the amount of time-shift happening in the signal. Usually, it is assumed that the relative change of time-shift within a window is negligible. As an example, the heart rate is approximately 1 Hz and its monitoring with standard BCG and ECG typically leads to an observed time-shift of 20 s over 4 hours of measurements. With a window size (i.e. a segment size) of 60 s at sampling frequency of 100 Hz, every window has a relative time-shift of 1.4e-5 s, which is considered to be negligible. Moreover, the intrinsic quality of the sensor control unit which controls the sampling frequency as well as exterior factors like the temperature of the battery levels may induce a time-shift of a few nanoseconds or milliseconds. For these reasons, the term high-frequency refers, in the present disclosure, to frequencies that are in general higher than 1 Hz, but that strongly depend on the measured signal.
[0071] In a preferred embodiment represented in
[0072] Each pair of corresponding segments x.sub.i, y.sub.i is subsequently input into a first and a second neural network N.sub.1,N.sub.2 whose respective weights w.sub.1,w.sub.2 parametrize respectively a first and a second signal transformation functions f.sub.1,f.sub.2. More precisely, the first segment x.sub.1 of the pair is input in the first neural network N.sub.1 and the second segment y.sub.1 of the pair is input in the second neural network N.sub.2. By optimizing a suitable correlation function, the first and second neural networks output a first and a second filtered and highly correlated signals, and, for each pair of segments, a shift si approximating the time shift between the first and the second segment. Because of the noise in the original signals or the signals being only partially correlated, some of the shifts do not correspond to the real signal misalignment so that a final step of noise tolerant energy based multi-model fitting algorithm is necessary.
[0073] In this embodiment, the original raw digital signals x, y are typically of same length T. They are both divided into n smaller segments x.sub.i, y.sub.i that are also of a fixed same length, say m so that T=m?n. This subdivision allows to work with long signals that would not fit into the computer memory. Moreover, it allows a distribution of the alignment workload across multiple processors which considerably reduces the computing time.
[0074] As previously mentioned, in the embodiment represented in
[0075] The choice of the above loss function is not restrictive and other loss functions could be chosen by a man skilled in the art.
[0076] In the embodiment represented in
[0077] The choice of neural networks for aligning intercorrelated high-frequency signals strongly depends on the nature of the signals. Both deep and shallow neural networks may be used according to the complexity of the input signals.
[0078] In one embodiment, the transformation functions are parametrized by ResNet like convolutional neural networks (convolutional residual neural networks) with locally constrained receptive fields. This type of network architecture is well suited for alignment tasks, allowing for effective computations and adjustment of network capacity based on the network depth.
[0079] In the embodiment illustrated in
[0080] The use of the PEARL algorithm should not be intended as a restriction. Other multi-model fitting algorithm may be chosen by the man skilled in the art.
[0081] In a particular embodiment the two sensors are accelerometers, the first being worn on the right wrist and the second on the right ankle. Since the movements of the wrist and the movements of the ankle are only partially correlated, it follows that the corresponding signals as partially correlated as well.
[0082] In another embodiment, a first raw signal x, represented in
[0083] In yet another embodiment, a signal is measured with a 12-leads ECG and a second PPG signal measured with an arm-worn device.