Analysis of cardiac data

11497430 · 2022-11-15

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention relates to a method of analysing cardiac data relating to a patient, comprising: providing cardiac data relating to the patient—optionally by using a means for providing physiological data (20); determining one or more properties of the data, wherein the or each property is determined over a particular context length, the context length being selected based on the or each property—optionally using an analysis module (24); comparing the or each property against a respective predetermined threshold value, thereby to indicate a probability of the patient experiencing a cardiac event—optionally using a means for providing an output (26); and providing an output based on the comparison. A system and apparatus corresponding to this method is also disclosed.

Claims

1. A method of analysing cardiac data relating to a patient, comprising: providing cardiac data relating to the patient; identifying a plurality of types of property; determining a plurality of context lengths, wherein a context length is determined for each type of property, and wherein each context length defines a number of heartbeats to be used for determining a property; determining a plurality of properties of the cardiac data, wherein each property is determined over the determined context length for the type of said property; comparing the plurality of properties against one or more predetermined threshold values, thereby to indicate a probability of the patient experiencing a cardiac event; and providing an output based on the comparison.

2. The method of claim 1, wherein the threshold value is determined based on a dataset comprising a plurality of data obtained from multiple sources.

3. The method of claim 1, wherein the context length is between 10 and 100,000 heartbeats.

4. The method of claim 1, wherein the context length is an optimally discriminating context length for a combination of types of property.

5. The method of claim 1, wherein the property is determined over a context length which is an optimally discriminating context length for the identified type of property.

6. The method of claim 4, wherein an optimally discriminating context length is determined using at least one of: a chi-squared (χ.sup.2) test, a Kolmogorov-Smirnov test, and an Energy test.

7. The method of claim 1, further comprising: providing further data relating to the patient, wherein the further data comprises at least one of: physiological data, demographic data, admission data, past medical history, laboratory data, imaging data; determining one or more properties of the further data, wherein the or each property of the further data is determined over a particular context length, the context length being selected based on the property; comparing the or each property of the further data against a respective predetermined threshold value for the further data, thereby to indicate a probability of the patient experiencing a cardiac event; and providing an output based on the comparison.

8. The method of claim 7, further comprising providing an output based upon a combination of the comparison of the property data and the comparison of the further data.

9. The method of claim 1, wherein the data comprises data from multiple heartbeats and/or wherein the data comprises RR intervals of multiple heartbeats, for example as indicated on an electrocardiogram (ECG).

10. The method of claim 9, further comprising at least one of: processing the data in batches, processing the data in batches of a size of at least 5 heartbeats.

11. The method of claim 1, wherein the type of property comprises at least one of: a property of multiple heartbeats; a mean of multiple heartbeats; a standard deviation of multiple heartbeats; a standard deviation in successive differences of multiple heartbeats; a measured heart rate variability (HRV) of a patient; and a fraction of multiple heartbeats that exceed an abnormality threshold.

12. The method of claim 1, further comprising calculating a rate of change of the property.

13. The method of claim 1, wherein the property is compared against the respective predetermined value for a given time window.

14. The method of claim 1, wherein the output comprises triggering an alert when the indicated probability exceeds a predetermined probability threshold and/or wherein the indicated probability comprises an indication of a corresponding time and/or a display of a period of highest risk.

15. The method of claim 1, wherein the indicated probability is determined using Bayesian inference.

16. The method of claim 1, wherein the indicated probability comprises an indication of a corresponding time and/or a display of a period of highest risk.

17. The method of claim 1, wherein the respective predetermined threshold is determined by: training at least two classifiers to classify a property of multiple heartbeats within the cardiac data using at least one machine learning algorithm; and combining the at least two classifiers to produce a hybrid classifier; wherein the combination is based on a performance metric.

18. A system for analysing cardiac data relating to a patient, comprising: a data module for providing cardiac data relating to the patient; an analysis module for: identifying a plurality of types of property; determining a plurality of context lengths, wherein a context length is determined for each type of property, and wherein each context length defines a number of heartbeats to be used for determining a property; determining a plurality of properties of the cardiac data, wherein each property is determined over the determined context length for the type of said property; a comparison module for comparing the plurality of properties against one or more predetermined threshold values, thereby to indicate a probability of the patient experiencing a cardiac event; and a presentation module for providing an output based on the comparison.

19. The system of claim 18, wherein the data module comprises at least one of: an electrocardiogram (ECG) machine; a pulsometer; a wearable cardioverter defibrillator; an implantable cardioverter defibrillator; a respiratory monitor; and a capnography monitor, or other such source extracting data from the cardiorespiratory system of a patient.

20. An apparatus for analysing patient health using data relating to a patient, comprising: an analysis module for: identifying a plurality of types of property; determining a plurality of context lengths, wherein a context length is determined for each type of property, and wherein each context length defines a number of heartbeats to be used for determining a property; determining a plurality of properties of the cardiac data, wherein each property is determined over the determined context length for the type of said property; a comparison module for comparing the properties against one or more predetermined threshold values, thereby to indicate a probability of the patient experiencing a cardiac event; and a presentation module for providing an output based on the comparison.

Description

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

(1) At least one exemplary embodiment of the present invention will now be described with reference to the accompanying figures, wherein similar reference numerals may be used to refer to similar features, and in which:

(2) FIG. 1 shows an example of a typical electrocardiographic tracing;

(3) FIG. 2a is a general process flowchart for the method described herein;

(4) FIG. 2b is a specific process flowchart for the method described herein as pertaining to one of the possible inputs to the algorithm;

(5) FIGS. 3a and 3b show all of the heartbeats in a “Spontaneous Ventricular Tachyarrhythmia (VTA) Database”, before and after outliers have been removed, respectively;

(6) FIG. 4 shows a the distribution of RR intervals for a heart leading up to an arrhythmia (circles), the same distribution with 5 minutes of data removed prior to the episode (triangles) and a normally functioning heart (squares)];

(7) FIG. 5 shows the time evolution of the mean RR interval leading up to an arrhythmia at t=0 s for the ‘arrhythmic’ distribution;

(8) FIG. 6 shows the time evolution of one of the time domain inputs to the algorithm, the standard deviation in RR intervals;

(9) FIG. 7 shows the statistical compatibility of the SD2 variable between the ‘arrhythmic’ and ‘normal’ distributions as a function of the number of beats included in the computation;

(10) FIG. 8 shows the time evolution of the probability for an arrhythmic episode for the Random Forest classifier;

(11) FIG. 9 shows the separation between the ‘arrhythmic’ and ‘normal’ probability distributions as a function of probability in units of standard deviations for the Random Forest classifier;

(12) FIG. 10 shows the distribution of fraction of abnormal beats for ‘arrhythmic’ and ‘normal’ patients scaled to unit area such that the y-axis scale is in arbitrary units (A.U.); and

(13) FIG. 11 shows an exemplary system for predicting cardiac events; and

(14) FIG. 12 shows a component diagram for analysing patient data and displaying an output.

DESCRIPTION OF SPECIFIC EMBODIMENTS OF THE INVENTION

(15) In what follows, cardiac data that terminate with a Ventricular Tachyarrhythmia (VTA) is referred to as ‘arrhythmic’, and cardiac data from control samples is labelled as ‘normal’.

(16) Prediction Method for Use on Patients

(17) FIG. 2a illustrates a method of predicting cardiac events. Physiological data, obtained from a monitoring device are input into an analysis module comprising a pre-trained classifier. The physiological data can comprise data relating to the patient that is collected in real time, for example cardiac data. In some embodiments, the physiological data alternatively or additionally comprises respiratory data relating to the patient collected from a respiratory monitoring device, which are also input into the pre-trained classifier.

(18) The analysis module uses the input physiological data to analyse the heartbeat of the patient, and determine one or more probabilities of the patient experiencing a cardiac event within a period of time in the future.

(19) RR interval sequences, as illustrated in FIG. 1, are taken as input, data analysis is performed, and a classifier separates ‘arrhythmic’ and ‘normal’ beat sequences. The classifier attributes a probability to each heartbeat and then aggregates the output into an abnormality fraction in [0,1] that forms the basis for a decision to alert the healthcare provider. The abnormality fraction may thereby serve as a useful, actionable, easy-to-interpret number that may guide healthcare providers, patients, or other people who may be in a position to assist the patient. The warning may enable preparation for an appropriate response to the cardiac event and/or prevention of the cardiac event, such as by administering medications and running diagnostic tests.

(20) The method will now be described in more detail, with reference to the process flow illustrated in FIG. 2b. Physiological data, in this example in the form of RR intervals, are input into the analysis module from a monitoring device. Physiological data may contain false measurements (e.g. “outliers”) owing, for example, to movement of the patient and/or poor connections in the monitoring device, which lead to artefacts in the datasets.

(21) Patients that suffer from VTAs are also likely to suffer from ectopic beats such as premature ventricular complexes. Therefore, as indicated in the process flow in FIG. 2b, it is necessary to identify and remove any outliers from the physiological data. This may be achieved using, for example, criteria described in G. D. Clifford, F. Azuaje, and P. E. McSharry, Advanced Methods and Tools for ECG 223 Data Analysis, Artech House Publishers, 2006. The presence of ectopic beats is, however, recorded and used in the subsequent analysis (see below).

(22) The effect of outlier removal on the data is illustrated in FIGS. 3a and 3b. FIG. 3a illustrates the raw RR interval data taken from the “Spontaneous Ventricular Tachyarrhythmia (VTA) Database”, which comprises multiple outlying datapoints 10. The cleaned version of the same data is shown in FIG. 3b.

(23) The cleaned physiological data are then pre-processed as indicated in the process flow of FIG. 2b to obtain unbiased measurements of frequency domain parameters (as discussed below). More specifically, the data first undergoes cubic spline interpolation, and is then resampled at, for example, 7 Hz. Subsequently, the spectral power is computed using a Welch periodogram, for example with a 256-point window overlapped at 50%.

(24) A series of derived quantities are computed based on RR interval data. The derived quantities (listed below) are referred to (interchangeably) as ‘features’ or ‘properties’:

(25) i) Time Domain The arithmetic mean, μ of the RR intervals; The standard deviation, σ of the RR intervals; The standard deviation in successive differences, σ.sub.Diff, of the RR intervals.

(26) The distribution of RR intervals in the time domain can provide valuable data relating to the probability of a patient undergoing a cardiac event.

(27) For example, FIG. 4 illustrates distributions of RR intervals for Arrhythmic and Normal sets of heartbeats, measured in arbitrary units (A.U.). The distribution of normal heartbeats is shown using squares, while that of the arrhythmic heartbeats is shown using circles. Furthermore, the distribution of arrhythmic heartbeats with five minutes of data prior to the arrhythmia is shown using triangles.

(28) FIG. 5 shows the time evolution of the mean RR interval leading up to an arrhythmia at t=0 s for the ‘Arrhythmic’ distribution. In particular, FIG. 5 illustrates the dramatic drop in mean RR intervals at the onset of the arrhythmia near t=0 s.

(29) FIG. 6 shows the time evolution of one of the time domain inputs to the algorithm, the standard deviation in RR intervals. The cardiac event occurs at t=0 s, which appears at the leftmost point of the x-axis. Time flows from right (the past) to left (terminating at the event).

(30) ii) Nonlinear Poincaré Poincaré nonlinear analysis variables, SD1, SD2, and SD1/SD2.

(31) A Poincaré HRV plot is a graph in which successive RR intervals are plotted against one another. From this plot values for SD1 (the dispersion of points perpendicular to the line of identity) and SD2 (the dispersion of points parallel to the line of identity) are determinable. These plots, and the determination of the SD1 and SD2 values, are well known. SD1, SD2, or a combination of SD1 and SD2 are used as inputs to the AI classifier.

(32) iii) Sample Entropy Sample entropy over four epochs, S1, S2, S3 and S4.

(33) iv) Frequency Domain Frequency domain parameters, VLF, LF, HF and LF/HF, derived from the spectral power calculated from the Welch periodogram.

(34) v) Ectopic Beat Frequency The relative frequency of ectopic beats, f.sub.e.

(35) The optimal context for each feature, i.e. the optimal—or maximally discriminating—‘context length’ (as discussed below) for determining whether a feature is indicative of a cardiac event, is determined before each feature is input into an Artificial Intelligence based classifier.

(36) The features derived from the RR interval data are input into an Artificial Intelligence Based Classifier (the AI classifier). The AI classifier can comprise a pre-trained classifier, or preferably multiple pre-trained classifiers combined into a hybrid classifier, that has been trained (as described below) to identify abnormal beats in the physiological data by assigning a probability (i.e. a number in [0,1]) to each heartbeat that reflects the likelihood for the given heartbeat to lead to an arrhythmic episode.

(37) In order to arrive at a robust decision, the number of ‘abnormal’ heartbeats (e.g. which cross a threshold probability) are counted, and the fraction of said ‘abnormal’ heartbeats occurring in a given time window (for example, five minutes) is computed. This leads to an abnormality fraction, F, which is attributed to each patient. A ‘yes/no’ decision is then made based on this fraction, and an alert may be issued (or another action taken) for positive decisions. The alert may, for example, indicates that a cardiac event is predicted; in some embodiments, it also provides additional data related to the probability of the event occurring.

(38) The counting of ‘abnormal’ heartbeats may also be used to obtain a rate of change of the occurrence of ‘abnormal’ heartbeats, where this rate of change may be used to identify both that a cardiac event is likely, and also to predict an urgency—where a high rate of change may indicate that a cardiac event is likely to occur soon.

(39) Classifier Training/Architecture

(40) The AI classifier can be trained by a machine learning system receiving as input examples of heartbeats from a training dataset comprising known normal and abnormal heartbeats from which the system can learn to predict whether an arrhythmia is going to occur. Each heartbeat in the training data set is represented as a real-valued vector containing values for features that describe the specific heartbeat, and enable a classification to be made. The training data is pre-processed in the same way as described above in relation to FIG. 2b, providing the same features that will be used in the prediction method for use in the training process.

(41) There is freedom in the number of preceding heartbeats that should be included in the computation of a feature. This is referred to herein as ‘context length’. Multiple context lengths from 10 beats to 100,000 beats (though preferably context lengths of less than around 3,600) are considered as variables for time domain measures (μ, σ, and σ.sub.Diff) and Poincaré nonlinear analysis.

(42) A X.sup.2-test (‘chi-squared’ test) for statistical compatibility is performed for each ‘feature’ (i.e. derived quantity) and each context length between the ‘arrhythmic’ and ‘normal’ data sample distributions. Context lengths that are optimally discriminating, i.e. where the data range is the most significant for detecting a cardiac event, can then be selected as evidenced by a large X.sup.2/ndf between the respective distributions, where “ndf” is the number of degrees of freedom.

(43) For example, the X.sup.2/ndf for different context lengths for the SD2 variable is shown in FIG. 7 to illustrate that a context length of 230 beats is optimally discriminating for this variable. Determining the optimal context length preferably occurs prior to training. The context length is then held constant during the classifier training phase.

(44) A maximum context length may also be enforced in order to limit the data storage needed, and the recording time needed. The 3,600 beats mentioned previously may be used to limit the amount of data which must be considered.

(45) In order to use the available dataset maximally, a 10-fold cross-validation is performed, whereby the dataset is divided into ten parts and the model is trained ten times. Each time, eight parts are used for training, one part for hyper-parameter tuning and one part for testing. The assignment of different folds is rotated during the ten times.

(46) Five separate machine learning algorithms, in particular, can be used in order to train classifiers (although this method is, of course, extendable to other algorithms). The algorithms are then, preferably, later combined to form a hybrid algorithm, in order to take advantage of each of their strengths.

(47) In some embodiments, the classifier is a long short-term memory unit which may record values over an arbitrary time interval. This type of classifier is particularly useful for processes which have time lags between events (such as cardiac events).

(48) In some embodiments, a convolutional neural network could be used to detect patterns within the recorded data, where this may be combined with an attention mechanism. An attention mechanism enables the neural network to ‘learn’ where it needs to focus and dynamically assign more importance to those areas. The attention mechanism calculates a weight for each time-window in the input stream and uses it to scale the importance of information coming from that window. This method has been shown to be very successful in other domains such as language processing and also enables visualisation of where the model is focusing, thereby making the actions of the system more human-interpretable.

(49) 1. Artificial Neural Network

(50) The feature vectors are given as input to an artificial neural network consisting of three layers. The first layer is an “input layer”, the size of which depends on the number of features in the feature vectors. The second layer is a “hidden layer” with tanh activation, with size 10. And finally, the third layer is a single neuron with sigmoid activation. The neurons in the hidden layer will automatically discover useful features from the input data. The model can then make a prediction based on this higher-level representation. The network may be optimised using AdaDelta, for example. Parameters may be updated based on mean squared error as the loss function. The model may be tested on the development set after every full pass through the training data, preferably wherein the best model is used for final evaluation.

(51) 2. Support Vector Machines (SVM)

(52) Support Vector Machines (SVM) are a separate class of supervised machine learning algorithms. Instead of focusing on finding useful features, they treat the problem as a task of separation in a high-dimensional space. Given that the feature vectors contain n features, they aim to find an n-1 dimensional hyperplane that best separates the positive and negative cases. This hyperplane is optimised during training so that the distance to the nearest datapoint in either class is maximal.

(53) 3. k-Nearest Neighbours

(54) k-Nearest Neighbours (k-NN) is an algorithm that analyses individual points in the high-dimensional feature space. Given a new feature vector that we wish to classify, k-NN returns k most similar points from the training data. Since we know the labels of these points, k-NN assigns the most frequent label as the prediction for the new point. This offers an alternative view to the problem—it no longer assumes that heartbeats of a single class are in a similar area in the feature space, but instead allows us to look for individual points that have very similar features.

(55) 4. Gaussian Process

(56) Gaussian Process is a statistical model where each datapoint is associated with a normally distributed random variable. The Gaussian Process itself is a distribution over distributions, which is learned during training. This model associates each prediction also with a measure of uncertainty, allowing us to evaluate how confident the model is in its own classification. As this type of model is difficult to train with more than 3,000 datapoints, it is preferable to ensure that a suitable size is sampled during training.

(57) 5. Random Forest

(58) Random forests are based on constructing multiple decision trees and averaging the results. Each decision tree is a model that attempts to separate two samples based on sequential splittings for each input feature. In this implementation, datapoints that are misclassified are given a weight larger than one (referred to as ‘boosting’).

(59) Each classifier assigns a probability (i.e. a number in [0,1]) to each heartbeat that reflects the likelihood for the given heartbeat to lead to an arrhythmic episode. Several different thresholds for the probability may be considered and the value that optimally separates the ‘arrhythmic’ and ‘normal’ datasets is chosen. This may be referred to as optimal classification separation.

(60) In some embodiments, the methods of predicting cardiac events are used (and/or embedded) within a portable device, such as a pacemaker, or an implantable cardioverter-defibrillator. Within such a device, it is important that computations are minimised, to maximise the battery life of the device. In order to achieve this algorithms with low computational cost are used (possibly at the expense of some accuracy).

(61) An example of using low computational cost algorithms is the use of difference of area (DOA) methods, which have a low complexity, within waveform analysis. Bin area methods (BAM) may also be used as these provide a trade-off between complexity and accuracy. More generally, it is preferable to use algorithms which analyse time domain features as opposed to those which analyse frequency domain features.

(62) In order to speed up the execution of the Random Forest algorithm, in some embodiments each input feature is discretised so that the volume of information fed to the decision trees is reduced. This approach is used to speed up the execution of the classifier and to reduce the effect of noise by choosing step sizes greater than the fluctuations present in the features on account of noise.

(63) In some embodiments, classifiers are formed using ‘distilling’. First, a very complex and computation-intensive neural network is trained. Next, a simpler and faster model is constructed, before being trained it on the output of the former model. This approach results in models (and classifiers) that have the benefits of both speed and accuracy.

(64) ‘Batching’ is another method that is used in some embodiments to speed up computation. If a model has limited processing power and cannot process one heartbeat at a time, the incoming data can be combined into batches of ten heartbeats to reduce the computational burden. This results in the model being up to ten beats behind in making predictions, but enables the use of more accurate models.

(65) In some embodiments, an adversarial training model is used, where cases for which the classifier would misclassify data are determined and these cases are used to improve the performance of the classifiers.

(66) As an example: a neural network is provided that is trained to classify RR sequences. Starting with a healthy rhythm, it is determined which (small) changes need to be made to this rhythm in order for the network to misclassify it as a VT example. This method then enables identification of the weak points of the network. These examples (of misclassified datasets) are subsequently introduced into the training data and the classifiers are trained to classify them correctly. This results in a more robust model with a decreased likelihood of misclassifications.

(67) FIG. 8 shows the time evolution of the probability for an arrhythmic episode for the Random Forest classifier. The significance of the separation (between the ‘arrhythmic’ and ‘normal’ datasets in standard deviations) as a function of threshold probability is shown for the Random Forest classifier in FIG. 9, which indicates that a threshold of 50% leads to a significance of roughly 1.9 standard deviations.

(68) FIG. 10 illustrates the distributions of abnormality fractions, F—for ‘arrhythmic’ and ‘normal’ patients for a Random Forest classifier. The distributions have been normalised to unit area for presentational purposes, where A.U. stands for arbitrary units.

(69) An optimal decision boundary is arrived at by minimising the root mean square error, denoted as RMSE, and defined as:

(70) RMSE = .Math. i ( F i - F decision ) 2 ( Equation 1.1 )

(71) Where F.sub.i is the fraction of abnormal heartbeats for the ith misclassified patient and F.sub.decision is the abnormality fraction under consideration. RMSE can be thought of as a measure of distance from the decision boundary for misclassifications.

(72) A hybrid classifier may be created by combining the abnormality fractions, F, for each model listed above. The combination is a weighted sum defined as:

(73) F hybrid = .Math. j ( w j F j ) ( Equation 1.2 )

(74) Where w.sub.j is the weight attributed to the jth classifier and F.sub.j is the corresponding abnormality fraction, F. The weights, w.sub.j, are determined according to the performance of the classifiers, as measured by their RMSE value.

(75) More specifically, the weights, wj, are determined dependent upon their RMSE value over misclassifications. The motivation for doing so is to achieve optimal performance of the resulting hybrid classifier in an unbiased way. Other commonly used metrics could lead to the wrong weights being attributed to classifiers and, consequently, suboptimal decisions.

(76) The performance of the method described herein may be determined according to three metrics, listed below: Accuracy (A), defined as:

(77) A = TP + TN TP + TN + FP + FN ( Equation 1.3 )

(78) Where the numerator is a sum of true positives (TP) and true negatives (TN) and the denominator includes false positives (FP) and false negatives (FN). Sensitivity (SE), defined as:

(79) SE = TP TP + FN ( Equation 1.4 ) Specificity (SP), defined as:

(80) SP = TN TN + FP ( Equation 1.5 )

(81) The method described herein may be integrated with and/or implemented by existing patient monitoring equipment.

(82) System Architecture

(83) FIG. 11 illustrates an example of a system for predicting cardiac events. A physiological data source 20 (e.g. means for providing physiological data), extracted from the cardiorespiratory system of a patient 22, is communicated to an analysis module 24, which analyses the extracted physiological data. This communication may occur over a wired connection or a wireless network. The physiological data source 20 can be, for example, an electrocardiogram (ECG) machine; a pulsometer; a wearable cardioverter defibrillator; an implantable cardioverter defibrillator; a respiratory monitor; and/or a capnography monitor.

(84) The analysis module 24 is configured to evaluate the extracted physiological data, for example evaluating a property of multiple heartbeats in the data, and determine whether said property exceeds an abnormality threshold. This information is then used to derive a probability of the patient experiencing a cardiac event, for example using the method described above in relation to FIGS. 2a and 2b to evaluate said property and derive said property.

(85) The analysis module 24 comprises a hybrid classifier trained and operating as described above in relation to FIG. 2b. The module 24 may comprise part of a dedicated machine, for example running locally to the patient and data source, or be part of a network, running on a server or in the “cloud”.

(86) If the analysis module 24 determines that the probability of the patient experiencing a cardiac event in a subsequent time period is above some pre-defined threshold, then the analysis module will trigger a means for providing an output 26, for example an alarm, or other alert, that can alert a healthcare provider that the patient is at risk. This can enable the healthcare provider to take preventative action.

(87) Display of Output

(88) The output displays one or more probabilities, as determined using the methods described. The probabilities are output in numerous forms, notably: A binary assessment is used as a threshold indicator, where a critical value triggers an alarm. This is particularly useful as a first indicator that a patient may require attention. A threshold here is used to indicate that urgent help is required, or that patient data should be looked at more closely. There may be multiple thresholds which each have a differing level of urgency. A probability of a cardiac event is output, where this allows a user to allocate resources, and make other decisions, appropriately. An uncertainty estimate is output alongside this probability. In differing embodiments, the probability is output quantitatively (for example as a percentage risk) and/or qualitatively, (for example a patient may be categorised as one of low risk, medium risk, or high risk, where these correspond to probability ranges). A qualitative measure may be used to simplify the immediate interpretation by a user. A probability density function is output, where this allows a user to more fully assess a situation.

(89) These probabilities are typically used in conjunction so that, upon a threshold risk being passed, a user is directed to view a probability, or a probability function, to determine an appropriate action. This can then be used as a general indicator of a patient's health, where an increased likelihood of a cardiac event indicates that a patient is more likely to need attention during a certain period.

(90) An uncertainty also being displayed further aids the determination of an appropriate action. A potential problem with any data based analysis, particularly an analysis of a complex situations, such as the prediction of a cardiac event, is that a precise result is rarely achievable; this leads to a figure (such as a probability) on its own having limited use—especially due to the difficulty in determining if this figure is reasonable. The inclusion of an uncertainty based measure (such as a variance, or error bounds), enables a better judgement to be made regarding any given figure/probability.

(91) Advantageously, a probability enables a user to make a rapid assessment, as a probability is intuitively interpreted more easily than, for example, a risk score. Additionally, a probability density function gives a user a large amount of information in a concise format.

(92) In various embodiments, probabilities are also output for a number of timeframes. An initial output is simply a probability without any time reference. A more useful output is a probable time-to-cardiac-event. More specifically, probabilities may be output for time ranges, where this allows efficient allocation of resources.

(93) The outputting of probability density functions for numerous timeframes enables limited resources to be scheduled effectively: for example a limited number of staff to be directed to be ready to assist certain patients at times of increased risk; a probability density function may be used to assess whether a cardiac event is almost certain or whether the risk is more unpredictable.

(94) In some embodiments, a probability density function is displayed numerically, where a mean, a standard deviation, and a kurtosis (indicating the skew of the distribution) are displayed. In these, or other, embodiments, the function is (also) displayed graphically.

(95) There are, in some embodiments, numerous, user selectable, ways to illustrate a probability, for example a best fit normal distribution, a skew normal distribution, or a Poisson distribution. A preferred distribution is suggested during analysis, where a suitable distribution depends on, for example, the amount of information available.

(96) In some embodiments, the probability assessment is continuously updated, where this occurs as relevant information is obtained. An initial assessment uses historic data, and/or admissions data; this initial assessment is then updated (and improved) using recorded and evaluated data (such as the RR intervals above) as it becomes available.

(97) In preferred embodiments, a Bayesian probabilistic framework is used in this updating, where Bayesian inference is used to obtain a probability. This is related to a form of Bayes rule, which is displayed in equation 2.1 below:

(98) P ( Y | X , α ) = P ( X | Y ) P ( Y | α ) P ( X | α ) P ( X | Y ) P ( Y | α ) ( Equation 2.1 )

(99) where: P(Y|α) is the prior distribution (i.e. the previously calculated probability);

(100) P(Y|X,α) is the posterior distribution (i.e. the updated probability);

(101) P(X|α) is the marginal likelihood (i.e. the likelihood of the recently sampled data given the entire set of data);

(102) P(X|Y) is the sampling distribution (i.e. the probability of the observed data given the current distribution); and

(103) α is the hyperparameter of the parameter distribution (i.e. Y˜P(Y|α)).

(104) This equation is used to derive an updated probability based upon a prior probability and the probability of the occurrence of the recently sampled data. Using this equation, recent data which is indicative of a cardiac event being likely would be more concerning in a patient previously judged to be high-risk than it would in a patient previously judged to be low-risk (an interpretation of this is that in the low-risk patient this data is more likely to be anomalous). The use of Bayesian inference is then useful for reducing the rate of false positives, as the prior probability will be small for low-risk patients.

(105) Notably, in the given example, the occurrence of data indicative of a cardiac event would be unlikely given the prior distribution, and so this would have a significant effect on the posterior distribution. Due to this, the data would not simply be written off entirely as anomalous; while it may not immediately result in a warning, continued occurrence of data indicative of a likely cardiac event would rapidly increase the probability (so that the chance of missing a cardiac event is unlikely); however, advantageously, a single (potentially anomalous) datapoint would not trigger a false positive warning.

(106) To further reduce the likelihood of false negatives, in some embodiments, a Bayesian inference model is used alongside a threshold marginal likelihood: a marginal likelihood which is indicative of a very high chance of an upcoming cardiac event then triggers a warning even if the overall probability remains low due to a consistently low prior probability.

(107) The updating of the probability takes place periodically (for example each five seconds, or each minute), where a longer update (or refresh) period use less computing power. This update period is, in some embodiments, small enough that the probability is updated effectively continuously (i.e. the period is so small as to not be noticeable by a user).

(108) In some embodiments, there is a component within the apparatus which allows a choice of the update period—this may also be selectively determined based on the use of the apparatus (where an implanted device may prioritise battery longevity over rapid updates).

(109) A consideration here is that, in many situations, it is possible to maintain an accurate probability while making only periodic updates, especially where there is a large prior distribution (i.e. where measurements have been taken for a long time). The update period is then based upon the prior distribution. As an upper limit for the time, these updates may be limited, so as to be regular enough that they do not miss a cardiac event.

(110) FIG. 12 shows a component diagram for analysing patient data and displaying an output.

(111) One or more measurement device(s) (e.g. an ECG, a patient file) 32 transmit(s) data to a local server 34. These data are then transmitted to a network server 36, and fed through an analysis module 24 (as discussed above, e.g. with reference to FIG. 2b). The output of the classifier passes through a results formatter 40 before being transmitted back to the local server 34 (this results formatter 40, for example, format results to be output as a warning alarm, or a display of probability). The output is then be presented on a UI 42 for one or more users, this uses, for example, a smartphone, a screen, or a display distributed by a hospital. In some embodiments, this also comprises a speaker, which provides an audible output if a threshold probability is exceeded.

(112) By sending data via a network server 36, instead of storing all data on a local device, the data can be displayed to numerous users simultaneously. This allows the gathering multiple opinions, or to alert numerous users simultaneously, so that the user in the best position to may be notified.

(113) The use of a network server 36 also enables remote monitoring of a patient. This may be used for a patient with an implantable device, where data recorded by the device is transferred to a network server 36, evaluated by the analysis module 24, and then displayed on a UI 42 to both the user and (separately) a healthcare professional, who may then check on the user at an appropriate time.

(114) The figures as described above show a system for monitoring a patient. As a general overview: in FIG. 11, there is a patient 22, for which it is desired to output a probability of a cardiac event. A means for providing physiological data 20, such as an electrocardiogram (ECG) is used to obtain this data. Typical data is shown in FIG. 1; specific data, such as the RR intervals is extracted from this data. This data is then fed into an analysis module 24, which is discussed with reference to FIG. 2b.

(115) The analysis module 24 is provided with the specific data (the RR intervals) as in FIG. 2b. Processing then occurs: 1. outliers are removed. This is demonstrated by FIG. 3; 2. numerous properties are determined, such as the mean RR interval, and the ectopic beat frequency; 3. optimal context lengths are determined for each property (or for a grouping of properties, such as time domain properties). This is demonstrated by FIG. 7; 4. the optimal length of data is fed into an artificial intelligence based classifier; 5. the artificial intelligence based classifier, which is formed of multiple different classifiers combined to obtain a hybrid classifier, determines threshold abnormality values for each property, which are indicative of an upcoming cardiac event (e.g. a threshold mean RR interval is calculated, where a mean interval above this threshold is indicative of an upcoming cardiac event). The threshold values are determined based upon past data from multiple sources, for example a database containing physiological data for patients alongside occurrences of cardiac events may be used for training a classifier. 6A. The data which has been fed in to the classifier is compared to the relevant threshold, and a probability of a cardiac event occurring is determined (based on the fraction of the data which exceeds the threshold). This probability is displayed, and an alarm is sounded if a high probability of a cardiac event is obtained. 6B. The data which has been fed in to the classifier is output to an optimisation stream, where it is used to further optimise the determination of following threshold values (i.e. it is incorporated into the training set).

(116) This output is then presented using a means for providing an output 26.

(117) The means for providing physiological data, and the means for providing an output are described in more detail above with reference to FIG. 12.

(118) Alternatives and Modifications

(119) Data Types

(120) The use of RR intervals is an example of a type of data—more specifically a type of physiological data, and even more specifically a type of cardiac data—which is usable with the described methods; more generally, any type of patient data, or any combination of types of data could be used with these methods, where the use of a combination of patient data may lead to fewer false positives (or false negatives). Examples of preferred types of data are (with some overlap as, for example, telemetry records and clinical data both comprise physiological data): telemetry records, such as arterial blood pressure, pulse contour data, or pulse rate; demographic data, such as age, sex, or race (this may come from an electronic health report/patient profile); Admission/historic data, such as a recent illness or any history of illness; in particular concomitant conditions, such as emphysema or diabetes; clinical data, such as haemoglobin values; laboratory data, such as the results of tests; imaging data, such as x-rays or MRI scans.

(121) Where multiple data types are considered, each of these types of data is treated similarly to the RR intervals: properties (such as a mean or a standard deviation) are extracted, and an optimal context length for these features determined—as an example, there is an optimal length of patient history to consider, where data more than, for example, 10 years old may have a negligible contribution to a prediction of future health. Numerous data types are considered in the determination of a probability, where, in some embodiments, each data type has a different weighting (where this weighting is based upon historic data and determined by the classifiers).

(122) In various embodiments, the data types used are optimised, where this is used within the display of a probability. In each situation, there is selected a combination of data features with the most significant effect; this is particularly useful where an implantable device is used, and using a low number of data types is desirable, as this minuses the computational burden.

(123) In some embodiments, to avoid the need for new measuring equipment, analysis occurs only using data which is attainable using current measuring methods.

(124) While the data recording methods discussed have primarily involved specialist equipment (e.g. electrocardiograms), the methods discussed could equally be used with other, more widely available equipment. As an example, there exist many user wearable devices which are used to monitor a heartrate or a pulse (such as a Fitbit™) The data recorded using this, or a similar, device could be used with the AI classifier described above to obtain a probability of a cardiac event, or to output a general health measure. If used in such a device, the output may be a displayed probability, or measure of health, to the user, or an automatic warning sent to, for example, an ambulance, if a threshold probability is exceeded. This may be particularly useful in devices such as a Fitbit™, which are used during periods of increased activity (where stress may be placed upon the heart).

(125) Context Length Determination

(126) The context length determination has been explained using the example of a X.sup.2-test (‘chi-squared’ test); numerous other tests could be used to make this determination. Various embodiments use one of (or a combination of): a Kolmogorov-Smirnov test, a comparison of the moments of distributions, or an Energy Test (as described by Guenter Zech and Berkan Asian).

(127) When using the Energy Test an Energy Test metric, T, is computed between two distinct unbinned multivariate distributions. One such example is arrhythmic and normal heartbeat distributions, which give a non-zero T-value. This is used in some embodiments as an additional test on the probability of a cardiac event: an Energy Test is performed and a T-value calculated, this T-value is updated after each heartbeat and a warning is issued if the T-value exceeds a predetermined threshold (which is based on past data, and may be determined for each patient based upon their specific data). The context length over which the Energy Test is performed is determined as with any other dataset. This test may be used in isolation, or in conjunction with any other method described, where use in conjunction with other methods may reduce the likelihood of false negatives or false positives.

(128) Autoregressive Models

(129) In some embodiments, autocorrelation is considered along with a measure of the lag required to obtain an autocorrelation. As an example, in the short term, the occurrence one cardiac event may be indicative of another cardiac event being likely to occur (i.e. recent cardiac events may have high autocorrelation), as these events are often related to periods of otherwise poor health. In the long term, a previously occurring cardiac event (e.g. a cardiac event which occurred in a previous year), may be a poor indicator of a subsequent cardiac event (i.e. distant cardiac events may have low autocorrelation), as the period of poor health may have passed. The suitability of using an autoregressive model is determined by comparing these correlations and lags.

(130) A consideration with autocorrelation is that (useful) autocorrelation may be negative or positive. In the previously used example, it may be the case that a previous, but distant cardiac event (e.g. one that occurred in a previous year), is a good indicator that a cardiac event is unlikely, as the person may have worked to improve their health in response to the previous event.

(131) Other Conditions

(132) The methods described could be used for a range of other conditions, for example, as well as a cardiac event, indicators of an upcoming arrhythmia may also be used to predict a stroke. The methods disclosed herein could also be used to measure conditions away from the heart: the flow of blood could, for example, be monitored as relates to transfer to the brain. In this situation, a context length would still be of relevance: monitoring the blood flow into the brain could be used to give a prediction of brain related events (such as brain aneurysms).

(133) More generally, the methods disclosed could be used as a general indication of health. Abnormal operation of any pulse based condition is a possible indicator of not only the probability of a specific event (e.g. arrhythmia), but also that the patient is likely to be at heightened risk of a more general health-related incident. These methods may then be used to indicate that a patient may need more careful monitoring during a determined period, or that it may be valuable to analyse patient data in more detail and/or to carry out tests.

(134) It will be understood that the invention has been described above purely by way of example, and modifications of detail can be made within the scope of the invention.

(135) Each feature disclosed in the description, and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination.

(136) Reference numerals appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.