Method and apparatus for vital signs measurement
10939852 ยท 2021-03-09
Assignee
Inventors
- Alessandro Guazzi (Oxford, GB)
- Mauricio Villarroel Montoya (Oxford, GB)
- Lionel Tarassenko (Oxford, GB)
Cpc classification
A61B5/7221
HUMAN NECESSITIES
A61B5/0077
HUMAN NECESSITIES
A61B5/02416
HUMAN NECESSITIES
A61B5/0816
HUMAN NECESSITIES
International classification
A61B5/1455
HUMAN NECESSITIES
A61B5/00
HUMAN NECESSITIES
Abstract
A method of monitoring changes in oxygen saturation of a subject by analysing a three colour channel video image of the exposed skin of the subject. Within each colour channel a normalised signal obtained by dividing the intensity signal by its mean value, and the normalised signals are averaged across plural regions of interest within the exposed skin area image of the subject. Regions of interest are selected on the basis of the signal-to-noise ratios for the heart rate and breathing rate components. A single representative waveform for each colour channel is obtained by signal averaging and the ratio of the amplitudes of the representative waveforms from two different colour channels, e.g. blue and red, is taken. The changes in the ratio of amplitudes is output as a measure of changes in blood oxygen saturation.
Claims
1. A method of determining changes in blood oxygen saturation of a subject comprising the steps of: obtaining a video image of an area of exposed skin of the subject, the video image comprising signals representing intensity in at least two different colour channels; defining in the image a plurality of regions of interest in said area of exposed skin of the subject; determining a signal to noise ratio of a heart rate or breathing rate frequency component of one of the colour channel signals or both for each region of interest; determining whether to reject or not reject regions of interest based on the determined signal to noise ratio of a heart rate or breathing rate frequency component or both; processing the colour channel signals from non-rejected regions of interest by: normalising the signal in each of the colour channels by dividing each of the signals by its baseline component; determining the ratio of the amplitudes of the normalised signals from two of the colour channel signals; and outputting changes over time in the ratio as representing changes in blood oxygen saturation of the subject.
2. The method according to claim 1 wherein the baseline component is the average value of the signal over a predetermined period.
3. The method according to claim 1 wherein the ratio of the amplitudes of the averaged normalised signals from blue and red colour channel signals are determined.
4. The method according to claim 1 further comprising the step of averaging each the normalised signals within each colour channel before calculating the ratio of amplitudes.
5. The method according to claim 4 wherein the step of averaging each of the normalised signals within each colour channel comprises averaging together the signals for that colour channel from each of a plurality of regions of interest within said area of skin of the subject.
6. The method according to claim 5 wherein in averaging each of the normalised signals within each colour channel the signals from each region of interest are weighted according to the strength of the signal to noise ratio for a heart rate frequency component of one of the colour channel signals from that region of interest.
7. The method according to claim 4 wherein the step of averaging each of the normalised signals within each colour channel comprises signal averaging to determine a representative waveform for a predetermined time period of said signal.
8. The method according to claim 7 wherein said signal averaging comprises detecting the times of peaks in one of the colour channel signals, then within each colour channel selecting sections of the normalised signal extending a predetermined time either side of the detected peak times and averaging together the selected sections of the normalised signal.
9. The method according to claim 1 wherein the area of exposed skin of the subject is selected by detecting areas in the image for which a signal to noise ratio function for the heart rate is maximised.
10. The method according to claim 1 wherein the regions of interest are formed by dividing the selected area into plural contiguous regions.
11. The method according to claim 1 wherein regions of interest are rejected if a signal to noise ratio for a heart rate frequency component of a colour channel signal from the region of interest is below a predetermined threshold.
12. The method according to claim 1 wherein regions of interest are rejected if a signal to noise ratio for a breathing rate frequency component of a colour channel signal from the region of interest is above a predetermined threshold.
13. The method according to claim 1 wherein regions of interest are rejected if the phase of a heart rate frequency component of a colour channel signal from the region of interest is outside a predetermined threshold of the phase of a heart rate frequency component of the colour channel signal averaged over a plurality of the regions of interest.
14. The method according to claim 1 wherein the signal in each colour channel is windowed into temporal windows, and the processing steps of normalising and determining are performed for each time window to determine and output a ratio for each time window.
15. A system for determining changes in blood oxygen saturation of a subject comprising: a video camera for obtaining a video image of an area of exposed skin of the subject, the video image comprising signals representing intensity in at least two different colour channels; a signal processor adapted to receive the signals from the video camera and process them by: normalising the signal in each of the colour channels by dividing each of the signals by its baseline component; defining in the image a plurality of regions of interest in said area of exposed skin of the subject; determining a signal to noise ratio of a heart rate or breathing rate frequency component of one of the colour channel signals or both for each region of interest; determining whether to reject or not reject regions of interest based on the determined signal to noise ratio of a heart rate or breathing rate frequency component or both; determining the ratio of the amplitudes of the normalised signals from two of the colour channel signals for non-rejected regions of interest; and outputting changes over time in the ratio as representing changes in blood oxygen saturation of the subject; and a display adapted to display the changes over time in the ratio.
Description
(1) The invention will be further described by way of example with reference to the accompanying drawings in which:
(2)
(3)
(4)
(5)
(6) In the context of broad-band illumination and the use of RGB sensors, the time-series S.sup.i.sub.c of intensity values recorded by a camera from any given region of skin i for any given colour channel c may be decomposed into two parts: the baseline, or DC component, due to the residual blood present in the tissue at all times, and the pulsatile, heart-rate synchronous signal due to the change in colour as the blood flows in and out of the skin:
S.sub.c.sup.i(t)DC.sub.c.sup.i(t)+AC.sub.c.sup.i(t)(1)
(7) This assumes that all light is reflected diffusively and ignores any component due to specular reflection (we will see below that the signal processing used in the invention makes this a reasonable assumption). We will further assume that the whole timeseries S.sup.i.sub.c (t) will be affected by only the following variable factors: i) The intensity of the light reaching the skin at i ii) The vascular volume at i iii) The oxygen saturation.
(8) Other factors that will affect the absorption of light in the skin, such as melanin, and the spectral distribution of the light source at i, are assumed to be constant. The method of the invention allows the elimination of the effects of light intensity and vascular volume changes so as to determine changes in oxygen saturation. Of the three factors i) to iii) above, the oxygen saturation is assumed to be locally-invariant, whereas the intensity of the light is allowed to vary locally (for example through geometrical effects such as shadowing), and the vascular volume is also allowed to vary locally (reflecting anatomical variation in the vasculature of the skin). Under these assumptions the relationship between the recorded intensity time series S.sup.i.sub.c from a region i, the spectrally-invariant light intensity I.sup.i.sub.c (t) reaching the region i, may be written as:
S.sub.c.sup.i(t)=I.sub.c.sup.i(t)[DC.sub.c.sup.i(t)+AC.sub.c.sup.i(t)](2)
(9) Using this model, the normalisation of the signal by the DC component will lead to an elimination of the effect of the local light intensity I.sup.i.sub.c (t):
(10)
(11) There are however other factors, such as local blood volume increases, that come into play with respect to the normalised AC component that are not necessarily eliminated by the normalisation itself. To eliminate these, the ratio of ratios R.sup.i(t for two colour channels 1 and 2 is taken, under the assumption that all changes that are not due to oxygenation (and therefore cause a change in colour), will be proportional across the channels:
(12)
(13)
(14) The processing of the video signals by the processor 5 will be described with reference to the flowchart of
(15) Following starting of the processing at step 100, during an initialization step 101 an area of exposed skin is identified. This may be done either by applying specific prior knowledge about the scene (for example by face-detection if a face is known to be in the image), or by simply doing a search using a very large search area for the position at which the result of the SNR function for heart rate (SNR.sub.HR) is maximised, where the SNR function is defined as:
(16)
(17) for a detrended and appropriately filtered timeseries x(t), its Fourier transform F{x(t)}, and a double-step function V() defined by the convolution:
V()=[({circumflex over ()})+(2{circumflex over ()})]*(.sub.h)
(18) centred on the fundamental frequency of interest {circumflex over ()} and its first harmonic (e.g. {circumflex over ()}={circumflex over ()}.sub.HR for heart rate), with as the Dirac delta function, and as the rect function of half-width .sub.h.
(19) For the purpose of initialisation, x(t) is taken as the green signal over a period of 12 seconds, and the double-step function for the heart rate SNR is constructed with .sub.HR=1.4 Hz and .sub.h=0.7 Hz so as to cover the entire span of the expected physiological heart rate range.
(20) Once a search area in which the totality of the skin to be image is included has been defined in step 102, the area is subdivided into N contiguous n by n pixel regions of interest i (n=40 for example). The size of the region of interest is set depending on the camera, lighting and physiology of the subject so that the region is as small as possible while still giving a detectable heart rate signal. Taking 12 second windows slid by 1 second at a time, crude estimates for the heart rate, {circumflex over ()}.sub.HR, and the breathing rate, {circumflex over ()}.sub.BR, are found. The heart rate estimate is found by taking the average of all the signals resulting from the per-frame spatial average of the green channel, then detrending and high-pass filtering this average prior to finding the peak of the Fourier Transform. The breathing rate estimate is instead found by taking the average of the power spectral density (PSD) of the detrended blue channel in the frequency domain across all regions of interest and then searching for a peak present in the expected physiological range (between 0.1 and 0.7 Hz corresponding to 6 to 42 breaths per minute). The differences in calculating the breathing rate and the heart rate are due to the fact that the relative phase shift between the heart rate signal, that is prevalently due to colour changes as a result of the inflow and outflow of blood during the cardiac cycle, estimated in two different regions of interest from the plurality of regions of interest, is uniquely determined by the pulse transit time between the two regions. The phase shift caused by the pulse transit time between the two regions is expected to be far smaller than /2 radians. The breathing rate signals, on the other hand, are caused by changes in colour due to movement, and so can be either in phase or in antiphase as this will solely depend on the relative intensity of the pixels in the region of interest through time, and a temporal average would in fact minimise the breathing rate signal.
(21) The spatial averages across the 12 second windows are then calculated for each of the channels of the N regions of interest i to reduce the three 2D plus time colour channel signals from each region to three 1D signals. The heart rate SNR function (5) as above is then applied to each of the N regions of interest using the red channel only and fixing the frequency limits a and b of the function at a=0.7 Hz and b=2.4 Hz, and .sub.h is taken to be the quantisation limit of the FFT appliede.g. 0.7 Hz. A breathing rate SNR function as defined in Equation 5 is also applied to each of the N regions of interest using the blue channel only and fixing the frequency limits a=0.1 Hz and b=0.7 Hz and the half-width of the breathing rate rect function .sub.h is once more the quantisation limit of the FFT
(22) The results of the heart rate and breathing rate SNR functions serve to create a logical inclusion function L.sup.i that determines in step 105 whether the region of interest will be used in further calculations for that window or will be rejected. This serves to eliminate oscillations in the signal caused by specular reflections because the inclusion function serves to introduce a degree of confidence that the signal from the selected region of interest is from a colour change only. In fact, for a time series that contains only a pulsatility due to a true PPG skin colour change, we would expect a high result for the heart rate SNR, but a low result for the breathing rate SNR. This is because breathing rate is mostly associated with movement, and any region of interest that has a high breathing rate SNR will therefore have some component (whether a physical feature or a specular reflection) that is moving and thus does not fit the model's assumptions. In addition to this, a further condition is imposed: the heart rate component of the green channel of the region interest needs to be in phase with the green heart rate signal derived for the whole search area. This condition stems from the understanding that only movement-induced pulsations can be in phase or antiphase, and is determined by meeting the condition p.sup.i=1 where:
(23)
where .sub.g.sup.i ({circumflex over ()}.sub.HR) is the phase of the heart rate frequency component of the green signal of the region of interest i considered, and .sub.G({circumflex over ()}.sub.HR) is the phase of the heart rate frequency component of the green signal averaged over the entire area. A phase difference of /2 is chosen here because this gives the clearest demarcation between phase estimates that can be said to be in phase (but for a phase shift caused by a delay due to the pulse transit time) and the case for which two phase estimates are exactly in antiphase. The overall logical inclusion function is then given by:
L.sup.i=(SNR.sub.HR.sup.i>SNR.sub.HR.sup.thresh)(SNR.sub.BR.sup.i<SNR.sub.BR.sup.thresh)P.sup.i
with SNR.sub.HR.sup.thresh and SNR.sub.BR.sup.thresh determined by the initial conditions of the video (these will depend on skin colour, light intensity, light spectrum and distance from the camera). The thresholds can, for example, be taken as the mean plus one standard deviation of the SNRs in each of the regions of interest over the first stable window. In Step 106, the normalised amplitude is determined for each colour channel as per Equation 4 in all M regions of interest that meet the condition L.sup.i=1. As the method depends on multiple regions to reduce the measurement error, a minimum number of regions M3 is set at each iteration or the window is rejected. The DC component is taken as the time-average of the window for each colour channel and each region of interest individually and the AC component is taken to be the residual of the original timeseries after the DC component has been subtracted out. A weighted average of all the M normalised amplitudes is then taken in step 107 for each of the channels, in which the region of interest weightings .sub.i are a function of the heart rate SNR, such that:
(24)
(25) The weighting introduces an additional degree of belief in each region of interest, favouring regions of interest that have a high heart rate signal-to-noise ratio as these are more likely to correspond to the ideal surfaces that are considered in the theoretical model.
(26) Finally, in step 108 a self-referential signal averaging procedure is applied to the averaged waveforms. This is done by taking the green channel averaged waveform, high-pass filtering it and finding the positions of the peaks in the waveform (the green channel is used because it has a high signal-to-noise ratio). In each of the three colour channels all samples around the peaks
(27)
in the averaged waveforms are then averaged together to obtain a single representative waveform for the entire 12-second period for each channel. The amplitude of the waveform is taken as the normalised amplitude for that channel and the ratio of the blue normalised amplitude with respect to the normalised amplitude of the red channel is then taken as the ratio of ratios in step 109. The ratio of the normalised blue amplitude divided by the normalised red amplitude emerging from step 109, or its logarithm, is then output in step 110 as representative of the oxygen saturation and displayed on display 7. Steps 103 to 110 are then repeated until there is a significant movement of the subject outside of the search area as checked at step 111, at which point the algorithm is halted until there is a period of no movement and then reinitialised in step 112 and restarted.
(28)
(29) Although in
(30)
(31) As can be seen, the changes in oxygen saturation as measured by the method of the invention using the video camera shown in
(32) As mentioned above the calculation made by the invention does not directly result in an oxygen saturation value. However in a clinical setting oxygen saturation values may be obtained by first calibrating the system using a standard pulse oximeter. Thus a subject's oxygen saturation can be measured initially (and potentially at intervals thereafter) using a standard finger-probe pulse oximeter, with this value being used to calibrate the system of the invention, the method of the invention then being used primarily to track variations from that initial saturation. Significant decreases in oxygen saturation, which might represent a worsening of the subject's condition, can be used to trigger an alarm to the clinicians.
(33) Although the main thrust of the invention is to track changes in oxygen saturation, the estimated heart rate and breathing rate used in the method can be output and displayed on display 7 as additional vital signs information.