Method and apparatus for vital signs measurement

Abstract

A method of monitoring changes in oxygen saturation of a subject by analysing a three colour channel video image of the exposed skin of the subject. Within each colour channel a normalised signal obtained by dividing the intensity signal by its mean value, and the normalised signals are averaged across plural regions of interest within the exposed skin area image of the subject. Regions of interest are selected on the basis of the signal-to-noise ratios for the heart rate and breathing rate components. A single representative waveform for each colour channel is obtained by signal averaging and the ratio of the amplitudes of the representative waveforms from two different colour channels, e.g. blue and red, is taken. The changes in the ratio of amplitudes is output as a measure of changes in blood oxygen saturation.

Claims

1. A method of determining changes in blood oxygen saturation of a subject comprising the steps of: obtaining a video image of an area of exposed skin of the subject, the video image comprising signals representing intensity in at least two different colour channels; defining in the image a plurality of regions of interest in said area of exposed skin of the subject; determining a signal to noise ratio of a heart rate or breathing rate frequency component of one of the colour channel signals or both for each region of interest; determining whether to reject or not reject regions of interest based on the determined signal to noise ratio of a heart rate or breathing rate frequency component or both; processing the colour channel signals from non-rejected regions of interest by: normalising the signal in each of the colour channels by dividing each of the signals by its baseline component; determining the ratio of the amplitudes of the normalised signals from two of the colour channel signals; and outputting changes over time in the ratio as representing changes in blood oxygen saturation of the subject.

2. The method according to claim 1 wherein the baseline component is the average value of the signal over a predetermined period.

3. The method according to claim 1 wherein the ratio of the amplitudes of the averaged normalised signals from blue and red colour channel signals are determined.

4. The method according to claim 1 further comprising the step of averaging each the normalised signals within each colour channel before calculating the ratio of amplitudes.

5. The method according to claim 4 wherein the step of averaging each of the normalised signals within each colour channel comprises averaging together the signals for that colour channel from each of a plurality of regions of interest within said area of skin of the subject.

6. The method according to claim 5 wherein in averaging each of the normalised signals within each colour channel the signals from each region of interest are weighted according to the strength of the signal to noise ratio for a heart rate frequency component of one of the colour channel signals from that region of interest.

7. The method according to claim 4 wherein the step of averaging each of the normalised signals within each colour channel comprises signal averaging to determine a representative waveform for a predetermined time period of said signal.

8. The method according to claim 7 wherein said signal averaging comprises detecting the times of peaks in one of the colour channel signals, then within each colour channel selecting sections of the normalised signal extending a predetermined time either side of the detected peak times and averaging together the selected sections of the normalised signal.

9. The method according to claim 1 wherein the area of exposed skin of the subject is selected by detecting areas in the image for which a signal to noise ratio function for the heart rate is maximised.

10. The method according to claim 1 wherein the regions of interest are formed by dividing the selected area into plural contiguous regions.

11. The method according to claim 1 wherein regions of interest are rejected if a signal to noise ratio for a heart rate frequency component of a colour channel signal from the region of interest is below a predetermined threshold.

12. The method according to claim 1 wherein regions of interest are rejected if a signal to noise ratio for a breathing rate frequency component of a colour channel signal from the region of interest is above a predetermined threshold.

13. The method according to claim 1 wherein regions of interest are rejected if the phase of a heart rate frequency component of a colour channel signal from the region of interest is outside a predetermined threshold of the phase of a heart rate frequency component of the colour channel signal averaged over a plurality of the regions of interest.

14. The method according to claim 1 wherein the signal in each colour channel is windowed into temporal windows, and the processing steps of normalising and determining are performed for each time window to determine and output a ratio for each time window.

15. A system for determining changes in blood oxygen saturation of a subject comprising: a video camera for obtaining a video image of an area of exposed skin of the subject, the video image comprising signals representing intensity in at least two different colour channels; a signal processor adapted to receive the signals from the video camera and process them by: normalising the signal in each of the colour channels by dividing each of the signals by its baseline component; defining in the image a plurality of regions of interest in said area of exposed skin of the subject; determining a signal to noise ratio of a heart rate or breathing rate frequency component of one of the colour channel signals or both for each region of interest; determining whether to reject or not reject regions of interest based on the determined signal to noise ratio of a heart rate or breathing rate frequency component or both; determining the ratio of the amplitudes of the normalised signals from two of the colour channel signals for non-rejected regions of interest; and outputting changes over time in the ratio as representing changes in blood oxygen saturation of the subject; and a display adapted to display the changes over time in the ratio.

Description

(1) The invention will be further described by way of example with reference to the accompanying drawings in which:

(2) FIG. 1 schematically illustrates the system of the invention;

(3) FIG. 2 is a flow diagram illustrating the steps of one embodiment of the invention;

(4) FIGS. 3(a) and 3(b) schematically illustrate signal averaging; and

(5) FIGS. 4(a) and 4(b) compare the results of monitoring oxygen saturation with an embodiment of the invention to results obtained using a finger-probe pulse oximeter.

(6) In the context of broad-band illumination and the use of RGB sensors, the time-series S.sup.i.sub.c of intensity values recorded by a camera from any given region of skin i for any given colour channel c may be decomposed into two parts: the baseline, or DC component, due to the residual blood present in the tissue at all times, and the pulsatile, heart-rate synchronous signal due to the change in colour as the blood flows in and out of the skin:
S.sub.c.sup.i(t)DC.sub.c.sup.i(t)+AC.sub.c.sup.i(t)(1)

(7) This assumes that all light is reflected diffusively and ignores any component due to specular reflection (we will see below that the signal processing used in the invention makes this a reasonable assumption). We will further assume that the whole timeseries S.sup.i.sub.c (t) will be affected by only the following variable factors: i) The intensity of the light reaching the skin at i ii) The vascular volume at i iii) The oxygen saturation.

(8) Other factors that will affect the absorption of light in the skin, such as melanin, and the spectral distribution of the light source at i, are assumed to be constant. The method of the invention allows the elimination of the effects of light intensity and vascular volume changes so as to determine changes in oxygen saturation. Of the three factors i) to iii) above, the oxygen saturation is assumed to be locally-invariant, whereas the intensity of the light is allowed to vary locally (for example through geometrical effects such as shadowing), and the vascular volume is also allowed to vary locally (reflecting anatomical variation in the vasculature of the skin). Under these assumptions the relationship between the recorded intensity time series S.sup.i.sub.c from a region i, the spectrally-invariant light intensity I.sup.i.sub.c (t) reaching the region i, may be written as:
S.sub.c.sup.i(t)=I.sub.c.sup.i(t)[DC.sub.c.sup.i(t)+AC.sub.c.sup.i(t)](2)

(9) Using this model, the normalisation of the signal by the DC component will lead to an elimination of the effect of the local light intensity I.sup.i.sub.c (t):

(10) $\begin{matrix} \begin{matrix} S_{c}^{i^{}} (t) = \frac{I_{c}^{i} (t) [{DC}_{c}^{i} (t) + {AC}_{c}^{i} (t)]}{I_{c}^{i} (t) {DC}_{c}^{i} (t)} \\ = 1 + \frac{{AC}_{c}^{i} (t)}{{DC}_{c}^{i} (t)} \end{matrix} & (3) \end{matrix}$

(11) There are however other factors, such as local blood volume increases, that come into play with respect to the normalised AC component that are not necessarily eliminated by the normalisation itself. To eliminate these, the ratio of ratios R.sup.i(t for two colour channels 1 and 2 is taken, under the assumption that all changes that are not due to oxygenation (and therefore cause a change in colour), will be proportional across the channels:

(12) $\begin{matrix} R^{i} (t) = \frac{s_{1}^{i^{}} (t) - 1}{s_{2}^{i^{}} (t) - 1} = \frac{\frac{k^{i} (t) {AC}_{1}^{i} (t)}{{DC}_{1}^{i} (t)}}{\frac{k^{i} (t) {AC}_{2}^{i} (t)}{{DC}_{2}^{i} (t)}} & (4) \end{matrix}$

(13) FIG. 1 schematically illustrates the system of the invention. A human (or animal) subject 1 is in the field of vision of an RGB video camera 3 whose three colour channel output is fed to a signal processor 5 and results are displayed on a display 7. The processor 5 identifies and analyses signals from one or more exposed areas of skin 10, 12 on the subject 1 and, as explained in more detail below, each of these exposed areas 10, 12 is itself divided into plural regions of interest.

(14) The processing of the video signals by the processor 5 will be described with reference to the flowchart of FIG. 2.

(15) Following starting of the processing at step 100, during an initialization step 101 an area of exposed skin is identified. This may be done either by applying specific prior knowledge about the scene (for example by face-detection if a face is known to be in the image), or by simply doing a search using a very large search area for the position at which the result of the SNR function for heart rate (SNR.sub.HR) is maximised, where the SNR function is defined as:

(16) $\begin{matrix} \begin{matrix} SNR = SNR {x (t)}_{a}^{b} \\ = 10 \log (\frac{V {.Math. F {x (t)} .Math.}^{2} df}{(1 - V) {.Math. F {x (t)} .Math.}^{2} df}) \end{matrix} & (5) \end{matrix}$

(17) for a detrended and appropriately filtered timeseries x(t), its Fourier transform F{x(t)}, and a double-step function V() defined by the convolution:
V()=[({circumflex over ()})+(2{circumflex over ()})]*(.sub.h)

(18) centred on the fundamental frequency of interest {circumflex over ()} and its first harmonic (e.g. {circumflex over ()}={circumflex over ()}.sub.HR for heart rate), with as the Dirac delta function, and as the rect function of half-width .sub.h.

(19) For the purpose of initialisation, x(t) is taken as the green signal over a period of 12 seconds, and the double-step function for the heart rate SNR is constructed with .sub.HR=1.4 Hz and .sub.h=0.7 Hz so as to cover the entire span of the expected physiological heart rate range.

(20) Once a search area in which the totality of the skin to be image is included has been defined in step 102, the area is subdivided into N contiguous n by n pixel regions of interest i (n=40 for example). The size of the region of interest is set depending on the camera, lighting and physiology of the subject so that the region is as small as possible while still giving a detectable heart rate signal. Taking 12 second windows slid by 1 second at a time, crude estimates for the heart rate, {circumflex over ()}.sub.HR, and the breathing rate, {circumflex over ()}.sub.BR, are found. The heart rate estimate is found by taking the average of all the signals resulting from the per-frame spatial average of the green channel, then detrending and high-pass filtering this average prior to finding the peak of the Fourier Transform. The breathing rate estimate is instead found by taking the average of the power spectral density (PSD) of the detrended blue channel in the frequency domain across all regions of interest and then searching for a peak present in the expected physiological range (between 0.1 and 0.7 Hz corresponding to 6 to 42 breaths per minute). The differences in calculating the breathing rate and the heart rate are due to the fact that the relative phase shift between the heart rate signal, that is prevalently due to colour changes as a result of the inflow and outflow of blood during the cardiac cycle, estimated in two different regions of interest from the plurality of regions of interest, is uniquely determined by the pulse transit time between the two regions. The phase shift caused by the pulse transit time between the two regions is expected to be far smaller than /2 radians. The breathing rate signals, on the other hand, are caused by changes in colour due to movement, and so can be either in phase or in antiphase as this will solely depend on the relative intensity of the pixels in the region of interest through time, and a temporal average would in fact minimise the breathing rate signal.

(21) The spatial averages across the 12 second windows are then calculated for each of the channels of the N regions of interest i to reduce the three 2D plus time colour channel signals from each region to three 1D signals. The heart rate SNR function (5) as above is then applied to each of the N regions of interest using the red channel only and fixing the frequency limits a and b of the function at a=0.7 Hz and b=2.4 Hz, and .sub.h is taken to be the quantisation limit of the FFT appliede.g. 0.7 Hz. A breathing rate SNR function as defined in Equation 5 is also applied to each of the N regions of interest using the blue channel only and fixing the frequency limits a=0.1 Hz and b=0.7 Hz and the half-width of the breathing rate rect function .sub.h is once more the quantisation limit of the FFT

(22) The results of the heart rate and breathing rate SNR functions serve to create a logical inclusion function L.sup.i that determines in step 105 whether the region of interest will be used in further calculations for that window or will be rejected. This serves to eliminate oscillations in the signal caused by specular reflections because the inclusion function serves to introduce a degree of confidence that the signal from the selected region of interest is from a colour change only. In fact, for a time series that contains only a pulsatility due to a true PPG skin colour change, we would expect a high result for the heart rate SNR, but a low result for the breathing rate SNR. This is because breathing rate is mostly associated with movement, and any region of interest that has a high breathing rate SNR will therefore have some component (whether a physical feature or a specular reflection) that is moving and thus does not fit the model's assumptions. In addition to this, a further condition is imposed: the heart rate component of the green channel of the region interest needs to be in phase with the green heart rate signal derived for the whole search area. This condition stems from the understanding that only movement-induced pulsations can be in phase or antiphase, and is determined by meeting the condition p.sup.i=1 where:

(23) $P^{i} = {\begin{matrix} 1, & if .Math._{g}^{i} ({\hat{f}}_{HR}) -_{G} ({\hat{f}}_{HR}) .Math. < \frac{}{2} \\ 0, & otherwise \end{matrix}$
where .sub.g.sup.i ({circumflex over ()}.sub.HR) is the phase of the heart rate frequency component of the green signal of the region of interest i considered, and .sub.G({circumflex over ()}.sub.HR) is the phase of the heart rate frequency component of the green signal averaged over the entire area. A phase difference of /2 is chosen here because this gives the clearest demarcation between phase estimates that can be said to be in phase (but for a phase shift caused by a delay due to the pulse transit time) and the case for which two phase estimates are exactly in antiphase. The overall logical inclusion function is then given by:
L.sup.i=(SNR.sub.HR.sup.i>SNR.sub.HR.sup.thresh)(SNR.sub.BR.sup.i<SNR.sub.BR.sup.thresh)P.sup.i
with SNR.sub.HR.sup.thresh and SNR.sub.BR.sup.thresh determined by the initial conditions of the video (these will depend on skin colour, light intensity, light spectrum and distance from the camera). The thresholds can, for example, be taken as the mean plus one standard deviation of the SNRs in each of the regions of interest over the first stable window. In Step 106, the normalised amplitude is determined for each colour channel as per Equation 4 in all M regions of interest that meet the condition L.sup.i=1. As the method depends on multiple regions to reduce the measurement error, a minimum number of regions M3 is set at each iteration or the window is rejected. The DC component is taken as the time-average of the window for each colour channel and each region of interest individually and the AC component is taken to be the residual of the original timeseries after the DC component has been subtracted out. A weighted average of all the M normalised amplitudes is then taken in step 107 for each of the channels, in which the region of interest weightings .sub.i are a function of the heart rate SNR, such that:

(24) $_{i} = \frac{{SRN}_{HR}^{i}}{{.Math.}^{M} {SNR}_{HR}^{i}}, {i : L^{i} = 1}$

(25) The weighting introduces an additional degree of belief in each region of interest, favouring regions of interest that have a high heart rate signal-to-noise ratio as these are more likely to correspond to the ideal surfaces that are considered in the theoretical model.

(26) Finally, in step 108 a self-referential signal averaging procedure is applied to the averaged waveforms. This is done by taking the green channel averaged waveform, high-pass filtering it and finding the positions of the peaks in the waveform (the green channel is used because it has a high signal-to-noise ratio). In each of the three colour channels all samples around the peaks

(27) $\frac{2}{3 f_{HR}}$
in the averaged waveforms are then averaged together to obtain a single representative waveform for the entire 12-second period for each channel. The amplitude of the waveform is taken as the normalised amplitude for that channel and the ratio of the blue normalised amplitude with respect to the normalised amplitude of the red channel is then taken as the ratio of ratios in step 109. The ratio of the normalised blue amplitude divided by the normalised red amplitude emerging from step 109, or its logarithm, is then output in step 110 as representative of the oxygen saturation and displayed on display 7. Steps 103 to 110 are then repeated until there is a significant movement of the subject outside of the search area as checked at step 111, at which point the algorithm is halted until there is a period of no movement and then reinitialised in step 112 and restarted.

(28) FIGS. 3(a) and (b) schematically illustrate a signal averaging process applied to a waveform from one colour channel. In FIG. 3(a), sections 31 of each of the successive waveforms centred on the peaks 33 are taken, the peaks are aligned and the waveforms averaged to produce a single representative shape waveform for that signal as shown in FIG. 3(b). The amplitude 35 of the representative waveform is taken as the amplitude for that signal.

(29) Although in FIG. 2 and as explained above, the 12-second sections of signal are first averaged over all regions of interest in step 107, and then signal averaging within the 12-second section is performed on the result in step 108, these steps can be performed the other way around. Thus signal averaging for each 12-second section can be conducted for each region of interest to produce a single representative waveform for each region of interest. Then the representative waveforms from the plural regions of interest can be averaged to produce a final representative waveform and corresponding amplitude for that 12-second section of signal.

(30) FIG. 4 illustrates results obtained (log of the ratio of ratios of the normalised blue signal to the normalised red signal) by tracking a volunteer's oxygen saturation in accordance with the invention and simultaneously using a standard finger-probe pulse oximeter. A 3-CCD (JAIAT-200CL) RGB camera was used and the volunteer was placed in a study chamber in which the relative concentrations of oxygen, carbon dioxide, and nitrogen could be modified so as to induce mild hypoxia and hypercapnia. To produce the results of FIG. 4 the oxygen concentrations of the chamber were modified by changing the concentration of nitrogen in accordance with the following protocol: the concentrations were lowered so as to induce a change in oxygen saturation in steps of five percent (as measured by the reference pulse oximeter) each lasting seven minutes, from base line oxygen saturation (around 97 percent) to 80 percent. Two cycles of fast re-saturations and de-saturations then took place through the use of a nasal cannula for oxygen delivery.

(31) As can be seen, the changes in oxygen saturation as measured by the method of the invention using the video camera shown in FIG. 4(a) track the changes in oxygen saturation measured by the reference pulse oximeter in FIG. 4(b) reasonably well.

(32) As mentioned above the calculation made by the invention does not directly result in an oxygen saturation value. However in a clinical setting oxygen saturation values may be obtained by first calibrating the system using a standard pulse oximeter. Thus a subject's oxygen saturation can be measured initially (and potentially at intervals thereafter) using a standard finger-probe pulse oximeter, with this value being used to calibrate the system of the invention, the method of the invention then being used primarily to track variations from that initial saturation. Significant decreases in oxygen saturation, which might represent a worsening of the subject's condition, can be used to trigger an alarm to the clinicians.

(33) Although the main thrust of the invention is to track changes in oxygen saturation, the estimated heart rate and breathing rate used in the method can be output and displayed on display 7 as additional vital signs information.

Method and apparatus for vital signs measurement

Assignee

Inventors

Cpc classification

Classification Explorer

A61B5/7221

HUMAN NECESSITIES

Classification Explorer

A61B5/0077

HUMAN NECESSITIES

Classification Explorer

G06T7/0012

PHYSICS

Classification Explorer

G06T2207/30076

PHYSICS

Classification Explorer

G06T2207/30104

PHYSICS

Classification Explorer

A61B5/02416

HUMAN NECESSITIES

Classification Explorer

A61B5/0816

HUMAN NECESSITIES

Classification Explorer

A61B5/14551

HUMAN NECESSITIES

Classification Explorer

A61B5/7257

HUMAN NECESSITIES

International classification

Classification Explorer

A61B5/1455

HUMAN NECESSITIES

Classification Explorer

A61B5/00

HUMAN NECESSITIES

Classification Explorer

G06T7/00

PHYSICS

Classification Explorer

A61B5/024

HUMAN NECESSITIES

Abstract

Claims

Description