DERIVING INFORMATION ABOUT A PERSON'S SLEEP AND WAKE STATES FROM A SEQUENCE OF VIDEO FRAMES

Abstract

For the purpose of obtaining information about a person's sleep and wake states, an arrangement (100) comprising a video camera (10) and a processing unit (20) is used. The video camera (10) serves for capturing a sequence of video frames during a time period, and the processing unit (20) is configured to process video frames provided by the video camera (10) and to provide output representative of the person's sleep and wake states during the time period. In particular, the processing unit (20) is configured to execute an algorithm according to which (i) a motion value-time relation, (ii) sets of features relating to respective epochs in the motion value-time relation and (iii) classifiers of the respective epochs are determined, wherein the algorithm is further configured to apply an adaptive prior probability determined for the particular person in dependence of the motion values of the respective epochs to the classifiers.

Claims

1. An arrangement designed to derive information about a person's sleep and wake states from a sequence of video frames, comprising: a video camera for capturing a sequence of video frames during a time period, and a processing unit configured to process video frames provided by the video camera and to provide output representative of the person's sleep and wake states during the time period, wherein the processing unit is configured to execute an algorithm according to which a motion value-time relation is determined from the video frames, sets of features relating to respective epochs in the motion value-time relation are determined by extracting a number of different features from the motion values in each of the respective epochs, classifiers of the respective epochs are determined by classifying the respective sets of features relating to the respective epochs as being representative of the person's sleep or wake state, and an adaptive prior probability determined for the particular person in dependence of the motion values of the respective epochs is applied to the classifiers.

2. The arrangement according to claim 1, wherein the algorithm is configured to determine the adaptive prior probability in dependence of a distribution of the motion values over the respective epochs.

3. The arrangement according to claim 2, wherein the algorithm is configured to determine the adaptive prior probability in dependence of a number value of epochs with motion values which are larger than a reference motion value that is equal to or higher than zero in a total of epochs.

4. The arrangement according to claim 3, wherein the algorithm is configured to assign a high prior probability for wake when the number value of epochs with motion values which are larger than the reference motion value is equal to or higher than a threshold, and to assign a low prior probability for wake when the number value of epochs with motion values which are larger than the reference motion value is lower than the threshold.

5. The arrangement according to claim 3, wherein the algorithm is configured to determine an optimal relation between the number value of epochs with motion values which are larger than the reference motion value and a prior probability for wake.

6. The arrangement according to claim 1, wherein the algorithm is further configured to apply a smoothing filter removing short period deviations from an overall pattern of classifiers in respect of successive epochs to the classifiers.

7. The arrangement according to claim 6, wherein, in order to have the smoothing filter, the algorithm is configured to assess whether two sequences of epochs both representing a minimum length of time and involving sets of features to be classified as being representative of only one of the person's sleep or wake state are interrupted by one epoch or a limited number of epochs involving (a) set(s) of features to be classified as being representative of the other of the person's sleep or wake state, and if such is found to be the case, to set the classifier(s) of the one epoch or the limited number of epochs to be the same as the classifiers of the two sequences of epochs.

8. The arrangement according to claim 1, wherein the features to be extracted from the motion values in each of the respective epochs include at least one of (i) the mean of the motion values in each of the respective epochs and (ii) the number of motion values which are larger than a reference motion value that is equal to or higher than zero in each of the respective epochs.

9. The arrangement according to claim 8, wherein the features to be extracted from the motion values in each of the respective epochs include a relative possibility of sleep, and wherein the algorithm is configured to include a step of determining a time distance of each of the respective epochs to a nearest epoch with a high activity level in a process of determining the relative possibility of sleep.

10. The arrangement according to claim 8, wherein the features to be extracted from the motion values in each of the respective epochs include at least one of (i) a relative possibility of sleep, wherein the algorithm is configured to include a step of determining a time distance of each of the respective epochs to a nearest epoch with a high activity level in a process of determining the relative possibility of sleep, and wherein the algorithm is configured to identify the epochs with a high activity level by taking the epochs with the highest mean of the motion values, up to a predetermined maximum percentage of a total number of epochs, and (ii) a relative possibility of sleep, wherein the algorithm is configured to include a step of determining a time distance of each of the respective epochs to a nearest epoch with a high activity level in a process of determining the relative possibility of sleep, and wherein the algorithm is configured to identify the epochs with a high activity level by taking the epochs with the highest number of motion values which are larger than the reference motion value, up to a predetermined maximum percentage of a total number of epochs.

11. The arrangement according to claim 1, wherein the algorithm is configured to normalize the features.

12. The arrangement according to claim 1, wherein the algorithm is configured to determine machine learning classifiers on the basis of differences between (i) an initial set of classifiers determined on the basis of the features and (ii) a final set of classifiers determined by applying at least the adaptive prior probability, and to use the machine learning classifiers for making adjustments in the algorithm as far as determining the classifiers of the respective epochs is concerned.

13. The arrangement according to claim 1, wherein the algorithm is configured to apply 3D recursive search motion estimation for determining the motion value-time relation from the video frames and/or to apply Bayesian-based linear discriminant analysis for determining the classifiers of the respective epochs.

14. The arrangement according to claim 1, designed for use in infant care, wherein the algorithm is configured to classify the respective sets of features relating to the respective epochs as being representative of a care state of the infant besides the infant's sleep or wake state, and wherein the algorithm is configured to set the classifier of an epoch to be a care state classifier when the epoch is an epoch with an activity level that is above a threshold chosen to distinguish between the wake state and the care state.

15. A computer program product comprising a program code of a computer program to make a computer execute the algorithm of the arrangement according to claim 1 when the computer program is loaded on the computer.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0026] The invention will now be explained in greater detail with reference to the figures, in which equal or similar parts are indicated by the same reference signs, and in which:

[0027] FIG. 1 diagrammatically shows a video camera and a processing unit of the arrangement according to the invention, as used for monitoring sleep and wake states of an infant in an incubator,

[0028] FIG. 2 is a diagram of various steps of an algorithm to be executed by the processing unit for the purpose of providing output representative of the infant's sleep and wake states during a recording,

[0029] FIG. 3 shows a first example of a graph of a motion value-time relation and monitoring results associated therewith,

[0030] FIG. 4 shows a second example of a graph of a motion value-time relation and monitoring results associated therewith, and

[0031] FIG. 5 shows a scatter plot and a linear fitting between percentage of epochs with non-zero motion values and percentage of wake of a number of recordings.

DETAILED DESCRIPTION OF EMBODIMENTS

[0032] As explained in the foregoing, the invention is about obtaining reliable information about a person's sleep in a non-obtrusive way, particularly by capturing a sequence of video frames and using a processing unit that is programmed to follow a certain algorithm in analyzing the video frames. In many cases, when a certain time period is considered, it is desirable to know during which epochs of the time period the person was asleep and during which epochs the person was awake. According to the invention, video detection of motion of the person is at the basis of the analysis to be performed for gaining the knowledge as desired.

[0033] FIG. 1 diagrammatically shows a video camera 10 and a processing unit 20 of an arrangement 100 according to the invention, as used for monitoring sleep and wake states of an infant 30 in an incubator 40. The positioning of the video camera 10 with respect to the incubator 40 is not very critical, as in fact the only requirement is to have the video camera 10 arranged at a position where the video camera 10 is capable of recording motion of at least some part of the infant's body. In order to keep costs of the arrangement 100 as low as possible, it is advantageous to use only one video camera 10, but that does not alter the fact that the invention covers the use of two or even more video cameras 10 as well.

[0034] The processing unit 20 can be provided in various ways. Practical examples include an arrangement of the processing unit 20 as an integral part of the video camera 10 and an arrangement of the processing unit 20 in a computer system positioned separately from the video camera 10. In any case, the processing unit 20 is arranged and configured so as to be enabled to receive information from the video camera 10. Further, any type of display device such as a screen (not shown) can be used for displaying information output from the processing unit 20 to a user. Communication between one or more components of the arrangement 100 for the purpose of conveying information can take place through a wired system or in a wireless manner, wherein use can be made of the Internet if so desired.

[0035] FIG. 2 is a diagram of various steps of the algorithm to be executed by the processing unit 20 for the purpose of providing output representative of the infant's sleep and wake states during a recording. The following description refers to the various steps and provides further information about each of those steps, wherein it is assumed that the invention is applied in a context as illustrated in FIG. 1, i.e. a context in which it is desirable to obtain information about a wake-sleep pattern of an infant 30 in an incubator 40. However, it is understood that the invention is also applicable to other contexts, including a domestic context in which an infant (or an older person) can be monitored while lying in bed.

[0036] Motion or video actigraphy, i.e. motion estimated from a video recording, is mainly caused by body motion, parent activity/caretaking, or other disturbances, e.g. from other moving objects. This is important information that can tell whether the infant 30 is present in the incubator 40, enabling further an automated analysis of the infant's sleep. The main idea is that body motion of the infant 30 is highly associated with the infant's sleep and wake states, assuming that an infant 30 usually has a higher degree of body motion during the wake state compared with the degree of body motion during the sleep state.

[0037] In a first step 1 of the analysis of the video frames, a motion estimation technique is employed for quantifying motion as video actigraphy (VA). In the shown example, the motion estimation technique is assumed to be a technique known as 3D recursive search (3DRS). It has been demonstrated that this particular technique is robust to scene changes, especially light changes, meaning that the effect of daytime light can be eliminated. The type of video recording may be RGB at grayscale. Another feasible example of the type of video recording is NIR, wherein it is noted that the application of RGB may be best applicable to the context of preterm infants, while the application of NIR may be best applicable to the context of term infants. The raw 3DRS motion estimates corresponding to the video recording may have a frame rate of approximately 15 Hz, but may also have another frame rate such as 8 Hz, depending on the video camera 10 that is employed.

[0038] FIG. 3 shows an example of VA values relating to a preterm infant 30, obtained by running a 3DRS algorithm for processing video frames of a recording of about 2 hours taken at a frequency of about 15 Hz. FIG. 4 shows an example of VA values relating to a healthy term infant 30, obtained by running a 3DRS algorithm for processing video frames of a recording of about 24 hours taken at the same frequency of about 15 Hz. A larger estimated VA value usually corresponds to a large body movement or more body movements.

[0039] In a second step 2 of the analysis of the video frames, features are extracted from the raw estimated motion data. According to guidelines of the American Academy of Sleep Medicine (AASM), sleep states should be classified for continuous non-overlapping epochs of 30 seconds. In view thereof, it is assumed that the features are extracted on a 30 seconds basis. In the following, it is explained how four features can be extracted for each epoch, two of those features being computed based on the mean of the VA values over the respective epochs, and another two of those features being computed by counting the non-zero VA values over the respective epochs. The set of four features that is obtained in this way for each of the epochs is summarized as follows: [0040] (i) video actigraphy mean (VAM)=mean of VA values (motion estimates) over 30 seconds, computed based on raw VA data after 3DRS motion estimations, [0041] (ii) video actigraphy count (VAC)=(averaged) count of non-zero VA values (motion estimates) over 30 seconds, computed based on raw VA data after 3DRS motion estimations, [0042] (iii) relative possibility of sleep based on VAM (PS.sub.VAM)=relative possibility of sleep, as measured by the time distance to the nearest epoch with a high activity level as computed based on VAM, and [0043] (iv) relative possibility of sleep based on VAC (PS.sub.VAC)=relative possibility of sleep, as measured by the time distance to the nearest epoch with a high activity level as computed based on VAC.

[0044] In the following, further details of the various features and the way in which they are determined are provided.

[0045] The VAM aims to capture the average magnitude of body movements, while the VAC characterizes movement frequency, i.e. number of movements, for each epoch. Assuming the raw VA data within a 30 seconds epoch is u={u.sub.1, u.sub.2, . . . , u.sub.k}, where k is 450 when the frequency of the video frames is 15 Hz, the mean of the VA data over the epoch is computed by

[00001] $VAM = \frac{{.Math.}_{i = 1}^{k} u_{i}}{k}$

and the (averaged) count of non-zero VA data over the epoch is computed by

[00002] $V AC = \frac{{.Math.}_{i = 1}^{k} v_{i}}{k} Where$ $v_{i} = {\begin{matrix} 1, & u_{i} > 0 \\ 0, & Otherwise \end{matrix}, for all i (i = 1, 2, .Math., k)$

[0046] It is challenging to identify wakefulness in situations of reduced body movements, e.g. during quiet wake, with VAM or VAC exclusively. Further accuracy can be realized by extracting features to characterize the possibility of being asleep PS.sub.VAM and PS.sub.VAC before/after very high activity level. This can be done by quantifying the logarithm of time difference between each epoch and its nearest epoch with a lot of body movements, in correspondence to a large VAM value or a large VAC value. The outcome can be smoothed through a moving average operation with an experimentally optimized window of 10 minutes. For each recording, an epoch with a VAM or VAC value higher than the 95 percentile of the VAM or VAC values over the entire recording is considered to have a high activity level for the purpose of computing PS.sub.VAM or PS.sub.VAC values. The hypothesis is that epochs closer to an epoch with a high level of activity, and thereby with a smaller time difference, are more likely to correspond to wake state, albeit possibly with less body movements. Assuming a series of n epoch-based VAM or VAC feature values a={a.sub.1, a.sub.2, . . . , a.sub.n} over an entire recording, a.sub.T is a subset of a in which feature values are larger than a threshold T, where the associated epoch indices are e.sub.T={e.sub.1, e.sub.2, . . . , e.sub.m}. Accordingly, b={b.sub.1, b.sub.2, . . . , b.sub.n} is a set of PS.sub.VAM or PS.sub.VAC feature values from the same series. The value b.sub.x at epoch x (x=1, 2, . . . , n) can then be computed such that

b.sub.x=ln(min{|x−e.sub.1|,|x−e.sub.2|, . . . ,|x−e.sub.m|})

[0047] T may be experimentally chosen as the 95 percentile of the feature values over the entire recording, for example. In any case, it follows from the foregoing that the second step 2 of the analysis of the video frames may involve obtaining two new, secondary features PS.sub.VAM and PS.sub.VAC after computing on the two primary features VAM and VAC.

[0048] In a third step 3 of the analysis of the video frames, in order to reduce the global variability between subjects and between days conveyed by VA, features are normalized for each recording. For PS.sub.VAM and PS.sub.VAC, feature values may be normalized to zero mean and unit standard deviation of the entire recording (Z-score normalization), while for VAM and VAC, it may be practical to normalize the feature values between 0 and 1 (max-min normalization). Within the framework of the invention, any suitable type of normalization method may be applied, wherein it is noted that normalization methods can be different for different features.

[0049] In a fourth step 4 of the analysis of the video frames, classification takes place on the basis of the sets of features of the respective epochs. In particular, for each epoch, the set of features of that particular epoch is classified as being representative of a wake state or a sleep state, possibly also an out of bed state. In FIGS. 3 and 4, the outcome of the classification process is represented above the respective graphs. The classifiers of the respective epochs can be determined in any suitable manner. For example, it may be practical to use Bayesian-based linear discriminant analysis in the process.

[0050] The processing unit 20 is configured to take the prior probability of the classifiers into account. This step of the analysis of the video frames is a parallel step that is indicated by reference numeral 4a in FIG. 2. In general, the prior probability of a classifier is usually set to give a preference to one class when making decisions. On the basis of the assumption that body movements indicate a wake state, it is intuitive that a recording having more epochs with body movements should have more wake epochs, leading to a higher probability for wake epochs and a lower probability for sleep epochs. This is confirmed by the significant correlation shown in FIG. 5 in which a relation between a percentage of non-zero motion epochs and a percentage of wake epochs is illustrated. Therefore, the invention proposes a personalized adaptive priors depending on the percentage of non-zero motions, which can be computed based on VAM or VAC, for example. When it is assumed that the prior probability is PriW for wake and 1−PriW for sleep, and that the percentage of epochs with non-zero motions is PE, PE is compared to a threshold S. If it appears that PE is larger than S, a higher PriW (PriW_high) should be assigned, otherwise, a lower PriW (PriW_low) should be given, such that

[00003] $P riW = {\begin{matrix} PriW_low, & PE < S \\ PriW_high, & PE \geq S \end{matrix}$

[0051] In an example involving experimentally chosen values, S is 0.14, the higher PriW is 0.5, and the lower PriW is 0.1. The personalized prior probability can also be determined using a linear regression method, wherein an optimal linear relation between PE and PriW is established.

[0052] FIG. 5 shows a scatter plot and illustrates a linear fitting between the percentage of non-zero motion epochs and the percentage of wake epochs. The scatter plot is derived from 45 recordings for healthy term infants. Pearson's correlation coefficient is 0.66 at p<0.0001.

[0053] In a fifth step 5 of the analysis of the video frames, suspicious segmented awakenings or sleep states with a very short time duration are “filtered” out, while relatively long periods of wake and sleep are preserved to be annotated. It is therefore advantageous to apply a “low pass filter” to smoothen the detection results, as by doing so, it may be possible to correct some single or very short periods of misclassified wake and sleep epochs. The window size can be experimentally chosen to optimize classification performance.

[0054] Finally, the processing unit 20 outputs the identification output that can be communicated to and interpreted by a user in order for the user to be provided with knowledge about the wake-sleep behavior of a recorded infant 30 over the time of the recording.

[0055] In the context of the invention, experiments were performed to validate the above-described algorithm, and to check whether the invention is suitable to be used for obtaining reliable information about the wake-sleep behavior of both healthy term infants and preterm infants. For preterm infants, 45 video recordings (738×480 pixels or 768×576 pixels) from 7 infants with an average gestational age of 29.9 weeks were included. For healthy term infants, video data of 29 hours (1280×720 pixels) from 8 infants aged 6 months on average were included. Inclusion criteria were that the term infants needed to sleep (most of the time) in their own bed and bedroom. Sleep and wake states were scored by human annotators for non-overlapping epochs that lasted 30 seconds. These annotations served as the golden standard for automated classification. For preterm infants, caretaking was also annotated and considered as wake, yet there were much less wake epochs.

[0056] To demonstrate the validity of the proposed classification algorithm, a (subject-independent) leave-one-subject-out cross validation (LOOCV) was applied. Overall accuracy and the chance-compensated metric Cohen's Kappa coefficient were used to assess the classification performance. The following tables present and compare the sleep and wake classification results using different feature sets and settings/methods for preterm infants and healthy term infants, respectively. It can be seen that using adaptive prior probability for the individual infants and result filtering can improve the classification performance for both preterm and term infants.

TABLE-US-00001 Sleep and wake (and caretaking) classification results for preterm infants (average and standard deviation over infants, LOOCV) Feature set Setting Accuracy Cohen's Kappa VAM + PS.sub.VAM Basic classifier + 0.82 ± 0.11 0.38 ± 0.14 result filtering 0.85 ± 0.11 0.42 ± 0.16 VAC + PS.sub.VAC Basic classifier + 0.79 ± 0.16 0.39 ± 0.16 result filtering 0.81 ± 0.17 0.41 ± 0.15 Best set (on Kappa): Basic classifier + 0.81 ± 0.15 0.40 ± 0.17 VAM + VAC + result filtering 0.83 ± 0.15 0.44 ± 0.18 PS.sub.VAC Note: Adaptive prior probability is not available/applicable due to short recordings from preterm infants. Median filter window size for result filtering was chosen to optimize Kappa.

TABLE-US-00002 Sleep and wake (and caretaking) classification results for healthy term infants (average and standard deviation over infants, LOOCV) Feature set Setting Accuracy Cohen's Kappa VAM + PS.sub.VAM Basic classifier + 0.85 ± 0.10 0.48 ± 0.15 adaptive prior 0.89 ± 0.04 0.55 ± 0.08 probability + result filtering VAC + PS.sub.VAC Basic classifier + 0.90 ± 0.03 0.52 ± 0.14 adaptive prior 0.92 ± 0.02 0.59 ± 0.16 probability + result filtering Best set (on Kappa): Basic classifier + 0.90 ± 0.03 0.53 ± 0.15 VAM + VAC + adaptive prior 0.93 ± 0.02 0.60 ± 0.16 PS.sub.VAC probability + result filtering Note: Median filter window size for result filtering was chosen to optimize Kappa.

[0057] The sleep and wake classification results can be used for higher level interpretations, e.g. total time of infant in bed, total sleep time, total wake time, number of awakenings, and other infant sleep/wake statistics. As mentioned earlier, the invention also covers application of the proposed algorithm for the purpose of monitoring wake-sleep behavior of persons older than infants. The arrangement 100 according to the invention can be used in homes, hospitals and other settings. Various reasons for applying the invention are available, including a desire to obtain sleep quality information of a person and a desire to schedule caretaking actions during wake states as much as possible.

[0058] It will be clear to a person skilled in the art that the scope of the invention is not limited to the examples discussed in the foregoing, but that several amendments and modifications thereof are possible without deviating from the scope of the invention as defined in the attached claims. It is intended that the invention be construed as including all such amendments and modifications insofar they come within the scope of the claims or the equivalents thereof. While the invention has been illustrated and described in detail in the figures and the description, such illustration and description are to be considered illustrative or exemplary only, and not restrictive. The invention is not limited to the disclosed embodiments. The drawings are schematic, wherein details that are not required for understanding the invention may have been omitted, and not necessarily to scale.

[0059] Variations to the disclosed embodiments can be understood and effected by a person skilled in the art in practicing the claimed invention, from a study of the figures, the description and the attached claims. In the claims, the word “comprising” does not exclude other steps or elements, and the indefinite article “a” or “an” does not exclude a plurality. Any reference signs in the claims should not be construed as limiting the scope of the invention.

[0060] Elements and aspects discussed for or in relation with a particular embodiment may be suitably combined with elements and aspects of other embodiments, unless explicitly stated otherwise. Thus, the mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

[0061] The term “comprise” as used in this text will be understood by a person skilled in the art as covering the term “consist of”. Hence, the term “comprise” may in respect of an embodiment mean “consist of”, but may in another embodiment mean “contain/include at least the defined species and optionally one or more other species”.

[0062] A possible summary of the invention reads as follows. For the purpose of deriving information about a person's sleep and wake states from a sequence of video frames, an arrangement comprising a video camera 10 and a processing unit 20 is used. The video camera 10 serves for capturing a sequence of video frames during a time period, and the processing unit 20 is configured to process video frames provided by the video camera 10 and to provide output representative of the person's sleep and wake states during the time period. In particular, the processing unit 20 is configured to execute an algorithm according to which (i) a motion value-time relation, (ii) sets of features relating to respective epochs in the motion value-time relation and (iii) classifiers of the respective epochs are determined, wherein the algorithm is further configured to apply a personalized adaptive prior probability, i.e. an adaptive prior probability determined for the particular person in dependence of the motion values of the respective epochs, to the classifiers.

DERIVING INFORMATION ABOUT A PERSON'S SLEEP AND WAKE STATES FROM A SEQUENCE OF VIDEO FRAMES

Inventors

Cpc classification

Classification Explorer

A61B2503/045

HUMAN NECESSITIES

Classification Explorer

A61B5/1128

HUMAN NECESSITIES

Classification Explorer

A61B5/0033

HUMAN NECESSITIES

Classification Explorer

A61B5/4812

HUMAN NECESSITIES

Classification Explorer

A61B2576/00

HUMAN NECESSITIES

Classification Explorer

G16H50/20

PHYSICS

Classification Explorer

A61B5/4809

HUMAN NECESSITIES

Classification Explorer

A61B5/7264

HUMAN NECESSITIES

Classification Explorer

G16H30/40

PHYSICS

Classification Explorer

A61B5/7225

HUMAN NECESSITIES

International classification

Classification Explorer

A61B5/00

HUMAN NECESSITIES

Classification Explorer

A61B5/11

HUMAN NECESSITIES

Abstract

Claims

Description