DEVICE, METHOD AND SYSTEM FOR REGISTERING A FIRST IMAGE FRAME AND A SECOND IMAGE FRAME

Abstract

The present invention relates to a remote photoplethysmography device (150) for registering a first image frame (120) acquired by a first imaging unit (110) and a second image frame (140) acquired by a second imaging unit (130), both the first and the second image frames (120, 140) depicting a common region of interest (160), the remote photoplethysmography device (150) comprising a processing unit (190) configured to measure a first pixel displacement (200) between the first image frame (120) and the second image frame (140), to correct the first pixel displacement (200) according to spatial and/or temporal geometric constraints between the first imaging unit (110) and the second imaging unit (130), and to register the first image frame (120) with the second image frame (140) based on the corrected first pixel displacement (200).

Claims

1. A remote photoplethysmography device for a first image frame acquired by a first imaging unit and a second image frame acquired by a second imaging unit, both the first and the second image frames depicting a common region of interest, the remote photoplethysmography device comprising a processor configured to: measure a first pixel displacement between the first image frame and the second image frame; correct the first pixel displacement according to spatial and/or temporal geometric constraints between the first imaging unit and the second imaging unit; and register the first image frame with the second image frame based on the corrected first pixel displacement.

2. The remote photoplethysmography device according to claim 1, wherein the first image frame and the second image are acquired at a same point in time.

3. The remote photoplethysmography device according to claim 1, wherein the processor is configured to measure, as the first pixel displacement, a pixel-to-pixel displacement between pixels or a displacement between a group of pixels inside the region of interest (160).

4. The remote photoplethysmography device according to claim 1, wherein the processor is configured to measure the first pixel displacement based on a dense optical flow acquired for each individual pixel inside the region of interest or for a group of pixels inside the region of interest.

5. The remote photoplethysmography device according to claim 4, wherein the dense optical flow is based on one of the Lukas Kanade flow, the Farneback flow, the Horn-Schunck flow, the block-matching flow, the deep-nets flow and/or the 3DRS flow.

6. The remote photoplethysmography device according to claim 1, wherein the processor is further configured to: analytically calculate a second pixel displacement based on the spatial geometric constraints and/or the temporal geometric constraints; and smooth the first pixel displacement by calculating a mean value of the first pixel displacement and the second pixel displacement.

7. The remote photoplethysmography device according to claim 1, wherein the processor is further configured to: analytically calculate a second pixel displacement based on the spatial geometric constraints and/or the temporal geometric constraints, detect outliers in the measured first pixel displacement by comparing said first pixel displacement with the second pixel displacement, and correct the first pixel displacement by rejecting the detected outliers.

8. The remote photoplethysmography device according to claim 1, wherein the processor is configured to downscale the first image frame and the second image frame and/or to upscale the first pixel displacement.

9. The remote photoplethysmography device according to claim 1, wherein the spatial geometric constraints are based on predetermined geometric constraints between the first imaging unit and the second imaging unit.

10. A remote photoplethysmography system comprising: a first image configured to acquire a first image frame, a second image spaced apart from the first imaging unit and configured to acquire a second image frame, and a remote photoplethysmography device according to claim 1 for registering the first image frame and the second image frame.

11. The remote photoplethysmography system according to claim 10, wherein the first imager is a monochrome camera and/or a multi-spectrum camera and wherein the second imager is a monochrome camera and/or a multi-spectrum camera.

12. The remote photoplethysmography system according to claim 10, wherein the first imager is configured to acquire a first wavelength or wavelength range in the visible or infrared wavelength range, and wherein the second imager is configured to acquire a second wavelength or wavelength range, different from the first wavelength or wavelength range, in the visible or infrared wavelength range.

13. The remote photoplethysmography system according to claim 10, further comprising a health parameter extractor configured to extract vital signs of a subject based on the registered image frame.

14. A remote photoplethysmography method for non-linear registration of a first image frame acquired by a first imager and a second image frame acquired by a second imager, the method comprising the steps of: measuring a first pixel displacement between the first image frame and the second image frame, wherein, as the first pixel displacement, a pixel-to-pixel displacement between pixel or a displacement between a group of pixels inside the region of interest is measured; correcting the first pixel displacement according to spatial and/or temporal geometric constraints between the first imager and the second imager; and registering the first image frame with the second image frame based on the corrected first pixel displacement.

15. A non-transitory computer-readable medium that stores therein a computer program product, which, when executed on a processor, causes the steps of the method as claimed in claim 14 to be performed.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0043] These and other embodiment of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter. In the following drawings

[0044] FIG. 1 shows a schematic diagram of a system according to the present invention;

[0045] FIG. 2 shows a flowchart of a method according to the present invention;

[0046] FIGS. 3A, 3B, 3C, 3D show a system setup (FIG. 3A) and resulting image frames (FIGS. 3B, 3C, 3D) acquired by a system according to the present invention;

[0047] FIG. 4 shows a schematic diagram of a first group of image frames acquired by a system according to the present invention;

[0048] FIGS. 5A, 5B show a schematic diagram of a second group of image frames acquired by a system according to the present invention;

[0049] FIG. 6 shows a schematic diagram of a third group of image frames acquired by a system according to the present invention;

[0050] FIGS. 7A, 7B, 7C show image frames with no image registration (FIG. 7A), transformation based image registration (FIG. 7B) and image registration according to the present invention (FIG. 7C);

[0051] FIGS. 8A, 8B, 8C show image frames with no image registration (FIG. 7A), transformation based image registration (FIG. 7B) and image registration according to the present invention (FIG. 7C); and

[0052] FIGS. 9A, 9B show a comparison of a state of the art registration (FIG. 9A) and image registration according to the present invention (FIG. 9B).

DETAILED DESCRIPTION OF EMBODIMENTS

[0053] FIG. 1 shows a schematic diagram of an embodiment of a system 100 according to the present invention. The system 100 comprises a first imaging unit 110 configured to acquire a first frame 120 and a second imaging unit 130 configured to acquire a second image frame 140.

[0054] The system 100 further comprises a device 150 for registering the first image frame 120 acquired by the first imaging unit 110 and the second image frame 140 acquired by the second imaging unit 130. The imaging units 110, 130 may also be referred to as camera-based or remote PPG sensors. Both the first image frame 120 and the second image frame 140 depicting a common region of interest 160 of a subject 170. Both image frames 120, 140 include information used to determine physiological information indicative of at least one vital sign of the subject 170.

[0055] The subject 170 may be a patient, in this example a patient lying in a bed 180, e.g. in a hospital or other healthcare facility, but may also be a neonate or premature infant with very sensitive skin in NICU's, e.g. lying in an incubator, a patient with damaged (e.g. burned) skin or a person at home or in a different environment.

[0056] There exist different embodiments for a device for registering image frames acquired by different imaging units depicting a common region of interest of a subject's body, which may alternatively (which is preferred) or together be used. In the embodiment of the system 100, one exemplary embodiment of the device 150 is shown and will be explained below.

[0057] In one embodiment of the system 100, the first imaging unit 110 is a first camera and the second imaging unit is a second camera. Here, the first camera 110 is a monochrome camera and the second camera 130 is a multi-spectrum camera. In other embodiments, both the first and the second imaging unit 110, 130 may be a monochrome camera and/or a multi-spectrum camera. Preferably, the first imaging unit 110 (the first camera 112) is configured to acquire a first wavelength (such as red light at 700 nm) or wavelength range (such as red light from 680 nm to 720 nm) or infrared wavelength range (above 790 nm), whereas the second imaging unit 130 (the second camera 130) is configured to acquire a second wavelength (such as green light at 550 nm) or wavelength range (such as green light from 530 nm to 570 nm). The second wavelength or wavelength range is preferably different from the first wavelength or wavelength range.

[0058] In other embodiments, the system may comprise more than two imaging units spaced apart from one another. For example, according to a preferred embodiment, the system may further comprise a third imaging unit depicting the common region of interest 160 of a subject 170. The third imaging unit is preferably configured to acquire a third image frame. Preferably, the third imaging unit is configured to acquire a third wavelength or wavelength range which is preferably different from the first and the second wavelength or wavelength range.

[0059] Both the first camera 110 and the second camera 130 preferably include a suitable photosensor for (remotely and unobtrusively) capturing image frames (such as the first image frame 120 and the second image frame 140) of the region of interest 160 of the subject 170, in particular for acquiring a sequence of image frames of the subject 170 over time, from which photoplethysmography signals can be derived. The image frames captured by the cameras 110, 130 may particularly correspond to a video sequence captured by means of an analog or digital photosensor, e.g. in a (digital) camera. Such cameras 110, 130 usually includes a photosensor, such as a CMOS or CCD sensor, which may also operate in a specific spectral range (visible, IR) or provide information for different spectral ranges. The cameras 110, 130 may provide an analog or digital signal.

[0060] The image frames 120, 140 include a plurality of image pixels having associated pixel values. Particularly, the image frames 120, 140 include pixels representing light intensity values captured with different photosensitive elements of a photosensor. These photosensitive elements may be sensitive in a specific spectral range (i.e. representing a specific color or pseudo-color (in NIR)). The image frames 120, 140 include at least some image pixels being representative of a skin portion of the subject 170. Thereby, an image pixel may correspond to one photosensitive element of a photo-detector and its (analog or digital) output or may be determined based on a combination (e.g. through binning) of a plurality of the photosensitive elements.

[0061] In some embodiments, the system 100 may further comprise a light source (also called illumination source), such as a lamp, for illuminating the region of interest 160, such as the skin of the subject's 170 face (e.g. part of the cheek or forehead), with light, for instance in predetermined wavelengths or wavelength ranges (e.g. in the red, green and/or infrared wavelength range(s)). The light reflected from said region of interest 160 in response to said illumination may be detected by the cameras 110, 130. In another embodiment no dedicated light source is provided, but ambient light is used for illumination of the subject 170. From the reflected light, only light in a number of desired wavelength ranges (e.g. green and red or infrared light, or light in a sufficiently large wavelength range covering at least two wavelength channels) may be detected and/or evaluated by the cameras 110, 130. Therefore, the cameras 110, 130 may be applied with optical filters which are preferably different, though their filter bandwidth can be overlapping. It is sufficient if their wavelength-dependent transmission is different.

[0062] The device 150 according to one aspect of the invention comprises a processing unit 190 which may be a processor of a computational device, system-on-a-chip or any other suitable unit for data processing. The processing unit 190 according to the embodiment shown in FIG. 1 is located inside the device 150, but may also be located outside, i.e., spaced apart from the device 150 and connected though one or more cables or wirelessly to device 150.

[0063] The processing unit 190 is configured to measure a first pixel displacement 200 between the first image frame 120 and the second image frame 140 and thereby executes step S100 of the method according to another aspect of the present invention (see FIG. 2). Furthermore, the processing unit 190 is configured to correct the first pixel displacement 200 according to spatial and/or temporal geometric constraints between the first imaging unit 110 and the second imaging unit 130, thereby executing step S200 of the method according to the present invention. Then, the processing unit 190 is configured to register the first image frame 120 with the second image frame 140 based on the corrected first pixel displacement 200 by executing step S300 of the invention method (see FIG. 2). It should be noted that the method may comprise one or more intermediate steps which may precede or follow the above steps S100, S200, S300, respectively.

[0064] In order to monitor health related parameters based on the registered image frames, information included in the image frames, namely pixel based or group of pixels based color or grayscale information, may be extracted from image frames in time sequence. Therefore, a health parameter extraction unit 210 may be connected to the device 150 via one or more cables or wirelessly or may be integrated in the device 150. The health parameter extraction unit 210 may be preferably configured to extract one or more health related parameters from registered successive images frames (such as the image frames 120, 140).

[0065] A system 100 as illustrated in FIG. 1 may, e.g., be located in a hospital, healthcare facility, elderly care facility or the like. Apart from the monitoring of patients, the present invention may also be applied in other fields such as neonate monitoring, general surveillance applications, security monitoring or so-called lifestyle environments, such as fitness equipment, a wearable, a handheld device like a smartphone, or the like. The uni- or bidirectional communication between the device 150 and the imaging units 110, 130 may work via a wireless or wired communication interface. Other embodiments of the present invention may include a device 150, which is not provided stand-alone, but integrated into at least one of the imaging units 110, 130.

[0066] In general, contactless monitoring may be more convenient than monitoring with contact sensors which is still used in a general ward or a triage in an emergency department. In addition, such contactless monitoring may be applicable for monitoring of automotive drivers as well as for sleep monitoring, wherein in the latter, especially NIR-based monitoring, preferably multi-spectral NIR-based monitoring, may be applied to improve robustness of vital signs extraction.

[0067] FIG. 3A shows a schematic diagram of a second embodiment the system 100 further comprising a third imaging unit 220. In this embodiment, the device 150 is integrated in the first imaging unit 110. The third imaging unit is configured to acquire a third image frame 230 depicting the common region of interest 160. The second and the third imaging unit 130, 220 are communicatively connected to the first imaging unit 110. In this case, also the third imaging unit 220 is a camera 220. Thus, the system 100 setup represents a multi-spectrum camera setup consisting of three cameras 110, 130, 220 that are spaced apart from one another depicting the same region of interest 160 viewing at the same direction (see in FIGS. 3B, 3C, 3D). Preferably, the cameras 110, 130, 220 are mono-chrome cameras with optical filters to sample desired wavelengths (which may be three different NIR wavelengths). Since the cameras 110, 130, 220 have different optical paths, the region of interest of the subject/object appearing in the image frames 120, 140, 230 acquired by the cameras 110, 130, 220 have displacement in their position with respect to one another. This displacement is commonly referred to as “parallax”. The significance of the parallax depends on the used focal length and the distance between the subject 170 and camera 110, 130, 220, respectively. The proposed nonlinear adaptive camera registration with spatio-temporal geometric constraints which is performed by the device 150 aims to improve the registration of the image frames 120, 140, 230 to eliminate said parallax problem.

[0068] In the follow, the nonlinear adaptive image registration is explained in detail referring to FIGS. 4, 5A, 5B and 6. Thereby, the image frames 120, 140, 230 have been acquired, for example, by a system according to FIG. 3A. To enable the nonlinear registration, preferably being a pixelwise registration or a registration of a predefined group of pixels across multiple cameras (here three cameras 110, 130, 220), is performed, wherein each pixel or group of pixels has its own registration in a new image frame.

[0069] Typically the image frames acquired by the central camera is taken as reference image frames. In FIG. 4, the image frame 120 acquired by the first imaging unit 110 is the reference image frame. For the image frames 140, 230 acquired by the second and the third cameras 130, 220, new image frames are created. Typically, the camera 110 placed in the central position is selected as a reference camera, since it has the shortest distance to the subject 170 compared to the cameras 130, 220. Then, the first pixel displacement 200 is measured which may be a pixel-to-pixel displacement between pixels or a displacement between a group of pixels. The first pixel displacement is measured, for example, between the first image frame 120 and the second image frame 140. Furthermore, a pixel displacement may be measured between the first image frame 120 and the third image frame 230. In FIG. 4, the first pixel displacement is measured between two pixels of the first image frame 120 relative to the second and third image frame 140, 230, respectively. The first pixel displacement may be measured by dense optical flow which results in motion vectors for pixels or group of pixels between the spatial and/or temporal image frames:

D=DOF(I.sub.ref,I.sub.nonref),

where DOF(.) denotes the dense optical flow, I.sub.ref the reference image frame (i.e. image frame 120), I.sub.nonref the non-reference image frames (i.e. image frames 140, 230) and D the first pixel displacement 200, wherein D is used to correlate/interpolate the non-reference image frames 140, 230 in order to register them with the reference image frame 120:

I.sub.reg=Interp(I.sub.nonref,D),

where Interp(.) denotes the interpolation/correlation and I.sub.reg the registered image. The pixel-based interpolation is highly nonlinear for image transformation and dense optical flow measurement and interpolation are performed for each individual image frame. Thus, the registration is adaptive to video contents and robust to scenes having depth changes or object position changes (e.g., distance-to-camera changes) during monitoring.

[0070] Furthermore, according to the present invention, the first pixel displacement is corrected according to spatial and/or temporal geometric constraints. These constraints are applied as a post processing step of the “raw” dense optical flow results described above. Since the setup (e.g., camera position) is preferably fixed during the measurement, parallax-induced pixel displacement across the cameras 110, 130, 220 depends on predefined geometric relationships (e.g., epipolar geometry). Thereby, epipolar geometry considers, for example, the geometry of stereo vision, wherein two cameras view a 3D scene from two distinct positions. In such a setup, there are a number of geometric relations between 3D points and their projections onto the 2D image frame leading to constraints between these image points. These relations may be derived based on the assumption that cameras can be approximated by a pinhole camera model. Such relationships may be used as spatial geometric constraints to smooth the measurement of the first pixel displacement 200 or to restrict outliners. In FIG. 5A, for example, a spatial geometric constraint between the second imaging unit 140 and the third imaging unit 220 is apparent.

[0071] Based on the spatial geometric constraint(s), a second pixel displacement can be analytically calculated. Based on the second pixel displacement, the first pixel displacement 200 may preferably be smoothed by calculating a mean value of the first pixel displacement 200 and the second pixel displacement. In other embodiments, the second pixel displacement may be used to detect outliners in the measured first pixel displacement 200 by comparing said first pixel displacement 200 with the second pixel displacement and to correct the first pixel displacement 200 by rejecting the detected outliners.

[0072] In the example shown in FIGS. 4, 5A, 5B and 6, a dense optical flow D (I.sub.camera 130, I.sub.camera 220)=D.sub.1.fwdarw.3 can be measured referring to a predefined relationship between the motion vectors determined from the dense optical flow D.sub.1.fwdarw.2. This may be expressed as:

D.sub.1.fwdarw.2 is the measured solution

D′.sub.1.fwdarw.2=D.sub.1.fwdarw.3−D.sub.2.fwdarw.3 (analytic solution)

D″.sub.1.fwdarw.2=(D.sub.1.fwdarw.2+D′.sub.1.fwdarw.2)/2 (smoothed solution)

where D.sub.1.fwdarw.2, D.sub.2.fwdarw.3, D.sub.1.fwdarw.3, denote the (group of) pixel displacement from camera 110 to 130, camera 130 to 220, and camera 110 to 220, respectively. Thereby, D.sub.1.fwdarw.2 is the solution measured by dense optical flow, D′.sub.1.fwdarw.2 is the analytic solution deducted from D.sub.1.fwdarw.3 and D.sub.2.fwdarw.3, and D″.sub.1.fwdarw.2 is the smoothed solution resulting from the mean value between D.sub.1.fwdarw.2 and D′.sub.1.fwdarw.2. It should be noted that it may be derived, i.e. from further metrics (not described here), that instead of the “smoothed solution” the “measured” or “analytic” solution may be more appropriate. Furthermore, it is possible to use D′.sub.1.fwdarw.2 to restrict measurement outliers in D.sub.1.fwdarw.2.

[0073] Similar to the use of spatial geometric constraints, temporal geometric constraints may be used to smooth the measurement of the first pixel displacement 200 (see FIG. 5B) which may be expressed by:

D.sub.1.fwdarw.2 is the measured solution

D′.sub.1.fwdarw.2=T.sub.2.fwdarw.1−T.sub.2.fwdarw.2 (analytic solution)

D″.sub.1.fwdarw.2=(D.sub.1.fwdarw.2+D′.sub.1.fwdarw.2)/2 (smoothed solution)

where T.sub.2.fwdarw.1 denotes the pixel displacement from the second camera 130 at time t to the first camera 110 at time t+1 and T.sub.2.fwdarw.2 the pixel displacement from the first camera 110 at time t to the first camera 110 at time t+1.

[0074] Preferably, spatial and temporal geometric constraints are applied simultaneously (see FIG. 6). In such a case, all pixels of a common region of interest of an object/subject are connected in space and time for globally optimizing/correcting the pixel displacement leading to a highly robust and smooth image registration. Another benefit of using spatio-temporal geometric constraints is that the pixel motions (due to subject body motion or camera movement) between adjacent frames can be reduced.

[0075] FIGS. 7A, 7B, 7C show three image frames depicting a common region of interest 160 which is in this case a part of a face of a patient how's vital signs are to be measured, wherein a distance between the at least two imaging units 110, 130 and the region of interest 160 of the subject 170 is short (<1 m) resulting in a large parallax. Instead, FIGS. 8A, 8B, 8C show three image frames depicting a common region of interest 160 which is in this case a part of a face of another patient, wherein a distance between the at least two imaging units 110, 130 and the region of interest 160 of the subject 170 is large (5-6 m) resulting in a small parallax (i.e., where parallax is reduced manually).

[0076] FIG. 7A and FIG. 8A both show an unregistered image frame in which strong blurriness is recognizable due to parallax. This leads, in particular when large parallax is apparent (see FIG. 7A), to the fact that, for example, facial contours of the subject 170 are not delimitable as sharp edges/contours. Thus, the region of interest 160 may not be clearly delimited which may lead to an erroneous evaluation of health related parameters.

[0077] FIG. 7B and FIG. 8B show a case where state of the art transformation based image registration has been performed resulting in a reduction of parallax based blurriness. However, especially when large parallax is apparent (see FIG. 7B), the state of the art registration still shows a remaining blurriness at an edge area of the region of interest 160 which still results in an incorrect evaluation of health related parameters, especially when the these areas include depth information.

[0078] In FIG. 7C and FIG. 8C image registration has been performed according to an embodiment of the registration method according to the present invention resulting in an elimination of parallax based blurriness. Thus, image registration is performed resulting in sharp edges of a depicted region of interest. The improvement is particularly visible when the subject 170 changes the position or rotates the head during the monitoring, as the distance-to-camera or the face 3D geometry may change which renders the linear model used by the existing method invalid (i.e., transformation model is usually estimated upfront the measurement).

[0079] FIG. 9A shows vital signs extraction based on a state of the art registered image frames. FIG. 9B shows vital signs extraction based on inventively registered image frames. It should be noted that pulse-rate and SpO2 have been extracted as exemplary vital signs and the vital signs extraction method is the same for both cases. As can be seen in the SpO2 graph in FIG. 9B, curve progression is more stable when registration has been performed according to the present invention. The higher stability results from the fact that SpO2 measurement is based on amplitude of color changes and is therefore rather sensitive to color gradient artefacts which can be reduced due to the image registration according to the present invention.

[0080] While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive; the invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims.

[0081] In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

[0082] A computer program may be stored/distributed on a suitable non-transitory medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.

[0083] Any reference signs in the claims should not be construed as limiting the scope.

DEVICE, METHOD AND SYSTEM FOR REGISTERING A FIRST IMAGE FRAME AND A SECOND IMAGE FRAME

Inventors

Cpc classification

Classification Explorer

G06T2207/10016

PHYSICS

Classification Explorer

G06T2207/30076

PHYSICS

Classification Explorer

G06T2207/10024

PHYSICS

Classification Explorer

G16H40/67

PHYSICS

Classification Explorer

G16H30/40

PHYSICS

Classification Explorer

G06T7/33

PHYSICS

Classification Explorer

A61B5/02416

HUMAN NECESSITIES

Classification Explorer

G06T2207/30201

PHYSICS

International classification

Classification Explorer

G06T7/33

PHYSICS

Classification Explorer

A61B5/024

HUMAN NECESSITIES

Classification Explorer

G16H30/40

PHYSICS

Classification Explorer

G16H40/67

PHYSICS

Abstract

Claims

Description