Determination of the image depth map of a scene
09854222 · 2017-12-26
Assignee
- Commissariat A L'energie Atomique Et Aux Energies Alternatives (Paris, FR)
- Centre National De La Recherche Scientifique (Paris, FR)
Inventors
Cpc classification
H04N9/03
ELECTRICITY
H04N2013/0081
ELECTRICITY
International classification
Abstract
A method for estimating the image depth map of a scene, includes the following steps: providing (E1) an image, the focus of which depends on the depth and wavelength of the considered object points of the scene, using a longitudinal chromatic optical system; determining (E2) a set of spectral images from the image provided by the longitudinal chromatic optical system; deconvoluting (E3) the spectral images to provide estimated spectral images with field depth extension; and analyzing (E4) a cost criterion depending on the estimated spectral images with field depth extension to provide an estimated depth map.
Claims
1. A method for estimating an image depth map of a scene, which comprises: providing an image in which the focus depends on a depth and wavelength of considered object points of the scene, using a longitudinal chromatic optical system, determining a set of spectral images from the image provided by the longitudinal chromatic optical system, deconvoluting the spectral images in order to provide estimated spectral images with extended depth of field, and analyzing a cost criterion that is dependent on the estimated spectral images with extended depth of field, in order to provide an estimated depth map {circumflex over (Z)}(x,y) of an actual depth Z(x, y) , where Z is the actual depth, of the scene, and x and y are coordinates of the actual scene.
2. The method according to claim 1, wherein the step of determining a set of spectral images comprises the steps of: receiving the image formed by the longitudinal chromatic optical system and forming a mosaic image, using an image sensor equipped with a plurality of spectral filters, demosaicing the mosaic image to provide a set of filtered images, and performing spectral estimation to receive the set of filtered images and provide the set of spectral images.
3. The method according to claim 2, wherein the step of analyzing a cost criterion comprises: forming a mosaic image reconstructed from the estimated spectral images with extended depth of field of the scene, and estimating the depth at each point of the image by minimizing a cost criterion that is dependent on a point-to-point squared difference between the mosaic image formed and the mosaic image reconstructed from estimated spectral images with extended depth of field of the scene, for depths in a predetermined depth domain.
4. The method according to claim 1, further comprising a step of estimating a set of spectral images with extended depth of field of the scene, for the estimated depth map provided in the analysis step.
5. The method according to claim 4, wherein estimating the set of spectral images with extended depth of field of the scene comprises selecting, by iterating on the depth, values of spectral images with extended depth of field provided in the deconvolution step for which the considered depth corresponds to the estimated depth map provided in the analysis step.
6. The method according to claim 2, wherein the spectral estimation step and the spectral image deconvolution step are combined into a single step.
7. The method according to claim 2, wherein the demosaicing step, the spectral estimation step, and the spectral image deconvolution step are combined into a single step, and wherein this step uses images from a database of images to calculate a transfer matrix for converting between the space of the mosaic image and the space of the spectral images.
8. A device for estimating the image depth map of a scene, which comprises: a longitudinal chromatic optical system for providing an image in which the focus depends on the depth and wavelength of the considered object points of the scene; a spectral image sensor for receiving the image provided by the longitudinal chromatic optical system and for providing a set of spectral images; and a computer, the computer being configured to deconvolute the spectral images in order to provide estimated spectral images with extended depth of field, and analyze a cost criterion that is dependent on the estimated spectral images with extended depth of field, in order to provide an estimated depth map {circumflex over (Z)}(x,y) of an actual depth Z(x, y), where Z is the actual depth, of the scene, and x and y are coordinates of the actual scene.
9. The device according to claim 8, wherein the spectral image sensor comprises: an image sensor equipped with a plurality of spectral filters for receiving the image provided by the longitudinal chromatic optical system and for providing a mosaic image, wherein the computer demosaics to receive the mosaic image and for providing a set of filtered images, spectrally estimates to receive the set of filtered images and for providing the set of spectral images.
10. The device according to claim 9, wherein the computer analyzing the cost criterion is additionally configured to: form a mosaic image reconstructed from the estimated spectral images with extended depth of field of the scene, and estimate the depth at each point in the image, adapted to minimize a cost criterion that is dependent on a point-to-point squared difference between the mosaic image formed and the mosaic image reconstructed from the estimated spectral images with extended depth of field of the scene, for depths in a predetermined depth domain.
11. The device according to claim 8, wherein the computer further estimates a set of spectral images with extended depth of field of the scene, for the estimated depth map provided by the analysis module.
12. The device according to claim 11, wherein the computer is adapted to select, by iterating on the depth, values of spectral images with extended depth of field provided by the deconvolution module for which the considered depth corresponds to the estimated depth map provided by the analysis module.
13. The device according to claim 9, wherein the spectral estimation and the spectral image deconvolution are combined.
14. The device according to claim 9, wherein the demosaicing, the spectral estimation and the spectral image deconvolution are combined to use images from a database of images to calculate a transfer matrix for converting between the space of the mosaic image and the space of the spectral images.
15. A computer program stored on a non-transitory storage medium, comprising instructions for executing the steps of the method according to claim 1 when said program is executed by a computer.
16. A computer-readable non-transitory storage medium on which is stored a computer program comprising instructions for executing the steps of the method according to claim 1.
17. The method according to claim 2, further comprising a step of estimating a set of spectral images with extended depth of field of the scene, for the estimated depth map provided in the analysis step.
18. The method according to claim 3, further comprising a step of estimating a set of spectral images with extended depth of field of the scene, for the estimated depth map provided in the analysis step.
19. The method according to claim 3, wherein the spectral estimation step and the spectral image deconvolution step are combined into a single step.
20. The method according to claim 3, wherein the demosaicing step, the spectral estimation step, and the spectral image deconvolution step are combined into a single step, and wherein this step uses images from a database of images to calculate a transfer matrix for converting between the space of the mosaic image and the space of the spectral images.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1) Other features and advantages will become apparent upon reading a preferred embodiment given by way of non-limiting example, described with reference to figures in which:
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS
(7) According to a preferred embodiment described with reference to
(8) Such a system has the characteristic of maintaining the longitudinal chromatic aberration and minimizing the other aberrations.
(9) The longitudinal chromatic aberration is due to the presence in the optical system of optical elements in which the materials have refractive indices dependent on the wavelength in vacuum of the ray of light passing through them. Thus the focal length of the optical system 1 is dependent on the wavelength that traverses it. The focal length increases with the wavelength. The focal length is greater for red than for blue.
(10) In other words, considering a given object point and its image through the longitudinal chromatic optical system, the focal distance and the image defocus are dependent on the depth and on the wavelength in vacuum of the considered object point. The given object point has a spectral signature in terms of image focus that is specific to the fixed properties of the system and to the distance of the object point relative to the optical system.
(11) It should be noted that the longitudinal chromatic optical system has no transverse or lateral chromatic aberrations, and therefore no perspective distortion related to spectral variation of the lens. The focal length only varies longitudinally, as a function of the wavelength. Thus, a given object point corresponds to the same image point regardless of the wavelength.
(12) When the longitudinal chromatic optical system 1 is used to capture an image of a scene SC, it carries out a spectral modulation of the image according to the depth and wavelength of the object point corresponding to the image point considered.
(13) The longitudinal chromatic optical system 1 has an output connected to an input of a spectral image sensor 2 to which it delivers the result of the spectral modulation of the image.
(14) From this input, the spectral image sensor 2 comprises a matrix image sensor 21 on which is placed a color filter array 22. The color filters may be colored resins, but a person skilled in the art may use other known filters such as interference filters or nanostructured filters. The spatial position and the spectral response of each color filter are optimized to facilitate reconstruction of the desired spectral images. The matrix image sensor 21 and the color filter array 22 allow estimating the spectral composition of the scene in a single image capture.
(15) The matrix image sensor 21 and the color filter array 22 output a mosaic image I.sub.CFA.sup.Z*(x,y)(x,y).
(16) The actual spectral images of the scene, denoted EDOF.sub.S(x,y,λ), are geometric projections on the plane of the sensor 21 of rays from the scene concerned.
(17) Considering a coordinate system of the image sensor 21, one can mathematically express the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) as a function of the actual spectral images of the observed scene, the characteristics of the longitudinal chromatic optical system, the characteristics of the color filter array, and on additive noise, according to the formula:
(18)
(19) where: (x,y) are the coordinates of a point in the reference system associated with the image sensor, λ is the wavelength, Z*(x,y) is the actual depth corresponding to the point having coordinates (x,y), EDOF.sub.S are the actual spectral images of the scene, and EDOF.sub.S(x,y,λ) is the value of the spectral image corresponding to the point having coordinates (x,y) for wavelength λ, PSF (Point Spread Function) is the impulse response of the longitudinal chromatic optical system and PSF(x,y,Z*(x,y),λ) is the value of this impulse response at the point having coordinates (x,y), for the actual depth Z*(x,y) and for wavelength λ, h.sub.i(λ) is the spectral transmission of the i.sup.th color filter, for wavelength λ, m.sub.i(x,y) is the spatial position of the i.sup.th color filter, n(x, y) is the value of the additive noise at the point having coordinates (x,y), {circle around (×)}.sub.(x,y) is the convolution operation considered at the point having coordinates (x,y), N.sub.f is the number of color filters.
(20) The same notations are retained throughout the description.
(21) The output of the color filter array 22 is connected to an input of a demosaicing module 23. The color filter array 22 delivers the mosaic image I.sub.CFA.sup.*(x,y)(x,y) to said module.
(22) The demosaicing module 23 estimates Nf images filtered by means of a filter D.sub.M which is applied to the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y), where Nf is the number of color filters in the array 22. The filtered images are at the resolution of the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) in the reference system associated with the image sensor.
(23)
(24) The calculation module determines a linear matrix filter D.sub.M that is optimal in the least squares sense. The optimal linear matrix filter D.sub.M is determined by minimizing the squared difference between the reference filtered images and the filtered images estimated by applying the filter D.sub.M to the corresponding mosaic image.
(25) The optimal linear matrix filter D.sub.M enables transfer from the space of the mosaic images to the space of the corresponding filtered images.
(26) Thus, the demosaicing module 23 allows transfer from the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) to filtered images. The filtered image SOC.sub.f.sub.
SOC.sub.f.sub.
(27) This formula assumes that these filtered images are given by the sum of the actual filtered images (left term) and a noise term (right term). In this case, the estimation errors related to demosaicing are contained in this term of additional noise. This term is not strictly identical to the noise term in the equation of the mosaic image considered (although the notation is the same). However, we considered that the demosaicing algorithm used induces in each filtered image a noise term whose statistical properties are identical to those of the noise of the mosaic image. It is the preservation of statistical properties which is exploited here.
(28) The demosaicing thus results in a set of Nf filtered images SOC.sub.f.sub.
(29) The output of the demosaicing module 23 is connected to an input of a spectral estimation module 24. The demosaicing module 23 delivers the Nf filtered images SOC.sub.f.sub.
(30) The spectral estimation module 24 determines Nλ spectral images by means of a linear matrix inversion filter F.sub.S applied to the Nf filtered images. The operation of module 24 is described in detail below.
(31) Each spectral image relates to a wavelength sample. The number Nλ of spectral images depends on the range of wavelengths of the spectral imaging system and on the size of a spectral sample.
(32) In one particular embodiment, the number of filters in the filter array is equal to the number of spectral images desired. In this case, the spectral images are the images filtered by the demosaicing module. In other words, the demosaicing directly provides the spectral image, and spectral estimation is not necessary.
(33) The output of the spectral estimation module 24 is connected to an input 25 of a deconvolution module. The spectral estimation module 24 delivers the Nλ spectral images to module 25.
(34) Module 25 applies deconvolution by means of a linear matrix inversion filter C.sup.Z.sup.
(35) The deconvolution module 25 determines estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(36) The output from the deconvolution module 25 is connected to an input of a module 26 for estimating the actual depth map of the scene considered. The deconvolution module 25 delivers the estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(37) The output of the color filter array 22 is also connected to an input of module 26. The array 22 delivers the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) to module 26.
(38) Module 26 determines an estimation {circumflex over (Z)}(x,y) of the actual depth map Z(x,y) of the scene considered, based on a minimization of a cost criterion calculated from the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) provided by the array 22 and from the Nλ estimated spectral images EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(39) An output of module 26 is connected to an input of the deconvolution module 25, to provide it with the test depth Z.sup.t.
(40) The operation of module 26 is detailed below.
(41) The output of the deconvolution module 25 is connected to an input of an adaptive pixel selection module 27. The deconvolution module 25 supplies the Nλ estimated spectral images EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(42) The output of the module 26 for estimating the actual depth map of the scene is connected to an input of the adaptive pixel selection module 27. Module 26 supplies the estimation {circumflex over (Z)}(x,y) of the actual depth map of the scene to module 27.
(43) The adaptive pixel selection module 27 determines Nλ estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.{circumflex over (Z)}(x,y)(x,y,λ) adapted to the estimated depth map {circumflex over (Z)}(x,y). The operation of module 27 is detailed below.
(44) In a preferred embodiment, the spectral estimation module 24 and the spectral image deconvolution module 25 are combined into a single module.
(45) In this case, a test depth domain is considered. For a given test depth Z.sup.t within this domain, Nλ estimated spectral images EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(46) This embodiment is based on the assumption that the object scene is planar, perpendicular to the optical axis of the system used, meaning that the object depth is independent of the coordinates (x,y).
(47) We thus obtain the following linear matrix representation:
G.sub.i.sup.Z*=H.sub.i.sup.Z*.S+N
(48) where: G.sub.i.sup.Z* is the matrix representation of the Fourier transform of the filtered image SOC.sub.f.sub.
(49) A least-squares estimate of matrix S is then performed in modules 24 and 25 considered together, according to the formula:
(50)
(51) where: Z.sup.t is a given depth in the test domain, Ŝ.sup.Z.sup.
(52) The estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(53) In the case where the operations of modules 24 and 25 are carried out simultaneously, the matrix term, expressed in Fourier space,
(54)
represents a linear matrix inversion filter equivalent to filter F.sub.S.C.sup.Z.sup.
(55) A mosaic image is reconstructed in module 26 from the estimated spectral images and for test depth Z.sup.t, according to the formula:
(56)
(57) The pixel-to-pixel squared difference, between the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) determined by module 22 and the reconstructed mosaic image I.sub.CFA.sup.Z.sup.
W.sup.Z.sup.
(58) The set of estimated depth values, for all positions (x,y) of the image, gives the estimated depth map {circumflex over (Z)}(x,y).
(59) Module 27 then determines the Nλ estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.{circumflex over (Z)}(x,y)(x,y,λ) by selecting, iterating on the depth Z.sup.t, only the values of estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(60) In another preferred embodiment, the demosaicing module 23, the spectral estimation module 24, and the spectral image deconvolution module 25 are combined into a single module.
(61) This embodiment is based on the fact that it is possible to find a law of transfer from the mosaic image space to another space, here the space of spectral images with extended depth of field, in the form of a linear matrix representation.
(62) As represented in
(63) This embodiment is based on the assumption that the scene is a plane in the object space, perpendicular to the optical axis of the system.
(64) The transfer matrix D.sub.EDOF.sub.
(65)
(66) where: Id represents an identity matrix of the same size as that of matrix
(67)
(68) The transfer matrix D.sub.EDOF.sub.
(69) Thus a plurality of transfer matrices D.sub.EDOF.sub.
(70) The estimated spectral image with extended depth of field for depth Z.sup.t is obtained by matrix multiplication of matrix D.sub.EDOF.sub.
EDOF.sub.S.sup.Z.sup.
(71) As in the first embodiment, a mosaic image is reconstructed in module 26 from estimated spectral images and for test depth Z.sup.t, according to the formula:
(72)
(73) The pixel-to-pixel squared difference between the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) determined by module 22 and the reconstructed mosaic image I.sub.CFA.sup.Z.sup.
W.sup.Z.sup.
(74) The set of estimated depth values at each position (x,y) of the image gives the estimated depth map {circumflex over (Z)}(x,y).
(75) Lastly, in module 27, the Nλ estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.{circumflex over (Z)}(x,y)(x,y,λ) are determined by selecting, iterating on the depth Z.sup.t, only the values of estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(76)
(77) Step E1 is a step of capturing a scene using the device described above. Only one image capture is required in the context of the invention.
(78) The longitudinal chromatic optical system 1 then delivers to the spectral image sensor 2 a spectral flux modulated as a function of the distance and wavelength for each point of the image.
(79) In the next step E2, the image provided by the longitudinal chromatic optical system is received by the spectral image sensor 2. This provides a set of Nλ spectral images. Step E2 is detailed below.
(80) In the next step E3, the Nλ spectral images are deconvoluted and used to estimate estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(81) The next step E4 determines an estimate {circumflex over (Z)}(x,y) of the actual depth map Z(x,y) of the scene considered, based on minimization of a cost criterion calculated from the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) provided by array 22 and on the Nλ estimated spectral images EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(82) The next step E5 is an adaptive selection of pixels in order to estimate a set of spectral images with extended depth of field of the scene, for the depth map {circumflex over (Z)}(x,y) estimated in step E4.
(83)
(84) In step E21, the image provided by the longitudinal chromatic optical system 1 is converted into a mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) by the image sensor 21 and the color filter array 22.
(85) In the next step E22, the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) undergoes demosaicing in order to produce Nf filtered images SOC.sub.f.sub.
(86) The next step E23 is a spectral estimation in order to determine Nλ spectral images from the Nf filtered images SOC.sub.f.sub.
(87) In a preferred embodiment, spectral estimation step E23 and spectral image deconvolution step E3 are combined into a single step.
(88) In this case, a test depth domain is considered. For a given test depth Z.sup.t, Nλ estimated spectral images EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(89) This estimation is performed based on Nf filtered images SOC.sub.f.sub.
(90) This embodiment is based on the assumption of a planar object scene, perpendicular to the optical axis of the system used, meaning that the object depth is independent of the coordinates (x,y).
(91) We thus obtain the following linear matrix representation:
G.sub.i.sup.Z*=H.sub.i.sup.Z*.S+N
(92) where: G.sub.i.sup.Z* is the matrix representation of the Fourier transform of the filtered image SOC.sub.f.sub.
(93) A least squares estimate of matrix S is then given by the formula:
(94)
(95) where: Z.sup.t is a given depth in the test domain, Ŝ.sup.Z.sup.
(96) The estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(97) In step E4, a mosaic image I.sub.CFA.sup.Z.sup.
(98)
(99) The pixel-to-pixel squared difference between the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) determined in step E21 and the reconstructed mosaic image I.sub.CFA.sup.Z.sup.
W.sup.Z.sup.
(100) The set of estimated depth values for each position (x,y) of the image gives the estimated depth map {circumflex over (Z)}(x,y).
(101) Step E5 then determines the Nλ estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.{circumflex over (Z)}(x,y)(x,y,λ) by selecting, iterating on the depth Z.sup.t, only the values of estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
(102) In another preferred embodiment, the demosaicing step E22, spectral estimation step E23, and spectral image deconvolution step E3 are combined into a single step.
(103) This embodiment is based on the fact that it is possible to find a transfer law to convert from the space of the mosaic image to another space, here the space of spectral images with extended depth of field, in the form of a linear matrix representation.
(104) This embodiment uses images from a database of images to calculate (
(105) This embodiment is based on the assumption that the scene is a plane in the object space, perpendicular to the optical axis of the system.
(106) The transfer matrix D.sub.EDOF.sub.
(107)
(108) where: Id represents an identity matrix of the same size as that of matrix
(109)
(110) The transfer matrix D.sub.EDOF.sub.
(111) Thus a plurality of transfer matrices D.sub.EDOF.sub.
(112) The estimated spectral image with extended depth of field for depth Z.sup.t is obtained by matrix multiplication of matrix D.sub.EDOF.sub.
EDOF.sub.S.sup.Z.sup.
(113) As in the first embodiment, in step E4 a mosaic image is reconstructed in module 26 from estimated spectral images and for test depth Z.sup.t, according to the formula:
(114)
(115) The pixel-to-pixel squared difference between the mosaic image I.sub.CFA.sup.Z*(x,y)(x,y) and the reconstructed mosaic image I.sub.CFA.sup.Z.sup.
(116) Step E5 then determines the Nλ estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.{circumflex over (Z)}(x,y)(x,y,λ) by selecting, iterating on the depth Z.sup.t, only the values of estimated spectral images with extended depth of field EDO{circumflex over (F)}.sub.S.sup.Z.sup.
PRIOR ART
(117) [1]: Article entitled “Generalized Assorted Pixel Camera: Post-Capture Control of Resolution, Dynamic Range and Spectrum”, by Yasuma, F.; Mitsunaga, T.; Iso, D. & Nayar, S. K. (2010), in IEEE Transactions on Image Processing;
(118) [2]: Article entitled “Extended depth-of-field using sharpness transport across color channels” by Guichard, F.; Nguyen, H.-P.; Tessieres, R.; Pyanet, M.; Tarchouna, I. & Cao, F. (2009), in ‘Proceedings of the SPIE—The International Society for Optical Engineering’;
(119) [3]: U.S. Pat. No. 7,626,769, “Extended Depth Of Field Imaging System using Chromatic Aberration”, 2009, Datalogic Scanning;
(120) [4]: Patent WO 2009/019362 “Optical System Furnished With a Device For Increasing its Depth Of Field”, DxO Labs;
(121) [5]: Article entitled “Coded Aperture Pairs for Depth from Defocus and Defocus Deblurring” by Zhou, C.; Lin, S. & Nayar, S. (2011), in International Journal on Computer Vision;
(122) [6]: Article entitled “Statistics of spatial cone-excitation ratios in natural scenes” by Nascimento, S. M. C., Ferreira, F., and Foster, D. H. (2002), in Journal of the Optical Society of America A, 19,1484-1490;
(123) [7]: Article entitled “Color filters including infrared cut-off integrated on CMOS image sensor” by Frey, L.; Parrein, P.; Raby, J.; Pelle, C.; Herault, D.; Marty, M. & Michailos, J. (2011), Opt. Express 19(14), 13073-13080.