Method and system for unsynchronized structured lighting
11425358 · 2022-08-23
Assignee
Inventors
Cpc classification
G06T7/521
PHYSICS
H04N13/254
ELECTRICITY
International classification
H04N13/254
ELECTRICITY
G06T7/521
PHYSICS
Abstract
A system and method to capture the surface geometry a three-dimensional object in a scene using unsynchronized structured lighting is disclosed. The method and system includes a pattern projector configured and arranged to project a sequence of image patterns onto the scene at a pattern frame rate, a camera configured and arranged to capture a sequence of unsynchronized image patterns of the scene at an image capture rate, and a processor configured and arranged to synthesize a sequence of synchronized image frames from the unsynchronized image patterns of the scene. Each of the synchronized image frames corresponds to one image pattern of the sequence of image patterns.
Claims
1. The system to capture the surface geometry of a three dimensional object in a scene comprising a pattern projector, a camera, and a computing processor; the pattern projector configured and arranged to project a sequence of image patterns onto the scene at a pattern frame rate; the pattern frame rate measured in image patterns per second; the sequence of image patterns containing a number of image patterns; the number of image patterns being equal to the length of the sequence of image patterns; the camera having an image sensor; the camera configured and arranged to capture a sequence of unsynchronized image frames of the scene at an image capture rate; the image capture rate measured in frames per second; the image capture rate being larger or equal than the pattern frame rate; the sequence of unsynchronized image frame containing a number of unsynchronized image frames; the number of unsynchronized image frames being equal to the length of the sequence of unsynchronized image frames; the number of unsynchronized image frames captured by the camera being larger or equal than the number of image patterns; the sequence of image patterns having a first image pattern, a plurality of intermediate image patterns, and a last image pattern; the sequence of unsynchronized image frames having a first unsynchronized image frame, a plurality of intermediate unsynchronized image frames, and a last unsynchronized image frame; the projector starting projecting the first image pattern at a first start projection time; the projector ending projecting the first image pattern at a first stop projection time; the camera starting capturing the first unsynchronized image frame at a first start capture time; the first start capture time being not earlier than the first start projection time; the first start capture time being not later than at the first stop projection time; the projector starting projecting the last image pattern at a last start projection time; the projector ending projecting the last image pattern at a last stop projection time; the camera ending capturing the last unsynchronized image frame at a last stop capture time; the last stop capture time being larger or equal than last start projection time; the last stop capture time being less or equal than last stop projection time; the computing processor configured and arranged to execute a process to synthesize a sequence of synchronized image frames from the sequence of unsynchronized image frames; the unsynchronized image frames and the unsynchronized images frames being digital images of the same dimensions, and sharing a plurality of common image pixels; each unsynchronized image frame having an unsynchronized pixel value corresponding to each common image pixel; each synchronized image frame having a synchronized pixel value corresponding to each common image pixel; the sequence of synchronized image frame containing a number of synchronized image frames; the number of synchronized image frames being equal to the length of the sequence of synchronized image frames; the number of synchronized image frames being equal to the number of image patterns; and each of the synchronized image frames corresponding to one image pattern in the sequence of image patterns.
2. The system, as in claim 1, where the camera and the computing processor are components of a single device such a digital camera, smartphone, or computer tablet.
3. The system as in claim 1, further comprising processes for decoding, three dimensional triangulation, and optionally geometric processing, executed by the computing processor.
4. The system as in claim 1, where the projector can select the pattern rate from a plurality of supported pattern rates, the camera can select the frame rate from a plurality of supported frame rates, and the camera can capture the unsynchronized image frames in burst mode at a fast frame rate.
5. The system as in claim 4, where the projector has a knob to select the pattern rate.
6. The system as in claim 4, where the pattern rate is set by a pattern rate code sent to the projector through a communications link.
7. The system as in claim 4, wherein the pattern rate and the frame rate are set so that the frame rate is not slower than the pattern rate.
8. The system as in claim 4, wherein a user sets the pattern rate and the frame rate.
9. The system as in claim 4, where the camera can receive a camera trigger signal, and set the number of burst mode frames.
10. The system as in claim 9, where the camera trigger signal is generated by a camera trigger push-button, and the camera starts capturing the unsynchronized image frames at the set frame rate as soon as it receives the trigger signal, and it stops capturing unsynchronized image frames after capturing the set number of burst mode frames.
11. The system as in claim 9, where the projector continuously projects the sequence of patterns in a cyclic fashion.
12. The system as in claim 11, wherein the process can detect when the first pattern is about to be projected, and the camera trigger signal is delayed until that moment.
13. The system as in claim 9, where the projector can receive a projector trigger signal.
14. The system as in claim 13, where the camera can send the projector trigger signal to the projector.
15. The system as in claim 14, where the camera has a flash trigger output, and it sends the projector trigger signal to the projector through the flash trigger output, the projector starts projecting the sequence of patterns at the set pattern rate when it receives the trigger signal, and the projector stops projecting patterns after it projects the last pattern.
16. The system as in claim 1, where the camera image sensor is a global shutter image sensor, the image capture rate is equal to the pattern frame rate, the number of image patterns in the sequence of image patterns is equal to the number of unsynchronized image frames in the sequence of unsynchronized image frames, a method to synthesize the sequence of synchronized image frames from the sequence of unsynchronized image frames comprises the following steps: a determining a plurality of active pixels; the plurality of active pixels being a subset of the plurality of common pixels; each active pixel in the plurality of active pixels corresponding to a three dimensional point in the scene illuminated by the image patterns projected by projector; b determining a plurality of active pixels with known synchronized values; the plurality of active pixels with known synchronized values being a subset of the plurality of active pixels; c estimating a normalized start time for the first unsynchronized image frame as a function of the plurality of active pixels with known synchronized values, and the unsynchronized image frames; the normalized start time being the difference between the first start capture time and the first start projection time, divided by the image capture frame time; the image capture frame time being the inverse of the image capture rate; d selecting an active pixel; e estimating the synchronized pixel values of the synchronized image frames corresponding to the active pixel; and f repeating steps d) and e) until all active pixels have been selected.
17. The system as in claim 16, where the step of estimating the normalized start time for the first unsynchronized image frame is performed by minimizing the following expression with respect to the normalized start time
18. The system as in claim 16, where the step of estimating the synchronized pixel values of the synchronized image frames corresponding to the active pixel is performed by solving the following N×N system of linear equations with respect to the synchronized pixel values of the synchronized image frames corresponding to the active pixel
βP.sub.n−1(x,y)+αP.sub.n(x,y)+βP.sub.n+1(x,y)=t.sub.0I.sub.n−1(x,y)+(1−t.sub.0)I.sub.n(x,y) n=1, . . . ,N where t.sub.0 is the normalized start time, N is the number of image patterns, n is a pattern index, (x, y is the active pixel, α=t.sub.0.sup.2(1−t.sub.0).sup.2, β=t.sub.0(1−t.sub.0), P.sub.1(x,y), . . . , P.sub.N(x,y)i are the synchronized pixel values of the synchronized image frames corresponding to the active pixel (x, y being estimated, P.sub.−1(x,y)=P.sub.N(x,y), P.sub.N+1(x,y)=P.sub.1(x,y), I.sub.1(x,y), . . . , I.sub.N(x,y) are the pixel values of the unsynchronized image frames corresponding to the active pixel (x, y, I.sub.−1(x,y)=I.sub.N(x,y).
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) These and other features, aspects, and advantages of the method and system will become better understood with reference to the following description, appended claims, and accompanying drawings where:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
DETAILED DESCRIPTION
(12) A system and method to capture the surface geometry a three-dimensional object in a scene using unsynchronized structured lighting is shown generally in
(13) One object of the present invention is a system to synthesize a synchronized sequence of image frames from an unsynchronized sequence of image frames, illustrated in
(14) Another object of the invention is an unsynchronized three-dimensional shape capture system, comprising the system to synthesize a synchronized sequence of image frames from an unsynchronized sequence of image frames described above, and further comprising prior art methods for decoding, three-dimensional triangulation, and optionally geometric processing, executed by the computer processor.
(15) Another object of the invention is a three-dimensional snapshot camera comprising the unsynchronized three-dimensional shape capture system, where the projector has the means to select the pattern rate from a plurality of supported pattern rates, the camera has the means to select the frame rate from a plurality of supported frame rates, and the camera is capable of capturing the unsynchronized image frames in burst mode at a fast frame rate. In a preferred embodiment the projector has a knob to select the pattern rate. In another preferred embodiment the pattern rate is set by a pattern rate code sent to the projector through a communications link. Furthermore, the system has means to set the pattern rate and the frame rate so that the frame rate is not slower than the pattern rate. In a more preferred embodiment the user sets the pattern rate and the frame rate.
(16) In a more preferred embodiment of the snapshot camera, the camera has the means to receive a camera trigger signal, and the means to set the number of burst mode frames. In an even more preferred embodiment, the camera trigger signal is generated by a camera trigger push-button. When the camera receives the trigger signal it starts capturing the unsynchronized image frames at the set frame rate, and it stops capturing unsynchronized image frames after capturing the set number of burst mode frames.
(17) In a first preferred embodiment of the snapshot camera with camera trigger signal, the projector continuously projects the sequence of patterns in a cyclic fashion. In a more preferred embodiment the system has the means of detecting when the first pattern is about to be projected, and the camera trigger signal is delayed until that moment.
(18) In a second preferred embodiment of the snapshot camera with camera trigger signal, the projector has the means to receive a projector trigger signal. In a more preferred embodiment the camera generates the projector trigger signal after receiving the camera trigger signal, and the camera has the means to send the projector trigger signal to the projector. In an even more preferred embodiment the camera has a flash trigger output, and it sends the projector trigger signal to the projector through the flash trigger output. When the projector receives the trigger signal it starts projecting the sequence of patterns at the set pattern rate, and it stops projecting patterns after it projects the last pattern.
(19) Another object of this invention is a method to synthesize a synchronized sequence of image frames from an unsynchronized sequence of image frames, generating a number of frames in the synchronized sequence of image frames equal to the number of projected patterns, and representing estimates of what the camera would have captured if it were synchronized with the projector.
(20) As will be described in greater detail below in the associated proofs, the method to synthesize the synchronized sequence of image frames from the unsynchronized sequence of image frames is shown generally in
(21) In a preferred embodiment, the method to synthesize the synchronized sequence of image frames from an unsynchronized sequence of image frames, applies to a global shutter image sensor where the image frame rate is identical to the pattern frame rate.
I.sub.n(x,y)=(1−i.sub.0)P.sub.n(x,y)+i.sub.0P.sub.n+1(x,y)
(22) where P.sub.n(x, y) and P.sub.n−1(x, y) represent the pattern values to be estimated that contribute to the image pixel (x, y) and P.sub.n+1=P.sub.1. Projected patterns are known in advance, but since it is not known which projector pixel illuminates each image pixel, they have to be treated as unknown. To estimate the value of t.sub.0, the following expression is minimized
(23)
(24) with respect to t.sub.0, where the sum is over a subset of pixels (x, y) for which the corresponding pattern pixel values P.sub.n(x, y) and P.sub.n−1(x, y) are known. Differentiating E(t.sub.0) with respect to t.sub.0 and equating the result to zero, an expression to estimate t.sub.0 is obtained
(25)
(26) Once the value of t.sub.0 has been estimated, the N pattern pixel values P.sub.1(x, y), . . . , P.sub.n(x, y) can be estimated for each pixel (x, y) by minimizing the following expression
E(P.sub.1(x,y), . . . ,P.sub.N(x,y))=½Σ.sub.n=1.sup.N((1−t.sub.0)P.sub.n(x,y)+t.sub.0P.sub.n+1(x,y)−I.sub.n(x,y)).sup.2
(27) which reduces to solving the following system of N linear equations
βP.sub.n−1(x,y)+αP.sub.n(x,y)+βP.sub.n+1(x,y)=t.sub.0I.sub.n−1(x,y)+(1−t.sub.0)I.sub.n(x,y)
(28) for n=1, . . . , N, where α=t.sup.2.sub.0+(1−t.sub.0).sup.2 and β=t.sub.0(1−t.sub.0).
(29) In another preferred embodiment, the method to synthesize the synchronized sequence of image frames from an unsynchronized sequence of image frames, applies to a rolling shutter image sensor where the image frame rate is identical to the pattern frame rate.
(30) Camera row y in image n begins being exposed at time t.sub.n,y
t.sub.n,y=t.sub.0+(n−1)t.sub.f+yt.sub.r,y:0 . . . Y−1,
(31) and exposition ends at time t.sub.n,y+t.sub.e
(32) In this model image n is exposed while pattern P.sub.n and P.sub.n+1 are being projected. Intensity level measured at a pixel in row y is given by
I.sub.n,y=(n−t.sub.n,y)k.sub.n,yP.sub.n+(t.sub.n,y+t.sub.e−n)k.sub.n,yP.sub.n+1+C.sub.n,y,
(33) The constants k.sub.n, y and C.sub.n, y are scene dependent.
(34) Let be min {I.sub.n, y} a pixel being exposed while P(t)=0, and max {I.sub.n,y} a pixel being exposed while P(t)=1, max {I.sub.n, y}=t.sub.e k.sub.n, y+C.sub.n, y. Now, we define a normalized image I.sub.n, y as,
(35)
(36) A normalized image is completely defined by the time variables and pattern values. In this section we want to estimate the time variables. Lets rewrite Equation 58 as
(37)
(38) being t.sub.0 and d unknown. Image pixel values are given by
I.sub.n(x,y)=(1−t.sub.0−yd)P.sub.n(x,y)+(t.sub.0+yd)P.sub.n+1(x,y),
(39) Same as before, P.sub.n(x, y) and P.sub.n+1(x, y) represent the pattern values contributing to camera pixel (x, y), we define P.sub.n+1=P.sub.1, P.sub.0=P.sub.n, I.sub.n+1=I.sub.1, and I.sub.0=I.sub.N, and I will omit pixel (x, y) to simplify the notation. We now minimize the following energy to find the time variables t.sub.0 and d
(40)
(41) The partial derivatives are given by
(42)
(43) We set the gradient equal to the null vector and reorder as
(44)
(45) We use Equation 29 to compute t.sub.0 and d when we have some known (or estimated) pattern values.
(46) With known t.sub.0 and d we estimate pattern values minimizing
(47)
(48) Analogous as in Case 1 we obtain that Ap=b with A as in Equation 12 and α, β, and b defined as
α=(1t.sub.0−yd).sup.2+(t.sub.0+yd).sup.2,β=(1−t.sub.0−yd)(t.sub.0+yd)
b=(1−t.sub.0−yd)(I.sub.1I.sub.2. . . ,I.sub.N).sup.r+(t.sub.0+yd)(I.sub.N,I.sub.1, . . . I.sub.N−1).sup.r
(49) Pattern values for each pixel are given by p=A.sup.−1 b.
(50) In another preferred embodiment, the method to synthesize the synchronized sequence of image frames from an unsynchronized sequence of image frames, applies to a global shutter image sensor where the image frame rate is higher or equal than the pattern frame rate.
(51)
(52) Let Δt=t.sub.n−1−t.sub.n the time between image frames, let p=(P.sub.1 . . . , P.sub.M).sup.T and Φ.sub.n(t.sub.0, Δt)=(Φ.sub.n, 1, t.sub.0, Δt), . . . , Φ(n, M, t.sub.0, Δt)).sup.T and rewrite Equation 33 as
(53)
(54) Each function Φ(n, m, t.sub.0, Δt)=∫.sub.n−1.sup.tnf.sub.m(t)dt can be written as
Φ(n,m,t0,Δt)=max(0,min(m,t.sub.n)−max(m−1,t.sub.n−1))
(55) Same as before, P.sub.n(x, y) represents a pattern value contributing to camera pixel (x, y), we define P.sub.n+1.=P.sub.1, P.sub.0=P.sub.N, I.sub.n+1=I.sub.1, and I.sub.0=I.sub.n, and I will omit pixel (x, y) to simplify the notation.
(56) We now minimize the following energy to find the time variables t.sub.0 and Δt
(57)
(58) We solve for t.sub.0 and Δt by making
(59)
(60) Because JΦ.sub.n(t.sub.0, Δt) depends on the unknown value t=(t.sub.0, Δt).sup.T we solve for them iteratively
(61)
(62) Matrix V.sub.A(n,t) and vector V.sub.b(n,t) are defined such as
(63)
(64) For completeness we include the following definitions:
(65)
(66) With known t.sub.0 and 66 t we estimate pattern values minimizing
(67)
(68) Analogous as in Case 1 we obtain that Ap=b with
(69)
(70) Pattern values for each pixel are given by p=A.sup.−1b.
(71) In another preferred embodiment, the method to synthesize the synchronized sequence of image frames from an unsynchronized sequence of image frames, applies to a rolling shutter image sensor where the image frame rate is higher or equal than the pattern frame rate.
(72) Camera row y in image n begins being exposed at time t.sub.n, y
t.sub.n,y=t.sub.0+(n−1)t.sub.f+yt.sub.y,y:0 . . . Y−1
(73) and exposition ends at time t.sub.n,y+t.sub.e.
(74) In this model a pixel intensity in image n at row y is given by
(75)
(76) The constants k.sub.n, y and C.sub.n, y are scene dependent, P.sub.m is either 0 or 1.
(77) Let be min{I.sub.n, y} a pixel being exposed while P(t)=0, and max{I.sub.n, y} a pixel being exposed while P(t)=1,
min{I.sub.n,y}=C.sub.n,y
max{I.sub.n,y}=i.sub.ek.sub.n,y+C.sub.n,y
(78) Now, we define a normalized image.sub.n, yas,
(79)
(80) A normalized image is completely defined by the time variables and pattern values. In this section we want to estimate the time variables. Lets rewrite the previous equation as,
(81)
(82) Let be
(83)
(84)
(85) We now minimize the following energy to find the unknown h
(86)
(87) with the following constraints
(88)
(89) or equivalently
(90)
(91) Equation E(h) cannot be minimized in closed form because the values matrix V.sub.n,y depends on the unknown values. Using an iterative approach the current value h.sup.(i) is used to compute V.sub.n,y.sup.(i) and the next value h.sup.(i+1)p1.
(92) Up to this point we have assumed that the only unknown is h, meaning that pattern values are known for all image pixels. The difficulty lies is knowing which pattern pixel is being observed by each camera pixel. We simplify this issue by making calibration patterns all ‘black or all ‘white’, best seen in
(93) Decoding is done in two steps: 1) the time offset t.sub.0 need to be estimated for this particular sequence; 2) the pattern values are estimated for each camera pixel, as shown in
(94) Similarly as for the time variables, pattern values are estimated by minimizing the following energy
(95)
(96) The matrix h.sup.T V.sub.n,y.sup.T is bi-diagonal for N=M and it is fixed if h is known.
(97) Therefore, it can be seen that the exemplary embodiments of the method and system provides a unique solution to the problem of using structure lighting for three-dimensional image capture where the camera and projector are unsychronized.
(98) It would be appreciated by those skilled in the art that various changes and modifications can be made to the illustrated embodiments without departing from the spirit of the present invention. All such modifications and changes are intended to be within the scope of the present invention except as limited by the scope of the appended claims.