Apparatus for providing calibration data, camera system and method for obtaining calibration data
10803624 ยท 2020-10-13
Assignee
Inventors
Cpc classification
International classification
Abstract
An apparatus comprises a first interface for receiving a plurality of partially overlapping images of an object from a corresponding plurality of cameras being arranged along a first and a second direction according to a camera pattern. The apparatus comprises an analyzing unit configured for selecting at least one corresponding reference point in an overlap area of a set of images, and for determining displacement information along the first and the second direction of the reference point in each of the other images of the set of images. A misalignment of the plurality of images along the first and the second direction is compensated by the displacement information so as to obtain aligned images. The apparatus comprises a determining unit configured for determining offset information between principal points of the plurality of cameras using at least three aligned images. The apparatus comprises a second interface for providing calibration data based on the displacement information and based on the offset information. The calibration data allows for calibrating the plurality of images so as to comply to the camera pattern.
Claims
1. Apparatus comprising: a first interface for receiving a plurality of partially overlapping images of an object from a corresponding plurality of camera positions being arranged along a first and a second direction according to a two-dimensional or three-dimensional camera pattern, wherein patterns comprising camera positions that differ in two directions are two-dimensional patterns and patterns comprising camera positions that differ in three directions are three-dimensional patterns; an analyzing unit configured for selecting at least one corresponding reference point in an overlap area of a set of overlapping images, and for determining a displacement information along the first and the second direction of the reference point in each of the other images of the set of images, wherein a misalignment of the plurality of images along the first and the second direction is compensated by the displacement information so as to acquire aligned images; a determining unit configured for determining an offset information between principal points at the plurality of camera positions using at least three aligned images; and a second interface for providing calibration data based on the displacement information and based on the offset information, the calibration data allowing for calibrating the plurality of images so as to comply to the camera pattern; wherein the analyzing unit is configured to determine the displacement information minimizing an error of a first minimization criteria, wherein for a 2D camera pattern the first minimization criteria is based on the determination rule;
2. The apparatus according to claim 1, wherein the apparatus is configured, for determining the displacement information, for using a feature detection and matching algorithm to detect references on pairs of images and for acquiring matched image coordinates between the images of the pair of images.
3. The apparatus according to claim 1, wherein the apparatus is configured for performing a self-calibration without imaging a calibration chart.
4. The apparatus according to claim 1, wherein the analyzing unit is configured for determining the displacement information using a set of parameters indicating a real condition of the camera pattern, the parameters comprising a non-linear relationship with respect to each other, wherein the analyzing unit is configured to use a linearized version of the set of parameters and to determine the displacement information by minimizing an error of the linearized version with respect to a desired condition of the camera pattern.
5. The apparatus according to claim 1, wherein the analyzing unit is configured to determine the displacement information minimizing an error of a first minimization criteria, wherein for a 3D camera pattern the first minimization criteria is based on the determination rule;
6. The apparatus according to claim 1, wherein the analyzing unit is configured to determine the displacement information minimizing an error of a linearized first minimization criteria, wherein for a 3D camera pattern the linearized first minimization criteria is based on the determination rule;
H(x.sub.p,C.sub.p,C.sub.q)=R.sub.0.Math.K.Math.R.sub.i=R.sub.0{tilde over (H)}(x.sub.p)
with {tilde over (H)}(x.sub.p)=K.Math.R.sub.i corresponding to camera p with parameter vector x.sub.p, x.sub.q, and x.sub.r, are vectors with parameters to be determined so as to minimizing the error for camera p, q and r, m.sub.p,q=[m.sup.(p).sub.p,q m.sup.(q).sub.p,q] denotes a set of reference points wherein m.sup.(p).sub.p,q corresponds to the respective reference points in camera p, m.sub.p,q,r=[m.sup.(p).sub.p,q,r m.sup.(q).sub.p,q,r m.sup.(r).sub.p,q,r] denotes a set of reference points wherein M.sup.(p).sub.p,q,r corresponds to the respective reference points in camera p, the set comprises all pairs of partially overlapping images; the set comprises triplets of partially overlapping images; such that each of the N cameras considered in the minimization problem is comprised in at least one pair of cameras comprised in and in at least one triplet of cameras comprised in , Functions u() and v() extract the horizontal/vertical component of an image point respectively, J.sub.p,q=[J.sup.(p).sub.p,q J.sup.(q).sub.p,q] and J.sub.p,q,r=[J.sup.(p).sub.p,q,r J.sup.(q).sub.p,q,r J.sup.(r).sub.p,q,r] denote the Jacobi matrices according to
(x,a):=T(x,a)=T F.sub.x,a(1,0)=F.sub.x,a(0)+F.sub.x,a(0)
F.sub.x,a:.fwdarw.
:=t>(a+t.Math.(xa))
and
0=[Tg.sub.1,1(,a),Tg.sub.1,2(,a), . . . ,Tg.sub.3,3(,a)].sup.T
0=J.Math.+b for camera pairs and camera triplets, respectively, J.sup.(p).sub.p,q and J.sup.(p).sub.p,q,r denote sub-matrices corresponding to a camera p, b.sub.p,q and b.sub.p,q,r denote residual elements, T.sub.(p,q,r),i represents a matrix representing the Trifocal Tensor corresponding to camera indices p, q, and r, m.sub.p.sup.# denotes a transformed point according to a homography H, wherein u.sub.p.sup.# and v.sub.p.sup.# denote the horizontal and vertical component of a point respectively and {tilde over (H)}(x.sub.i) of camera i is a multiplication of a rotation matrix R.sub.i correcting a camera's orientation such that it corresponds to its ideal orientation as given by an ideal camera setup and a center matrix K modelling a standard camera matrix based on
7. The apparatus of claim 1, wherein the analyzing unit is configured to determine the displacement information pairwise for a pair of images of the plurality of images.
8. The apparatus of claim 1, wherein the analyzing unit is configured to first determine the displacement information along one of the first and the second direction and to subsequently determine the displacement information for the other direction.
9. The apparatus of claim 1, wherein the analyzing unit is configured to determine the displacement information along a first image direction independent from the displacement information along a second image direction.
10. The apparatus of claim 1, wherein the calibration data is based on angles that describe a rotation of the cameras, to a focal length of the cameras and to a principal point of the cameras, wherein the calibration data does not comprise information indicating a position of cameras of the plurality of camera positions.
11. The apparatus of claim 1, further comprising at least one camera being configured to provide the plurality of images from the plurality of camera positions, wherein the apparatus is configured to determine a depth map of an object region comprising a plurality of subregions, wherein the at least one camera is configured to provide the corresponding image by projecting one of the plurality of subregions.
12. The apparatus of claim 1, further comprising at least one camera being configured to provide the plurality of images from the plurality of camera positions, wherein the apparatus is configured to apply the calibration data to the plurality of images so as to acquire a plurality of rectified images and to provide the plurality of rectified images.
13. The apparatus of claim 1, wherein displacement information referring to an image of the plurality of images comprises at least one of a shift of the image along a lateral direction and a rotation of the image.
14. The apparatus of claim 1, wherein the set of images is a subset of the plurality of images.
15. Camera system comprising at least one camera being configured to provide a plurality of images from a corresponding plurality of camera positions and comprising a memory having stored thereon calibration information derived from calibration data generated from an apparatus according to one of previous claims, wherein the calibration information is the calibration data or incorporates at least part thereof.
16. The camera system of claim 15, further comprising an apparatus of claim 1.
17. Method for acquiring calibration data, the method comprising: receiving a plurality of partially overlapping images of an object from a corresponding plurality of camera positions being arranged along a first and a second direction according to a two-dimensional or three-dimensional camera pattern, wherein patterns comprising camera positions that differ in two directions are two-dimensional patterns and patterns comprising camera positions that differ in three directions are three-dimensional patterns; selecting at least one corresponding reference point in an overlap area of a set of overlapping images, and determining a displacement information along the first and the second direction of the reference point in each of the other images of the set of images, such that a misalignment of the plurality of images along the first and the second direction is compensated by the displacement information so as to acquire aligned images; determining an offset information between principal points at the plurality of camera positions using at least three aligned images; and providing calibration data based on the displacement information and based on the offset information, the calibration data allowing for calibrating the plurality of images so as to comply to the camera pattern; and determining the displacement information minimizing an error of a first minimization criteria, wherein for a 2D camera pattern the first minimization criteria is based on the determination rule;
18. A non-transitory digital storage medium having a computer program stored thereon to perform the method for acquiring calibration data, the method comprising: receiving a plurality of partially overlapping images of an object from a corresponding plurality of camera positions being arranged along a first and a second direction according to a two-dimensional or three-dimensional camera pattern, wherein patterns comprising camera positions that differ in two directions are two-dimensional patterns and patterns comprising camera positions that differ in three directions are three-dimensional patterns; selecting at least one corresponding reference point in an overlap area of a set of overlapping images, and determining a displacement information along the first and the second direction of the reference point in each of the other images of the set of images, such that a misalignment of the plurality of images along the first and the second direction is compensated by the displacement information so as to acquire aligned images; determining an offset information between principal points at the plurality of camera positions using at least three aligned images; and providing calibration data based on the displacement information and based on the offset information, the calibration data allowing for calibrating the plurality of images so as to comply to the camera pattern; and determining the displacement information minimizing an error of a first minimization criteria, wherein for a 2D camera pattern the first minimization criteria is based on the determination rule;
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
DETAILED DESCRIPTION OF THE INVENTION
(22) Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures.
(23) In the following description, a plurality of details is set forth to provide a more thorough explanation of embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the present invention may be practiced without these specific details. In other instances, well known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention. In addition, features of the different embodiments described hereinafter may be combined with each other, unless specifically noted otherwise.
(24) In the following, reference is made to camera arrays comprising a plurality of cameras that are arranged at different positions. Such arrays are described herein as being arranged according to a two-dimensional pattern for illustrating the advantages of the embodiments. Although referring to a two-dimensional pattern, other patterns may also be used that comprise cameras being arranged according to a three-dimensional pattern. For example, this may be considered when allowing for divergences of the focal length of a camera. According to other embodiments, the camera pattern may comprise only one dimension, i.e., the cameras are arranged along a single line (or row). Some embodiments are described as one camera being arranged at each camera position such that embodiments that relate to a plurality of partially overlapping images of an object being received from a corresponding plurality of camera positions may be understood as the plurality of images being received from a corresponding plurality of cameras.
(25) Without limitation one camera may capture two or more images at different locations when being moved from a location to another such that images from different camera positions may be captured by a single camera.
(26) Embodiments described herein are explained in terms of stepwise rectifying images. This may relate to rectify or transform images so as to obtain transformed images and to use such transformed images for further steps or for further processing. Without any limitation, such descriptions rate to correction data that is obtained stepwise, i.e., the original images may remain partially or completely unchanged, wherein a transformation of the images is only theoretically performed, e.g., when determining displacement data. Thus, rectifying or transforming the images may be equivalent to determining the amount of the transformation that has to be applied to the images.
(27) Embodiments described herein relate to a displacement of images with respect to each other and/or with respect to a camera pattern. The displacement relates to a transformation according to a first and/or second direction parallel to an object plane but also relates to a rotation of the image in the three-dimensional space. In the following, reference is made to rectifying images, i.e., to calibration of images. Such rectification may comprise a transformation of images. Such transformation may relate to a shifting or a movement of the image along at least one image direction, to a rotation of the image in the image plane and/or to a scaling (skew) of images. Thus and unless stated otherwise, reference made herein that relates to a shift of an image also relates to a rotation and/or a scaling of the image.
(28)
(29) Further, the analyzing unit is configured for receiving the images 14 from the plurality of cameras, e.g., without or only coarse prior knowledge with respect to the content thereof and/or overlaps of the images. The analyzing unit may be configured for selecting a corresponding reference point, i.e., image content in the overlap area, e.g., by comparing images. The corresponding reference point or reference region may be a single point or region/object in the captured scene. By identifying similar or identical content, the overlap area may be identified and a set of cameras (pairs, triplets, etc.) may be identified having overlaps in their images at least at a region comprising the reference point 26.sub.1 and/or 26.sub.2. The set of images may thus be configured such that the set of images is a subset of the plurality of images 14, i.e., at least one of the images is not contained in the respective set. In contrast each camera is contained in one or more, e.g., at least two sets.
(30) Based thereon relationships between the cameras may be determined as being described in connection with
(31) Further images of the plurality of images 14 may be misaligned with respect to the image 14a and/or the image 14b. The analyzing unit 24 is configured to determine the displacement, to determine the displacement information 34, respectively, between the plurality of images 14. The analyzing unit 24 may be configured to determine the misalignment or displacement globally, i.e., to select all images of the plurality of images 14 that illustrate the same reference point 26 and to determine the displacement information globally for the selected set of images. Alternatively, the analyzing unit 24 may be configured to align two images such as the images 14a and 14b with respect to each other, i.e., to operate pair-wise. The analyzing unit 24 may be configured to select a set of images that comprise the same reference point 26 and to pair-wise align the images with respect to each other. The analyzing unit 24 may be configured to select more than just one reference point in the overlap area such as, for example, at least 5, 10, or 100. For determining the displacement information along one of the first and second direction, a first image of a selected pair of images may be rotated and scaled so as to be aligned along one image direction, i.e., such as lines or columns of pixels point towards the other camera position. Then, the second image may be transformed along the respective direction so as to align the at least one selected reference point. This may be repeated for the other direction. The machine has knowledge about the position of all cameras. It is even assumed that this information is reliable.
(32) The apparatus 10 comprises a determining unit 36 configured for determining a offset information 38 that relates to principal points 42a to 42c of the plurality of images 14. The offset information may thus be a distance information that describes a distance between principal points and may thus be a relative information relative with respect to the images associated to the principal points. Based on a physical distance between cameras (or camera positions) 18a and 18b or between cameras (or camera positions) 18a and 18c or between cameras (or camera positions) 18b and 18c, a view on the scene or object area, i.e., the object 16 is different from camera to camera. This effect is known as disparity and is used for determining 3D images. For a stereoscopic view, two cameras are needed, for example, cameras 18a and 18b. Through the use of camera 18b and 18c, 18a and 18c, respectively, a second stereoscopic view may be obtained that may allow for an increase in information. When considering now a first stereoscopic view being obtained by using cameras 18a and 18b and a second stereoscopic view being obtained by using cameras 18b and 18c, the stereoscopic information may differ from one another when distances 44a between principal points 42a and 42b and a distance 44b between the focal points 42b and 42c differ from each other. Based thereon disparity estimation and/or depth reconstruction might lead to different results, which may be prevented by the described embodiments.
(33) The determining unit 36 is configured to determine the offset information 38 so as to comprise information related to the distances 44a and 44b. The determining unit 36 is configured to use at least three images, for example, the images of the cameras 18a to 18c, for determining the offset information 38. For determining the distances 44a and/or 4b the reference points may be used. The respective reference point may be the identical in a pair of images and may be identical or different from each other in different sets of images, i.e., in a picture that is part of different pairs of images, different reference points may be used. Determining the offset information 38 so as to obtain information on how to transform the images and allowing for low or a minimum of remaining disparity offset for all of the considered images may be a second optimization problem. The determining unit 36 may be configured to solve the optimization problem, as will be described later.
(34) The apparatus 10 comprises an interface 46 for providing calibration data 48 that is based on the displacement information 34 and based on the offset information 38. The calibration data may allow for rectifying the images 14 including a compensation in differences between distances 44a and 44b between the focal points 42a to 42c. The apparatus 10 may comprise an optional directing unit 52 being configured to process and/or to direct the calibration data 48. For example, the apparatus 10 may comprise an optional memory 54, wherein the directing unit 52 is configured to store the calibration data 38 in the memory 54. The directing unit 52 may be implemented by a central processing unit (CPU), by a microcontroller, by a field programmable gate array (FPGA) or the like.
(35) Based on the displacement information 34, the apparatus 10 has knowledge about the displacement of the images 14 and may compensate for the displacement either by transforming such as shifting the images with respect to each other or by considering the displacement during further operation. This allows determining the differences in the distances 44a and 44b using aligned images such that the distance may sufficiently be calculated along one direction such as x for obtaining sufficient, precise information. An alignment of the images 14 may allow for reconstructing one dimensional dependencies in the camera pattern such as lines, rows or diagonals of the cameras 18a to 18c. The alignment performed by the analyzing unit 24 may, for example, allow for each of the images 14 being in-line or in-row of the camera pattern. This may leave unconsidered a distance between two lines of the pattern, or two rows of the pattern, respectively. However, by determining the distance 44a and the distance 44b, differences in the disparity between different stereoscopic images may be compensated for. Thus, by pre-aligning the images, the compensation for deviations in distances of principal points may be performed along one direction and may, at the same time, provide sufficient information. Thus, disparity estimation and/or depth reconstruction as well as rectification of images may be performed with low computational effort.
(36)
(37)
(38) In other words,
(39)
(40)
(41) When referring again to the model of the pinhole camera, the projection matrix P may be formulated according to:
P=K.Math.R.Math.[I|C]Formula 1
(42) K is, for example, a 33 matrix consisting of the intrinsic parameters. These include the focal length, the principal point and the skew factor s.
(43)
(44) Parameters that may be evaluated for determining the displacement between images may be a focal length. In
(45)
(46) The analyzing unit 24 may be configured to determine the displacement information along the x direction and subsequently along the y direction or vice versa. When compared to a determining of the displacement along x direction and y direction simultaneously, this may provide for a high robustness. The apparatus may have knowledge about the error-free pattern 30 and may determine that lines 74a and 74b should be in parallel to each other and that columns 76a and 76b should also be in parallel to each other. The analyzing unit 24 may be configured to determine that this is not true and that the images have to be transformed with respect to each other as illustrated on the right hand side of
(47) Thus, the left hand side of
(48) In other words, the left hand side of
(49)
(50) An angle .sub.14 indicates an angle by which image 1 (24a) has to be rotated around an anchor point, possibly around a center 27.sub.1 of the image 24a, such that its direction u or alternatively v at the anchor is directed towards the comparable anchor 27.sub.4 in image 4 (24d). An angle .sub.24 indicates an angle by which image 2 (24b) has to be rotated around an anchor point, possibly around a center 27.sub.2 of the image 24b, such that its direction u at the anchor is directed towards the comparable anchor 27.sub.4 in image 4 (24d). Such angles may provide for information indicating the displacement information. may therefore be a parameter fixed for a pair of images or views (p,q) as it only depends on the mechanical center/position of cameras p and q. Centers 24.sub.1 to 24.sub.4 may also indicate the used camera position C.sub.1 to C.sub.4.
(51)
(52)
(53) In other words,
(54)
(55) Although having described patterns that comprise a square or rectangular shape, any other shape may be realized. Any number of cameras such as at least three, five, nine or sixteen may be used. Thus, known approaches are extended towards two dimensional arrays of cameras, for example, such as shown in
(56) Apparatus 10 implements a concept that transforms the images such that the remaining deviation compared to a given model is minimized. This may be achieved by approximating a highly non-linear problem by a series of sub-problems that may be of a linear nature. The overall problem may be approximated and solved using iterative minimization methods minimizing the residual error. Alternatively, the apparatus 10 may also be configured to solve a non-linear optimization.
(57)
(58)
(59) In other words, typically, the precision of mechanical setups is limited. Therefore, the exact properties of one or several cameras need to be reconstructed. Such a reconstruction of camera parameters is also known as calibration. In this procedure, different measurements are made and based on a physical or mathematical model, unknown parameters are found using the analyzing unit and/or the determining unit. Once those unknown parameters are found, one can use these parameters, for example, to define a function that transforms a set of non-rectified images into a set of rectified images. Thus, the obtained calibration data may be applied to non-rectified images which allows for storing or transmitting the non-rectified images even if the displacement information has already been obtained.
(60) The focal length f of a camera may differ from its intended/ideal value in horizontal and/or vertical direction via the error at. The skew factor s may be used to describe the skew of a pixel. In the case of modern imaging sensors, this factor is typically 0, meaning that pixel cells are rectangular. The principal point describes the point of intersection between the camera's optical axis and the sensor. If the sensor is positioned exactly beyond the central point of the optical axis, .sub.x, deviation along the x direction and .sub.y, deviation along the y direction, both are zero. Therefore, the properties of a camera may be formulated as a matrix as:
(61)
(62) By convention it may be formulated that, in a default state, an ideal camera is oriented such that it is strictly pointing down the z axis of a coordinate system such as the one illustrated in
(63)
(64) In this case [M.sub.i.sup.T, 1].sup.T, denotes a homogenous 4D coordinate. The resulting image coordinate m, may then be a three-element vector. This point represents an image coordinate such that the vector may be normalized to its third component. After this normalization, u and v denote the horizontal/vertical position of a point in an image plane which may also be understood as distance (in pixels) from a side of the image.
(65) For each vector m, the following rule may be applied:
(66)
(67) In the context of multi-camera self-calibration, bundle adjust (BA), as described in [2], chapter 18, and which is also known as bundle-adjustment, is a known and proven algorithm. This algorithm uses iterative optimization methods like the Levenberg-Marquardt algorithm to minimize the following problem:
(68)
(69) wherein d() denotes the geometric distance between an image point x.sub.j.sup.i and a re-projected point {circumflex over (P)}.sup.i.Math.{circumflex over (M)}.sub.j.Math.{circumflex over (P)}.sup.i.Math.{circumflex over (P)}.sup.i denotes the estimated camera matrix for camera i and {circumflex over (M)}.sub.j denotes an estimated world point. Variants of this algorithm are known that are designed to improve efficiency for specific applications and working points. In [3] a method is presented to calibrate a similar camera system consisting of 25 cameras in total. A well-known chart-based calibration method presented in [4] is used and the result is refined using global optimization. Other works as presented in [5], [6] and [7] use additional means like projected patterns or laser pointers to calibrate a set of cameras or entail actively controlling the camera orientation or zoom. A calibration procedure as described in [8] whilst making references to [9] involves the user to provide a calibration object (such as a checkerboard) in front of the multi-camera system. Furthermore the system is limited to two-dimensional (2D) camera setups and the calibration object needs at least to be partially visible in all cameras at once. The properties of the calibration object need to be known to the algorithm. Thus, the calibration object cannot be selected arbitrary. Thus, in known systems, properties of the calibration object are known to the algorithm. I.e., the number of patches on the checkerboard or the fact that all patches are squares. This is directly exploited in the algorithm. The algorithm will fail if the calibration object cannot be detected properly.
(70) Camera systems such as the ones illustrated in
(71) The starting point of the calibration procedure that may be performed with apparatus 10 may be seen on the left hand side of
(72) Based on the pinhole camera model and the projection matrix P, the situation on the right hand side of
(73) In the remaining part of the document subscripts as in x.sub.p are used to indicate that a vector x denotes properties or information belonging to a specific element such as a camera p. If a variable encodes information belonging to a pair or tuple of cameras we denote this using a two-element subscript such as .sub.pg In this example .sub.pg encodes an angular relationship between to cameras p and q. Similarly, relationship or information comprising three cameras is encoded using a three element subscript such as in A.sub.pqr.
(74) The calibration task can be formulated as measuring the deviation from its ideal state and finding a set of functions H(x.sub.p,C.sub.p,C.sub.q) and H(x.sub.q, C.sub.p, C.sub.q) that, given a pair of images, transforms each image point such that corresponding image points are located in the same image row. For example, matrix H is a 33 homography matrix. Mathematically, this may be expressed as:
(75)
(76) M denotes a 3D world coordinate, P.sub.p denotes the projection matrix for camera p. Typically, M as well as P are unknown. Assuming that the world point M can be seen in both cameras p and q, this point forms one pair of corresponding image points m.sub.p and m.sub.q. H(x.sub.p,C.sub.p,C.sub.q) is a function of each camera's position C.sub.p and C.sub.g and several intrinsic parameters x.sub.p. The row vector x may be formulated according to the rule:
x:=[.sub.f p.sub.x p.sub.y] Formula 8
(77) H(x.sub.p,C.sub.p,C.sub.q) may be modeled using the pinhole camera model. The outer function v() is used to extract the vertical component of an image point. The equation given above may be implemented for all possible pairs of cameras for a given subset which is relevant for the further processing. Although Formulas 6 and 7 and other descriptions made herein differentiate between left view and right view, this may be changed with respect to each other or manipulated, without any limitation. When referring, for example, to
(78) The problem discussed above is typically valid for the case of a pinhole camera that does not suffer from side effects like lens-distortion. If lens-distortion is relevant for a specific camera and lens in an implementation Formula 7 may be extended:
v(H(x.sub.p,C.sub.p,C.sub.q).Math.undist(m.sub.p,x.sub.p))v(H(x.sub.q,C.sub.p,C.sub.q).Math.undist(m.sub.p,x.sub.q)) Formula 9
(79) Undist( ) denotes a function that reverts lens-distortion according to some model and parameters contained in x as (Formula 8). Undist( ) may be defined as:
(80)
(81) Function L(r, ) actually models the distortion and is controlled by . Different models may be selected as well as a different length of parameter vector . Here, the discussion is limited to two elements in . In the ongoing part of this text, explicit elements in will be referred to as k such that may be expressed as: =[k.sub.1, k.sub.2, . . . ].
(82) In principle, this problem may be solved using stereo calibration methods on each pair of cameras. In general, such a method will estimate a parameter set x.sub.p, x.sub.q for images p and q, but a different parameter set {tilde over (x)}.sub.p,{tilde over (x)}.sub.r for images p and r, wherein image r is located, for example, below picture p such as image 14c with respect to image 14a. In contrast, it is an object to obtain only one set of parameters x per camera. The desired calibrated situation may also be expressed using the camera matrices P. In the calibrated case, all cameras have identical camera matrices K. All cameras are orientated identical with their optical axis perpendicular to the camera baseline. In this situation, the camera matrix P only depends on the camera position C as described by:
P.sub.p=K.Math.[I|C.sub.p]Formula 12
(83) In the case of independent stereo calibration, the obtained solution is slightly different and doesn't match all criteria: Self-calibration methods can only estimate a relative difference between two cameras. This is, of course, also true for the principal point. In case of a horizontal calibrated stereo rig with corresponding image points on identical image lines, both horizontal principal points .sub.x,1 and .sub.x,2 may be selected and manipulated arbitrarily. Both images still form a stereo system. This fact is, in example, also exploited in stereoscopic productions to manipulate the convergence plane.
(84) In case of complex image processing such as multi-camera disparity estimation, one likes to ensure that corresponding image points across more than two views can be correlated by a simple relationship. Based on the simplified projection matrices given in Formula 9, one can deduce the following formula:
(85)
(86) Given a triplet of matching feature points, one can see that stereo disparity directly corresponds to the distance between the cameras' centers. The denominator of Formula 10 may be the distance 44 or b.sub.x, b.sub.y, respectively.
(87) In a real setup, true camera positions will slightly differ from the given values C and known calibration methods are in general not applicable. However, the concept according to embodiments described herein assumes that the positional error is very small and can be ignored which allows for a reduced space of parameters and therefore for a lower complexity of the problem. The concept according to the embodiments described herein may be applied but may slightly increase back-propagation error.
(88) It is therefore one aspect of the embodiments described herein to first align the images and afterwards determine the distance between the principal points by using a different number of images. For determining the displacement information, images may be used pair-wise, wherein the offset information may be determined using three cameras.
(89) The displacement information as well as the offset information may be determined without user-interaction. For each pair of images/triplet of images robust corresponding points, i.e., reference points, regions or objects are determined in an automatic manner, especially without a user providing special calibration patterns such as checkerboards or similar objects. In contrast to existing systems (such as [8]) it is in addition not needed that a single (possibly known) object or point can be observed in all cameras at one time.
(90) Thus, the number of cameras is adapted to the problem to be solved. Furthermore, the underlying non-linear optimization problem may be reformulated. The analyzing unit and/or the determining unit may be configured to solve the non-linear optimization problems. According to an embodiment, the analyzing unit and/or the determining unit may be configured to linearize the non-linear problem for some distinct points and to solve the linearized versions using standard iterative minimization algorithms. Instead of solving one large non-linear problem, such embodiments set up a list of local, piece-wise linear problems that are used to formulate the global problem. This global problem may then be solved iteratively using linear methods. When assuming that the mechanical design of the system is known, including camera position and ideal orientation, the calibration procedure may enforce camera views to obey this model. Additionally, the determination of the displacement information may be split into two parts. By this, highly correlated parameters leading to unstable results can be separated and optimized sequentially. Known approaches such as BA involve estimating 3D world points from image correspondences. This is, in itself, a complex, erroneous procedure. In embodiments, 3D world points are not needed, increasing the stability and efficiency of the approach. By example, BA optimizes camera parameters as well as position of 3D world points. The total number of degrees-of-freedom can be computed as 7N+3P. In this case, N denotes the number of cameras while P denotes the number of 3D world points. According to embodiments, a focus is set on the intrinsic parameters and the camera rotation. This, initially, yields, in an example, number of degrees-of-freedom of 8N. When referring again to Formula 8, the vector x comprises a number of 6 parameters. When formulating a problem based on the 6 parameters that rely on a single camera, the number of cameras N8 values per camera yield in the value above. Thus, embodiments described herein focus on obtaining a result, i.e., the displacement information and the offset information, that makes subsequent image processing steps efficient. In the first step, correspondences between image pairs are sufficient. The second step involves correspondences in three, possibly adjacent, views. This solution may spread an error in the camera array among all cameras, which is a specific property of embodiments described herein. Thus, an apparatus such as apparatus 10 may be configured to impinge on an error-free image with an error so as to compensate for inter-image errors. The concept is designed to fit a set of images to a given model, i.e., the camera pattern. The physical model, as well as the mapping between an image and its position in the array, is specified, by a user.
(91)
(92)
(93) Sub-method 603 comprises block 604 that is a linear optimization procedure and block block 606 that is performed alternatively or in addition to block 604 and is a non-liner optimization procedure. Only one of the blocks 604 and 606 needs to be executed in order to obtain a set of calibration data 607. Block 608 generates, in a block 612, a set of calibrated images from the calibration data obtained in 607 so as to obtain L rectified images in a block 614 or stores the calibration data in a block 616 for future use. Method 600 allows for use of linear strategies as well as of non-linear strategies and is consistent for planar camera array and non-planar camera array.
(94) In the following reference will be made to a use of a linear optimization, for example, in block 604 using steps 612,614 and 616 of method 600. In the MN camera array (M1)N+(M1)N, individual stereo pairs can be formed using directly adjacent camera positions. When referring, for example, to
(95) Although description was made in connection with a number of six parameters that are to be determined in order to calibrate images, also a different number of parameters or different parameters may be used, e.g., in case one or more of the above mentioned is considered to be within a tolerance range, calibration of that parameter may be skipped. Alternatively or in addition other parameters may be considered such as a distortion of lenses used in the cameras which may lead to low picture quality. The apparatus 10 may be configured to calibrate for the distortion, e.g., during determination of the displacement information. Different reference points in the image may allow for compensating the distortion of the lens.
(96) In this way, each view is at least involved in two individual stereo pairs. The set of stereo pairs is further referenced as the set . The set may in example comprise pairs (p,q), (p,r), (q,r) and others. In general, the method is not limited to directly adjacent cameras. Each possible pair of views with overlapping image areas that allow for a common reference point may be formed. In step 614 interest points or reference are detected on each of the images and these interest points are matched on the mentioned stereo pairs in step 616. This may be done using any arbitrary feature detection and matching algorithm like SIFT, SURF or BRIEF. As a result, sets of corresponding, matched image coordinates
(97)
between two images p and q is obtained.
(98) u.sub.p and v.sub.p denote vectors containing l horizontal/vertical positions of feature matches between view p and view q respectively. In the example of
(99) In block 604 disparities along a direction, e.g., vertical disparities, for all stereo pairs contained in are reduced or minimized. For example, this may be done along the vertical direction such as the y direction. Formally, for one pair (p,q) this can be written as:
0=v(H(x.sub.p,C.sub.p,C.sub.q).Math.m.sub.p,q.sup.(p))v(H(x|.sub.q,C.sub.p,C.sub.q).Math.m.sub.p,q.sup.(q)) Formula 14
(100) As before, H(x.sub.p,C.sub.p,C.sub.q) denotes a function returning a 33 homography matrix for view p. The argument x.sub.p of this function is a vector of length k. Function v(x) extracts the vertical component of a feature point coordinate:
(101)
where x is a 3l matrix. Consequentially, v(x) returns a vector of length l. In the present consideration x is a variable unrelated to other variables named x herein and shall simply denote an argument of function v. The overall minimization task can now be formulated as:
(102)
(103) The result thereof may allow the analyzing unit for obtaining a solution for all images p, q, r and s forming pairs, the pairs being comprised of the set , i.e., a common solution for all x may be obtained.
(104) In Formula 15 the square norm is selected in order to model the error between two views. Other choices are possible. The notation .sub.(p,hd q)arg indicates that the tuple (p,q) is one tuple exemplarily selected from the set but also that the summation is performed for all tuples contained in . H(x.sub.p,C.sub.p,C.sub.q) is defined as a product of three functions R.sub.0, K and R.sub.i:
H(x.sub.p,C.sub.p,C.sub.q)=R.sub.0.Math.K.Math.R.sub.i=R.sub.0.Math.{tilde over (H)}(x.sub.p)
with {tilde over (H)}(x.sub.p)=K.Math.R.sub.i Formula 17
(105) This formulation can be interpreted as follows: The right-most rotation matrix R.sub.i corrects a camera's orientation such that it corresponds to its ideal orientation as given by the ideal camera setup. The center matrix K corrects parameters like focal length and principal point. The left-most rotation R.sub.0 rotates a camera p such that it forms a stereo system with a camera q. Only this matrix is individual for each pair of views.
(106) K may model the standard camera matrix as given above. R.sub.0 and R.sub.i denote the outer and inner rotation matrix. More precisely, R.sub.0 is computed from the relative position and orientation of two cameras p and q. Because each camera position is known, R.sub.0 is known. In some arrangements, all these cameras share one common plane, such that the outer rotation R.sub.0 can be modeled using a single rotation around the optical axis. This may be referred to as angle that is explained in connection with
(107) Alternatively, Formula 17 may also be defined as:
H(x.sub.p,C.sub.p,C.sub.q)=R.sub.0.Math.R.sub.i.Math.K=R.sub.0.Math.{tilde over (H)}(x.sub.p)
with {tilde over (H)}(x.sub.p)=R.sub.i.Math.K Formula 17b
(108) In the more general case when the cameras do not share a common plane, the apparatus, in particular the analyzing unit is configured to determine R.sub.0 from the position C.sub.p and C.sub.q of two cameras p and q. R.sub.0 is a 33 matrix with elements as listed below:
(109) TABLE-US-00001 Formula 18 R.sub.0(1,1) c.sub.p.sub.
(110) The parameter vector x.sub.p contains all unknown parameters for a camera p and is defined according to Formula 8. Based on this definition H(x.sub.p,C.sub.p,C.sub.q) may now be completely defined as:
(111)
(112) The integer number subscript such as in C.sub.p,2 denotes the second element in the vector C.sub.p. Please note that in Formula 19 subscripts in the elements of x.sub.p such as have been neglected. It should however be clear that in any formula comprising elements of x, those elements need to be interpreted with respect to specific camera p. In Formula 19 the ideal orientation of all cameras is identical and perpendicular to the baseline. If this is not the case, H can be extended by an additional rotation matrix R(,, ) describing the specific orientation with respect to the coordinate frame:
H(x.sub.p,C.sub.p,C.sub.q,,,)R.sub.o(0,0,.sub.p,q).Math.R(,,).sup.1.Math.K(.sub.fp.sub.x,p.sub.y).Math.R.sub.i(,,) Formula 21
(113) In the ongoing description the case expressed in Formula 14 is considered. If needed, calculations can be redone including R(,, ).
(114) With respect to the alternative definition of Formula 17 as given in Formula 17b, Formulae 19, 20 and 21 can be adapted by switching the order of the two rightmost matrices R.sub.i and K.
(115) According to embodiments, the apparatus is configured to consider the lens-distortion. The functionality of the apparatus may be expressed by including Formula 10 in Formula 11 and Formula 12 as to obtain an undistorted version of reference points m.
(116) The term of Formula 14 forms a non-linear system. This formula can be solved using iterative least-square minimization techniques like Levenberg-Marquardt algorithm or comparable methods. However, this solution is inefficient. This non-linear minimization is depicted in block 606 of
(117) The problem in Formula 12 may be reformulated in order to find the values in H efficiently: Therefore, it may be assumed that all variables in x are small. This corresponds to the block 604 in
(118) With the complete definition of H, Formula 12 may be evaluated for one pair of cameras p, q and for one pair of point correspondences.
(119) In the next step such a formulation may be expanded, for example, using first order Taylor expansion of several variables. This expansion may be defined as:
(x,a):=T(x,a)=T F.sub.x,a(1,0)=F.sub.x,a(0)+F.sub.x,a(0)
F.sub.x,a:.fwdarw.
:=t.fwdarw.(a+t.Math.(xa)) Formula 22 & 23
(120) In Formulae 21 and 22 F.sub.x,a is a function of t that evaluates the function f(x) at the point a+t.Math.(xa). f(x) is the function to approximate at some point a. Parameters to the function f(x) are contained in the parameter vector x. t is a scalar value. a is a vector with identical size as x. First order Taylor approximation T of a function f(x) at some point a may be expressed as Taylor expansion of a scalar function F.sub.x,a(t). F.sub.x,a is expanded at point t.sub.0=0 by computing the function value of F.sub.x,a with t.sub.0=0 and the value of the first order derivative
(121)
also at t.sub.0=0. Evaluating the resulting term at t=1 yields the first order Taylor approximation of function f(x) at point a. The expansion may exemplary be computed for a single pair of images p and q and a single pair of reference points. Then, Formula 16 (here also including lens-distortion as given in Formula 9) may be simplified as written in Formula 24.
v(H(x.sub.p,C.sub.p,C.sub.q).Math.undist([u.sub.p,v.sub.p,1].sup.Tx.sub.p))v(H(x.sub.q,C.sub.p,C.sub.q).Math.undist([u.sub.g,v.sub.q,1].sup.T,x.sub.q)) Formula 24
(122) The apparatus may be configured to determine the Taylor expansion using Formula 22 for Formula 24 for all variables contained in x.sub.p and x.sub.q forming a parameter vector like =[x.sub.p, x.sub.q]. is argument to Formula 22 as well as an expansion point a=0. Without limitation, other expansion points may be selected.
(123) This yields an approximation for the point x.sub.p=0 and x.sub.q=0. The resulting equation can be differentiated with respect to every unknown variable in x.sub.p and x.sub.q. As a result, one obtains the Jacobian matrix of this system. Each column in the Jacobian matrix contains an approximation for each parameter in the system. These approximations may also be expressed as:
(124) TABLE-US-00002 Formula 25 .sub.p v.sub.pu.sub.p sin(.sub.pq) cos(.sub.pq) v.sub.p.sup.2 cos(.sub.pq) .sub.q v.sub.qu.sub.q sin(.sub.pq) + cos(.sub.pq) v.sub.q.sup.2 + cos(.sub.pq) .sub.p sin(.sub.pq) u.sub.p.sup.2 + v.sub.pu.sub.p cos(.sub.pq) + sin(.sub.pq) .sub.q sin(.sub.pq) u.sub.q.sup.2 v.sub.qu.sub.q cos(.sub.pq) sin(.sub.pq) .sub.p u.sub.p cos(.sub.pq) v.sub.p sin(.sub.pq) .sub.q u.sub.q cos(.sub.pq) + v.sub.q sin(.sub.pq) a.sub.fp sin(.sub.pq) u.sub.p + cos(.sub.pq) v.sub.p a.sub.fq sin(.sub.pq) u.sub.q cos(.sub.pq) v.sub.q {open oversize brace} .sub.xp sin(.sub.pq) {close oversize brace} {open oversize brace} .sub.xq sin(.sub.pq) {close oversize brace} .sub.yp cos(.sub.pq) .sub.yq cos(.sub.pq) k.sub.p,1 (sin(.sub.pq) u.sub.p + cos(.sub.pq) v.sub.p) {square root over (u.sub.p.sup.2+v.sub.p.sup.2)} k.sub.q,1 (sin(.sub.pq) u.sub.q + cos(.sub.pq) v.sub.q) {square root over (u.sub.q.sup.2+v.sub.q.sup.2)} k.sub.p,2 (u.sub.p.sup.2 + v.sub.p.sup.2) (sin(.sub.pq) u.sub.p + cos(pq) v.sub.p) k.sub.q,2 (u.sub.q.sup.2 + v.sub.q.sup.2) (sin(.sub.pq) u.sub.q + cos(pq) v.sub.q)
(125) In this case, k.sub.g, h denotes the parameters to model lens-distortion of view g being either q or p according to Formula 10 and 11 and the definition of vector .
(126) The remaining constant part may be formulated as:
b.sub.p,q=(v.sub.pv.sub.q)cos(.sub.pq))+sin(.sub.pq)(u.sub.pu.sub.q) Formulas 26
(127) All elements in J and also the constant part b.sub.p,q can now be evaluated for point correspondences. The resulting system can be re-written as linear system according to:
(128)
(129) In Formulas 27 & 28, J.sub.p,q is a lx(2*m) matrix where l denotes the number of point correspondences and m denoting the number of variables in x. b is a column vector with length l.
(130) As the Taylor expansion only approximates the function value at one point, the true parameter vector x.sub.p needs to be replaced by an approximated value {tilde over (x)}.sub.p. In Formula 27, J.sub.p,q and b.sub.p,q denote the Jacobian matrix and constant part of two views (p, q) evaluated for all point correspondences. As before, J.sub.p,q.sup.(p) in Formula 28 denotes the columns of J.sub.p,q belonging to unknowns of camera p.
(131) For the set of all stereo-pairs , the overall minimization task may be formulated by the analyzing unit 24 as follows: Each pair (p, q) in forms one line in the global constraint matrix A that is generated in a step 621 of method 600. For a 22 camera array with four individual stereo pairs, the problem can be written as:
(132)
(133) The number of feature matches in each stereo pair may be greater than the number of unknown variables in x, such that the overall system is overdetermined. To solve the linearized system and thus finding a set of parameters approximating the real condition of the plurality of cameras as given by Formula 15, the system can be solved for by computing the Pseudo-inverse of A. In total, the solution for can be obtained as:
=(A.sup.T.Math.A).sup.1.Math.A.sup.T.Math.b Formula 31
(134) Thus, the analyzing unit may be configured to minimize the error based on the determination rule that corresponds to Formula 31. The analyzing unit may be configured for determining the displacement information using a set of parameters, that indicates a real condition of the plurality of cameras. The parameters may comprise a non-linear relationship with respect to each other. The analyzing unit may be configured to determine a linearized version of the parameters in the vector and may be configured to determine the displacement information by minimizing an error of the linearized version with respect to a desired condition of the plurality of cameras. This is indicated as step 622, i.e., the system is solved for . Alternatively, the analyzing unit may solve the non-linear problem. The at least two elements in x show the approximation for .sub.x and .sub.y. As both do not depend on image coordinates and only depend on which is constant, the linearization does not provide valuable information of these parameters. For this reason, both parameters .sub.x and .sub.y are excluded in this step. This works by dropping the value in {tilde over (x)}.sub.p and {tilde over (x)}.sub.q as well as the appropriate columns in J.sub.p,q. As a consequence, deviations in a camera's principal point will be approximated by a small additional rotation around the y axis and x axis respectively. I.e., the calibration data may be based on angles that describe a rotation of the cameras, to a focal length of the camera and to a principle point of the cameras, wherein the calibration data may be in absence to a position of the cameras of the plurality of cameras, i.e., does possibly not contain information indicating the camera positions. This may be possible as the camera position is considered as known variable in the system.
(135) The elements in now contain an approximated solution for the overall non-linear problem. At this point, the overall problem can now be solved using an iterative optimization algorithm like gradient descent. In every iteration a small portion of .Math. is added to the previous x.sub.p.sup.(t-1) forming a new x.sub.p.sup.(t), i.e., the solution vector is updated in a step 623. The current x.sub.p.sup.(t) is then used to generate an intermediate mapping homography {tilde over (H)} for each camera. This intermediate homography is used to update all feature sets (p, q). Those updated feature sets {tilde over (m)}.sub.p,q form new initial feature sets in the next iteration, i.e., the feature points are transformed in a step 624. Ideally, x.sub.p.sup.(t) iteratively approaches the desired solution. As soon as is smaller than threshold {tilde over ()}.sub.0 or the remaining error cannot be reduced anymore, the algorithm stops. I.e., the analyzing unit may be configured to iteratively minimize the error of the linearized version. This is indicated in step 625 checking, if congruence is reached. This may also be expressed as:
x.sub.p.sup.(t)=.Math.+x.sub.p.sup.(t-1)
{tilde over (m)}.sub.p,q.sup.(p)={tilde over (H)}.sub.p.Math.m.sub.p,q.sup.(p)=K(a.sub.f,p,p.sub.x,p,p.sub.y,p).Math.R.sub.i(.sub.p,.sub.p,.sub.p).Math.m.sub.p,q.sup.(p)
and {tilde over (m)}.sub.p,q.sup.(q)={tilde over (H)}.sub.p.Math.m.sub.p,q.sup.(q)=K(a.sub.f,q,q.sub.x,q,q.sub.y,q).Math.R.sub.i(.sub.q,.sub.q,.sub.q).Math.m.sub.p,q.sup.(q) Formulas 32 & 33(a)
(136) Once this iterative minimization algorithm has converged, the second part of this linearized calibration procedure can be started, i.e., block 627 may be started, e.g., by the determining unit. At the end of block 625 a result according to
(137) The final state after the converge criteria is this illustrated
(138) As formally expressed in Formula 13, corresponding disparity values in three cameras may depend on the array geometry only. Up to this point, this has not been taken into account in the method 604. Consequentially, it is possible that Formula 13 is not met. Simplified, a distance between columns in pairs may differ or vary. This condition can be re-written in order to formulate a condition similar to Formula 15. For a triplet of three cameras this may be formulated as:
(139)
(140) H denotes the mapping homography that has been obtained in the last iteration of the previous steps, i.e., block 625. H() may be defined as given in Formula 17 or 17b with respect to a parameter vector x. The function u() extracts the horizontal component of image point. d(C.sub.pC.sub.q) denotes the Euclidian distance between two camera C.sub.p and C.sub.q. Using {tilde over (m)} instead of m, it may be assumed that optical axes are parallel and that all cameras have identical focal lengths. The remaining error can be modeled by a shift in the cameras' principal point. Formula 34 can now be used to define a second minimization problem which is subject to be minimized and formulated as follows:
(141)
(142) Formula 36 refers to the general case when the cameras are not in a calibrated state. The analyzing unit may be configured to solve this problem so as to determine the displacement information. Alternatively or in addition, the analyzing unit may be configured to solve a linearized version thereof that may be expressed by
(143)
(144) Function (x.sub.p,C.sub.p,C.sub.q) may be defined as:
{tilde over (H)}({circumflex over (x)}.sub.p,C.sub.p,C.sub.q)R.sub.o(0,0,.sub.p,q).Math.K.sub.i(0,{circumflex over ()}.sub.x,p,{circumflex over ()}.sub.y,p)
{circumflex over (x)}[{circumflex over ()}.sub.x,{circumflex over ()}.sub.y] Formula 38
(145) Function as given in Formula 38 models a transformation that performs a shift in a cameras' principal point and as before rotates two images such that they may form a stereo pair. Thus, by comparing two individual stereo pairs minimizing the remaining error as given in Formulas 36 and 37 by finding optimum values for {circumflex over ()} may be performed.
(146) As an additional condition, Formulas 36 and 37 may be minimized with respect to the minimization criteria given in Formula 15. Being consistent to the definition of sets of feature point pairs m.sub.p,q given before, m.sub.p,q,r defines a set of triple matching feature points. All triplets (p, q, r) of cameras sharing overlapping image areas are contained in the set .
(147) Formulas 36 and 37 may be evaluated for various camera configurations with the requirements as given above. Depending on the camera configuration and the resulting image overlap the set of equations that is obtained will have a different shape. It is possible that the resulting system can be solved exactly with linear methods, and is over or under determined.
(148) In matrix notation, problem 37 may be written as follows: For one triplet of cameras (p,q,r) this may be formulated as:
(149)
(150) In matrix A the first line arises from Formula 36 evaluated for a triplet (p,q,r). Lines 2 to 4 may arise from the additional requirement as given by Formula 15. As transformed points {tilde over (m)} are used here, Formula 15 simplifies and only R.sub.o (or the elements comprised in R.sub.o) remain.
(151) For a camera pattern comprising more than 3 cameras, and so more than one camera-triplet in the set , each triplet forms one line in a global matrix A.sub.G.
(152)
(153) In Formula 42, the set comprises four triplets denoted as pqr, prs, qrs and prt. A solution for the equation system may then be found as:
[{circumflex over (x)}.sub.p{circumflex over (x)}.sub.q{circumflex over (x)}.sub.r . . . ].sup.T=(A.sub.G.sup.T.Math.A.sub.G).sup.1.Math.A.sup.T.sub.G.Math.b.sub.G Formula 43
(154) A critical setup needs to be discussed in the case when only three cameras or images are available. The presented strategy can easily be extended to such a scenario. Evaluating the sum of Formula 36 for the camera triplet (C.sub.1, C.sub.2, C.sub.3) as given in
(155)
(156) Evaluating Formula 15 for the pairs (C.sub.1, C.sub.2), (C.sub.1, C.sub.3) and (C.sub.2, C.sub.3) yields three additional conditions:
v.sub.i+{circumflex over ()}.sub.y,1v.sub.2{circumflex over ()}.sub.y,2=0
u.sub.1{circumflex over ()}.sub.x,1+u.sub.3+{circumflex over ()}.sub.x,3=0
(u.sub.2+v.sub.2+{circumflex over ()}.sub.x,2+{circumflex over ()}.sub.y,2u.sub.3v.sub.3{circumflex over ()}t.sub.x,3{circumflex over ()}.sub.y,3=0 Formula 45
(157) Rewritten in matric notation, Formula 39 becomes:
(158)
(159) Formulas 44, 45 and 46 may be evaluated for one triplet-point correspondence m.sub.1,2,3={u.sub.1,v.sub.1,u.sub.2,v.sub.2,u.sub.3,v.sub.3}. The resulting system has six unknown variables but only four formulas. It is therefore under-determined. Additionally, the matrix A does not depend on the values of u and v. This means that by adding additional triplet-point correspondences the matrix A cannot obtain full row rank. Any line added to A will only be a duplicate of an already existing line. Therefore, this system cannot be solved without further assumptions or without reducing the number of unknown variables. One possibility to reduce the number of unknowns is to define C.sub.1 as reference camera with {circumflex over ()}.sub.x,1={circumflex over ()}.sub.y,1=0. This allows removal of the first and second column of matrix A. The remaining linear system can then easily be solved using linear methods. This is shown by:
(160)
(161) This strategy can be extended towards a general planar camera array. Depending on the field of view of each camera, their position and orientation, there may exist cameras that are unable to be formed into triplet matches over all possible camera combinations. It is, in general, therefore not possible to predict how many triplets and pairs can be formed. The amount of linear independent equations one obtains to solve for each principal point is unknown. This needs to be taken into account when setting up an array.
(162) Alternatively, the linearization of Formula 36 may also be expressed as:
(163)
(164) Wherein the argument of Formula 36 is approximated as
(165)
(166) Alternatively, the analyzing unit may be configured to determine the displacement information, to minimize the error respectively based on a non-linear concept. This may lead to a slightly increase in computational effort but may in return allow for usage of common available implementations as in software libraries or the like. In case of non-linear optimization both problems as given in Formula 15 and Formula 36 are subject to be minimized with respect to vectors x.
(167) Based on {circumflex over (x)}.sub.p,x.sub.p,C.sub.p and C.sub.q and as part of as 612 and/or 616 the total calibrating function H being referred to as a homography matrix can be computed as:
(168)
(169) H() may thus be a function providing a homography matrix undistorting points according to a parameter vector x.sub.p and camera positions C.sub.p and C.sub.q.
(170) Formula 48 relates to the definition of H as given in Formula 17. Alternatively, the definition as given in Formula 17b may be used.
(171) One can see that the right part is constant for a camera p while the left part depends on the relative position of both cameras. An application of the calibration data, i.e., the homography in step 612, allows for obtaining rectified, i.e., calibrated images in a step 614. In the special scenario of rectangular arrays as used as an example herein, this can be exploited to increase performance. In a rectangular system matrix, R.sub.0 is either the identity matrix or a rotation around 90. It is therefore beneficial to apply the constant part once for an image and execute the second part on demand. Within a computer system, a rotation around 90 can easily be executed by swapping rows and columns of an image without further computations. In other words,
(172) For the following calibration procedure, there may be assumptions that all cameras in the array are mechanical mounted on a strict, static grid. Therefore, it may be assumed that it has a good pre-calibration. This mechanical setup may be, in general, not aligned at pixel level and comprise slight deviations that remain in the images. As illustrated in
(173)
(174) wherein f denotes a normalized focal length and may be set to 1. at denotes a small deviation of focal length. Furthermore, the focal length may be considered as constant in horizontal and vertical direction, i.e., s.sub.i0 which means that square pixels are present and a.sub.f<<f=1. As the camera's principal point, one can further assume that .sub.x0 and .sub.y0. The orientation is modeled using a standard rotation matrix R(,,) with , and denoting tilt, yaw and roll respectively. In this case, it may be assumed that ||, ||, ||<<1, wherein all angles may be given in radiants). At the end of the calibration, deviations from the ideal model are known.
(175) Details of above embodiments may relate to a 2D camera pattern. When determining calibration data for a 3D camera pattern, the same or at least related considerations may be taken into account.
(176) In a general case when cameras do not reside on a common plane, i.e., form a 2D-pattern, Formula 13 cannot be applied and needs to be replaced by a more general formulation that fits to a 3D-pattern. This formulation is known as the Trifocal-Tensor and is presented in [2]. The basic equation is also expressed in Formula 45. The Formula states that given three corresponding image points m.sub.p, m.sub.q and m.sub.r visible in three cameras the function results in a 33 zero matrix. This also entails that camera properties like position, orientation and intrinsics are known precisely.
(177)
(178) Formula 50 may be expanded as:
[m.sub.q].sub.x.Math.(u.sub.p.Math.T.sub.(p,q,r),1+v.sub.p.Math.T.sub.(p,q,r),2+T.sub.(p,q,r),3).Math.[m.sub.r].sub.x=0.sub.33 Formula 51
(179) In Formula 45 and Formula 46 [a].sub.x denotes a skew symmetric matrix corresponding to some vector a as:
(180)
(181) The three elements T.sub.(p,q,r),1,, T.sub.(p,q,r),2 and T.sub.(p,q,r),3 denote a representation of the Trifocal Tensor. Each element T.sub.(p,q,r),i may be represented as a 33 matrix and its entries may be computed (according to [2]) from three projection matrices P.sub.q, P.sub.q and P.sub.r. Based on the ideal layout and ideal extrinsic and intrinsic camera parameters it may be assumed that all projection matrices and consequentially all Tensor elements may be computed.
(182) Due to mechanical misalignment, measured points in images p, q and r will not obey this ideal setup and consequentially computing Formula 51 for some triplet of corresponding points and idealized Tensor elements may result in values different from a zero matrix.
(183) As in the planar system Formula 46 may be modified as follows:
[m.sub.q.sup.#].sub.x.Math.(u.sub.p.sup.#.Math.T.sub.(p,q,r),1+v.sub.p.sup.#.Math.T.sub.(p,q,r),2+T.sub.(p,q,r),3).Math.[m.sub.r.sup.#].sub.x=0.sub.33 Formula 53
g(x.sub.p,x.sub.q,x.sub.r):=[{tilde over (H)}(x.sub.q).Math.m.sub.q].sub.x.Math.(u.sub.p.sup.#.Math.T.sub.(p,q,r),1+v.sub.p.sup.#.Math.T.sub.(p,q,r),2+T.sub.(p,q,r),3).Math.[{tilde over (H)}(x.sub.r).Math.m.sub.r].sub.x Formula 54
wherein
m.sup.#=[u.sup.#v.sup.#1].sup.T={tilde over (H)}(x.sub.p).Math.m Formula 55
(184) A calibrated image point m.sub.p.sup.# may be obtained by transforming an uncalibrated image point m.sub.p by a mapping homography as m.sub.p.sup.#={tilde over (H)}(x.sub.p).Math.m.sub.p.Math.{tilde over (H)}(x.sub.p) may be again be constructed from a camera matrix K and a rotation matrix R as in Formula 17 or 17b with parameters contained in a parameter vector x.
(185) As before, an additional element to compensate for lens-distortion may be integrated in order to obtain calibrated image points m.sub.p.sup.#.
m.sub.p.sup.#(x.sub.p):={tilde over (H)}(x.sub.p).Math.undist(m.sub.p,x.sub.q) Formula 56
(186) Formula 54 may therefore also be represented as function of x.sub.p, x.sub.q and x.sub.r:
g(x.sub.p,x.sub.q,x.sub.r):=[m.sub.q.sup.#].sub.x.Math.(u.sub.p.sup.#.Math.T.sub.(p,q,r),1+v.sub.p.sup.#.Math.T.sub.(p,q,r)2+T.sub.(p,q,r)3).Math.[m.sub.r.sup.#].sub.x Formula 57
(187) For a general set of cameras, the apparatus may be configured to solve two optimization problems simultaneously, wherein one of the optimization problems may be subject to be minimized according to:
(188)
(189) In Formula 53,
(190) Again, the apparatus may be configured to execute a non-linear optimization method or a linearized optimization method. This is identical as presented for the planar case as depicted in
(191) For the linearization, a local sub-problem consisting of three cameras p,q and r is evaluated for each element of Formula 57. This yields 9 functions of x.sub.p, x.sub.q and x.sub.r. Each of these functions can be approximated using Taylor expansion as defined in Formulae 22 and 23. Formally, this can be expressed as:
g.sub.ij(x.sub.p,x.sub.q,x.sub.r)Tg.sub.i,j(,a)
=[x.sub.p,x.sub.q,x.sub.r] Formula 59 & 60
(192) The approximated elements g.sub.i,j may be composed to a vector and subsequently re-written in matrix notation yielding a Jacobian matrix and a column vector b. As a result a linearized system is obtained with unknowns in . As before, a is a zero-vector with the same length as .
0=[Tg.sub.1,1(,a),Tg.sub.1,2(,a), . . . ,Tg.sub.3,3(,a)].sup.T
0=J.Math.+b Formula 61 & 62
(193) T g.sub.i,j(, a) denotes the Taylor approximation of a function g and should not be mixed up with the Tensor elements in Formula 57. The indices in g denote an entry in the matrix returned by function g. (Compare also for Formula 53 and 54). However it should be clear the Tensors are elements of Formulae 58 to 62 and further on.
(194) Alternatively, the analyzing unit may be configured to linearly minimize the error. In case of a linearized optimization method as shown in
J.sub.p,q,r=[J.sub.p,q,r.sup.(p)J.sub.p,q,r.sup.(q)J.sub.p,q,r.sup.(r)] Formula 63
(195) In the same fashion as presented in Formula 29, the linearized global problem can be constructed from a series of local, linearized problems. Therefore, each camera triplet contained in
(196)
(197) In a gradient-descend like manner the overall problem may be solved by solving A.sub.G for a set of approximated vectors x and using a portion of these values to update the true solution vector x as stated in Formula 32. This true solution vectors x may subsequently be used to update initial points m. With those updated points {tilde over (m)} matrix A.sub.G may be recomputed. The process continues until congruence is reached.
(198) In other words, the analyzing unit may be configured to solve a first and a second optimization criteria for determining the displacement information. In other words, the analyzing unit may be configured to solve a first and a second optimization criteria for the case of having a 2D camera pattern and for having a 3D camera pattern, wherein the criteria may differ between the 2D pattern and the 3D pattern.
(199) For the 2D pattern, a possibly non-linear first minimization criteria, may be based on the determination rule
(200)
(201) as described above. The possibly non-linear second minimization criteria may be based on the determination rule
(202)
(203) as described above. For the 3D pattern, the possibly non-linear first minimization criteria may be based on the determination rule
(204)
(205) and the possibly non-linear second minimization criteria may be based on the determination rule
(206)
(207) wherein argmin denotes the minimization criteria, and wherein C.sub.p, C.sub.g denote camera positions of cameras p and q, d(C.sub.pC.sub.g) is a distance between the camera positions, H is a homography matrix, x.sub.p, x.sub.q and x.sub.r are vectors with parameters to be determined so as to minimizing the error for camera p, q and r, m.sub.q, m.sub.r and m.sub.p,q refer to a respective minimization result of a previous minimization iteration, and denote a set of images used for minimization, N is a number of parameters, v is an image direction, u.sup.# and v.sup.# denote a calibrated position of an image point along direction u and v obtained as m.sub.p.sup.#=H.sub.p m.sub.p and T.sub.i represents element i of a Trifocal Tensor. The explanation of the global optimization problem is represented in formula 58, wherein formula 58 may be built from (local) sub-problems given by formula 54.
(208) Alternatively or in addition, the analyzing unit may be configured to solve linearized minimization criteria, i.e., to use a linearized version of the above mentioned criteria. The analyzing unit may thus be configured to determine the displacement information minimizing an error of a linearized first minimization criteria. Accordingly, the determining unit may be configured to determine the offset information minimizing an error of a linearized second minimization criteria. For the 2D camera pattern the first linearized minimization criteria may be based on the determination rule
(209)
(210) wherein this approximation rule relates to the non-linear minimization problem as:
(211)
(212) and for the 2D camera pattern the linearized second minimization criteria may be based on the determination rule
(213)
(214) wherein this approximation rule relates to the non-linear minimization problem as:
(215)
(216) A solution may be determined, for example as:
(217)
(218) For the 3D camera pattern the linearized first minimization criteria may be based on the approximation rule
(219)
(220) wherein this approximation rule relates to the non-linear minimization problem as:
(221)
(222) wherein for the 3D camera pattern the linearized second minimization criteria is based on the approximation rule
(223)
(224) wherein this approximation rule relates to the non-linear minimization problem as:
(225)
(226) A solution may be determined, for example as:
(227)
(228) Elements in J.sub.p,q,r and b.sub.p,q,r are formed from Taylor approximation of g(x.sub.p,x.sub.q,x.sub.r).sup.2 according to Formula 58 to Formula 62, and wherein argmin denotes the minimization criteria, and wherein C.sub.p, C.sub.p denote camera positions of cameras p and q, d(C.sub.pC.sub.q) is a distance between the camera positions, is a homography matrix, {tilde over (x)}.sub.p, and {tilde over (x)}.sub.q are vectors with linearized parameters to be determined so as to minimizing the error for camera p and q with respect to a constant b.sub.q,p, J denotes a Jacobian matrix T represents a Trifocal Tensor. Subscripts p, q, r, s and t denote indices referring to a specific camera.
(229) Thus, the apparatus may be configured to linearly and/or non-linearly performing optimization, wherein the selection is up to an implementation. Non-linear methods may allow for using a standard library, wherein the non-linear method may result in a higher computational effort.
(230) The first optimization problem or criteria may be solved by the analyzing unit so as to pre-align the images. The second optimization problem or criteria may be solved by the determining unit so as to reduce or compensate for remaining differences in disparities.
(231) According to an embodiment, only a coarse setup of the camera system is known to the algorithm and the exact amount of overlapping image regions will be approximated or estimated. The analyzing unit is therefore configured to automatically detect corresponding points in all possible pairs and triplets of images.
(232) A minimum requirement for proper execution according to an example is shown in
(233)
(234) Edges E.sub.ij between Nodes N.sub.i and N.sub.j of the graph and the corresponding labeling indicate cameras that are incorporated in both sets of triplet matches. For the sake of simplicity, E.sub.ij=E.sub.ji. For example, cameras 1 & 2 are both incorporated in the triplet set {1, 2, 3} corresponding to node N.sub.1 and the set {1, 2, 4} corresponding to node N.sub.2. Thus, cameras C.sub.1 and C.sub.2 have a (first) overlap containing a (first) correspondence point/object with camera C.sub.3 and have a second overlap containing a (second) correspondence point/object with camera C.sub.4. Thereby, a correlation between camera C.sub.3 and camera C.sub.4 may be determined with the analyzing unit even if cameras C.sub.3 and C.sub.4 would not have a common overlap.
(235) It is possible but not needed and even unlikely that a common corresponding point is found in each pair of images. However, the embodiments described herein are not limited to a corresponding point being common in each of the cameras or pictures thereof. It is sufficient that each camera that is considered in the minimization problem is comprised in at least one pair of cameras contained in the set and/or in at least one triplet of cameras contained in . I.e., it may be sufficient, that for each pair and/or triplet a possibly individual corresponding point may be identified.
(236) The analyzing unit may be configured for determining the triplets N, from the pictures taken with the Cameras C.sub.i by identifying correspondence points in the respective pictures. The analyzing unit may be configured for the determining the triplets and/or the graph 60 or a representation thereof such that the graph 60 is fully connected.
(237) As shown in
(238) In contrast
(239) As can be seen, the node N.sub.8 {6,8,9} is not connected to any other node though all cameras (C.sub.1 . . . C.sub.9) are represented in the graph. In this case the graph is not fully connected. Hence it is not guaranteed that the condition as specified in Formula 36 or Formula 58 is met.
(240) The feature-detection & matching step in the analyzing unit is implemented to build-up a fully-connected graph or a data representation thereof as shown in
(241) A similar condition may also to be met for pairs of cameras. However any set comprising 3 cameras can be seen as 3 individual pairs of cameras. Therefore any camera that is incorporated in a triplet set is also incorporated in at least 2 pairs of correspondence points. Thus, the explanation given directly relates to 2D and to 3D camera patterns.
(242) The example given in
(243) The analyzing unit may determine the reference points so as to be contained in pairs and/or triplets of images.
(244) According to embodiments, the analyzing unit selects or determines reference/correspondence points in overlapping image regions of the pairs and/or triplets. It is possible that not every possible pair of images overlaps. Further, the analyzing unit may be configure for searching or identifying at least one or up to a specific number of reference points, e.g., 2, 3, 5, 10 or more. When the specific number is reached, the analyzing unit may be configured for stopping the search and to thereby determining less than a possible number of reference points in the overlap area. This allows saving time and computational effort. Further, it may be sufficient to determine a specific reference point in the images of cameras of a pair or triplet only, even if the same reference point is also seen in further images.
(245) When compared to known concepts, it is therefore neither needed to have a known reference object, nor to have a specific object to be contained in every image. It is of advantage to search for corresponding content/points in pairs and/or triplets of images as, e.g., at spherical or cylindrical arrays it becomes increasingly unlikely to have an object being imaged with more than three cameras when increasing the number of cameras of the array.
(246)
(247)
(248) The camera system 80 may further comprise an apparatus 10 that is configured to provide for the calibration information 48.
(249) Some explanations given herein relate to mathematical expressions of actions implemented by an apparatus and/or a method. It is to be pointed out that such expressions are used for explanatory reasons only so as to clearly define what an apparatus according to the respective embodiment does. Therefore, the embodiments do not relate to the mathematical expressions but to the actions implemented by the embodiments.
(250) Some embodiments are described in connection with a 2D camera pattern, wherein other embodiments are described in connection with 3D camera patterns. Further, some embodiments are described in connection with solving a possibly non-linear optimization, wherein other embodiments are described in connection with solving a linearized versions thereof. Although being described as separate solutions, further embodiments may also provide for combinations. E.g., an analyzing unit may receive information whether the camera pattern is of 2D type or 3D type and may perform different actions responsive to said information. Further, the analyzing unit may be blind for said pattern-information and may perform both, 2D determination and 3D determination and may determine a result of higher quality and may take that result. Further, the analyzing unit may perform a 3D optimization and may take this result even if the pattern is a 2D pattern because the result is also valid.
(251) Accordingly, the analyzing unit may implement one or both of the non-linear and linear approach, e.g., by first solving the linearized formulation and in case of having insufficient results solving the non-linear formulation.
(252) In the following, additional embodiments and aspects of the invention will be described which can be used individually or in combination with any of the features and functionalities and details described herein. 1. Apparatus (10; 70) comprising: a first interface (12) for receiving a plurality of partially overlapping images (14; 14a-i) of an object (16) from a corresponding plurality of camera positions (18a-i) being arranged along a first and a second direction according to a camera pattern (30; 40; 40); an analyzing unit (24) configured for selecting at least one corresponding reference point (26.sub.1-26.sub.4) in an overlap area (28) of a set of overlapping images (14; 14a-i), and for determining a displacement information (34) along the first and the second direction (x, y) of the reference point (26.sub.1-26.sub.4) in each of the other images of the set of images, wherein a misalignment of the plurality of images (14; 14a-i) along the first and the second direction (x, y) is compensated by the displacement information (34) so as to obtain aligned images; a determining unit configured for determining an offset information (38) between principal points (42a-c) at the plurality of camera positions (18a-i) using at least three aligned images; and a second interface (46) for providing calibration data (46) based on the displacement information (34) and based on the offset information (38), the calibration data allowing for calibrating the plurality of images so as to comply to the camera pattern. 2. The apparatus according to aspect 1, wherein the analyzing unit (24) is configured for determining the displacement information (34) using a set of parameters indicating a real condition of the camera pattern (30; 40; 40), the parameters comprising a non-linear relationship with respect to each other, wherein the analyzing unit (24) is configured to use a linearized version of the set of parameters and to determine the displacement information (34) by minimizing an error of the linearized version with respect to a desired condition of the camera pattern (30; 40; 40). 3. The apparatus according to aspect 1 or 2, wherein the analyzing unit is configured to determine the displacement information minimizing an error of a first minimization criteria, wherein for a 2D camera pattern the first minimization criteria is based on the determination rule;
(253)
(254)
(255)
(256)
(257)
(258)
(259)
(260)
(261) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
(262) Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
(263) Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
(264) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
(265) Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
(266) In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(267) A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
(268) A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
(269) A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
(270) A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
(271) In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are performed by any hardware apparatus.
(272) While this invention has been described in terms of several advantageous embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
LITERATURE
(273) [1] Zilly F., Riechert C., Mller M., Waizenegger W., Kauff P., Determination of rectifying homographies for a camera array, EP2917895 A1, 2015 [2] Hartley R., Zissermann A., Multiple View Geometry, Cambridge University Press, 2003 [3] Wilburn B., Joshi N., Vaish V., Talvala E., Antunez E., Barth A., Adams A., Levoy M., Horowitz M., High Performance Imaging Using Large Camera Arrays, Proc. of ACM SIGGRAPH 2005, Vol. 24, No. 3, pp. 765-776, 2005 [4] Zhang Z., A flexible new technique for camera calibration, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(11), 1330-1334 [5] Aerts M., Tytgat D., Macq J., Lievens S., Method and arrangement for multi-camera calibration, EP 2 375 376 B1, 2013 [6] Zilly F., Method for the automated analysis, control and correction of stereoscopic distortions and parameters for 3D-TV applications, TU Berlin, 2015, http://dx.doi.org/10.14279/depositonce-4618 [7] Zilly F., Riechert C., Mller M., Waizenegger W., Sikora T., Kauff P., Multi-camera rectification using linearized trifocal tensor, 21st International Conference on Pattern Recongition (ICPR), 2012 [8] Kurillo, G., Baker, H., Li, Z., & Bajcsy, R. (2013, June). Geometric and color calibration of multiview panoramic cameras for life-size 3D immersive video. In 3D Vision-3DV 2013, 2013 International Conference on (pp. 374-381). IEEE. [9] Li, Z., Baker, H., Kurillo, G., & Bajcsy, R. (2010). Projective epipolar rectification for a linear multi-imager array. In 3DPVT.