DISPARITY IMAGE STITCHING AND VISUALIZATION METHOD BASED ON MULTIPLE PAIRS OF BINOCULAR CAMERAS

Abstract

The present invention discloses a disparity image stitching and visualization method based on multiple pairs of binocular cameras. A calibration algorithm is used to solve the positional relationship between binocular cameras, and the prior information is used to solve a homography matrix between images; internal parameters and external parameters of the cameras are used to perform camera coordinate system transformation of depth images; the graph cut algorithm has high time complexity and depends on the number of nodes in a graph; the present invention divides the images into layers, and solutions are obtained layer by layer and iterated; then the homography matrix is used to perform image coordinate system transformation of the depth images, and a stitching seam is synthesized to realize seamless panoramic depth image stitching; and finally, depth information of a disparity image is superimposed on a visible light image.

Claims

1. A disparity image stitching and visualization method based on multiple pairs of binocular cameras, comprising the following steps: step 1): calibrating internal parameters and external parameters of each binocular camera; the internal parameters K include a focal length focus and optical center coordinates C.sub.x, C.sub.y; the external parameters include a rotation matrix R and a translation vector T; obtaining a baseline length baseline of each binocular camera by calibration; and obtaining visible light images and disparity images of two binocular cameras; step 2): calculating homography matrix: calculating a homography matrix H according to the internal parameters and external parameters of each binocular camera, the placement angle between the cameras, and a scene plane distance d; and the value range of d is 8-15 m; step 3): using the internal parameters of the binocular cameras and the external parameters between the binocular cameras obtained in step 1) and step 2) to perform camera coordinate system transformation of the disparity images; step 4): building overlapping area model: using the homography matrix H between images obtained in step 2) to calculate an overlapping area ROI of images, and modeling the overlapping area; step 5): dividing each image into blocks with a size of B.sub.1*B.sub.2, taking the divided blocks as nodes of a graph, performing graph cut to find a local optimal solution, then continuing to divide each node corresponding to an optimal stitching line corresponding to B.sub.1*B.sub.2 until a final block size is equal to a pixel value, and finally and approximately finding a global optimal solution by finding the local optimal solution each time; step 6): using the homography matrix H to perform image coordinate system transformation of the disparity images; seamlessly stitching the optimal stitching line in step 5); when the number of binocular cameras is greater than two, repeating steps 3)-6) to obtain disparity images with a larger field angle; and step 7): adding the stitched disparity information to the visible light images.

2. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 1, wherein the specific steps of calculating homography matrix described in step 2) are as follows: 2-1) imaging a plane in a scene by two binocular cameras, and assuming that the unit normal vector of the plane in the coordinate system of the first binocular camera is N, and the distance from the plane to the center of the first binocular camera (i.e., the scene plane distance) is d, then the plane π is expressed as:
N.sup.T=C.sub.1=d (1) wherein C.sub.1 is the three-dimensional coordinate of a three-dimensional point P in the coordinate system of the first binocular camera, and the coordinate of the three-dimensional point P in the coordinate system of the second binocular camera is C.sub.2, then the relationship between C.sub.1 and C.sub.2 is:
C.sub.2=RC.sub.1+T (2) formula (2) is further expressed as: $\begin{matrix} C_{2} = {RC}_{1} + T \frac{1}{d} N^{T} C_{1} = (R + T \frac{1}{d} N^{T}) C_{1} = H^{'} C_{1} & (3) \end{matrix}$ wherein R and T are respectively a rotation vector and a translation vector from the first binocular camera to the second binocular camera; 2-2) transforming C.sub.1 and C.sub.2 in step 2-1) from the internal parameters of the cameras into the image coordinate system:
c.sub.1=K.sub.1C.sub.1 (4)
c.sub.2=K.sub.2C.sub.2 (5) it can be obtained from formulas (3), (4) and (5) that: $\begin{matrix} c_{2} = K_{1} (R + T \frac{1}{d} N^{T}) {K_{2}}^{- 1} c_{1} = K_{1} H^{'} {K_{2}}^{- 1} c_{1} & (6) \end{matrix}$ finally, a calculation formula of the homography matrix calculated by the internal parameters and the external parameters is obtained: $\begin{matrix} H = K_{1} H^{'} {K_{2}}^{- 1} = K_{1} (R + T \frac{1}{d} N^{T}) {K_{2}}^{- 1} & (7) \\ H = [\begin{matrix} a_{1 1} & a_{1 2} & a_{1 3} \\ a_{2 1} & a_{2 2} & a_{2 3} \\ a_{31} & a_{3 2} & a_{3 3} \end{matrix}] & (8) \end{matrix}$ wherein c.sub.1 is a corresponding coordinate of C.sub.1 in the coordinate system of an imaging plane, and c.sub.2 is a corresponding coordinate of C.sub.2 in the coordinate system of the imaging plane; K.sub.1 is the internal parameters of the first binocular camera; K.sub.2 is the internal parameters of the second binocular camera; and the finally obtained transformation matrix H is a 3*3 matrix, and a.sub.11-a.sub.33 represent specific values.

3. A disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 1, wherein the specific steps of step 3) are as follows: 3-1) using the internal parameters K.sub.1 (i.e., the baseline length baseline.sub.1 and the focal length focus.sub.1) of the first binocular camera to restore the disparity images to a point cloud in the coordinate system of the first binocular camera, and the calculation formulas of the three-dimensional coordinates C.sub.1 (X.sub.1, Y.sub.1, Z.sub.1) of the point cloud are as follows: $\begin{matrix} Z_{1} = \frac{{baseline}_{1} * {focus}_{1}}{{disparity}_{1}} & (9) \\ X_{1} = \frac{(x_{1} - C_{x}) * {baseline}_{1}}{{disparity}_{1}} & (10) \\ Y_{1} = \frac{(y_{1} - C_{y}) * {focus}_{1}}{{disparity}_{1}} & (11) \end{matrix}$ wherein x.sub.1 and y.sub.1 are the pixel coordinates of the first binocular camera; disparity is a disparity value; 3-2) using the external parameters R and T from the first binocular camera to the second binocular camera to transform the camera coordinate system of the point cloud and obtain the three-dimensional coordinates of the point cloud in the coordinate system of the second binocular camera; and the coordinate transformation formula is as follows: $\begin{matrix} (\begin{matrix} X_{2} \\ Y_{2} \\ Z_{2} \end{matrix}) = R (\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \end{matrix}) + T & (12) \end{matrix}$ 3-3) using the internal parameters K.sub.2 (i.e., the baseline length baseline.sub.2 and the focal length focus.sub.2) of the second binocular camera to restore the point cloud to the disparity images, at this time, only Z.sub.2 is needed to calculate the disparity value in the coordinate system of the second binocular camera, and the calculation formula is as follows: $\begin{matrix} {disparity}_{2} = \frac{{baseline}_{2} * {focus}_{2}}{Z_{2}} . & (13) \end{matrix}$

4. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 1, wherein the specific steps of building overlapping area model described in step 4) are as follows: 4-1) for the pixels of two images in the overlapping area, calculating the second norm of the RGB pixels corresponding to the overlapping area of the two images, and building t-links; the calculation formulas of the second norm are as follows:
e(p,q)=∥p—p′∥+∥q−q′∥ (14)
∥p−p′∥=(R.sub.p−R.sub.p′).sup.2+(G.sub.p−G.sub.p′).sup.2+(B.sub.p−B.sub.p′).sup.2 (15)
∥q−q′∥=(R.sub.q−R.sub.q′).sup.2+(G.sub.q−G.sub.q′).sup.2+(B.sub.q−B.sub.q′).sup.2 (16) wherein e(⋅) is a weight function, p is a source image, q is a target image, p is the pixel value of one point in the image p, p′ is the pixel value of a p adjacent point, q is the pixel value of one point in the target image, q′ is the pixel value of a q adjacent point, R.sub.p is the value of R channel at p point, R.sub.p′ is the value of R channel at p′ point, G.sub.p is the value of G channel at p point, G.sub.p′ is the value of G channel at p′ point,; is the value of B channel at p point, B.sub.p′ is the value of B channel at p′ point, R.sub.q is the value of R channel at q point, R.sub.q′ is the value of R channel at q′ point, G.sub.q is the value of G channel at q point, G.sub.q′ is the value of G channel at q′ point, B.sub.q is the value of B channel at q point, and B.sub.q′ is the value of B channel at q′ point; 4-2) finding the optimal stitching line for the built model, and solving the stitching seam by graph cut; an energy function is defined as:
E(f)=Σ.sub.p,q∈NS.sub.p,q(l.sub.p,l.sub.q)+Σ.sub.p∈pD.sub.p(l.sub.p) (17) wherein S.sub.p,q is a smoothing term representing the cost of assigning a pair of pixels (p, q) in the overlapping area to (l.sub.p,l.sub.q), l.sub.p is a label assigned to the pixel p, l.sub.q is a label assigned to the pixel q, and D.sub.P is a data term representing the cost of marking the pixel p in the overlapping area as l.sub.p.

5. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 3, wherein the specific steps of building overlapping area model described in step 4) are as follows: 4-1) for the pixels of two images in the overlapping area, calculating the second norm of the RGB pixels corresponding to the overlapping area of the two images, and building t-links; the calculation formulas of the second norm are as follows:
e(p,q)=∥p—p′∥+∥q−q′∥ (14)
∥p−p′∥=(R.sub.p−R.sub.p′).sup.2+(G.sub.p−G.sub.p′).sup.2+(B.sub.p−B.sub.p′).sup.2 (15)
∥q−q′∥=(R.sub.q−R.sub.q′).sup.2+(G.sub.q−G.sub.q′).sup.2+(B.sub.q−B.sub.q′).sup.2 (16) wherein e(⋅) is a weight function, p is a source image, q is a target image, p is the pixel value of one point in the image p, p′ is the pixel value of a p adjacent point, q is the pixel value of one point in the target image, q′ is the pixel value of a q adjacent point, R.sub.p is the value of R channel at p point, R.sub.p′ is the value of R channel at p′ point, G.sub.p is the value of G channel at p point, G.sub.p′ is the value of G channel at p′ point, B.sub.p is the value of B channel at p point, B.sub.p′ is the value of B channel at p′ point, R.sub.q is the value of R channel at q point, R.sub.q′ is the value of R channel at q′ point, G.sub.q is the value of G channel at q point, G.sub.q′ is the value of G channel at q′ point, B.sub.q is the value of B channel at q point, and B.sub.q′ is the value of B channel at q′ point; 4-2) finding the optimal stitching line for the built model, and solving the stitching seam by graph cut; an energy function is defined as:
E(f)=Σ.sub.p,q∈NS.sub.p,q(l.sub.p,l.sub.q)+Σ.sub.p∈PD.sub.p(l.sub.p) (17) wherein S.sub.p,q is a smoothing term representing the cost of assigning a pair of pixels (p, q) in the overlapping area to (l.sub.p,l.sub.q), l.sub.p is a label assigned to the pixel p, l.sub.q is a label assigned to the pixel q, and D.sub.p is a data term representing the cost of marking the pixel p in the overlapping area as l.sub.p.

6. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 1, wherein the specific steps of disparity image stitching described in step 6) are as follows: 6-1) transforming the disparity image of the first binocular camera into the image coordinate system of the second binocular camera: $\begin{matrix} W (\begin{matrix} x_{2} \\ y_{2} \\ 1 \end{matrix}) = H (\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}) & (18) \end{matrix}$ wherein x.sub.1 and y.sub.1 are the coordinates in the image coordinate system of the first binocular camera; x.sub.2 and y.sub.2 are the coordinates in the image coordinate system of the second binocular camera; and w is a normalization coefficient; 6-2) stitching image: comparing the positions of the first binocular image after image coordinate system transformation and the second binocular image corresponding to an optimal stitching seam, and merging two visible light images and two disparity images respectively.

7. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 3, wherein the specific steps of disparity image stitching described in step 6) are as follows: 6-1) transforming the disparity image of the first binocular camera into the image coordinate system of the second binocular camera: $\begin{matrix} W (\begin{matrix} x_{2} \\ y_{2} \\ 1 \end{matrix}) = H (\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}) & (18) \end{matrix}$ wherein x.sub.1 and y.sub.1 are the coordinates in the image coordinate system of the first binocular camera; x.sub.2 and y.sub.2 are the coordinates in the image coordinate system of the second binocular camera; and w is a normalization coefficient; 6-2) stitching image: comparing the positions of the first binocular image after image coordinate system transformation and the second binocular image corresponding to an optimal stitching seam, and merging two visible light images and two disparity images respectively.

8. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 4, wherein the specific steps of disparity image stitching described in step 6) are as follows: 6-1) transforming the disparity image of the first binocular camera into the image coordinate system of the second binocular camera: $\begin{matrix} W (\begin{matrix} x_{2} \\ y_{2} \\ 1 \end{matrix}) = H (\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}) & (18) \end{matrix}$ wherein x.sub.1 and y.sub.1 are the coordinates in the image coordinate system of the first binocular camera; x.sub.2 and y.sub.2 are the coordinates in the image coordinate system of the second binocular camera; and w is a normalization coefficient; 6-2) stitching image: comparing the positions of the first binocular image after image coordinate system transformation and the second binocular image corresponding to an optimal stitching seam, and merging two visible light images and two disparity images respectively.

9. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 1, wherein the specific steps of step 7) are as follows: 7-1) converting the disparity images to color images, replacing disparity information with color information, and using different colors to represent different depths; 7-2) superimposing and fusing the color images obtained from the disparity images and the visible light images, and the superposition method is a weighted average method:
Fused image=k*visible light image+(1−k)*color image (19) wherein k is a weight coefficient.

10. The disparity image stitching and visualization method based on multiple pairs of binocular cameras according to claim 6, wherein the specific steps of step 7) are as follows: 7-1) converting the disparity images to color images, replacing disparity information with color information, and using different colors to represent different depths; 7-2) superimposing and fusing the color images obtained from the disparity images and the visible light images, and the superposition method is a weighted average method:
Fused image=k*visible light image+(1−k)*color image (19) wherein k is a weight coefficient.

Description

DESCRIPTION OF DRAWINGS

[0035] FIG. 1 is a flow chart of the present invention.

[0036] FIG. 2 is a system structure diagram of binocular cameras of an embodiment of the present invention.

DETAILED DESCRIPTION

[0037] The present invention proposes a disparity image stitching and visualization method based on multiple pairs of binocular cameras, and will be described in detail below in combination with drawings and embodiments.

[0038] The present invention uses multiple pairs of horizontally placed binocular cameras as an imaging system to perform multi-viewpoint image collection, wherein K.sub.1 is the internal parameters of the first binocular camera, and K.sub.2 is the internal parameters of the second binocular camera. The resolution of each binocular camera is 1024*768, the video frame rate is greater than 20 frames per second, and a system reference structure is shown in FIG. 2. The spatial transformation relationship R and T between each pair of binocular cameras is calculated on this basis, and the homography matrix H between images is calculated through R, T and the distance d of the imaging plane; the horizontal translation of each image is calculated by taking an intermediate image as a benchmark; and finally, the calculated parameters are used as inputs for stitching and visualization. The specific process is as follows:

[0039] 1) System calibration and data collection

[0040] 1-1) Calibrating each pair of binocular cameras to obtain the internal parameters including focal length and optical center and the external parameters including rotation and translation of each pair of binocular cameras;

[0041] 1-2) Connecting each pair of binocular cameras to multiple computers, and using a router to control and conduct synchronous data collection;

[0042] 1-3) Using special customized calibration boards to collect images at the same time; paying attention to ensure that the positional relationship between the binocular cameras is consistent during the collection process and keep the calibration boards fixed; and rotating the calibration boards to collect 10-15 groups of images for each pair of binocular cameras according to the actual conditions.

[0043] 2) Calculating homography matrix between image transformations

[0044] 2-1) Imaging a plane in a scene by two cameras, and assuming that the unit normal vector of the plane in the coordinate system of the first camera is N, and the distance from the plane to the center (coordinate origin) of the first camera is d, then the plane π can be expressed as:

N.sup.TC.sub.1=d

Wherein C.sub.1 is the coordinate of a three-dimensional point P in the coordinate system of the first camera, X.sub.1 and the coordinate of the three-dimensional point P in the coordinate system of the second camera is C.sub.2, then the relationship between the two is:

[00008] $C_{2} = R * C_{1} + T$ $C_{2} = {RC}_{1} + T \frac{1}{d} N^{T} C_{1} = (R + T \frac{1}{d} N^{T}) C_{1} = H^{'} C_{1}$

2-2) Obtaining the homography matrix obtained in step 2-1) from the coordinate system of the first camera, and the homography matrix needs to be transformed into the coordinate system of the imaging plane:

c.sub.1=K.sub.1C.sub.1

c.sub.2=K.sub.2C.sub.2

H=K.sub.1H′K.sub.2.sup.−1

The value of d in the above formula can be set manually, and the rest are fixed values. In this way, the homography matrix H from the first binocular camera to the second binocular camera is obtained.

[0045] 3) Using the internal parameters of the binocular cameras and the external parameters between the binocular cameras obtained in steps 1) and 2) to perform camera coordinate system transformation of the disparity images;

[0046] 3-1) Using the internal parameters K.sub.1, etc. of the first pair of binocular cameras to restore the disparity images to a point cloud in the coordinate system of the first camera:

[00009] $Z_{1} = \frac{{baseline}_{1} * {focus}_{1}}{{disparity}_{1}}$ $X_{1} = \frac{(x_{1} - C_{x}) * {baseline}_{1}}{{disparity}_{1}}$ $Y_{1} = \frac{(y_{1} - C_{y}) * {focus}_{1}}{{disparity}_{1}}$

[0047] 3-2) Using R and T from the first binocular camera to the second binocular camera to transform the camera coordinate system of the point cloud:

[00010] $(\begin{matrix} X_{2} \\ Y_{2} \\ Z_{2} \end{matrix}) = R (\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \end{matrix}) + T$

[0048] Using the internal parameters K.sub.2 of an intermediate viewpoint binocular camera to restore the point cloud to the disparity images, at this time, only Z.sub.2 is needed to obtain the disparity images, and the calculation formula is as follows:

[00011] ${disparity}_{2} = \frac{{baseline}_{2} * {focus}_{2}}{Z_{2}}$

Calculating overlapping area of images and solving optimal stitching seam by modeling: first, calculating an overlapping area ROI through the homography matrix between images, and then building an overlapping area model; and the specific steps are as follows:

[0049] 4-1) Calculating the size of the overlapping area by the homography matrix between images:

[0050] Taking the four vertices (0, 0), (img.cols, 0), (img.cols, img.rows) and (0, img.rows) of an image, calculating the transformed coordinates, the transformed upper left corner coordinate is for the stitched image, and the homography transformation matrix H is:

[00012] $H = [\begin{matrix} a_{1 1} & a_{1 2} & a_{1 3} \\ a_{2 1} & a_{2 2} & a_{2 3} \\ a_{31} & a_{3 2} & a_{3 3} \end{matrix}]$

[0051] The calculation formulas are:

[00013] $x = \frac{x^{'}}{w^{'}} = \frac{a_{11} u + a_{12} v + a_{13}}{a_{31} u + a_{32} v + a_{33}}$ $y = \frac{y^{'}}{w^{'}} = \frac{a_{21} u + a_{22} u + a_{23}}{a_{31} u + a_{32} u + a_{33}}$

Wherein x is the x-coordinate of the source image p point after perspective transformation, y is the y-coordinate of the source image p point after perspective transformation, u is the x-coordinate of the source image p point, and v is the y-coordinate of the source image p point;

[0052] 4-2) Building an energy model (seam-driven image stitching), and constructing an energy function of a graph cut algorithm:

[00014] $E (1) = \underset{p \in P}{.Math.} D_{p} (l_{p}) + \underset{(p, q) \in N}{.Math.} S_{p, q} (l_{p}, l_{q})$

Wherein the data term D.sub.p(l.sub.p) represents the assigned value of pixel p in the overlapping area:

[00015] ${\begin{matrix} D_{p} (1) = 0 D_{p} (0) = μ & if p \in \partial I_{0} .Math. \partial P \\ D_{p} (0) = 0 D_{p} (1) = μ & if p \in \partial I_{1} .Math. \partial P \\ D_{p} (0) = D_{p} (1) = 0 & otherwise \end{matrix}$

[0053] In order to avoid marking errors, pi is set to be a very large number;

[0054] S.sub.p,q(l.sub.p, l.sub.q) is a smoothing term;

S.sub.p,q(l.sub.p,l.sub.q)=I.sub.*(p)+I.sub.*(q)

I.sub.*(p)=∥I.sub.0(⋅)−I.sub.1(⋅)∥.sub.2

[0055] 5) After the model is built, obtaining a solution by graph cut, and the result is an optimal stitching seam. It can be known that the construction of the energy function is very important for the result of the stitching seam.

[0056] 5-1) As the operation time of graph cut is related to the number of nodes in a graph, and the algorithm complexity is relatively high, only by down-sampling or stratifying the overlapping area to reduce the number of nodes in the constructed graph, and making the local optimal solution obtained by this method be approximately equal to the global optimal solution, can the real-time performance of the algorithm meet requirements.

[0057] 5-2) In addition, the parallelization of the graph cut algorithm can also achieve a further acceleration effect. (Fast graphcut on GPU CVPR2008) 6) The specific steps of disparity image stitching are as follows:

[0058] 6-1) Transforming the depth image of the first binocular camera into the image coordinate system of the second binocular camera:

[00016] $W (\begin{matrix} x_{2} \\ y_{2} \\ 1 \end{matrix}) = H (\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix})$

[0059] 6-2) Stitching image: comparing the positions of the disparity image after image coordinate system transformation and an intermediate disparity image corresponding to the optimal stitching seam, and merging the two disparity images.

[0060] Completing the disparity image stitching of one pair of binocular cameras by steps 1)-6), and repeating steps 1)-6) to complete the disparity image stitching of the second pair of binocular cameras (e.g., the second and third binocular cameras).

[0061] 7) Adding the stitched disparity information to the visible light images:

[0062] 7-1) Converting the disparity images to color images, replacing disparity information with color information, and using different colors to represent different depths, wherein the color images calculated from the disparity images includes but is not limited to pseudo-color images and rainbow images;

[0063] 7-2) Superimposing and fusing the color images obtained from the disparity images and the visible light images, and the superposition method is a weighted average method:

Fused image=k*visible light image+(1−k)*color image

[0064] k is a weight coefficient; when k is relatively large (1-*0.5), visible light information can be observed more clearly; and when k is relatively small (0.5-0), more depth information can be observed.

DISPARITY IMAGE STITCHING AND VISUALIZATION METHOD BASED ON MULTIPLE PAIRS OF BINOCULAR CAMERAS

Inventors

Cpc classification

Classification Explorer

G06T2200/32

PHYSICS

Classification Explorer

G06T7/11

PHYSICS

Classification Explorer

H04N13/239

ELECTRICITY

Classification Explorer

G06T2207/20021

PHYSICS

Classification Explorer

G06T2207/10012

PHYSICS

Classification Explorer

G06T2207/10024

PHYSICS

Classification Explorer

H04N13/246

ELECTRICITY

Classification Explorer

H04N13/15

ELECTRICITY

Classification Explorer

H04N13/243

ELECTRICITY

Classification Explorer

H04N13/156

ELECTRICITY

Classification Explorer

G06T2207/20221

PHYSICS

Classification Explorer

G06T7/85

PHYSICS

Classification Explorer

G06T7/593

PHYSICS

International classification

Classification Explorer

H04N13/156

ELECTRICITY

Classification Explorer

G06T7/11

PHYSICS

Classification Explorer

G06T7/80

PHYSICS

Classification Explorer

H04N13/15

ELECTRICITY

Classification Explorer

H04N13/239

ELECTRICITY

Classification Explorer

H04N13/246

ELECTRICITY

Abstract

Claims

Description