Multispectral stereo camera self-calibration algorithm based on track feature registration
11575873 · 2023-02-07
Assignee
Inventors
- Wei Zhong (Liaoning, CN)
- Haojie Li (Liaoning, CN)
- Boqian Liu (Liaoning, CN)
- Zhihui Wang (Liaoning, CN)
- Risheng Liu (Liaoning, CN)
- Zhongxuan Luo (Liaoning, CN)
- Xin Fan (Liaoning, CN)
Cpc classification
G06T7/246
PHYSICS
Y02A90/10
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G06V10/243
PHYSICS
H04N13/254
ELECTRICITY
G06V10/74
PHYSICS
H04N13/25
ELECTRICITY
International classification
H04N13/254
ELECTRICITY
G06T7/80
PHYSICS
Abstract
The present invention discloses a multispectral stereo camera self-calibration algorithm based on track feature registration, and belongs to the field of image processing and computer vision. Optimal matching points are obtained by extracting and matching motion tracks of objects, and external parameters are corrected accordingly. Compared with an ordinary method, the present invention uses the tracks of moving objects as the features required for self-calibration. The advantage of using the tracks is good cross-modal robustness. In addition, direct matching of the tracks also saves the steps of extraction and matching the feature points, thereby achieving the advantages of simple operation and accurate results.
Claims
1. A multispectral stereo camera self-calibration algorithm based on track feature registration, stored on a non-transitory computer-readable medium, comprising the following steps: 1) using an infrared camera and a visible light camera to shoot a group of continuous frames with moving objects at the same time; 2) original image correction: conducting de-distortion and binocular correction on an original image according to internal parameters and original external parameters of the infrared camera and the visible light camera; 3) calculating tracks of the moving objects; 4) obtaining an optimal track corresponding point and obtaining a transformation matrix from an infrared image to a visible light image accordingly; 5) further optimizing matching results of the track corresponding points: selecting the number of registration point pairs with lower error as candidate feature point pairs; 6) judging a feature point coverage area: dividing the image into m*n grids; if the feature points cover all the grids, executing a next step; otherwise continuing to shoot the image and repeating step 1) to step 5); 7) correcting the calibration result: using image coordinates of all the feature points to calculate the positional relationship between the two cameras after correction; and then superimposing with the original external parameters.
2. The multispectral stereo camera self-calibration algorithm based on track feature registration according to claim 1, wherein the original image correction in the step 2) specifically comprises the following steps: 2-1) calculating the coordinates in a normal coordinate system corresponding to the pixel points of the image, wherein a pixel coordinate system takes the upper left corner of the image as an origin, and x-axis and y-axis of the pixel coordinate system are parallel to x-axis and y-axis of an image coordinate system, respectively; the unit of the pixel coordinate system is the pixel; taking the optical center of the camera as the origin of the image coordinate system and scaling the distance from the optical center to an image plane to 1; the relationship between pixel coordinates and normal coordinates is as follows:
X=K.sup.−1u 2-2) removing image distortion: the radial distortion of the image is a position deviation of the image pixel points with the distortion center as the center point along the radial direction, thereby causing the distortion of the picture formed in the image; the radial distortion is described as follows:
x.sub.d=x(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)
y.sub.d=y(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6) wherein r.sup.2=x.sup.2+y.sup.2; k.sub.1, k.sub.2 and k.sub.3 are radial distortion parameters; the tangential distortion of the image is generated by the defect in the camera manufacturing that makes the lens not parallel to the image plane, and is quantitatively described as:
x.sub.d=x+(2p.sub.1xy+p.sub.2(r.sup.2+2x.sup.2))
y.sub.d=y+(p.sub.1(r.sup.2+2y.sup.2)+2p.sub.2xy) wherein p.sub.1 and p.sub.2 are tangential distortion coefficients; the coordinate relationship before and after distortion is as follows:
x.sub.d=x(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)+(2p.sub.1xy+p.sub.2(r.sup.2+2x.sup.2))
y.sub.d=y(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)+(p.sub.1(r.sup.2+2y.sup.2)+2p.sub.2xy) wherein (x,y) is a normal coordinate in an ideal state, and (x.sub.d,y.sub.d) is an actual normal coordinate with distortion; 2-3) reversing the two images according to the original rotation relationship between the two cameras: an original rotation matrix R and a translation vector t between the two cameras are known:
X.sub.r=RX.sub.l+t wherein X.sub.l indicates the normal coordinate of the infrared camera, and X.sub.r indicates the normal coordinate of the visible light camera; the infrared image is rotated to positive direction of R by half an angle, and the visible light image is rotated to opposite direction of R by half an angle; 2-4) restoring the de-distorted and rotated image to the pixel coordinate system according to the formula u=KX.
3. The multispectral stereo camera self-calibration algorithm based on track feature registration according to claim 1, wherein the step 4) of obtaining the optimal track corresponding point comprises the following steps: 4-1) randomly selecting a pair of tracks, and repeating the following steps until the error is small enough: a. randomly selecting 4 pairs of points from the selected track pair; b. calculating a transformation matrix H from infrared image points to visible light image points; c. adding point pairs with small enough error obtained by using the transformation matrix H; d. recalculating H; e. calculating and assessing the error; 4-2) adding a track pair with a small enough error obtained by using the transformation matrix H; 4-3) recalculating H; 4-4) calculating and assessing the error, and if the error is not small enough, repeating step 4-1).
4. The multispectral stereo camera self-calibration algorithm based on track feature registration according to claim 1, wherein the step 7) of correcting the calibration result comprises the following steps: 7-1) further screening the point pairs by using random sample consensus; 7-2) solving a basic matrix F and an essential matrix E: a relationship between the pixel points u.sub.l and u.sub.r corresponding to infrared light and visible light and the basic matrix F is:
u.sub.r.sup.TFu.sub.l=0 substituting the coordinates of the corresponding points into the above formula to construct a homogeneous linear equation system to solve F; a relationship between the basic matrix and the essential matrix is:
E=K.sub.r.sup.TFK.sub.l wherein K.sub.l and K.sub.r are respectively the internal parameter matrices of the infrared camera and the visible light camera; 7-3) decomposing a relationship between rotation and translation from the essential matrix: the relationship between the essential matrix E and rotation R and translation t is as follows:
E=[t].sub.×R wherein [t].sub.× indicates a cross product matrix of t; conducting singular value decomposition on E to obtain:
Description
DESCRIPTION OF DRAWINGS
(1)
(2)
DETAILED DESCRIPTION
(3) The present invention solves the change of a positional relationship between an infrared camera and a visible light camera due to factors such as temperature, humidity and vibration. The present invention will be described in detail below in combination with drawings and embodiments.
(4) 1) Using the infrared camera and the visible light camera to shoot a group of continuous frames with moving objects at the same time.
(5) 2) Original image correction: conducting de-distortion and binocular correction on an original image according to internal parameters and original external parameters of the infrared camera and the visible light camera. The flow is shown in
(6) 2-1) Calculating the coordinates in a normal coordinate system corresponding to the pixel points of the image, wherein the normal coordinate system is the projection of a camera coordinate system on the plane Z=1; the camera coordinate system is a coordinate system which takes the center of the camera as an origin of the image coordinate system, takes image directions as XY axis directions and takes a direction perpendicular to the image as Z axis direction; a pixel coordinate system takes the upper left corner of the image as an origin, and x-axis and y-axis of the pixel coordinate system are parallel to x-axis and y-axis of the image coordinate system, respectively; the unit of the pixel coordinate system is the pixel; the relationship between pixel coordinates and normal coordinates is as follows:
(7)
(8) wherein
(9)
indicates the pixel coordinate of the image;
(10)
indicates an internal parameter matrix of the camera; f.sub.x and f.sub.y respectively indicate the focal distances of the image in x direction and y direction; the unit is the pixel; (c.sub.x, c.sub.y) indicates the principal point position of the camera, i.e., the corresponding position of the camera center on the image; and
(11)
is a coordinate in the normal coordinate system. the normal coordinate system corresponding to the pixel points is calculated, i.e., X=K.sup.−1u, through the known pixel coordinate system of the image and the internal parameters of the camera;
(12) 2-2) Removing image distortion: due to the limitation of a lens production process, a lens under actual conditions has some distortion phenomena, causing nonlinear distortion. Therefore, a pure linear model cannot accurately describe an imaging geometric relationship. The nonlinear distortion can be roughly classified into radial distortion and tangential distortion.
(13) The radial distortion of the image is a position deviation of the image pixel points with the distortion center as the center point along the radial direction, thereby causing the distortion of the picture formed in the image. The radial distortion is roughly described as follows:
x.sub.d=x(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)
y.sub.d=y(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)
(14) wherein r.sup.2=x.sup.2+y.sup.2; k.sub.1, k.sub.2 and k.sub.3 are radial distortion parameters.
(15) The tangential distortion of the image is generated by the defect in the camera manufacturing that makes the lens not parallel to the image plane, and can be quantitatively described as:
x.sub.d=x+(2p.sub.1xy+p.sub.2(r.sup.2+2x.sup.2))
y.sub.d=y+(p.sub.1(r.sup.2+2y.sup.2)+2p.sub.2xy)
(16) wherein p.sub.1 and p.sub.2 are tangential distortion coefficients.
(17) In conclusion, the coordinate relationship before and after distortion is as follows:
x.sub.d=x(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)+(2p.sub.1xy+p.sub.2(r.sup.2+2x.sup.2))
y.sub.d=y(1+k.sub.1r.sup.2+k.sub.2r.sup.4+k.sub.3r.sup.6)+(p.sub.1(r.sup.2+2y.sup.2)+2p.sub.2xy)
(18) wherein (x,y) is a normal coordinate in an ideal state, and (x.sub.d,y.sub.d) is an actual normal coordinate with distortion.
(19) 2-3) Reversing the two images according to the original rotation relationship between the two cameras: an original rotation matrix R and a translation vector t between the two cameras are known:
X.sub.r=RX.sub.l+t
(20) wherein X.sub.l indicates the normal coordinate of the infrared camera, and X.sub.r indicates the normal coordinate of the visible light camera. The infrared image is rotated to positive direction of R by half an angle, and the visible light image is rotated to opposite direction of R by half an angle.
(21) 2-4) Restoring the de-distorted and rotated image to the pixel coordinate system according to the formula u=KX.
(22) 3) Calculating tracks of the moving objects.
(23) 4) Obtaining an optimal track corresponding point and obtaining a transformation matrix from an infrared image to a visible light image accordingly.
(24) 4-1) Randomly selecting a pair of tracks, and repeating the following steps until the error is small enough: Randomly selecting 4 pairs of points from the selected track pair. Calculating a transformation matrix H from infrared image points to visible light image points. Adding point pairs with small enough error obtained by using the transformation matrix H. Recalculating H. Calculating and assessing the error.
(25) 4-2) Adding a track pair with a small enough error obtained by using the transformation matrix H.
(26) 4-3) Recalculating H.
(27) 4-4) Calculating and assessing the error, and if the error is not small enough, repeating step 4-1).
(28) 5) Further optimizing the matching results of the track corresponding points: selecting the number of registration point pairs with lower error as candidate feature point pairs.
(29) 6) Judging a feature point coverage area: dividing the image into m*n grids; if the feature points cover all the grids, executing a next step; otherwise continuing to shoot the image and repeating step 1) to step 5).
(30) 7) Correcting the calibration result: using image coordinates of all the feature points to calculate the positional relationship between the two cameras after correction; and then superimposing with the original external parameters.
(31) 7-1) Further screening the point pairs by using random sample consensus (RANSAC).
(32) 7-2) Solving a basic matrix F and an essential matrix E: a relationship between the pixel points u.sub.l and u.sub.r corresponding to infrared light and visible light and the basic matrix F is:
u.sub.r.sup.TFu.sub.l=0
(33) The coordinates of the corresponding points are substituted into the above formula to construct a homogeneous linear equation system to solve F.
(34) A relationship between the basic matrix and the essential matrix is:
E=K.sub.r.sup.TFK.sub.l
(35) wherein K.sub.l and K.sub.r are respectively the internal parameter matrices of the infrared camera and the visible light camera.
(36) 7-3) Decomposing a relationship between rotation and translation from the essential matrix: the relationship between the essential matrix E and rotation R and translation t is as follows:
E=[t].sub.×R
(37) wherein [t].sub.× indicates a cross product matrix oft.
(38) Conducting singular value decomposition on E to obtain:
(39)
(40) Defining two matrices
(41)
(42) Thus, writing E in the following two forms
(43) (1) E=UZU.sup.TUWV.sup.T
(44) setting [t].sub.×=UZU.sup.T, R=UWV.sup.T
(45) (2) E=−UZU.sup.TUW.sup.TV.sup.T
(46) setting [t].sub.×=−UZU.sup.T, R=UW.sup.TV.sup.T
(47) 7-4) Superimposing the decomposed relationship between rotation and translation into the original positional relationship between the infrared camera and the visible light camera;
(48) Recording the rotation matrix before de-distortion as R.sub.0 and the translation vector as t.sub.0=(t.sub.x, t.sub.y, t.sub.z).sup.T; recording the rotation matrix calculated in the previous step as R and the translation vector as t=(t.sub.x′, t.sub.y′, t.sub.z′).sup.T; and new R.sub.new and t.sub.new are as follows:
(49)
(50) In addition, multiplying t.sub.new by a coefficient so that the component of t.sub.new in x direction is t.sub.x.sup.new=t.sub.x.