Method of stabilizing a sequence of images
11477382 · 2022-10-18
Assignee
Inventors
Cpc classification
H04N23/683
ELECTRICITY
H04N23/6812
ELECTRICITY
International classification
Abstract
A method operable within an image capture device for stabilizing a sequence of images captured by the image capture device is disclosed. The method comprises, using lens based sensors indicating image capture device movement during image acquisition, performing optical image stabilization (OIS) during acquisition of each image of the sequence of images to provide a sequence of OIS corrected images. Movement of the device for each frame during which each OIS corrected image is captured is determined using inertial measurement sensors. At least an estimate of OIS control performed during acquisition of an image is obtained. The estimate is removed from the intra-frame movement determined for the frame during which the OIS corrected image was captured to provide a residual measurement of movement for the frame. Electronic image stabilization (EIS) of each OIS corrected image based on the residual measurement is performed to provide a stabilized sequence of images.
Claims
1. A camera module (60) for an image capture device comprising: inertial measurement sensors (20) arranged to indicate image capture device movement during image acquisition, a lens (12), an image sensor, and a camera module processor (70) arranged to: provide a measurement of movement for an image frame (74) in terms of rotation around each of said image capture device's X, Y and Z axes represented as quaternions; back-project a plurality of nodes from an image sensor space to a 3-dimensional space according to a projection model for said lens to provide quaternion representations Q for each node of a correction grid; rotate said quaternions Q according to said measurement of movement for an image frame; project said rotated quaternions back to the image sensor space to form the correction grid (76); and provide the correction grid for said frame and said frame to a central camera processor (72) for correction of said image frame (74) based on said measurement of movement and said projection model.
2. A camera module according to claim 1 wherein a quaternion (w,i,j,k) represents point orientation P(X, Y, Z) in space, wherein projecting each rotated quaternion Q=(w,i,j,k) comprises: calculating
3. A camera module according to claim 2 wherein said look-up table is pre-calculated.
4. A camera module according to claim 1 wherein said camera module processor is further arranged to: selectively perform optical image stabilization (OIS) control (14) during capture of images in a sequence of images based on inertial measurement sensors' signals to obtain OIS corrected images from said image sensor, obtain at least an estimate of OIS control performed during acquisition of an image; adjust the correction grid to remove said estimate from the intra-frame movement determined for the frame during which said OIS corrected image was captured.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
DESCRIPTION OF THE EMBODIMENTS
(8) Referring to
(9) Referring back to
(10) Note that it is important that the record of device movement R[ ] captured by the IMU sensors 20 be capable of being synchronized with the lens movement T[ ] recorded by the OIS controller 14. While it is not necessary that these be captured at the same spatio-temporal resolution, if the values are to be correlated accurately with one another, they need to be performed on the same time basis. Thus in some embodiments, the matrix T[ ] provided by the OIS controller is time stamped using the same timer used to generate timestamps for the IMU matrix R[ ]; or at least the timestamp sources are calibrated so that the matrices R[ ] and T[ ] can be correlated with one another. In other embodiments, a common clock signal could be employed by each of the OIS controller 14 and the IMU sensors 20, but it will be appreciated that any synchronization technique can be used.
(11) In any case, each of the movement matrices T[ ] and R[ ] are fed to a video stabilization module 18. In one embodiment, the video stabilization module 18 uses the matrix R[ ] to calculate the amount of correction (local displacement in the sensor plane) required for video stabilization based on the camera orientation change with respect to orientation in the previous frame.
(12) The video stabilization module 18 then subtracts the lens barrel shift amount indicated by the matrix T[ ] to provide a final correction matrix M[ ]. This is done to remove the correction already applied by the OIS controller 14, as not subtracting it from the correction calculated using IMU data will lead to overcorrection.
(13) The video stabilization module 18 provides the final correction matrix M[ ] to an image warping module 22 in order to produce a stabilized output frame 24 based on the OIS corrected input image 26 corresponding to the matrix T[ ].
(14) More formally, knowing the camera intrinsic matrix K:
(15)
where f=focal length; x.sub.0, y.sub.0 are the principal point offsets; and s=axis skew, the final correcting transformation matrix M can be defined as follows:
M=KRK.sup.−1T.sup.−1
where R[ ] and T.sup.−1[ ] have been normalized to correspond with one another.
(16) Thus after inverting the correction T applied by the OIS controller 14, EIS based on a final correction (M) applied by the image warping module 22 can be performed without introducing distortion into the resultant output image 24.
(17) Unlike the OIS controller 14 of the first embodiment, when an OIS controller does not provide information about lens position, a precise combination of OIS and IMU sensor based EIS stabilization is not possible.
(18) Referring now to
(19) The embodiment of
(20) As before, each input image frame . . . N−1, N . . . captured by the image sensor is already stabilized using OIS, but the level of stabilization is unknown. Note that because the OIS controller typically only uses inertial sensors, it is unaffected by the motion of objects that could be in the camera's field of view.
(21) Nonetheless, a displacement map V[ ] (
(22) Thus, this embodiment is based on knowing the overall frame to frame motion R[ ] from the IMU sensors 20 and combining this information with the displacement map V[ ] to extract an estimate of OIS correction applied across the image so that this can be removed before an image warping module 22, similar to that of
(23) Again, the camera IMU sensors 20 provide information about actual camera rotation along all three axes (R.sub.X R.sub.Y R.sub.Z) during frame acquisition. Where the OIS controller does not correct for rotation around optical axis (typically Z axis), correction for movement around this axis can be applied in full, by the image warping module 22, based on the gyroscope input.
(24) Thus, before a position correction matrix is calculated, the R.sub.Z components of movement across an image can be removed from the displacement map V[ ] produced by the local motion estimation unit 52 by an R.sub.Z Removal block 54. After this, the motion field V-Rz[ ] will contain only motion in X,Y directions partially corrected by the OIS controller and containing outliers caused by the moving objects and estimation errors.
(25) A final correction calculation module 56 calculates a residual correction matrix M[ ] using image analysis supported by the IMU sensor output R.sub.X R.sub.Y R.sub.Z by. In this case, R.sub.X R.sub.Y R.sub.Z are not applied directly to V-Rz[ ], but help in verification of the local motion vectors retrieved by the image analysis performed by the block 56 to extract the OIS controller motion component T[ ] from the V-Rz[ ] matrix. So for example, the final correction calculation block 56 can use IMU sensor output R[ ] to filter out any outlier vectors from the motion field V-Rz[ ]. The remaining vectors can then be used to calculate the transformation matrix T[ ].
(26) Once this matrix T[ ] has been generated, the residual correction matrix M[ ] can be generated as in the first embodiment to indicate the X,Y stabilization that needs to be performed across the image by an image warping module 22.
(27) Because the rotation of the camera R.sub.Z was previously subtracted from the motion field, the final correction calculation block 56 adds this back to form the final transformation between two consecutive frames. This matrix M+Rz[ ] can be further filtered if required.
(28) In summary, using the second embodiment, a motion field VI similar in form to that shown in
(29) Assuming a perfect motion field V.sub.I (no outliers or errors) the shift introduced by the OIS will be:
T=V.sub.R−V.sub.I
(30) In the real situation, the V.sub.I field will contain outliers and as a result, vector field T will contain outliers. However, since the vector field T is a result of motion strictly in the image plane, all we need to find is the translation matrix with two independent parameters X,Y. By comparison, estimation of a homography matrix would require finding 8 or 9 independent parameters and is not only more complex but is also more prone to numerical conditioning and overfitting.
(31) Assuming we are dealing with a rolling shutter camera, we need to find the translation value for each of the rows of vectors and interpolate intermediate values if needed. This will give the estimated trajectory T[ ] applied by the OIS controller.
(32) The next step will be calculation of the correction values M[ ] using camera rotations obtained from IMU and lens projection parameters. From this correction we need to subtract the motion already corrected by the OIS (based on T motion field) to get the final correction.
(33) Using the above embodiments, all calculations can be performed at any point in time allowing for the recovery of the camera trajectory T[ ] during the exposure time of the frame and as a consequence perform effective rolling shutter effect removal.
(34) Incorporating the information from the IMU sensors 20 reduces the number of degrees of freedom during calculation of the residual correction matrix M[ ]. This helps in removing outliers from the original motion field and increases the reliability of estimated correction matrix.
(35) In variants of the above described embodiments, measurements R.sub.X R.sub.Y R.sub.Z from the camera IMU 20, especially gyroscope signals, can be integrated as a function of the exposure time of the image frames as disclosed in co-filed U.S. patent application Ser. No. 15/048,224 entitled “A method for correcting an acquired image” (Reference: FN-483-US), to mitigate distortion caused by high frequency vibration of the camera. These signals with appropriate conditioning may substitute for, either in part or in whole, or be combined with raw R.sub.X R.sub.Y R.sub.Z measurements in performing EIS as described above.
(36) The above described embodiments have been described in terms modules 18, in
(37) It will be appreciated that once such functionality has been incorporated within a camera module 60, the functionality of the camera module may be further extended to control the correction grid and to accommodate distortion effects other than EIS as described below in more detail in relation to
(38) In
(39) Note that the output of the correction calculation modules 18 and 56 of the embodiments of
(40) Thus, in the embodiment of
(41) The motion processing unit 70 writes acquired input images 74 along with respective associated correction grids 76 into system memory 80 so that a processing unit, in this case a dedicated graphics processing unit (GPU) 72, can correct each input image and write a corrected output image 78 back to system memory 80. As in WO2014/005783 (Ref: FN-384), the correction grid 76 provided for each image, in this case referred to as a hybrid correction grid, can take into account global transformation characteristics, local transformation characteristics or even affine transformation characteristics, for example, to compensate for rolling shutter distortion.
(42) As well as providing acquired images and their associated correction grids, the camera module 60 writes motion data 79 to system memory 80 so that this case be used by other applications or modules running in the device so avoiding the need for a second IMU within the device.
(43) In addition to this functionality, the camera module 60 also incorporates a lens projection model 62 in order to enable the motion processing unit 70 to properly predict the behavior of an image projected by the camera lens 12 and acquired by the camera imaging sensor.
(44) Using this lens model 62, the motion processing unit 70 can control the correction grid provided for each image frame to take into account the characteristics of the image acquisition system, the lens model 62, as well as IMU input 20 and, if activated, OIS control 14.
(45) Typically, lens projection can be represented as a 3D grid that is used to transform an image in order to apply correction to artefacts caused by camera movement and rolling shutter for example as disclosed in WO2014/005783 (Ref: FN-384).
(46) As mentioned above, in some embodiments, the changes in camera orientation during image acquisition (R.sub.X R.sub.Y R.sub.Z) can be represented using quaternions. In this case, rather than converting the quaternion to a rotation matrix in order to transform a correction grid node, represented as a vector in a Cartesian coordinate system, by multiplication (which poses much greater computational cost than multiplication of two quaternions), correction grid nodes determined based on the lens projection can be represented as quaternions and transformed as such. This can be done due to fact that the projection of a camera can be represented as mapping of the intersections of the incoming light rays with a unit sphere to the locations on the surface of the projection plane.
(47) Obtaining the quaternion Q=(w, i, j, k) by rotating a vector A=(0, 0, 1) representing the optical axis to P=(X, Y, Z) can be performed as follows: 1. d=dot(A,P)/vector dot product 2. ax=cross(A,P)/vector cross product 3. w=sqrt(norm(A){circumflex over ( )}2+norm(P){circumflex over ( )}2)+d/w component of the quaternion 4. Q=normalizeQuaternion([w, i=ax(0), j=ax(1), k=ax(2)])/final quaternion
(48) The projection of a point P in 3D space, corresponding to a node of the correction grid, onto a projection plane is performed in step 3 of the following algorithm after taking in account rotation of the camera. Thus calculation of the correction grid for an image comprises: 1. Back-projection from the image (sensor) space to 3D according to the lens projection model to provide quaternion representations Q for each node of a correction grid; 2. Rotation (to take into account rotation measured by a gyroscope or other means) of the quaternions Q. The rotation is calculated using input from a gyroscope within the IMU 20. This rotation is represented as a quaternion; 3. Projection back to the image space. Reference 3D points (or quaternions Q as calculated above) based on the lens projection model are rotated according to the camera motion quaternions and projected back to 2D space to form a correction grid, that can then be used by the GPU 72 to perform the correction.
(49) In some implementations, step 3 could be performed using a conventional approach, where finding the projected coordinates p=(x,y) of the reference point P=(X, Y, Z) requires finding the distance of the point P to the projection axis:
R=√{square root over (X.sup.2+Y.sup.2)}
calculating the angle of incidence:
α=atan(R,Z)
obtaining the projected radius
r=ƒ(α)
and finally calculating the location of the point on the projection plane
(50)
(51) However, it will be seen that the square roots and atan( ) function required to do so would be processor intensive.
(52) In embodiments based on quaternions, the w component cannot be used directly as an index to a lookup table representing conversion of w to the radius because of high nonlinearity of this function. Rather than using w directly as an index to a look-up table, step 3 of the above process works as follows: 1. Take a quaternion (w, i, j, k) representing point orientation P(X,Y,Z) in space; 2. Calculate
(53)
(54) Now, once the correction grid taking into account lens projection and changes in camera orientation during image acquisition (R.sub.X R.sub.Y R.sub.Z) represented using quaternions has been determined as described above, the correction grid can be further adjusted to take into account OIS control during the acquisition of the image by for example, subtracting the translation T[ ] applied by the OIS controller 14 as measured by the IMU 20 from the correction grid during image acquisition.
(55) An alternative way of calculating the lens projection can be used in a case of forward distortion mapping. In such case, the undistorted grid is associated with the senor image and pixel location on the output image is calculated. Since the regular grid is associated with the sensor, it can be assumed that pixels along a single row of grid nodes were captured at the same time and share the camera orientation. In such case, the correction rotation can be calculated once per row and applied to all the grid nodes belonging to that row. In such case it is more efficient to store the 3D reference grid as normal 3D vectors in Cartesian coordinates, calculate the rotation matrix from quaternion and multiply the vectors by the matrix. Matrix multiplication requires 9 scalar multiplications versus quaternion multiplication requiring 16 scalar multiplications. Such case can also benefit from the fact that all the 3D vectors forming reference grid lay on a unit sphere. The Z distance of the vector will be in functional relationship to the angle of incidence. Following a similar idea to the case of quaternions, the Z coordinate can be used for indexing a lookup table instead of the w component of the quaternion with similar computational cost. Thus, point 2 above will have the following form:
(56)
(57) The lookup table will have to be built accordingly following similar steps to the quaternion approach (step 3). Step 4 will not be necessary as direction will be given explicitly by X and Y components of the vector. Since X, Y components do not form unit vector, the lookup table will need to contain pre-calculated values of r/R instead of r alone.
(58) Using a system such as shown in
(59) The correction grid 76 can be generated in two styles:
(60) 1. Forward mapping grid suitable for GPU-style correction where a content of the destination image is warped by the underlying warping grid as disclosed in WO2014/005783 (Ref: FN-384).
(61) 2. Where transformation is defined as a texture mapping grid that maps an output coordinate system to the source coordinate system.
(62) Generation of each correction grid 76 by the motion processing unit 70 can also take into account auto-focusing (AF) activity to prevent or smooth changes in image scale during “focus hunting” process. In order to do so, the motion processing unit 70 acquires data from sensors 64 both internal and external to the camera module 60, for example, to track the DAC codes employed to drive the auto-focus mechanism as described in more detail in WO2016/000874 (Ref: FN-396) and this can be used to scale the correction grid to ensure that imaged objects, such as faces, maintain their scale within an image as focus is varied.
(63) Providing a correction grid 76 as meta-data together with an input image 74 relieves downstream processors such as the GPU 72 from needing to perform motion analysis and synchronization between motion data and image data.
(64) As will be appreciated, there may be some delay in the motion processing unit 70 calculating a correction grid 76 for any given acquired input image 74 and for this reason, motion data 79 provided by the camera module 60 can be internally buffered and the correction grid 76 generated with a pre-programmed delay. For example if the delay is 10 frames, the camera module 60 can buffer frames numbered 0 to 9 and together with frame 10, provide a correction grid for frame 0 to the remainder of the system then, with frame 11, provide a correction grid for frame 1 and so on. In this case the camera module 60 will have to buffer a pre-defined number of frames. However, this will also allow for better motion analysis and more optimal image stabilization.