Velocity estimation from imagery using symmetric displaced frame difference equation
09547911 ยท 2017-01-17
Assignee
Inventors
Cpc classification
H04N7/0137
ELECTRICITY
International classification
Abstract
A method and apparatus for processing an image sequence described herein that provides a symmetric displaced frame difference equation. Next, an input image sequence can be received that includes a pair of image frames individually including multidimensional image data corresponding to a plurality of pixel locations at different times. Finally, using at least one processor, the symmetric displaced frame difference equation can be solved using an iteration equation and the image data of the pair of image frames to determine a displacement field describing displacement vectors at half of the total displacement vector.
Claims
1. A method for processing an image sequence, comprising; providing a symmetric displaced frame difference equation; receiving an input image sequence comprising a pair of image frames individually including multidimensional image data corresponding to a plurality of pixel locations at different times; using at least one processor, solving the symmetric displaced frame difference equation using: an iteration equation and the image data of the pair of image frames to determine a displacement field describing displacement vectors at half of the total displacement vector, and at least one bilinear polynomial function expressing a multidimensional displacement field; and determining, using the at least one processor, displacement by motion for the input image sequence based on the solved symmetric displaced frame difference equation.
2. The method of claim 1, wherein the multidimensional image data of the pair of frames is two-dimensional, and wherein the symmetric displaced frame difference equation is
I(i+u.sub.ijt.sub.h,j+v.sub.ijt.sub.h,t.sub.2)I(iu.sub.ijt.sub.h,jv.sub.ijt.sub.h,t.sub.1)=0, wherein: I is an intensity of the image data, i is a pixel index in a horizontal direction x, j is a pixel index in a vertical direction y orthogonal to the horizontal direction x, u.sub.ij is a component of the velocity vector on a pixel point, v.sub.ij is another component of the velocity vector on a pixel point, and t.sub.h is equal to t/2, wherein t is a time difference between the pair of image frames (t.sub.1 and t.sub.2).
3. The method of claim 1, wherein solving the symmetric displaced frame difference equation comprises using at least one iteration equation derived from the symmetric displaced frame difference equation by conversion to a fully or over-constrained system using a nonlinear least squares model of the displacement field.
4. A motion estimator apparatus, comprising: at least one processor; and a memory storing a symmetric displaced frame difference equation, wherein the at least one processor is operative to: receive an input image sequence comprising a pair of image frames individually including multidimensional image data corresponding to a plurality of pixel locations at different times, solve the symmetric displaced frame difference equation using; an iteration equation and the image data of the pair of image frames to determine a displacement field describing displacement vectors at half of the total displacement vector, and at least one bilinear polynomial function expressing a multidimensional displacement field, and determine displacement by motion for the input image sequence based on the solved symmetric displaced frame difference equation.
5. The motion estimator apparatus of claim 4, wherein the multidimensional image data of the pair of frames is two-dimensional, and wherein the symmetric displaced frame difference equation is
I(i+u.sub.ijt.sub.h,j+v.sub.ijt.sub.h,t.sub.2)I(iu.sub.ijt.sub.h,jv.sub.ijt.sub.h,t.sub.1)=0, wherein: I is an intensity of the image data, i is a pixel index in a horizontal direction x, j is a pixel index in a vertical direction y orthogonal to the horizontal direction x, u.sub.ij is a component of the velocity vector on a pixel point, v.sub.ij is another component of the velocity vector on a pixel point, and t.sub.h is equal to t/2, wherein t is a time difference between the pair of image frames (t.sub.1 and t.sub.2).
6. The motion estimator apparatus of claim 4, wherein the at least one processor is operative to solve the symmetric displaced frame difference equation using at least one iteration equation derived from the symmetric displaced frame difference equation by conversion to a fully or over-constrained system using a nonlinear least squares model of the displacement field.
7. A non-transitory computer readable medium with computer executable instructions for: providing a symmetric displaced frame difference equation; receiving an input image sequence comprising a pair of image frames individually including multidimensional image data corresponding to a plurality of pixel locations at different times; solving the symmetric displaced frame difference equation using: an iteration equation and the image data of the pair of image frames to determine a displacement field describing displacement vectors at half of the total displacement vector, and at least one bilinear polynomial function expressing a multidimensional displacement field; and determining displacement by motion for the input image sequence, based on the solved symmetric displaced frame difference equation.
8. The non-transitory computer readable medium of claim 7: wherein the multidimensional image data of the pair of frames is two-dimensional, and wherein the symmetric displaced frame difference equation is
I(i+u.sub.ijt.sub.h,j+v.sub.ijt.sub.h,t.sub.2)I(iu.sub.ijt.sub.h,jv.sub.ijt.sub.h,t.sub.1)=0, wherein: I is an intensity of the image data, i is a pixel index in a horizontal direction x, j is a pixel index in a vertical direction y orthogonal to the horizontal direction x, u.sub.ij is a component of the velocity vector on a pixel point, v.sub.ij is another component of the velocity vector on a pixel point, and t.sub.h is equal to t/2, wherein t is a time difference between the pair of image frames (t.sub.1 and t.sub.2).
9. The non-transitory computer readable medium of claim 7, comprising computer-executable instructions for solving the symmetric displaced frame difference equation using at least one iteration equation derived from the symmetric displaced frame difference equation by conversion to a fully or over-constrained system using a nonlinear least squares model of the displacement field.
10. The method of claim 1, wherein receiving the input image sequence comprises receiving only two image frames.
11. The motion estimator apparatus of claim 4, wherein the input image sequence comprises only two image frames.
12. The non-transitory computer readable medium of claim 7, wherein receiving the input image sequence comprises receiving only two image frames.
13. The method of claim 1, further comprising: determining, using the at least one processor, displacement by motion for the input image sequence based on the solved symmetric displaced frame difference equation.
14. The method of claim 1, wherein solving the symmetric displaced frame difference equation using the iteration equation and the image data of the pair of image frames to determine a displacement field describing displacement vectors comprises: determining, using the at least one processor, a displacement vector corresponding to a moving particle in the image sequence.
15. The method of claim 14, further comprising: determining, using the at least one processor and the determined displacement vector corresponding to the moving particle, a plot of a motion particle trajectory of the moving particle.
16. The motion estimator apparatus of claim 4, wherein the at least one processor is further operative to: determine displacement by motion for the input image sequence based on the solved symmetric displaced frame difference equation.
17. The non-transitory computer readable medium of claim 7, further comprising computer-executable instructions for: determining, using the at least one processor, displacement by motion for the input image sequence based on the solved symmetric displaced frame difference equation.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The following description and drawings set forth certain illustrative implementations of the disclosure in detail, which are indicative of several exemplary ways in which the various principles of the disclosure may be carried out. The illustrated examples, however, are not exhaustive of the many possible embodiments of the disclosure. Other objects, advantages and novel features of the disclosure will be set forth in the following detailed description of the disclosure when considered in conjunction with the drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
(15) One or more embodiments or implementations are hereinafter described in conjunction with the drawings, where like reference numerals refer to like elements throughout, and where the various features are not necessarily drawn to scale.
(16)
(17)
(18) The estimator apparatus 100 receives the input image sequence 10 and generates a displacement vector field 140 which can be stored in the internal memory 120 and/or maybe outputted by the apparatus 100 alone or as part of a processed image sequence 200. In addition, the estimator provides an equation system 130, which may be stored in the electronic memory 120. The illustrated estimator 100 further includes at least one iteration equation 132 and the bilinear displacement (or motion) vector function 134, which can be stored in the memory 120 or otherwise be accessible for use by the processor 110 in performing the displacement field estimation function set forth herein. In particular, the iteration equations 132 in certain embodiments are derived from the equation set 130 by conversion to a fully or over-constrained system using a nonlinear least squares model of the displacement field, as discussed further below. In addition, the bilinear motion vector function 134 in certain embodiments expresses a multidimensional displacement field. The bilinear displacement vector function 134 in certain embodiments can be represented by compact form shown in equation (10) below.
(19) The equation system 130 is a fully constrained nonlinear equation set of equations, where four exemplary equations are illustrated in the example of
(20) In certain embodiments, the PROC equation Solver 110a is programmed to solve the equation set 130 using an iterative numerical and PROC techniques to determine the displacement field 140, and may employ any suitable initial conditions and loop termination logic, including without limitation a maximum number of iterations per pixel location i, j, alone or in combination with termination based on computed value changes being less than a predetermined threshold value. In certain embodiments, the Motion Vector Solver 110b solves the equations (3) and (4) below using a damped Newton-Raphson method with suitable initial values used in the computation. In other embodiments, the Motion Vector Solver 110b solves the equations (3) and (4) using bilinear modeling of the displacement field 140. The estimator 100 may provide the derived displacement field 140 for use in a variety of applications, such as video processing using an interpolator to construct one or more additional frames for frame rate up-conversion. In another example, the estimator 100 may provide the displacement field 140 for use with compression processing in a video encoder for selectively dropping certain frames 12 received with the input image sequence 10.
(21)
(22) At 302 in
(23) At 320 in
(24) In the illustrated embodiment, the displacement vectors on node points are solved at 130 (
(25) If the cost function is not minimized and no other termination conditions are satisfied (NO at 340 in
(26) As noted above, the motion estimator apparatus 100 employs a fully constrained nonlinear system of equations 130 which includes forward and backward displaced frame difference equations (EQ1 and EQ2 in
(27) As further discussed below, the inventor has appreciated that such an equation set 130 facilitates motion determination from an image sequence in a variety of computer vision and remote sensing applications such as velocity or displacement estimation from motion, object tracking or recognition, medical imaging, advanced video editing, and ocean surface current estimation. The use of this fully constrained equation set 130 without having to make further approximations or to impose any additional constraints or assumptions provides a novel solution to the inverse problem of motion estimation in two successive image frames 12. In this regard, determination of the instant velocity is ill-posed because the information about the path and rate is lost after a temporal sampling. If the images have enough texture morphologies, both initial and final configurations of a moving particle are recorded by the image sequence. The inverse problem for determination of the displacement field 140, however, is well-posed because both initial and final positions can be determined and observed physically based on the input image sequence 10.
(28) The inventor has appreciated that if the initial and final positions of a particular artifact (e.g., artifact 14 in
(29) As used herein, conservative velocity is not intended to mean that the physical velocity is a conservative quantity. This concept of equivalent motion fields between the displacement and the conservative velocity is based on the definition. If the initial and final positions of a moving particle or artifact in two successive frames are the only input information, the unique total displacement vector may be associated with many different intermediate dynamic motion states. However, only the motion with the conservative velocity is collinear with the displacement trajectory (a straight line). Focusing strictly on the initial and final results, the conservative velocity as used herein is the resolved velocity field, i.e., the solved velocity is only an average (conservative) velocity. To avoid any misunderstanding on this concept, all derivations in this disclosure are based on the displacement vector field 140 that describes displacement vectors (r.sub.ij(t.sub.1), r.sub.ij(t.sub.2)) with respect to pixel locations (i, j) at the first and second times (t.sub.1, t.sub.2). In this respect, it is understood that there is only limited information available from an input image sequence 10 to study the inverse problem without any way of absolutely knowing the real physical (dynamic) processes or the identity of physical objects, which can be rigid bodies, liquid flow, or deformed objects in an image scene 12, but the above described processed 300 can be advantageously employed to determine motion fields that are consistent with the physical observation.
(30) To illustrate, a function I(r, t) is defined as a scalar property (e.g., intensity) of a continuum of particles that may be expressed as a point function of the coordinates (r=r(t)), where r is a multidimensional position (e.g., x and y positions in a two dimensional example). In other examples, depending on the application, I(r, t) can be defined as the intensity of the optical flow (in computer vision), the tracer concentration (ocean color), or the temperature of heat flow (in geosciences and remote sensing). If the intensity between two images is conserved, then the following equation (1) represents the conservation constraint, regardless of a source term:
(31)
where the operator
(32)
denotes a total derivative with respect to time t, and
(33)
is the velocity vector in Cartesian coordinates. Equation (1) is sometimes referred to as the optical flow or brightness conservation constraint equation (in computer vision), or the heat flow or tracer conservation equation (in geophysics), and is a differential form of conservation constraint. The differential form of the conservation constrained equation (1) which contains linear terms of the components of the velocity holds only for infinitesimal or close infinitesimal motions.
In order to constrain the image scenes at time t=t.sub.1 and t=t.sub.2, equation (1) is an integrated from time t.sub.1 to t.sub.2 as follows:
(34)
where r(t.sub.1) and r(t.sub.2) are the position vectors at time t.sub.1 and t.sub.2. If a displacement vector field is defined by r=r(t.sub.2)r(t.sub.1), then the displaced frame difference (DFD) equation is given by DFD=I(r(t.sub.1)+r, t.sub.2)I(r(t.sub.1),t.sub.1)0. The conservation constraint equation (1) is thus temporally integrated into the path independent equation (2). It is noted that although the DFD equation is an implicit function on the displacement field 140, the two path independent terms in equation (2) correspond to the initial and final states of motion associated with the two successive frames for this conservation system. Employing the DFD equation derived by integrating between time of two successive frames can achieve higher accuracy in comparison with the differential form of the conservation constraint optical flow equation (1) (a first order term of Taylor expansion of the DFD), especially for large scale displacement motion estimation. As the image intensity change at a point due to motion has only one constraint equation (1), while the motion vector at the same point has two components (a projection case), the motion field was previously believed to be not computable without an additional constraint.
(35) Referring also to
r(r(t.sub.2),t.sub.2)+r(r(t.sub.2)+r(r(t.sub.2),t.sub.2),t.sub.1)=0,(3)
or
r(r(t.sub.1),t.sub.1)+r(r(t.sub.1)+r(r(t.sub.1),t.sub.2),t.sub.2)=0,
and the Conservative Velocity Constraint (CVC) equation is given by
v(r(t.sub.1)+v(r(t.sub.1),t.sub.1)t,t.sub.2)=v(r(t.sub.1),t.sub.1).(4)
or
v(r(t.sub.2),t.sub.2)=v(r(t.sub.1)v(r(t.sub.2),t.sub.2)t,t.sub.1),
where t=t.sub.2t.sub.1.
(36) The DVI equation (3) establishes an implicit function relationship between the forward and backward displacement fields. Both displacement and conservation velocity fields are equivalent based on the definition. According to the definition of the forward and backward displacement vectors r(r(t.sub.1),t.sub.1) and r(r(.sub.t2),t.sub.2) in
(37)
(38) Equations (5) indicate that both the displacement and conservative vectors are equivalent because each vector can be obtained by multiplying and dividing a time different factor (t.sub.2t.sub.1) or (t.sub.1t.sub.2). The DVI equation (3) indicates that the motion fields at time t.sub.1 and t.sub.2 are not equal at the same position, but both fields have a shift from each other for moving objects. The shift vector is the displacement vector of the time differences. Equation (3) or (4) establishes an implicit recursive relationship at time t.sub.1 and t.sub.2 for the displacement vector fields or conservative velocity fields. The inventor has appreciated that since the DVI equation (3) or the CVC equation (4) is a vector equation, the total number of component equations is equal to the number of dimensions of the velocity, and as a result, if one field is given or solved then another corresponding field can be determined by equation (3) or (4) completely.
(39) A fully constrained system of equations 130 can thus be provided in the motion estimator apparatus 100 using the motion compensation concept. The temporal integral form of the conservation constraint equation (2) or the DFD equation indicates that the image at time t.sub.1 can be predicted by motion-compensated prediction with an image at time t.sub.2 and the displacement field 140 at time t.sub.1, thus facilitating interpolation or extrapolation for video coding and/or frame rate up-conversion applications. Also, the image at time t.sub.2 can be predicted also with an image at time t.sub.1 and the displacement field 140 at time t.sub.2. The inventor has thus found that the equation (2) can be described forward and backward at a fixed position based on the displacement fields 140 at different times, t.sub.1 and t.sub.2.
(40) Assuming that the number of pixels in an image frame 12 is equal to N, the total number of data points is equal to 2N for an input image sequence 10 having two frames (e.g., frames 12.sub.t1 and 12.sub.t2 in
(41)
where r(r, t.sub.1) and r(r, t.sub.2) are the forward and backward displacement fields 140 at a fixed position r as shown in
(42) The forward and backward displacement vectors on two image frames 12.sub.t1 and 12.sub.t2 at times t.sub.1 and t.sub.2 as shown in
(43)
Since the above two equations hold for all image scenes, the equations (6) result if all position vectors in above equations are at a fixed position r, and the first and second equations are denoted by FDFD and BDFD. The solution to the inverse problem is therefore based on the motion analysis in physics, the DVI equation, and recognized different displacement fields at time t.sub.1 and t.sub.2 in both FDFD and BDFD equations. If the forward displacement field is chosen as independent variables, the DVI equations (whether forward DVI equations or backward DVI equations are used) link the FDFD and BDFD equations together for solving the forward two component displacement field.
(44) Using the conservative velocity and CVC equation to replace the displacement vector and DVI equation in (6), a set of fully constrained nonlinear system of equations 130 can be determined for solving the conservative velocity field. Defining a function with discrete variables i and j as f.sub.ij(t)=f (i, j, t), the FDFD, BDFD, and the component DVI equations in (6) on a given pixel point i, j are given by the following equations (7) and (8) as the equation set 130:
(45)
where r.sub.ij(t)=(x.sub.ij(t), y.sub.ij(t)).sup.T. The displacement vector field at time t.sub.2 is a function of the field at time t.sub.1 based on the DVI equation (8), where the same is true of the backward DVI equations discussed below. There are several available techniques for solving nonlinear systems of equations 130. For example, a damped Newton-Raphson method is an efficient technique, but a converged solution may be facilitated by having a good guess for the initial values, especially for a huge dimensional problem.
(46) Referring also to
(47)
where the function H.sub.a,b (x, y) is defined by:
(48)
and where the parameters of block size n.sub.x and n.sub.y are the sampled spaces of the function f on x and y directions as shown in
(49)
where denotes an integer operator. The {p, q} serve as tile indices because the integer operator increments them by unity after an additional n.sub.x or n.sub.y pixels are counted.
(50) Referring also to
(51) The displacement field is modeled at 602 in the adaptive framework 600 of
(52)
where the displacement vector r.sub.ij=r.sub.ij(t.sub.1)=(x.sub.ij, y.sub.ij). In the special case when block size is unity (n.sub.x=n.sub.y=1), r.sub.ij=r.sub.pq for all indices i and j. All displacement vectors r.sub.ij can be calculated with the bilinear function using the displacement on node points expressed as r.sub.pq. Displacement vectors r.sub.ij off-node are no longer independent variables (for over-constrained case: n=n.sub.x=n.sub.y>1) except on node points (or a fully constrained case: n=1).
(53) In this regard, the block (tile) size parameter n1 can be adjusted to control the number of interpolation points related to the resolution of the displacement field and the degree of the over-constraint. When the parameter n is equal to one, all node points and pixel positions are overlapped together and the system is fully constrained. The system is over-constrained if the parameter n is greater than one.
(54) A nonlinear least-squares model can also be used at 604 in
(55)
where the range of i and j are the entire (N=N.sub.xN.sub.y) image pixel domain (i[0, N.sub.x1], j[0, N.sub.y1], and the weighting factor can be equal to unity in this case). By minimizing the cost function with respect to the displacement components x.sub.kl and y.sub.kl as variables for given indices k and l on all node points, a fully or over-constrained system of equations (11) for the displacement may be written as follows:
(56)
(57) In addition, the summation domain is reduced from the entire image plane to only a local region .sub.kl or .sub.kl, so that:
(58)
where ={k, k} and ={1, 1}, and regions of the summation coverage k and l are defined by {k, l}={k+x.sub.k, I+y.sub.kl}. To obtain equation (14) below, the following is used:
(59)
where .sub.ij is the Kronecker-Delta symbol.
(60) Equations (11) obtained by the nonlinear least-squares model are a set of nonlinear system equations with all displacement vectors x.sub.pq and y.sub.pq on node points as variables. To solve the nonlinear system equations (11), an iterative equation can be formulated at 606 in
(61)
where m is a iteration index, and
{FDFD.sub.ij.sup.(m),BDFD.sub.ij.sup.(m)}={FDFR.sub.ij(x.sub.pq.sup.(m),y.sub.pq.sup.(m)),BDFD.sub.ij(x.sub.pq.sup.(m),y.sub.pq.sup.(m))}.
(62) Utilizing these expansions and the above equations (11), the following iterative equations can be used for all indices k and l:
(63)
(64) The parameter 0 is a Levenberg-Marquardt factor that is adjusted at each iteration to guarantee that the MSE is convergent. A smaller value of the factor can be used, bringing the algorithm closer to the Gauss-Newton method with second order convergence. This Levenberg-Marquardt method can improve converge properties greatly in practice and has become the standard of nonlinear least-squares routines.
(65) All displacement vectors x.sub.pq and y.sub.pq on node points can be obtained by iterating the equations in (12), and the displacement vectors x.sub.ij and y.sub.ij off node points (for n>1) can be calculated by the bilinear functions in (10). Ultimately, all displacement vectors r can be determined, and an optimum solution can be achieved over the large-scale image. In addition, the block size parameter n1 can be adjusted to control the number of interpolation points within a tile, to resolve the displacement field, and to control the degree of the over-constraint.
(66) An optimized motion-compensated predication is also possible. In particular, using an additionally constrained system to estimate motion field, the inverse problem has previously been addressed by minimizing an objective function with a weighting (penalty) parameter. However, there are two major issues with this use of a weighting parameter. The first is the determination of the optimized weighting parameter, and several different values of the weighting parameter have been proposed. However, it is difficult to find a single optimal value of the parameter for realistic applications if the ground truth flow field is unknown. The second issue is that the Peak Signal-to-Noise Ratio (PSNR):
(67)
is not optimized by minimizing the objective function with the weighting parameter. The estimated flow field by this approach cannot always lead to an optimal Motion-Compensated prediction or Interpolation (MCI) image for applications of video compression.
(68) As seen above, however, the iteration equations (12) are derived based on the least-squares principle that leads directly to a solution of the displacement field 140 with a minimized target function MSE or a maximized PSNR (An average PSNR for the Forward and Backward MCP (FMCP and BMCP)). Since both FDFD and BDFD are equivalent physical quantities, the target function PSNR is an optimized function without any additional parameters. Therefore the MCP image using the estimated displacement field 140 based on the fully constrained system 130 is believed to be optimized.
(69) The adjustable block size approach using smaller number of displacement components on nodes to interpolate a full density displacement field 140 (for n>1 case) provides a powerful capability for different applications in both computer vision and remote sensing fields. If the block size shown in
(70) With respect to iteration algorithms, certain embodiments of the process 300 and apparatus 100 can start from a set of preset initial values of the displacement field r.sup.(0) at time t.sub.1, then the Motion Vector Solver 110b solves the correspondence field r.sup.(0)(t.sub.2) using all the component equations (8) numerically. An iteration step based on these two displacement fields at time t.sub.1 and t.sub.2 and the iterative equations in (12) can be performed. Furthermore, employing a principle similar to that of the Gauss-Seidel iteration algorithm, updated values of x.sub.pq.sup.(m) and y.sub.pq.sup.(m) can be used on the right-hand side of equation (12) as soon as they become available during the iteration processing. In certain embodiments, all initial displacement field vectors are preset to be equal to 0.01, and the initial Levenberg-Marquardt factor is set to 0.001 and is multiplied by ten in each step if the iteration is not convergent.
(71) The FDFD and BDFD equations on each pixel include two motion-compensated predictions I(ix.sub.ij, jy.sub.ij, t.sub.{2,1}) with variables that may be out of the position on pixels in an image scene. In order to compute the motion-compensated predictions, the general bilinear interpolation function in (4) is utilized for this computation as follows:
(72)
where the function H is evaluated when n.sub.x=n.sub.y=1, and {p, q} {p(ix.sub.ij), q(j.sub.ij)}.
(73) The proposed displacement estimation approach leads to a nonlinear system of equations that may have multiple solutions depending on texture morphology in an image sequence. For example, to estimate a displacement field within a featureless region, the displacement field may not be unique because the initial and final positions of a particular particle cannot be physically determined. Even in a texture-rich environment, the realistic motion fields may have multiple possibilities, which satisfy the same equations and are consistent with the same physical observation. The multiple solutions in the inverse problem by solving a nonlinear system are congruent with this physical property.
(74) The inventor has further appreciated that in order to approach a globally minimized solution using the iteration equations (12), an algorithm of PROC that adapts a variable resolution of the displacement structure during the iterations can be employed in this algorithm. In certain embodiments, an initial block size parameter n.sub.0 is selected to be greater than a preset value of the block size n at initial iteration, and it reaches a preset value of n at the end iteration. In certain embodiments, the displacement field can be regularized by changing the block size parameter n from a larger value (higher degree of over-constraint) to a smaller one (lower degree of over-constraint) by one every Nth iteration until it approaches a preset value of n. The inventor has appreciated that the PROC algorithm is helpful for seeking a flow field in which each vector is consistent with its neighbors.
(75) As seen in
(76)
If the conservative velocity field {x.sub.ij(t.sub.1), y.sub.ij(t.sub.1)} at time t.sub.1 is given or solved, then the correspondence field {x.sub.ij(t.sub.2), y.sub.ij(t.sub.2)} at time t.sub.2 can be determined (or vice versa) by all component equations in (A1) or (A2). Thus, the processor 110 of the estimator 100 is programmed to solve the equation set 130 using the forward and backward displaced frame difference equations in combination with either the forward DVD equations (A1) (
(77) In one example, numerically solving the equation set 130 at time t.sub.2 using equations (A1) or (A2) by the Motion Vector Solver 110b if the velocity {x.sub.ij(t.sub.1), y.sub.ij(t.sub.1)} or {x.sub.ij(t.sub.2), y.sub.ij(t.sub.2)} are given involves expanding the field by a bilinear polynomial function, where a bilinear expression of a two-dimensional displacement field is given by equation (10).
(78) The forward and backward DVI equations (A1) and (A2) are implicit recursive functions of the fields ({x.sub.ij(t.sub.1), y.sub.ij(t.sub.1)}) at time t and t.sub.1. Three methods for solving the matrix field {x.sub.ij(t.sub.2), y.sub.ij(t.sub.2)} are described below for the case in which the matrix field {x.sub.ij(t.sub.1), y.sub.ij(t.sub.1)} are given, and it will be appreciated that the converse problem can be solved by similar techniques for solving the equation set 130 at time t.sub.1 where the matrix field is given at time t.sub.2.
(79) An interpolation method with a searching algorithm can be used to solve the equation set 130 using either of the forward DVI equations (
(80) Another technique involves solving the forward DVI equations (A1) by Newton-Raphson method, such as a damped Newton-Raphson method. Assuming that the forward DVI equations (A1) are nonlinear functions with variables x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2), the system of equations 130 can be solved by a Newton-Raphson method. Two-component nonlinear system of equations with variable x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2) are given by:
(81)
(82) Using the bilinear function in the above equation (A3) to expand the given field at time t.sub.1, all off site values (with non-integer value variables) of the given field at time t.sub.1 are evaluated by the function (10). Since both indexes p and q are functions of the variables x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2), the variables x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2) cannot be solved directly from the above equations. However, these equations are quasi-quadratic, and thus can be solved by a damped Newton-Raphson method. Iteration equations for solving the matrix field at time t for all i and j are given by:
(83)
where m is an iteration index. All derivatives with respect variables x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2) in the above equations can be evaluated by the bilinear function (10). Two index variables p and q in function (10) are integer function of the variables x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2), but the derivative of the integer function is equal to zero, and thus:
(84)
(85) A third approach can be used to solve equation set 130 employing the backward DVD equations (A2) (
(86)
where x.sub.1 and y.sub.1 are two components of a position vector. Two new index variables are introduced as follows:
(87)
The above equations become:
(88)
(89) All the displacement vector fields x.sub.ij(t.sub.2) and y.sub.ij(t.sub.2) in equation (A6) on all pixel points (i, j) can be determined by off-site field at time t.sub.1 after all position coordinates x.sub.1 and y.sub.1 are solved from equations (A5). According to the bilinear expansion in equation (A3), the displacement field at time t.sub.1 can be expressed by:
(90)
(91) Iterative equations for solving position coordinate x.sub.1 and y.sub.1 are given by
(92)
(93) Using the property of the integer function that the derivative is zero yields:
(94)
All position coordinate x.sub.1 and y.sub.1 for given indexes i and j, and the displacement field at time t.sub.1 can be solved by only a few iteration steps, because these equations are quasi-quadratic for this motion model.
(95) In certain implementations, the Motion Vector Solver 110b is programmed to solve the forward displacement vector invariant equations (
(96) Referring now to
(97) In an alternative exemplary embodiment of the invention, a method for processing an image sequence can include an under-constrained system with a symmetric displaced frame difference equation (SDFD).
(98) As noted herein, there can be numerous local minima in image data applications when the nonlinear model is used to deal with a problem of global nonlinear minimization with huge number of unknown parameters. A convergent solution can depend on having a good guess for the initial values, especially for huge dimensional problem. However, in an exemplary embodiment of the invention, to avoid obtaining a specious local minimum solution when solving a nonlinear problem using the initial guess far away from the real flow field, a SDFD equation with only half displacement parameters used for resolving the total displacement field can be utilized.
(99) The estimated motion vector from two successive images is a displacement vector (or average velocity in the time interval) between the initial and final positions of a particle in the scene as shown in
(100) The motion field estimated from successive NOAA Advanced Very High Resolution Radiometer (AVHRR) images or ocean-color imagery spaced several hours apart, and the field is relevant to the important question of surface (or near-surface) drift velocity. For example, one application of this would be the drift of pollution (e.g., an oil plume).
(101) The image intensity I(r, t) measured by brightness (optical flow from visible band images) or temperature (heat flow from thermal images) can be a scalar function of position coordinates r(t) and time t. A brightness or tracer conservation constraint equation can be as follows:
(102)
where operator
(103)
denotes a total derivative with respect to time t. The differential form of equation (B1) can be linear in the velocity components and can hold only for infinitesimal or nearly infinitesimal motions. Using two successive frames, an estimate of partial derivatives with respect to time in (B1) can be made; however, equation (B1) may not exactly describe the motion between the two successive frames, especially for large-scale displacement motion.
(104) In order to constrain the image scenes at time t=t.sub.1 and t=t.sub.2, equation (B1) can be integrated from time t.sub.1 to t.sub.2
(105)
where r(t.sub.1) and r(t.sub.2) can be the position vectors at time t.sub.1 and t.sub.2.
(106) In order to estimate a motion field with as small a displacement magnitude as possible, a half displacement vector r.sub.h=r/2=v t.sub.h can be defined along the total displacement vector 1210 as shown in
(107)
Substituting the position vectors in the above equations into equation (B2), the symmetric displaced frame difference (SDFD) equation can be given by
DFCD=I(r(t.sub.1+t.sub.h)+vt.sub.h,t.sub.2)I(r(t.sub.1+t.sub.h)vt.sub.h,t.sub.1)=0(B3)
(108) The conservation constraint equation (B1) can be integrated into path-independent equation (B3) and all terms in equation (B3) associate with intensity at time t.sub.1 and t.sub.2. It can be clear that employing the SDFD equation (B3) for motion estimations can achieve higher accuracy in comparison with the heat or optical flow equation (B1) especially for large scale displacement estimation. Since the estimated displacement vectors within the terms in equation (B3) are only half of the total displacement vector, convergence to a global optimal solution can be more robust than when the total displacement vector is estimated by an iteration algorithm.
(109) The intensity terms at time t.sub.1 and t.sub.2 in equation (B3) can be calculated from the intensity fields on the image sequence. The inverse problem for solving the velocity vector is under-constrained because two unknown velocity components must be derived from a single SDFD equation (B3) at each of these pixel points.
(110) In an exemplary embodiment of the invention, one of the efficient approaches for resolving the velocity field from the under-constrained system can be to express the velocity field as bilinear polynomial functions or two-dimensional B-Splines functions. Bilinear modeling of the displacement field can be utilized to develop a unified adaptive framework that can allow the estimator to control the resolutions of the displacement field and the degree of over-constraint, and to simplify the computational complexity. The velocity fields u(x, y, t) and v(x, y, t) can be represented as two-dimensional bilinear functions controlled by a smaller number of velocity estimates u(p, q, t) and v(p, q, t) which can lie on a coarser node point in grid as shown in
(111) In general, any two-dimensional function can be approximated by a Lagrange's bilinear function
(112)
where function H.sub.a,b(x, y) can be the bilinear forms. The image domain can be partitioned into sub-domains, or tiles, each of which can contain an n.sub.xn.sub.y array of pixels as represented in
(113)
where denotes an integer operator. The {p, q} serve as tile indices, since the integer operator increments them by unity after an additional n.sub.x or n.sub.y pixels are counted. The bilinear function H.sub.a,b (x, y) can be defined by
(114)
(115) The two component velocity fields on pixels in an image can be approximated by the following discrete forms of the bilinear approximation functions with first order continuity that holds for all N=N.sub.xN.sub.y image globally
(116)
where a function with discrete variables i and j can be denoted by v.sub.ij=v.sub.ij(t)=v(i, j, t) or .sub.ij(t)=(i, j, t). In a special case, v.sub.ij can be equal to v.sub.pq for all indices i and j when block size parameter n is unity (n=n.sub.x=n.sub.y=1).
(117) All off-node velocities can be calculated by equation (B6) using the on-node velocities expressed as u.sub.pq and v.sub.pq. The SDFD equation in (B3) can become
DFCD.sub.ij=I(i+u.sub.ijt.sub.h,j+v.sub.ijt.sub.h,t.sub.2)+I(iu.sub.ijt.sub.h,jv.sub.ijt.sub.h,t.sub.1)=0,(B7)
where u.sub.ij and v.sub.ij are two components of velocities on each pixel with horizontal and vertical indices {i, j} for the whole image. The SDFD in (B7) is a function of variables of two velocity components that depend on node point velocities in (B6), i.e.
DFCD.sub.ij=DFCD.sub.ij(u.sub.ij,v.sub.ij)=DFCD.sub.ij(u.sub.pq,v.sub.pq).
(118) The SDFD equations (B3) and (B7) are equivalent only when the block size parameters n.sub.x=1 and n.sub.y=1, in this case, p=i and q=j. All independent SDFD.sub.ij equations have only a smaller number of the independent velocities on nodes when the velocity indices i=p(i) and j=q(j). The total number of SDFD equations is N=N.sub.xN.sub.y for an N.sub.xN.sub.y image sequence. The number of node points shown in
(119)
(120) The total number of independent velocity fields with two components u.sub.pq and v.sub.pq is 2N.sub.node. It can be clear that this system is over-constrained because the number of SDFD equations in (B7) for all pixels is greater than the number of independent velocity components u.sub.pq and v.sub.pq (i.e., N>2N.sub.node) if block size n.sub.x>1 and n.sub.y>1.
(121) The presence of possible quantization errors and noise in the remote sensing measurements suggest that the SDFD equation in (B7) is never identically zero; however, we can choose a set of v.sub.ij for which it can be minimized in a least-squares sense. Accordingly, we define a Mean Square Error (MSE) function based on equation (B7) as
(122)
where i and j go over all pixels in N.sub.xN.sub.y image (i[0,N.sub.x1]j[0,N.sub.y1]). Minimizing the cost function with parameters u.sub.kl and v.sub.kl as variables for given indices k and l on all node points in an image represented in
(123)
To obtain equation (B8), the following is used
(124)
where symbol .sub.ij is the Kronecker-Delta symbol, and the summation domain is reduced from the entire image plane to only a local region .sub.kl, so that
(125)
The two independent equations in (B8) on a node point can be degraded back to a single SDFD equation (B7) when block size n.sub.x and n.sub.y are equal to unity. Of the original pixel set, the velocities u.sub.kl and v.sub.kl and their nearest neighbors are the only members contained in the summations in equation (B8).
(126) To solve the nonlinear system equations (B8), the SDFD can be expanded in Taylor series using Gauss-Newton method
(127)
where m is an iteration index, and
DFCD.sub.ij.sup.(m)=DFCD.sub.ij(u.sub.pq.sup.(m),v.sub.pq.sup.(m))
Substituting SDFD expansion into equation (B8), the following iterative equations of two component velocity u.sub.kl and v.sub.kl can be found for all indexes k and l on node points
(128)
where 0 is a Levenberg-Marquardt factor that can be adjusted at each iteration to guarantee that the objective function MSE is convergent. A smaller value of the factor can be used, bringing the algorithm closer to the Gauss-Newton method with second order converging. This Levenberg-Marquardt method can improve converge properties efficiently in practice and has become the standard of nonlinear least-squares routines.
(129) The block size parameters n=n.sub.x=n.sub.y can be adjusted to control the degree of the over-constrained system and robustness to noise, and to obtain the velocity field from high to low resolutions of the field structures for different applications.
(130) The new approach is named as the Estimator with the SDFD equation to differentiate from the traditional Linear GOS (LGOS) method.
(131) The SDFD equation on each pixel can consist of two motion-compensated predictions I(iu.sub.ijt.sub.h, jv.sub.ijt.sub.h, t.sub.{2,1}) with variables that may be out of the position on pixels in an image. In order to compute the motion-compensated predictions, the general bilinear interpolation function in (B4) can be utilized as follows
(132)
where the function H.sub.a,b (x, y) in (B5) can be evaluated when n.sub.x=n.sub.y=1, and {p, q}={p(iu.sub.ijt.sub.h), q(jv.sub.ijt.sub.h)}.
(133) The partial derivatives in (B9) with respect to coordinates x and y can be computed by their finite differences. In order to improve computation accuracy of numerical differentiation from the discrete set of images, the partial differential method can be implemented with two-point difference (with mask coefficients {1, 0}) and N-point central differences (for examples, with mask coefficients {1, 0, 1}/2, {1, 8, 0, 8, 1}/12, or {1, 9, 45, 0, 45, 9, 1}/60) for differentiation. The number of the convolution points in the differentiation can be dependent on an application. In general, a larger number of the central differences can improve the performance of the velocity estimation if the scale of the feature structures (some featureless regions are located within these structures) is larger. Moreover, a spatial Gaussian low pass filter with standard deviation about one pixel in space can be used to smooth the gradient fields.
(134) The gradient field smoothing processes can improve the performance of the velocity estimation especially for real world image data because a motion can be observed and detected only if there are enough textures in the interested region. Smoothing gradient vector fields can help increase gradient values in the featureless regions from their higher gradient neighbors by a convolution processing.
(135) First, gradient fields with components in x and y can be evaluated on each pixel using numerical differentiation method. Then, a Gaussian low-pass filter with standard deviation from 0.375 to 1.125 pixels in space can be used to smooth the gradient fields. Finally, utilizing the general interpolation function in (B4) when n.sub.x=n.sub.y=1, the values of the gradient fields in equations (B9) and (B10) can be calculated at any position (x, y)=(iu.sub.ijt, jv.sub.ijt) in an image scene.
(136) Almost all methods for solving the velocity in an image sequence can be based on an assumption that the surface temperature, ocean color, or brightness is conserved. Most of the successive satellite-borne images are recorded in long temporal range or from different sensors in remote sensing applications. The conservation constraint of the intensity is not always satisfied for such applications. For examples, there can be calibration differences and diurnal warming of the surface layer for the AVHRR images, and sun angle differences for ocean color reflectance data.
(137) In the case of a global gray level change in an image scene, there does exist the gray scale normalization techniques addressing the problem which were originally developed for side looking radar image analysis. Applying these techniques, one can be normalized with respect to other by a linear transformation in such a manner that the gray level distribution in both images can be of the same mean and variance. Assuming the first image at time t=t.sub.1 is a reference image, then the normalized second image at time t=t.sub.2 can be given by
(138)
where I and I-hat are the intensity recorded at time t=t.sub.2 and the normalized intensity, and .sub.i and .sub.i are the mean and standard deviation of the intensity values recorded within unmasked regions in image at time t=t.sub.i.
(139) The velocity estimation with the approaches herein can lead to solve a nonlinear system of equations that may have multiple solutions depending on texture structures in an image sequence. For example, to estimate a velocity field within a featureless region, the velocity field may not be unique because the initial and final positions by tracking a particular particle cannot be physically determined. This inconsistency between physical motion and observation in a featureless region is the intrinsic physical property of the inverse problem in a real world. The realistic motion fields may have multiple possibilities which satisfy the same equations and can be consistent with the same physical observation. The multiple solutions in the inverse problem by solving a nonlinear system can be congruent with this physical property.
(140) In order to approach a globally minimized solution using iteration equations (B10), an algorithm of progressive relaxation of the over-constraint (PROC) that adapts a variable resolution of the velocity structure during the iterations can be employed in this framework. An initial block size parameter n.sub.0 can be always set to be greater than a preset value of the block size n at initial iteration, and can reach a preset value of n at the end of the iteration. The velocity field can be regularized by changing the block size parameter n from a larger value (higher degree of over-constraint) to a smaller one (lower degree of over-constraint) by one on every N-th iterations until it approaches a preset value of n. The PROC algorithm can be helpful for seeking a flow field in which each vector is consistent with its neighbors.
(141) All initial displacement field vectors can be preset to be equal to 0.01. The initial Levenberg-Marquardt factor can be set to 0.001. The procedures can start from a set of preset initial values of the velocity field, then an iteration step at a fixed pixel point based on the previous iterative velocity field and the iterative equations in (B10) can be performed. Furthermore, employing a principle similar to that of the Gauss-Seidel iteration algorithm, updated values of u.sub.pq.sup.(m) and v.sub.pq.sup.(m) on the right-hand side of (B10) can be used as soon as they become available during the iteration processing. After all pixels are scanned in the above iteration procedures, the program can then check the convergent criteria by comparing between previous MSE.sup.(m-1) (before the iteration) and current MSE.sup.(m) (after the iteration). If the MSE.sup.(m-1) is greater than the MSE.sup.(m), then the program can take the next iteration. Otherwise, the Levenberg-Marquardt factor can be multiplied by ten and the iteration procedures can be repeated again until the MSE.sup.(m)<MSE.sup.(m-1). Combining the PROC algorithm, it can be found that the program always converges to a solution for almost all test cases for remote sensing datasets.
(142) If a Peak Signal-to-Noise Ratio (PSNR) is defined by
(143)
three typical converging curves that demonstrate PSNR versus the converged iteration index with different initial displacement vectors (in pixel scale) for estimating the velocity field using the PROC algorithm can be plotted. The second-order convergence properties by the Gauss-Newton and Levenberg-Marquardt methods can indicate that the PSNR for the image sequence has a sharp increase within first five iterations. However, a convergence to a static value of 43.0 (dB) can be approached after forty four iterations for all different initial guesses.
(144) The flow field estimation framework has been validated on both synthetic and real world infrared image sequences. First, using the solution of a numerical simulation model as a benchmark, a surface tracer field can be introduced as an initial condition. Angular and magnitude measures of error can be introduced in this section, and their mean values can be applied to evaluate the performance of the velocity estimations. Finally, qualitative evaluations on can be used on real satellite-borne infrared images
(145) In an exemplary embodiment of the invention, a system and method are described herein to estimate surface velocity between two successive images, such as two satellite images. A symmetric displaced frame difference, or displaced frame central difference, equation can be utilized with iterative equations, and Gauss-Newton and Levenberg-Marquardt algorithms can be formulated for the determination of velocity field. An adaptive framework for solving the nonlinear problem can be developed based on the SDFD equations, velocity field modeling, a nonlinear least-squares model, and an algorithm of progressive relaxation of the over-constraint for seeking a flow field in which each vector is consistent with its neighbors.
(146) There are two major advantages using the SDFD equation in applications. The toughest problem is the convergent problem for resolving a nonlinear system with huge number of parameters. Therefore, the first advantage of the SDFD equation is that the scale of the displacement vector in the SDFD equation is only half of the total displacement of motion. The program using the SDFD equation can converge easily to a global optimal solution by iteration, even when the preset initial velocity field is far away from a converged solution. The second advantage is that the spatial derivatives in the iteration equation (B10) are for both images scenes for this nonlinear model. The average of the derivatives in equation (B9) for two tracer structures at different times can be helpful to improve the performance of the velocity estimation.
(147) In an exemplary embodiment of the invention, a unified adaptive framework for motion field estimation from an image sequence is described herein. The computational and implementation complexities of the inverse problem have been greatly simplified by the new iteration approach with equations (B10) for full dense velocity estimation. Moreover, the resolved displacement vectors in an iteration procedure are only half of the total travel distance of a feature structure. Using the SDFD equation, the flow field by the adaptive framework can converge rapidly to a global optimal solution, especially for larger scale displacement estimation. Since both linear and nonlinear models are iterative, there is no significant difference in computation cost between the linear or nonlinear model in a single iteration.
(148) The proposed approach using a smaller number of on-node velocity components to interpolate full density velocity field can have potential applications in both computer vision and remote sensing fields. The feature of robustness to noise by adjusting the block size is demonstrated using AVHRR image sequences in remote sensing applications.
(149) Experimental evaluations, comparisons, and applications with a numerical simulation model and infrared image sequences have been demonstrated. Both average angular and magnitude error measurements between the estimated flow fields, and the numerical simulation model and the CODAR array indicate that the proposed work provides major improvement over the linear inverse model for the simulation and real AVHRR data sets.
(150) The above examples are merely illustrative of several possible embodiments of various aspects of the present disclosure, wherein equivalent alterations and/or modifications will occur to others skilled in the art upon reading and understanding this specification and the annexed drawings. In particular regard to the various functions performed by the above described components (processor-executed processes, assemblies, devices, systems, circuits, and the like), the terms (including a reference to a means) used to describe such components are intended to correspond, unless otherwise indicated, to any component, such as hardware, processor-executed software, or combinations thereof, which performs the specified function of the described component (i.e., that is functionally equivalent), even though not structurally equivalent to the disclosed structure which performs the function in the illustrated implementations of the disclosure. In addition, although a particular feature of the disclosure may have been illustrated and/or described with respect to only one of several implementations, such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for any given or particular application. Also, to the extent that the terms including, includes, having, has, with, or variants thereof are used in the detailed description and/or in the claims, such terms are intended to be inclusive in a manner similar to the term comprising.
(151) Portions of the invention can comprise a computer program that embodies the functions described herein. Furthermore, the modules described herein, such as the tomography module, scope algorithm module, and optimization module, can be implemented in a computer system that comprises instructions stored in a machine-readable medium and a processor that executes the instructions. However, it should be apparent that there could be many different ways of implementing the invention in computer programming, and the invention should not be construed as limited to any one set of computer program instructions. Further, a skilled programmer would be able to write such a computer program to implement an exemplary embodiment based on the flow charts and associated description in the application text. Therefore, disclosure of a particular set of program code instructions is not considered necessary for an adequate understanding of how to make and use the invention. The inventive functionality of the claimed computer is explained herein in more detail read in conjunction with the figures illustrating the program flow.
(152) It should be understood that the foregoing relates only to illustrative embodiments of the present invention, and that numerous changes may be made therein without departing from the scope and spirit of the invention as defined by the following claims.