COMPUTER-IMPLEMENTED METHOD AND SYSTEM FOR PROCESSING VIDEO WITH TEMPORAL CONSISTENCY
20170345163 · 2017-11-30
Inventors
- Rida SADEK SADEK (Barcelona, ES)
- Juan Francisco GARAMENDI (Fuenlabrada, ES)
- Pablo Arias Martinez (Barcelona, ES)
Cpc classification
G06T2207/20182
PHYSICS
G11B27/031
PHYSICS
International classification
G06T7/246
PHYSICS
Abstract
System and computer-implemented method for editing a video sequence with temporal consistency. The method includes the steps of: computing a motion field modeling temporal consistency between successive frames; defining an energy functional modeling the desired properties to be enforced on the video sequence; splitting the video sequence into two sets with even frames and odd frames; computing the motion field between consecutive frames on the splitted sequences; recursively performing steps until the sets to be split contain one frame to edit; minimizing the energy functional for each set containing one frame to edit; merging the edited frames and outputting the edited video sequence.
Claims
1. A computer-implemented method for editing a video sequence with temporal consistency comprising the steps of: i) computing a motion field modeling temporal consistency between successive frames; ii) defining an energy functional modeling the desired properties to be enforced on the video sequence; iii) splitting the video sequence into two sets with even frames and odd frames; iv) computing the motion field between consecutive frames on the splitted sequences; v) recursively performing steps iii) and iv) until the sets to be split contain one frame to edit; vi) minimizing the energy functional for each set containing one frame to edit; vii) merging the edited frames and outputting the edited video sequence.
2. The computer-implemented method of claim 1, wherein it comprises a previous step of identifying a frame in a video sequence having an edited object to be propagated.
3. The computer-implemented method of claim 1, wherein the energy functional comprises a summation of editing energies depending on single-frames and temporal coupling energies depending on pairs of consecutive frames.
4. The computer implemented method of claim 1, wherein the energy functional is based on at least one of the following model: global brightness change assumption; brightness constancy assumption; gradient constancy assumption.
5. A system for editing a video sequence with temporal consistency comprising: a first processing means or computing a motion field (v) modeling temporal consistency between successive frames (f.sub.i, f.sub.i+1) and for defining an energy functional (J) modeling the properties to be enforced in the video sequence (f); a second processing means for splitting the video sequence (f) into two sets with even frames (f.sup.even) and odd frames (f.sup.odd), wherein the splitting is done recursively until there is only one frame to edit in each set; a third processing means for computing the motion field (v) between consecutive frames in each set after each split; a fourth processing means for minimizing the energy functional (J) on each set to obtain edited frames; a fifth processing means for merging the edited frames and outputting the entire edited video sequence (u).
6. A system for editing a video sequence with temporal consistency of claim 5, comprising a sixth processing means for identifying a frame in a video sequence having an edited object to be propagated.
7. A computer program product for editing a video sequence with temporal consistency comprising computer code instructions stored thereon that, when executed by a processor, causes the processor to perform the method of claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0082] A series of drawings which aid in better understanding the invention and which are expressly related with an embodiment of said invention, presented as a non-limiting example thereof, are very briefly described below.
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
DETAILED DESCRIPTION OF THE INVENTION
[0092] The present embodiment describes a method for reducing the complexity and solving time of video edition schemes assuming temporal consistency. Let Ω×[1,T] be the (spatio-temporal video) domain, where where Ω.sup.2 is the spatial domain, and T≧1 represents the frame number within a video. Let u: Ω×[1,T].fwdarw.
.sup.M be a given color (M>1) or grey (M=1) video such that u(x,t) represents the pixel value at location x in the frame t. Moreover, let v:Ω×[1,T−1].fwdarw.
.sup.2 be the corresponding motion field. This motion field gives the correspondence between pixels in frame t and t+1 in such a way that u(x,t)=u(x+v(x),t+1) (under brightness constancy assumption), that is v establishes the temporal coherence in the video.
[0093] As a way of example, several algorithms written in pseudo code are defined below for a better understanding. These algorithms can be implemented in any processing device (e.g. a computer) to advantageously edit a video sequence according to the principles of the invention. Especially, the core of the invention can be seen in the sequence of steps to be taken in Algorithm 2.
Algorithm 1: u←Minimize(J,f,v)
Require:
[0094] A grayscale or color video f(x,t),tε[1,T] composed by T frames. The motion field v(x,t), tε[1,T−1] [0095] An energy functional J(u,v,f) to minimize or the partial differential equation that involves the given video f and the motion field v.
Ensure: An edited video u(x,t), tε[1,T].
1:u←minimum of J(u,v,f) w.r.t. u
Algorithm 2: u←SplitAndMinimize(J,f,v)
Require:
[0096] A grayscale or color video f(x,t),tε[1,T] composed by T frames. The motion field v(x,t), tε[1,T−1] [0097] An energy functional J(u,v,f) to minimize or the partial differential equation that involves the given video f and the motion field v.
Ensure:
[0098] An edited video u(x,t), tε[1,T].
1. nFrames←number of frames
2. if nFrames is 1 then [0099] a. u←Minimize (j,f,v) w.r.t. u
3. else{nFrames>1} [0100] a. f.sup.odd(x,t)←VideoOddSplit(f) [0101] b. f.sup.even(x,t)←VideoEvenSplit(f) [0102] c. v.sup.odd(x,t)←MotionOddSplit(v) [0103] d. v.sup.even(x,t)←MotionEvenSplit(f) [0104] e. u.sup.odd←SplitAndMinimize(J,f.sup.odd,v.sup.odd) [0105] f. u.sup.even←SplitAndMinimize(J,f.sup.even,v.sup.even) [0106] g. u←Merge(u.sup.odd,u.sup.even)
4. end if
5. return u
Algorithm 3: f.sup.odd←VideoOddSplit(f)
Require:
[0107] A grayscale or color video f(x,t),tε[1,T] composed by T frames
Ensure: A video f.sup.odd(x,t) composed by the odd frames from f
TABLE-US-00001 1. nFrames←number of frames 2. if nFrames is 1 then a. f.sup.odd ← f 3. else{nFrames>1} a. j ← 1 b. for i = 1 to i=nFrames do i. if i is odd then 1. f.sup.odd(x,j) = f(x,i) 2. j ← j + 1 ii. end if iii. i ← i + 1 c. end for 4. end if 5. return f.sup.odd
Algorithm 4: f.sup.even←VideoEvenSplit(f)
Require:
[0108] A grayscale or color video f(x,t),tε[1,T] composed by T frames
Ensure: A video f.sup.even(x,t) composed by the odd frames from f
TABLE-US-00002 1. nFrames←number of frames 2. if nFrames is 1 then a. f.sup.even ← f 3. else{nFrames>1} a. j ← 1 b. for i = 1 to i=nFrames do i. if i is even then 1. f.sup.even(x,j) = f(x,i) 2. j ← j + 1 ii. end if iii. i ← i + 1 c. end for 4. end if 5. return f.sup.even
Algorithm 5: v.sup.odd←MotionOddSplit(v)
Require:
[0109] A motion field v(x,t),tε[1,T−1] from a composed by T frames.
[0110] The number of frames nFrames of the corresponding video.
Ensure: A motion field v.sup.odd(x,t) that should be coherent with VideoOddSplit(f)
TABLE-US-00003 1. nFrames←number of frames 2. if nFrames is 1 then a. v.sup.odd ← v 3. else{nFrames>1} a. j ← 1 b. for i = 1 to i=nFrames−1 do i. if i is odd then 1. v.sup.odd(x,j) ← v(x,i) + v(x + v(x,i), i + 1) 2. j ← j + 1 ii. end if iii. i ← i + 1 c. end for 4. end if 5. return v.sup.odd
Algorithm 6: v.sup.even MotionEvenSplit(v)
Require:
[0111] A motion field v(x,t),tε[1,T−1] from a composed by T frames.
[0112] The number of frames nFrames of the corresponding video.
Ensure: A motion field v.sup.even(x,t) that should be coherent with VideoEvenSplit(f)
TABLE-US-00004 1. nFrames←number of frames 2. if nFrames is 1 then a. v.sup.even ← v 3. else{nFrames>1} a. j ← 1 b. for i = 1 to i=nFrames−1 do i. if i is even then 1. v.sup.even(x,j) ← v(x,i) + v(x + v(x,i), i + 1) 2. j ← j + 1 ii. end if iii. i ← i + 1 c. end for 4. end if 5. return v.sup.even
Algorithm 7: u←Merge (u.sup.odd,u.sup.even)
Require:
[0113] Two color or grey video sequences (u.sup.odd,u.sup.even).
[0114] The difference in the number of frames can not be greater than one.
Ensure: A new u video composed by u.sup.odd and u.sup.even [0115] 1. nFrames.sup.odd←number of frames from u.sup.odd [0116] 2. nFrames.sup.even←number of frames from u.sup.even [0117] 3. i←1 [0118] 4. j←1 [0119] 5. while i←nFrames.sup.odd or i<nFrames.sup.even do [0120] a. u(x,j)←u.sup.odd(x,i) [0121] b. u(x,j+1)←u.sup.even(x,i) [0122] c. i←i+1 [0123] d. j←j+2 [0124] 6. end while
[0125] Now the following video editing problem is considered as an example of how to proceed for solving the problem according to the invention: Let f be a color video composed by 5 frames, as it shown in first row of figure
[0126] One of the possible methods in the literature for solving this problem goes as follows. Let 0⊂Ω×[1,3] be the editing domain (0 is the sides of the box from
[0127] The problem can be solved minimizing an energy functional. In this example, the global brightness change assumption is used:
J(u)=∫.sub.0.sup.T∫.sub.Ω.sub.
[0128] Where ∇ is the gradient operator and ∂.sub.v is the convective derivative. Following calculus of variations, the minimum of energy (15) is the solution to the Euler-Lagrange equation given by the following fourth order PDE (Partial Differential Equation)
∂.sub.v*div ∇∂v(x,t)=0, (x,t)ε0 (16)
where div is the spatial divergence adjoint to −∇ and ∂.sub.v* denotes the adjoint operator of the convective derivative, given by
[0129] This Equation is completed with Dirichlet boundary conditions,
u(x,t)=u.sub.0(x,t), xεΩ.sup.t/0.sub.t (17)
[0130] According to the present proposal, it is not used the whole video u.sub.0 neither the whole correspondence map v. In an informal way, the method applied states as follows: The inputs of the problem are the video f and the connectivity information between consecutive frames. This connectivity information (or motion field) is usually approximated by the optical flow v.
[0131] The first step relates to splitting the input video sequence to be edited. According to this step, the input video is split into two sets: The odd frames and the even frames (
[0132] Now, let us to describe the algorithm 2 step by step in the example context of the previous video editing problem for a video sequence f={f0, f1, f2, f3, f4}, with the manual edited frames {f0, f4} and motion field v={v0, v1, v2, v3}
1. u←SplitAndMinimize(J,f,v) (because f has more than one frame to edit) [0133] 1.1. Split the original video sequence f={f0, f1, f2, f3, f4} following algorithms 3 and 4. [0134] f.sub.o←VideoOddSplit(f) [0135] f.sub.o={f0, f1, f3, f4} [0136] f.sub.e←VideoEvenSplit(f) [0137] f.sub.e={f0, f2, f4} [0138] 1.2. Compute the new motion fields from v following algorithms 5 and 6 [0139] v.sub.o←MotionOddSplit(v) [0140] v.sub.o={v(x,0),v(x,1)+v(x+v(x,1),2),v(x,3)} [0141] v.sub.oe←MotionEvenSplit(v) [0142] v.sub.e={v(x,0)+v(x+v(x,0),1),v(x,2)+v(x+v(x,2),3)} [0143] 1.3. Solve the problem for each f.sub.o, f.sub.e and their corresponding motion fields v.sub.o, v.sub.e. [0144] u.sub.e←Minimize(J,f.sub.e,v.sub.e) (because f.sub.e has only one frame to edit) [0145] u.sub.e={f0, u.sub.e1,f4} [0146] u.sub.o←SplitAndMinimize(J,f.sub.o,v.sub.o) (because f.sub.o has more than one frame to edit) [0147] 1.3.1. Split the video sequence f.sub.o following algorithms 3 and 4 [0148] f.sub.oo←VideoOddSplit(f.sub.o) [0149] f.sub.oo={f0,f1,f4} [0150] f.sub.oe←VideoEvenSplit(f.sub.o) [0151] f.sub.oe={f0,f3,f4} [0152] 1.3.2. Compute The new motion fields from v.sub.o following algorithms 5 and 6 [0153] v.sub.oo←MotionOddSplit(v.sub.o) [0154] v.sub.oo={v.sub.o(x,0), v.sub.o(x,1)+v.sub.o(x+v.sub.o(x,1)2) [0155] v.sub.oe←MotionEvenSplit(v.sub.o) [0156] v.sub.oe={v.sub.o(x,0)+v.sub.o(x+v(x,0),1),v.sub.o(x,2)} [0157] 1.3.3. Because the number of frames to edit of f.sub.oo and f.sub.oe is one, we solve [0158] u.sub.oo←Minimize(J,f.sub.oo,v.sub.oo) [0159] u.sub.oo={f0, u.sub.oo1,f4} [0160] u.sub.oe←Minimize(J,f.sub.oe,v.sub.oe) [0161] u.sub.oe={f0, u.sub.oe1,f4} [0162] 1.3.4. Merge the solutions u.sub.oo, u.sub.oe [0163] u.sub.o←Merge(u.sub.oo, u.sub.oe) [0164] u.sub.o={f0,u.sub.oo1,u.sub.oe1,f4} [0165] 1.3.5. Merge the solutions u.sub.o, u.sub.e [0166] u←Merge(u.sub.o, u.sub.e) [0167] u={f0,u.sub.oo1,u.sub.e1,u.sub.oe1,f4} [0168] 1.3.6. Return the edited video u.
[0169] As apparent, the above algorithms can be coded as instructions in a suitable computer language for automatically performing the described operations when executed on a computer.
[0170]
[0171] Although the invention has been explained in relation to its preferred embodiment(s) as mentioned above, it can be understood that many other modifications and variations can be made without departing from the scope of the present invention defined by the appended claims.