Method and apparatus for constructing an epitome from an image

09661334 ยท 2017-05-23

Assignee

Inventors

Cpc classification

International classification

Abstract

A method for constructing an epitome from an image divided into non overlapping blocks is disclosed. The method comprises: determining, for each block, similar patches in the image, a similar patch being a patch with similar content; constructing at least one epitome chart for the picture from the similar patches;
wherein determining, for each block, similar patches in the image comprises: a) determining, for each current block, similar blocks in the image, a similar block being a block with content similar to the content of the current block; b) determining, for one current block and for the similar blocks determined for the current block, similar patches in the image, a similar patch being a patch with content similar to the content of the current block; c) repeating step b) for a next current block for which no similar patch is determined until at least one similar patch is determined for each block in the image.

Claims

1. A method for constructing an epitome from an image divided into non overlapping blocks comprising: determining, for each block of the block grid, similar patches of the pixel-grid in the image, a similar patch being a patch with content similar to the content of said current block; and constructing at least one epitome chart for said picture from said similar patches; wherein determining, for each block, similar patches in the image comprises: a) determining, for each current block, similar blocks in the image, a similar block being a block with content similar to the content of said current block, said determining comprising calculating a distance between content of said current block and content of blocks in the image and determining, as similar blocks, those blocks for which the calculated distance is below a first threshold value; b) determining, for one current block and for the similar blocks determined for said current block, similar patches in the image, a similar patch being a patch with content similar to the content of said current block, wherein said determining similar patches comprises: i) calculating a distance between content of said current block and content of patches in the image and determining, as similar patches, the patches for which the calculated distance is below a second threshold value higher than the first threshold value; c) determining, for similar blocks determined for said current block, among said similar patches, the similar patches whose distance to the current block is below a threshold value equal to the difference between said second threshold value and said first threshold value; d) repeating steps b) and c) for a next current block for which no similar patch is determined until at least one similar patch is determined for each block in the image.

2. A non-transitory processor readable medium having stored therein instructions for causing a processor to perform at least the steps of the method for constructing an epitome according to claim 1.

3. An apparatus for constructing an epitome from an image divided into non overlapping blocks comprising at least one processor configured to: determine, for each block of the block grid, similar patches of the pixel-grid in the image, a similar patch being a patch with content similar to the content of said current block; and construct at least one epitome chart for said picture from said similar patches; wherein to determine, for each block, similar patches in the image, said processor is further configured: a) to determine, for each current block, similar blocks in the image, a similar block being a block with content similar to the content of said current block, wherein to determine, for each current block, similar blocks in the image comprises to calculate a distance between content of said current block and content of blocks in the image and to determine, as similar blocks, those blocks for which the calculated distance is below a first threshold value; b) to determine, for one current block and for the similar blocks determined for said current block, similar patches in the image, a similar patch being a patch with content similar to the content of said current block, wherein to determine, for one current block and for the similar blocks, similar patches in the image comprises to calculate a distance between content of said current block and content of patches in the image and to determine, as similar patches, the patches for which the calculated distance is below a second threshold value higher than the first threshold value; c) to determine, for similar blocks determined for said current block, among said similar patches, the similar patches whose distance to the current block is below a threshold value equal to the difference between said second threshold value and said first threshold value; d) to repeat steps b) and c) for a next current block for which no similar patch is determined until at least one similar patch is determined for each block in the image.

Description

4. BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 illustrates the construction of an epitome from an image Y and the reconstruction of an image Y from a texture epitome E and a transform map according to the prior art;

(2) FIG. 2 represents on the left a picture Y divided into non overlapping blocks and on the right the selection of patches to construct an epitome according to a specific and non-limitative embodiment;

(3) FIG. 3 represents a flowchart of a method for epitome construction according to a specific and non-limitative embodiment;

(4) FIG. 4 represents a detail of the flowchart depicted on FIG. 3;

(5) FIG. 5 represents a chart initialization step: on the left, grey blocks in the image are the blocks able to be reconstructed by the current chart, the current epitome EC.sub.n being initially represented by a single patch E.sub.0;

(6) FIG. 6 represents a detail of the flowchart depicted on FIG. 4;

(7) FIG. 7 illustrates an extension chart step: image blocks reconstructed by the current epitome (left) and the current epitome (right) extended by an increment E; and

(8) FIG. 8 represents an apparatus for epitome construction according to a specific and non-limitative embodiment.

5. DETAILED DESCRIPTION

(9) A method of constructing an epitome of an image divided into non-overlapping blocks is disclosed. A block is located on a block grid as depicted on the left part of FIG. 2. A patch is a block of pixels located on the pixel grid as depicted on right part of FIG. 2. In the following, the word block is used to designate the blocks of pixels located on the block grid while the word patch is used to designate the blocks of pixels located on the pixel grid.

(10) FIG. 3 represents a flowchart of the method for constructing an epitome from a current image Y according to a specific and non-limitative embodiment. The current image Y is factorized, i.e. a texture epitome E and a transform map are determined for the current image. The texture epitome E is determined from pieces of texture (e.g. a set of charts) taken from the current image. The method is disclosed for a current image and can be applied on each image of a sequence of images. The method comprises, in a step 8, determining for each block in the image Y at least one similar patch also called matched patch and, in a step 16, constructing at least one epitome chart from the matched patches determined in the step 8. The step 8 is detailed below.

(11) In a step 10, similar block(s) A.sub.i,I are determined for each block B.sub.i in the image Y, where i is an integer identifying the block B.sub.i and I is an integer identifying the similar block A.sub.i,I. A block is similar to the block B.sub.i if a distance d calculated between the content of these two blocks is below a first threshold value .sub.A. The distance d equals for example the Sum of Absolute Differences (SAD), wherein the differences are the pixel by pixel differences between the two blocks. According to a variant, the distance equals the Sum of Square Errors (SSE), wherein the errors are the pixel by pixel differences between the two blocks. Many other such metrics may be used consistent. A block in Y can have no similar blocks, a single similar block or a plurality of similar blocks. With respect to FIG. 2, the block Bi has 5 similar blocks A.sub.i,0, A.sub.i,1, A.sub.i,2, A.sub.i,3 and A.sub.i,4 (also referred as B.sub.j). B.sub.j has 3 similar blocks A.sub.j,0, A.sub.j,1 and A.sub.j,2 and B.sub.k has 3 similar blocks A.sub.k,0, A.sub.k,1 and A.sub.k,2. B.sub.h has 1 similar block A.sub.i,0.

(12) In a step 12, similar patches M.sub.i,p are determined for one current block B.sub.i and for the similar blocks A.sub.i,I determined in step 10 for B.sub.i, where p is an integer identifying the similar patch. As an example, the current block is the block for which the number of similar blocks A.sub.i,I determined in step 10 is the highest. If two blocks have the same number of similar blocks A.sub.i,I, the current block can be the first block encountered when going through the picture in a specific scan order, e.g. raster scan order (i.e. from top to bottom and from left to right). With respect to FIG. 2, B.sub.i is the block with the highest number of similar blocks. According to a specific embodiment, an exhaustive search is performed in the entire image, i.e. all the patches comprised in the image are tested. According to a variant only a subset of the patches are tested, e.g. one out of two. The similar patches M.sub.i,p, determined for the current block B.sub.i are further associated with the similar blocks A.sub.i,I determined in step 10 for B.sub.i. With respect to FIG. 2, the set of patches {M.sub.i,0, M.sub.i,1, . . . } similar to B.sub.i are further associated with the blocks A.sub.i,0 A.sub.i,1, A.sub.i,2, A.sub.i,3 and A.sub.i,4, i.e. the patches {M.sub.i,0, M.sub.i,1, . . . } are also considered to be patches similar to the similar blocks A.sub.i,0 A.sub.i,1, A.sub.i,2, A.sub.i,3 and A.sub.i,4. The similar patches M.sub.i,p, also called matched patches, are the patches whose content is similar to the content of the current block B.sub.i. A patch M.sub.i,j is similar to B.sub.i when the distance d calculated between the current block B.sub.i and the patch M.sub.i,j is below a second threshold value .sub.M higher than the first threshold .sub.A. According to a specific embodiment,
.sub.A=.sub.A*.sub.M with 0.sub.A<1
The value of .sub.A is set via the coefficient .sub.A. In practice, an appropriate value for .sub.A is 0.5. When the method of epitome construction is used in an encoder/decoder, the value of the parameter .sub.A could be particularly useful in order to tune the complexity of the encoder/decoder.
This solution advantageously reduces the number of blocks considered during step 12 for the determination of similar patches. By doing so, a patch similar to a current block B.sub.i can have a distance to a similar block A.sub.i,I larger than .sub.M. According to a variant, only a subset of the patches {M.sub.i,0, M.sub.i,1, . . . } similar to the current block B.sub.i are associated with the similar blocks A.sub.i,I. Specifically, a patch M.sub.i,p similar to a current block B.sub.i is further associated with a similar block A.sub.i,I when the following equation is verified: d(M.sub.i,p; B.sub.i).sub.M.sub.A, where d(M.sub.i,p; B.sub.i) is the distance between the contents of B.sub.i and M.sub.i,k. This ensures that the distance d between any similar block A.sub.i,I and any of its matched is below the second threshold value .sub.M. At the end, for each block B.sub.i belonging to the block grid, a list L.sub.match(B.sub.i)={M.sub.i,0, M.sub.i,1, . . . } of matched patches is determined that approximate B.sub.i with a given error tolerance .sub.M.
The step 12 is repeated for a next current block for which no matched patch is determined until at least one matched patch is determined (step 14) for each block. Consequently, the step 12 is not repeated for blocks A.sub.i,0, A.sub.i,1, A.sub.i,2, A.sub.i,3 and Bj (=A.sub.i,4) because these blocks already have matched patches, namely the patches {M.sub.i,0, M.sub.i,1, . . . } similar to block B.sub.i. When matched patches have already been determined for a block, the block is removed from the list of similar blocks it belongs to. Exemplarily, A.sub.i,0 is a block similar to B.sub.i and B.sub.h. Consequently, A.sub.i,0 is removed from the list of blocks similar to B.sub.h because the matched patches {M.sub.i,0, M.sub.i,1, . . . } are associated with A.sub.i,0 when considering block B.sub.i. According to a variant, a block A.sub.n,m belonging to several lists of similar blocks is left in the list of the block to which A.sub.n,m is the closest in the sense of the distance d. Exemplarily, if A.sub.n,m is a block similar to B.sub.p and B.sub.q and d(A.sub.n,m, B.sub.q)<d(A.sub.n,m, B.sub.p) then A.sub.n,m is removed from the list of blocks similar to B.sub.p and left in the list of blocks similar to B.sub.q.

(13) In a step 16, at least one epitome chart is constructed from the lists of matched patches. The method of Wang disclosed in the article entitled Factoring Repeated Content Within and Among Images published in the proceedings of ACM SIGGRAPH 2008 (ACM Transaction on Graphics, vol. 27, no. 3, pp. 1-10, 2008) can be used. Many other such methods for constructing epitome chart(s) using lists of matched patches may be used. According to a specific and non-limiting embodiment depicted on FIG. 4, constructing at least one epitome chart comprises, in a step 160, determining new lists L.sub.match(M.sub.j,l) indicating the set of image blocks that can be represented, i.e. able to be reconstructed, by a patch M.sub.j,l. The new lists L.sub.match (M.sub.j,l) are determined for example by reversing the lists L.sub.match (B.sub.i) determined in step 12. One block B.sub.i can be in two different lists L.sub.match(M.sub.j,l).

(14) In a step 162, at least one epitome chart is constructed. To this aim, matched patches are selected in order to construct epitome charts, the union of all the epitome charts constituting the texture epitome E. A matched patch selected to be part of an epitome chart is called an epitome patch. Each epitome chart represents specific regions of the image Y in term of texture. Step 162 is detailed below.

(15) In a step 1620, an index n is set equal to 0, n is an integer.

(16) In a step 1622, a first epitome chart EC.sub.n is initialized. Several candidate matched patches can be used to initialize the epitome chart EC.sub.n. Each epitome chart is initialized by the matched patch E0 which is the most representative of the not yet reconstructed, i.e. represented, remaining blocks. A block B.sub.i is able to be reconstructed by a matched patch M.sub.jJ if B.sub.i belong to the list L.sub.match (M.sub.jJ). Let YR.sup.NM denote the input image and let YR.sup.NM denote the image reconstructed by a candidate matched patch and the epitome charts previously constructed. To initialize a chart, a selection criterion based on the minimization of the MAE (equation 1) or of the Mean Square Error (equation 2) criterion can be used:

(17) FC init = min ( i N j M .Math. Y i , j - Y i , j .Math. N * M ) ( 1 ) FC init = min ( i N j M ( Y i , j - Y i , j ) 2 N * M ) ( 2 )
where Y.sub.i,j is the image value of pixel (i,j) in the image Y and Y.sub.i,j is the image value of pixel (i,j) in the reconstructed image Y. Other metrics can be used to compute the reconstruction error.
The selected criterion takes into account the reconstruction errors on the whole image. This criterion allows the epitome to be extended by a texture pattern that allows the reconstruction of the largest number of blocks while minimizing the reconstruction error. The reconstruction error is computed between the image Y and the image Y reconstructed from the current epitome. The current epitome comprises a candidate matched patch and the epitome charts previously constructed. In a specific and non-limitative embodiment, when computing the image reconstruction error, a zero value is assigned to the pixels of blocks in the image Y that are not yet represented by epitome patches of the current epitome. Thus, the error for these pixels is equal to the value of the pixels in the original image. The issue is that the overall distortion does not only depend on the reconstructed part of the image, but also on the non-reconstructed part. According to a variant a value different from zero is used. As an example, the value 128 is used instead of zero. According to yet another variant, the error for these pixels is set to a maximum value, e.g. 255. The latter solution tends to promote reconstruction of larger part of the image, thus accelerating the creation of the epitome. FIG. 5 shows the image blocks reconstructed by the first epitome patch E0.

(18) In a step 1624, the epitome chart EC.sub.n is then progressively enlarged. The step is detailed on FIG. 6. Each time the epitome chart is enlarged, one keeps track of the number of additional blocks which can be reconstructed in the image as depicted on FIG. 7. This step is also known as epitome chart extension. The initial epitome chart EC.sub.n(0) corresponds to the matched patch selected at the initialization step 1622. The epitome enlargement step proceeds first by determining the set of matched patches M.sub.j,l that overlap the current chart EC.sub.n (k) and represent other blocks, i.e. other block which are not yet represented by the current chart EC.sub.n (k), k being an integer. Therefore, there are several candidates regions E that can be used as an extension of the current epitome chart. For each chart growth candidate E, the supplement image blocks that can be reconstructed is determined from the list L.sub.match (M.sub.j,k) related to the matched patch M.sub.j,k containing the set of pixels E. According to a specific and non-limitative embodiment, the additional blocks able to be reconstructed/represented by inferred patches as defined in EP2011794733 are also determined. The candidate E.sub.opt leading to best match according to a rate distortion criterion is selected among the set of the candidate chart growths. Let YR.sup.NM denote the input image and let YR.sup.NM denote the image reconstructed by the current epitome E.sub.curr and a chart growth candidate E. Note that the current epitome E.sub.curr is composed of previously constructed epitome charts and the current epitome chart EC.sub.n(k). This selection is for example made according to a rate distortion minimization of the lagrangian criterion FC.sub.ext:

(19) FC ext = min ( D E curr + E + * R E curr + E ) with E curr = .Math. i = 0 n EC i

(20) E opt k = arg min E ( D E curr + E + * R E curr + E )
where D is a distortion and R a rate.

(21) Exemplarily,

(22) E opt k = arg min E ( i N j M .Math. Y i , j - Y i , j .Math. N * M + * ( EC ( k ) + E N * M ) )
According to a variant,

(23) E opt k = arg min E ( i N j M ( Y i , j - Y i , j ) 2 N * M + * ( EC ( k ) + E N * M ) )

(24) In a preferred embodiment, the value is set to 1000. The first term of the criterion refers to the average reconstruction error per pixel when the input image is reconstructed by texture information contained in the current epitome

(25) E curr = .Math. i = 0 n EC i
and the increment E. As in the initialization step, when the pixels are neither represented by the current epitome E.sub.curr nor by the increment nor by the inferred patches (i.e. does not belong to a block that can be reconstructed from E.sub.curr, from the matched patch that contains the increment or from the inferred blocks), a zero value is assigned to them. According to a variant a value different from zero is used. As an example, the value 128 is used instead of zero. According to yet another variant, the error for these pixels is set to a maximum value, e.g. 255. The latter solution tends to promote reconstruction of larger part of the image, thus accelerating the creation of the epitome. The second term of the criterion corresponds to a rate per pixel when constructing the epitome, which is roughly estimated as the number of pixels in the current epitome and its increment divided by the total number of pixels in the image. After having selected the locally optimal increment E.sub.opt, the current epitome chart becomes: EC.sub.n(k+1)=EC.sub.n(k)+E.sub.opt. The assignation map is updated for the blocks newly reconstructed by EC.sub.n(k+1).
Then, the current chart is extended, during next iteration k+1, until there are no more matched patches M.sub.j,l which overlap the current chart EC.sub.n(k) and represent others blocks. If such overlapping patches exist then step 1624 is repeated with EC.sub.n(k+1).
According to a specific embodiment, when the current chart EC.sub.n(k) cannot be enlarged anymore, it is padded so that the current chart EC.sub.n(k) is aligned on the block grid. To the aim, the pixels are for example padded with their value in the original picture Y. Once the current epitome chart is padded, it is checked whether the padded chart contains new inferred patches able to reconstruct new blocks. This embodiment accelerate the reconstruction of the image especially when there are many inferred patches in the padded chart. It is preferable to pad the current epitome chart EC.sub.n(k) after its entire construction than padding it after each enlargement by E.sub.opt. Indeed, the latter leads to an increase of the size of the epitome chart.
When the current chart cannot be extended anymore and when the whole image is not yet reconstructed by the current epitome (step 1626), the index n is incremented by 1 in a step 1628 and another epitome chart is constructed in a new location in the image. The method thus continues with the new epitome chart at step 1622, i.e. the new chart is first initialized before its enlargement. The process ends when the whole image is reconstructed/represented by the epitome (step 1626). The texture epitome E comprises the union of all epitome charts EC.sub.n. The assignation map indicates for each block B.sub.i of the current image Y the location in the texture epitome of the epitome patch that is to be used for its reconstruction.

(26) FIG. 8 represents an exemplary architecture of an apparatus 100 configured to construct an epitome from an image Y according to an exemplary embodiment. The apparatus 100 comprises one or more processor(s) 110, which is(are), for example, a CPU, a GPU and/or a DSP (English acronym of Digital Signal Processor), along with internal memory 120 (e.g. RAM, ROM, EPROM). The apparatus 100 comprises one or several Input/Output interface(s) 130 adapted to display output information and/or allow a user to enter commands and/or data (e.g. a keyboard, a mouse, a touchpad, a webcam); and a power source 140 which may be external to the apparatus 100. The apparatus 100 may also comprise network interface(s) (not shown). The image Y may be obtained from a source. According to different embodiments, the source belongs to a set comprising: a local memory, e.g. a video memory, a RAM, a flash memory, a hard disk; a storage interface, e.g. an interface with a mass storage, a ROM, an optical disc or a magnetic support; a communication interface, e.g. a wireline interface (for example a bus interface, a wide area network interface, a local area network interface) or a wireless interface (such as a IEEE 802.11 interface or a Bluetooth interface); and an image capturing circuit (e.g. a sensor such as, for example, a CCD (or Charge-Coupled Device) or CMOS (or Complementary Metal-Oxide-Semiconductor)).
According to different embodiments, the epitome may be sent to a destination. As an example, the epitome is stored in a remote or in a local memory, e.g. a video memory or a RAM, a hard disk. In a variant, the epitome is sent to a storage interface, e.g. an interface with a mass storage, a ROM, a flash memory, an optical disc or a magnetic support and/or transmitted over a communication interface, e.g. an interface to a point to point link, a communication bus, a point to multipoint link or a broadcast network.
According to an exemplary and non-limitative embodiment, the apparatus 100 further comprises a computer program stored in the memory 120. The computer program comprises instructions which, when executed by the apparatus 100, in particular by the processor 110, make the apparatus 100 carry out the method described with reference to FIG. 3. According to a variant, the computer program is stored externally to the apparatus 100 on a non-transitory digital data support, e.g. on an external storage medium such as a HDD, CD-ROM, DVD, a read-only and/or DVD drive and/or a DVD Read/Write drive, all known in the art. The apparatus 100 thus comprises an interface to read the computer program. Further, the apparatus 100 could access one or more Universal Serial Bus (USB)-type storage devices (e.g., memory sticks.) through corresponding USB ports (not shown).
According to exemplary and non-limitative embodiments, the apparatus 100 is a device, which belongs to a set comprising: a mobile device; a communication device; a game device; a tablet (or tablet computer); a laptop; a still image camera; a video camera; an encoding chip; a decoding chip; a display; a still image server; and a video server (e.g. a broadcast server, a video-on-demand server or a web server).

(27) The implementations described herein may be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method or a device), the implementation of features discussed may also be implemented in other forms (for example a program). An apparatus may be implemented in, for example, appropriate hardware, software, and firmware. The methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants (PDAs), and other devices that facilitate communication of information between end-users.

(28) Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications. Examples of such equipment include an encoder, a decoder, a post-processor processing output from a decoder, a pre-processor providing input to an encoder, a video coder, a video decoder, a video codec, a web server, a set-top box, a laptop, a personal computer, a cell phone, a PDA, and other communication devices. As should be clear, the equipment may be mobile and even installed in a mobile vehicle.

(29) Additionally, the methods may be implemented by instructions being performed by a processor, and such instructions (and/or data values produced by an implementation) may be stored on a processor-readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette (CD), an optical disc (such as, for example, a DVD, often referred to as a digital versatile disc or a digital video disc), a random access memory (RAM), or a read-only memory (ROM). The instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two. A processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a processor-readable medium (such as a storage device) having instructions for carrying out a process. Further, a processor-readable medium may store, in addition to or in lieu of instructions, data values produced by an implementation.

(30) As will be evident to one of skill in the art, implementations may produce a variety of signals formatted to carry information that may be, for example, stored or transmitted. The information may include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal may be formatted to carry as data the rules for writing or reading the syntax of a described embodiment, or to carry as data the actual syntax-values written by a described embodiment. Such a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream. The information that the signal carries may be, for example, analog or digital information. The signal may be transmitted over a variety of different wired or wireless links, as is known. The signal may be stored on a processor-readable medium.

(31) A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made. For example, elements of different implementations may be combined, supplemented, modified, or removed to produce other implementations. Additionally, one of ordinary skill will understand that other structures and processes may be substituted for those disclosed and the resulting implementations will perform at least substantially the same function(s), in at least substantially the same way(s), to achieve at least substantially the same result(s) as the implementations disclosed. Accordingly, these and other implementations are contemplated by this application.

(32) The present principles find its interest in all domains concerned with the image epitome reduction. Applications related to video compression and representations of videos are concerned.