APPARATUS AND METHOD FOR EXTRACTING OBJECT
20170278246 · 2017-09-28
Inventors
Cpc classification
G06T7/143
PHYSICS
International classification
Abstract
According to one general aspect, an apparatus for extracting an object includes an image receiver configured to receive an image; a coupled saliency-map generator configured to generate a coupled saliency-map which is the sum of the product of a global saliency-map of the image and a predetermined weight value and a local saliency-map: an adaptive tri-map generator configured to generate an adaptive tri-map corresponding to the coupled saliency-map; an alpha matte generator configured to generate an alpha matte based on the adaptive tri-map: and an object detector configured to extract an object according to transparency of the alpha matte to generate an object image.
Claims
1. An apparatus for extracting an object comprising: an image receiver configured to receive an image; a coupled saliency-map generator configured to generate a coupled saliency-map which is the sum of the product of a global saliency-map of the image and a predetermined weight value and a local saliency-map; an adaptive tri-map generator configured to generate an adaptive tri-map corresponding to the coupled saliency-map; an alpha matte generator configured to generate an alpha matte based on the adaptive tri-map; and an object detector configured to extract an object according to transparency of the alpha matte to generate an object image.
2. The apparatus of claim 1, wherein the local saliency-map is a saliency-map including pixels representing Euclidean distance between a mean color vector of each pixel of the image and a vector generated through Gaussian blur treatment for the image, and wherein the global saliency-map is a saliency-map defined based on space and color of regions which are segmented to each component after representing the image in a Gaussian mixture model composed of a plurality of components.
3. The apparatus of claim 1, wherein the adaptive tri-map generator generates a tri-map from the coupled saliency-map through Gaussian blur and image clustering, selects a pixel, of which distance is the least to any one of mean color values of foreground region, background region and unknown region of the tri-map among pixels of the coupled saliency-map, as the shortest distance pixel, and generates the adaptive tri-map by replacing a pixel value of the tri-map, corresponding to the shortest distance pixel and a pixel adjacent to the shortest distance pixel, as a mean color value of the foreground region when the shortest distance pixel is identical to the mean color value of the foreground region and is located within the unknown region of tri-map.
4. The apparatus of claim 1, wherein the alpha matte generator generates the alpha matte through a parallel processing using GPGPU by a preconditioned conjugate gradient method.
5. The apparatus of claim 1, further comprising: an output interface configured to show extraction result of the object; and an input interface configured to receive a tri-snap correction input which requests for correcting the adaptive tri-map from a user, wherein the adaptive tri-map generator corrects the adaptive tri-map based on the tri-map correction input.
6. A method for extracting an object using an apparatus for extracting an object from an image, the method comprising: receiving the image; generating a coupled saliency-map which is the sum of the product of a global saliency-map of the image and a predetermined weight value and a local saliency-map; generating an adaptive tri-map corresponding to the coupled saliency-map; generating an alpha matte based on the adaptive tri-map; and extracting an object according to transparency of the alpha matte to generate an object image.
7. The method of claim 6, wherein the local saliency-map is a saliency-map including pixels representing Euclidean distance between a mean color vector of each pixel of the image and a vector generated through Gaussian blur treatment for the image, and wherein the global saliency-map is a saliency-map defined based on space and color of regions which are segmented to each component after representing the image in a Gaussian mixture model composed of a plurality of components.
8. The method of claim 6, wherein the generating an adaptive tri-map corresponding to the coupled saliency-map comprises: generating tri-map from the coupled saliency-map through Gaussian blur and, image clustering; selecting a pixel, of which distance is the least to any one of mean color values of foreground region, background region and unknown region of the tri-map among pixels of the coupled saliency-map, as the shortest distance pixel; and generating the adaptive tri-map by replacing a pixel value of the tri-map, corresponding to the shortest distance pixel and a pixel adjacent to the shortest distance pixel, as a mean color value of the foreground region when the shortest distance pixel is identical to the mean color value of the foreground region and is located within the unknown region of the tri-map.
9. The method of claim 6, wherein the generating an alpha matte based on the adaptive tri-map comprises generating the alpha matte through a parallel processing using GPGPU by a preconditioned conjugate gradient method.
10. The method of claim 6, further comprising: showing extraction result of the object; receiving a tri-map correction input which requests for correcting the adaptive tri-map from a user; correcting the adaptive tri-map according to the tri-map correction input; and generating an object image according to the corrected adaptive tri-map.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0022] Hereinafter, the following description will be described with reference to embodiments illustrated in the accompanying drawings. To help understanding of the following description, throughout the accompanying drawings, identical reference numerals are assigned to identical elements. The elements illustrated throughout the accompanying drawings are mere examples of embodiments illustrated for the purpose of describing the following description and are not to be used to restrict the scope of the following description.
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031] Throughout the drawings and the detailed description, the same reference numerals refer to the same elements. The drawings may not be to scab,and the relative size, proportions, and depiction of elements in the drawings may be exaggerated for clarity, illustration, and convenience.
DETAILED DESCRIPTION
[0032] Since there can be a variety of permutations and embodiments of the following description, certain embodiments l be illustrated and described with reference to the accompanying drawings. This, however, is by no means to restrict the following description to certain embodiments, and shall be construed as including all permutations, equivalents and substitutes covered by the ideas and scope of the following description.
[0033] When one element is described as “transfer a signal” or “transmit a signal” to another element, it shall be construed as transfer or transmit a signal to the other element directly but also as possibly having another element in between.
[0034]
[0035] Referring to
[0036] The image receiver 110 may receive an image from an external device such as a terminal, a storing medium and the like through network or a predetermined input terminal. The image receiver 110 may transfer the received image to the coupled saliency-map generator 120 and the object extractor 150.
[0037] The coupled saliency-map generator 120 may generate a coupled saliency-map corresponding to the received image. The coupled saliency-map generator 120 may generate a balanced coupled saliency-map between a local feature (e.g., edges, luminance color)-based saliency-map(spatial frequency & luminance based saliency-map, hereinafter, referred to as a local saliency-map) and a global feature (e.g., region-level cluster and contrast)-based saliency-map(spatial region & color contrast based saliency-map, hereinafter, referred to as a global saliency-map). The coupled saliency-map generator 120 may generate a coupled saliency-map according to the following Equation 1:
wherein, .sub.i may be a vector of the i.sub.th pixel of the image,
.sup.f may be a local saliency-map including pixels representing Euclidean distance between the mean color vector
.sub.u and the Gaussian blurred vector of N pixels of the image, (G.sub.σ*
.sub.j).Math.
.sup.9 may be a global saliency-map defined as an integration of region-based spatial and color contrast relationships after Gaussian mixture model based representation having K components {c.sub.p}.sub.p=1.sup.k .Math.|c.sub.p|may be the number of pixels of a component region c.sub.p.Math.D.sub.s(c.sub.q, c.sub.p) may be a spatial distance between component regions c.sub.p and c.sub.q.Math.D.sub.r(c.sub.q, c.sub.p) may be color distance between component regions c.sub.p and c.sub.q. σ.sub.s may be a stiffness parameter representing the strength of color contrast weighting. ω.sub.c.sub.
.sup.f and
.sup.9, preferable may be set to 1.
[0038] Here, the local saliency-map represents uniformly highlighted regions with well-defined boundaries of high frequencies in the CIELab color space, while the global saliency-map represents spatial distance and color distance through image segmentation in the RGB color space.
[0039] Referring to
[0040] Furthermore, MAE and F-measure of the local saliency-map, the global saliency-map, the coupled saliency-map for two images, and MAE and F-measure for filtered maps of the local saliency-map, the global saliency-map, the coupled saliency-map are shown in the following Table 1.
TABLE-US-00001 TABLE 1 Before filtering After filtering Image Map MAE F-measure MAE F-measure Image 1 Coupled 0.1876 0.9434 0.1447 0.9375 saliency- map Global 0.1232 0.9502 0.1449 0.9368 saliency- map Local 0.2676 0.9042 0.2062 0.8929 saliency- map image 2 Coupled 0.1557 0.9588 0.1036 0.9596 saliency- map Global 0.1011 0.9628 0.1209 0.9533 saliency- map Local 0.2279 0.9178 0.1629 0.9216 saliency- map
[0041] Here, MAE and F-measure are values representing similarity of images which can be obtained by the following Equation 2:
wherein, G.sub.i may be the ith pixel of a ground-truth image, and .sub.i.sup.* may be the ith pixel of each saliency-map. F.sub.βmay be F-measure β.sup.2=1, representing harmonic mean of precision and recall.
[0042] The coupled saliency-map generator 120 may transfer the coupled saliency-map to the adaptive tri-map generator 130.
[0043] The adaptive tri-map generator 130 may generate an adaptive tri-map by referring to the coupled saliency-map. For example, the adaptive tri-map generator 130 may generate a tri-map by applying Gaussian blur and age clustering to the coupled saliency-map.
[0044] Then, the adaptive tri-map generator 130 may select a pixel, of which distance is the least to any one of mean color values of tri-rap's foreground region, background region and unknown region (hereinafter, referred to as The shortest distance pixel'), among pixels in the coupled saliency-map. When the shortest distance pixel is a pixel identical to the foreground mean color value and located in the unknown region, the adaptive tri-map generator 130 may generate an adaptive tri-map by replacing the shortest distance pixel and the pixel which is adjacent to the shortest distance pixel (e.g., 410 in
[0045] The adaptive tri-map generator 130 may transfer the adaptive tri-map to the alpha matte generator 140.
[0046] The alpha matte generator 140 may generate an alpha matte by referring to the adaptive tri-map. The alpha matte generator 140 may determine an alpha value {circumflex over (α)}.sub.v pre-defined by setting transparency as 1 at the position of a pixel located in the foreground region of the adaptive tri-map among pixels of the image, and transparency as 0 at the position of a pixel located in the background region. The alpha matte generator 140 may generate a pre-defined alpha value of each pixel of the image according to the following Equation 3:
wherein, v is a pixel of an image, is an image,
is a foreground region,
is a background region,
is a unknown region,
is a tri-map.
[0047] The alpha matte generator 140 may generate an alpha matte by employing the pre-defined alpha value. For example, the alpha matte generator 140 may generate transparency corresponding to each pixel of the alpha matte by employing a sparse linear system represented by the following Equation 4:
(LλW.sub.v)α=λ{circumflex over (α)}.sub.v [Equation 4]
wherein, Lis a matting Laplacian matrix, W.sub.v is a diagonal matrix which consists of elements having pre-defined alpha values. Here, the matting Laplacian matrix may be represented by L=D−A, in which A is a matting affinity matrix, which is a matrix having each element value according to the following Equation 5. α is transparency of the alpha matte.
[0048] Here μ.sub.k is mean in the kth local window, Σ.sub.k is a covariance matrix in the kth local window, I.sub.c is a c x c identity matrix.
[0049] D may be a diagonal matrix, which consists of elements such as =Σ.sub.j A.sub.ij.
[0050] The Equation 4 is an equation obtained by inducing the matting equation of Eq. 6 as Lagrangian equation according to Equation 7 and then differentiating the Lagrangian equation.
[0051] The alpha matte generator 140 may transfer the alpha matte to the object extractor 150.
[0052] The object extractor 150 may determine the region of the pixel with the transparency of the alpha matte of 1 as an object region and then generate and output the object image including the pixel value of the object region.
[0053] The output interface 160 may display abject extraction result by being connected with an output device such as a monitor. Accordingly, a user may see the object extraction result through the output device.
[0054] The input interface 170 may receive a tri-map correction input by a user o correct the adaptive tri-map generated by the adaptive tri-map generator 130. The input interface 170 may transfer the tri-map correction input to the adaptive tri-map generator 130. The adaptive tri-map generator 130 may correct the adaptive tri-map based on the tri-map correction input and transfer the corrected adaptive tri-map to the alpha matte generator 140. The alpha matte generator 140 may generate an alpha matte based on the corrected adaptive tri-map and the object extractor 150 may extract an object based on the re-generated alpha matte.
[0055] Here, the alpha matte generator 140 may generate alpha matte with parallel processing by using GPGPU through a direct method of CF(cholesky Factorization), a iterative method of CG(Conjugate Gradient) or PCG(preconditioned conjugate gradient) An alpha matte generation algorithm through PCG may be represented as shown in
[0056] The apparatus for extracting an object according to an example is able to generate each alpha matte fast for a large-scale image by reducing computing time to generate alpha matte.
[0057]
[0058] Referring to
[0059] In step 620, the apparatus for extracting an object generates a coupled saliency-map by referring to the image.
[0060] In step 630, the apparatus for extracting an object generates an adaptive tri-map using the coupled saliency-map. A process for generating an adaptive tri-map using the coupled saliency-map will be explained in more detail with reference to
[0061] In step 640, the apparatus for extracting an object generates an alpha matte corresponding to the adaptive tri-map.
[0062] In step 650, the apparatus for extracting an object extracts an object through the alpha matte and generates an object image including the object.
[0063] In step 660, the apparatus for extracting an object determines whether a tri-map correction input is received from a user or not.
[0064] When it is determined as that the tri-map correction input is received from a user, the apparatus for extracting an object corrects the adaptive tri-map according to the tri-map correction input in step 670.
[0065] In step 680, the apparatus for extracting an object regenerates an object image by extracting an object according to the corrected adaptive tri-map.
[0066] In step 690, the apparatus for extracting an object outputs the extracted object image.
[0067]
[0068] Referring to
[0069] In step 720, the apparatus for extracting an object estimates each mean of foreground, background and unknown regions of the tri-map.
[0070] In step 730, the apparatus for extracting an object selects a pixel of the coupled saliency-map of which distance is the least to any one of the means as the shortest distance pixel.
[0071] In step 740, the apparatus for extracting an object determines whether the foreground mean color value is identical to a value of the shortest distance pixel.
[0072] In step 740, when the foreground mean color value is not identical to a value of the shortest distance pixel, the apparatus for extracting an object ends the process for generating an adaptive tri-map for the shortest distance pixel
[0073] On the other hand, when the foreground mean color value is identical to a value of the shortest distance pixel, the apparatus for extracting an object determines if the shortest distance pixel is located within an unknown region in step 750.
[0074] When it is determined as that the shortest distance pixel not located within an unknown region, the apparatus for extracting an object ends a process for generating an adaptive tri-map for the shortest distance pixel.
[0075] On the other hand, when it is determined as that the shortest distance pixel is located within an unknown region, the apparatus for extracting an object replaces the shortest distance pixel and the pixel which is adjacent to the shortest distance pixel in the tri-map with a tri-map's foreground mean color value in step 760.
[0076] In step 770, the apparatus for extracting an object generates an adaptive tri-map including the replaced pixel value.
[0077] Exemplary embodiments of the present disclosure may be implemented in a computer system.
[0078] Exemplary embodiments of the present disclosure may be implemented in a computer system, for example, a computer readable recording medium. As shown in
[0079] Accordingly, the exemplary embodiment of the present disclosure can be implemented by the method which the computer is implemented or in non-volatile computer recording media stored in computer executable instructions. The instructions can perform the method according to at least one embodiment of the present disclosure when they are executed by a processor