Apparatus and method for re-identifying object in image processing

Abstract

The apparatus includes: a weighted feature extractor configured to extract a weighted feature from an input image and generate a weighted descriptor to which a feature of a salient region is applied; a dictionary constructor configured to construct a dictionary composed of images with different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied by the weighted feature extractor and store the dictionary in a database (DB); and a coefficient estimator and ID determiner configured to apply sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and perform identification using an error between a target and the reconstructed object.

Claims

1. An apparatus for re-identifying an object in image processing, the apparatus comprising: a weighted feature extractor configured to extract a weighted feature from an input image and generate a weighted descriptor to which a feature of a salient region is applied; a dictionary constructor configured to construct a dictionary composed of images with the most different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied by the weighted feature extractor and store the dictionary in a database (DB) such that redundant information is not included in the dictionary; and a coefficient estimator and ID determiner configured to apply sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and perform identification using an error between a target and the reconstructed object; wherein an application of the sparse representation is defined as an equation of y=D, where y denotes an object (target) to be re-identified, D denotes a set of candidate objects, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having a same size as ; wherein atoms of must have a non-zero value, y is generated by combining only a few dictionary elements selected by non-zero atoms of , and when dictionary D is given, the equation is modified as $\hat{} = \underset{}{argmin} ({.Math. y - D .Math.}_{2}^{2} + {.Math. .Math.}_{l})$ to minimize a number of the non-zero atoms, where {circumflex over ()} represents a modified sparse vector, represents a regularization factor, .sub.2.sup.2 represents the l.sub.2-norm operator, .sub.0 represents the l.sub.0-norm operator, and {circumflex over ()} is a first term which is optimally reconstructed due to an error between input signal y and recovered version D and satisfies a sparsity condition.

2. The apparatus of claim 1, wherein the weighted feature descriptor of the weighted feature extractor is defined by a weighted hue saturation value (HSV) histogram as $W_{s}^{C} (b) = {\begin{matrix} W_{s}^{C} (b) + W (u, v), & if I_{s}^{H} (u, v) b \\ W_{s}^{C} (b), & otherwise, \end{matrix}$ where I.sub.s.sup.H represents a hue channel of an s.sup.th stripe, W.sub.s.sup.C represents a weighted vector, b is a bin of the histogram, and (u, v) are coordinates of each stripe.

3. The apparatus of claim 2, wherein each histogram bin is modified by multiplying the histogram bin by a corresponding weight as shown in f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b), wherein f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC custom character .sup.1M.sup.C with M.sub.C histogram bins.

4. The apparatus of claim 1, wherein the weighted feature descriptor of the weighted feature extractor is defined by a weighted local binary pattern as f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)), where f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT custom character .sup.1M.sup.T with M.sub.T bins, I.sub.s.sup.T is a texture pattern computed by LBP at (u, v), f.sub.s.sup.T denotes an LBP histogram from each stripe region, and W.sub.s.sup.T denotes a weight vector at a same size as f.sub.s.sup.T.

5. The apparatus of claim 4, wherein a total descriptor d custom character .sup.M of an object is defined by arranging all stripe regions as d=[f.sub.1.sup.wC f.sub.1.sup.wT f.sub.2.sup.wC f.sub.2.sup.wT . . . f.sub.L.sup.wC f.sub.L.sup.wT].

6. The apparatus of claim 1, wherein the coefficient estimator and ID determiner estimates a sparse coefficient using a least absolute shrinkage and selection operator (LASSO), obtains a reconstruction error of galleries by computing a difference between the galleries and a reconstructed probe, and searches an optimal gallery using a reconstructed probe from an estimated coefficient and D from N gallery images, and a decision of a minimum error and an ID is defined as $r_{i} = \min_{k} .Math. .Math. y_{k} - D_{i} {\hat{}}_{i}^{k} .Math. .Math.,$ where r.sub.i represents a minimum reconstruction error of an i.sup.th gallery for a probe, and y.sub.k represents a feature descriptor of a k.sup.th image in a probe.

7. The apparatus of claim 6, wherein an index of a gallery corresponding to a probe is $c = \underset{i}{\arg \min} {r_{i}}, i = 1, 2, .Math., N,$ where c represents a selected identity of a probe to re-identify.

8. A method of re-identifying an object in image processing, the method comprising: a weighted feature extraction operation of extracting a weighted feature from an input image and generating a weighted descriptor to which a feature of a salient region is applied; a dictionary construction operation of constructing a dictionary composed of images with the most different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied and storing the dictionary in a database (DB) such that redundant information is not included in the dictionary; and a coefficient estimation and ID decision operation of applying sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and performing identification using an error between a target and the reconstructed object; wherein an application of the sparse representation is defined as an equation of y=D, where y denotes an object (target) to be re-identified, D denotes a set of candidate objects, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having a same size as ; and wherein atoms of must have a non-zero value, y is generated by combining only a few dictionary elements selected by non-zero atoms of , and when dictionary D is given, the equation is modified as $\hat{} = \underset{}{\arg \min} ({.Math. .Math. y - D .Math. .Math.}_{2}^{2} + {.Math. .Math. .Math. .Math.}_{1})$ to minimize a number of the non-zero atoms, where {circumflex over ()} represents a modified sparse vector, represents a regularization factor, .sub.2.sup.2 represents the l.sub.2-norm operator, .sub.0 represents the l.sub.0-norm operator, and {circumflex over ()} is the first term which is optimally reconstructed due to an error between input signal y and recovered version D and satisfies a sparsity condition.

9. The method of claim 8, wherein the weighted feature descriptor in the weighted feature extraction operation is defined by a weighted hue saturation value (HSV) histogram as $W_{s}^{C} (b) = {\begin{matrix} W_{s}^{C} (b) + W (u, v), & if I_{s}^{H} (u, v) b \\ W_{s}^{C} (b), & otherwise, \end{matrix}$ where I.sub.s.sup.H represents a hue channel of an s.sup.th stripe, W.sub.s.sup.C represents a weighted vector, b is a bin of the histogram, and (u, v) are coordinates of each stripe.

10. The method of claim 9, wherein each histogram bin is modified by multiplying the histogram bin by a corresponding weight as shown in f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b), wherein f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC custom character .sup.1M.sup.C with M.sub.C histogram bins.

11. The method of claim 8, wherein the weighted feature descriptor in the weighted feature extraction operation is defined by a weighted local binary pattern as f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)), where f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT custom character .sup.1M.sup.T with M.sub.T bins, I.sub.s.sup.T is a texture pattern computed by LBP at (u, v), f.sub.s.sup.T denotes an LBP histogram from each stripe region, and W.sub.s.sup.T denotes a weight vector at a same size as f.sub.s.sup.T.

12. The method of claim 11, wherein a total descriptor d custom character .sup.M of an object is defined by arranging all stripe regions as d=[f.sub.1.sup.wC f.sub.1.sup.wT f.sub.2.sup.wC f.sub.2.sup.wT . . . f.sub.L.sup.wC f.sub.L.sup.wT].

13. The method of claim 8, wherein the coefficient estimation and ID decision operation comprises estimating a sparse coefficient using a least absolute shrinkage and selection operator (LASSO), obtaining a reconstruction error of galleries by computing a difference between the galleries and a reconstructed probe, searching an optimal gallery using a reconstructed probe from an estimated coefficient and D from N gallery images, and a decision of a minimum error and an ID is defined as $r_{i} = \min_{k} .Math. .Math. y_{k} - D_{i} {\hat{}}_{i}^{k} .Math. .Math.,$ where r.sub.i represents a minimum reconstruction error of an i.sup.th gallery for a probe, and y.sub.k represents a feature descriptor of a k.sup.th image in a probe.

14. The method of claim 13, wherein an index of a gallery corresponding to a probe is $c = \underset{i}{\arg \min} {r_{i}}, i = 1, 2, .Math., N,$ where c represents a selected identity of a probe to re-identify.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The above and other objects, features and advantages of the present invention will become more apparent to those of ordinary skill in the art by describing exemplary embodiments thereof in detail with reference to the accompanying drawings, in which:

(2) FIG. 1 is a configuration diagram illustrating a general surveillance camera and an image processing process;

(3) FIG. 2 is a configuration diagram illustrating problematic factors in an object re-identification process of a conventional technology;

(4) FIG. 3 is a configuration diagram illustrating an apparatus for re-identifying an object in image processing according to the present invention;

(5) FIG. 4 is a detailed configuration diagram illustrating an apparatus for re-identifying an object in image processing according to the present invention;

(6) FIG. 5 is a detailed configuration diagram illustrating a weighted feature extractor;

(7) FIG. 6 is a configuration diagram illustrating a dictionary constructor;

(8) FIG. 7 is a configuration diagram of a coefficient estimator and ID determiner; and

(9) FIG. 8 is a flowchart illustrating a method of re-identifying an object in image processing according to the present invention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

(10) Hereinafter, exemplary embodiments of an apparatus and method for re-identifying an object in image processing according to the present invention will be described.

(11) Advantages and features of the apparatus and method for re-identifying an object in image processing according to the present invention will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings.

(12) FIG. 3 is a configuration diagram illustrating an apparatus for re-identifying an object in image processing according to the present invention, and FIG. 4 is a detailed configuration diagram illustrating an apparatus for re-identifying an object in image processing according to the present invention.

(13) The apparatus and method for re-identifying an object in image processing according to the present invention are provided to allow cameras having different characteristics for global monitoring to accurately recognize one object.

(14) To this end, the present invention includes a configuration using saliency-based learning and sparse representation in order to enhance accuracy and robustness of object identification and ensure the real-time object identification, where saliency indicates an apparent visual-perceptual characteristic of an object.

(15) The present invention includes a configuration to reconstruct an observed signal with a few linear combinations of atoms constituting a dictionary using sparse representation.

(16) The present invention includes a configuration to estimate a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting a dictionary and perform identification using an error between a target and a reconstructed object.

(17) The present invention includes a configuration to generate a descriptor to which a feature of a salient region is applied, thereby reducing a dictionary due to removal of redundant information and facilitating coefficient estimation.

(18) The configurations for re-identifying an object in image process according to the present invention, which will be described hereinafter, are construed to be applicable to any apparatus, such as a surveillance camera such as a closed-circuit television (CCTV) camera, an image tracking system, a personal digital assistant (PDA), a smart phone, a navigation terminal, a desktop computer, or a personal computer such as a notebook computer, which can recognize an object contained in an image and output information about the recognized object.

(19) As shown in FIGS. 3 and 4, the apparatus for re-identifying an object in image processing according to the present invention includes a weighted feature extractor 30 configured to extract a weighted feature from an input image and generate a descriptor to which a feature of a salient region is applied, a dictionary constructor 40 configured to construct a dictionary composed of images with the most different characteristics of one object so as to reduce the dictionary and facilitate coefficient estimation using the descriptor to which the feature of the salient region is applied by the weighted feature extractor 30 and then store the dictionary in a database (DB) 50, and a coefficient estimator and ID determiner 60 configured to estimate a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object and perform identification using an error between a target and the reconstructed object.

(20) Sparsity of sparse representation used in the configuration for re-identifying an object in image processing according to the present invention indicates that a coefficient is zero or close to zero.

(21) The present invention uses such sparse representation and uses a principle that reconstructs an observed signal with a few linear combinations of atoms constituting a dictionary.

(22) This will be expressed below:
y=D[Equation 1]

(23) In the equation representing the application of sparse representation of object re-identification, y denotes an object (target) to be re-identified and D denotes a set of candidate objects.

(24) As such, a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of the target object constituting a dictionary is estimated and identification (matching) using an error between the target and the reconstructed object is performed.

(25) In Equation 1, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having the same size as .

(26) To satisfy the equation, atoms of should have a non-zero value, and y is generated by combining only a few dictionary elements selected by non-zero atoms of .

(27) When dictionary D is given, Equation 1 is modified to minimize the number of the non-zero atoms.

(28) $\begin{matrix} \hat{} = \underset{}{\arg \min} {.Math. .Math. y - D .Math. .Math.}_{2}^{2} + {.Math. .Math. .Math. .Math.}_{0}, & [Equation 2] \end{matrix}$

(29) where {circumflex over ()} represents a modified sparse vector, represents a regularization factor, .sub.2.sup.2 represents the l.sub.2-norm operator, and .sub.0 to represents the l.sub.0-norm operator.

(30) {circumflex over ()} is the first term which is optimally reconstructed due to an error between the input signal y and recovered version D and satisfies a sparsity condition.

(31) FIG. 5 is a detailed configuration diagram illustrating a weighted feature extractor.

(32) The weighted feature extractor may solve a feature extraction problem restricted by a small area of an object occupying an image.

(33) A weighted feature descriptor may be defined below.

(34) Equations 3 and 4 define a weighted hue saturation value histogram.

(35) $\begin{matrix} W_{s}^{C} (b) = {\begin{matrix} W_{s}^{C} (b) + W (u, v), & if I_{s}^{H} (u, v) b \\ W_{s}^{C} (b), & otherwise, \end{matrix} & [Equation 3] \end{matrix}$

(36) where I.sub.s.sup.H represents a hue channel of the s.sup.th stripe, W.sub.s.sup.C represents a weighted vector, b is a bin of the histogram, and (u, v) are coordinates of each stripe.

(37) Each histogram bin is modified by multiplying it by a corresponding weight as shown in Equation 4.
f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b)f.sub.s.sup.C(b),[Equation 4]

(38) where f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC custom character .sup.1M.sup.C with M.sub.C histogram bins.

(39) Equations 5 and 6 define a weighted local binary pattern (LBP).
f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)),[Equation 5]

(40) wherein f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT custom character .sup.1M.sup.T with M.sub.T bins, I.sub.s.sup.T is a texture pattern computed by LBP at (u, v), f.sub.s.sup.T denotes an LBP histogram from each stripe region, and W.sub.s.sup.T denotes a weight vector at the same size as f.sub.s.sup.T.

(41) A total descriptor d custom character .sup.M of an object is defined by arranging all stripe regions as shown in Equation 6.
d=[f.sub.1.sup.wCf.sub.1.sup.wTf.sub.2.sup.wCf.sub.2.sup.wT . . . f.sub.L.sup.wCf.sub.L.sup.wT].sub.T[Equation 6]

(42) FIG. 6 is a configuration diagram illustrating a dictionary constructor.

(43) In the present invention, a dictionary is constructed to solve an inefficiency problem according to a method of constructing multiple images of each gallery and a problem in that the number of combinations that can represent a target increases when there are many similar atoms.

(44) In the present invention, a dictionary is constructed with images with the most different characteristics of one object, thereby reducing the dictionary and facilitating coefficient estimation due to removal of redundant information.

(45) FIG. 7 is a configuration diagram of a coefficient estimator and ID determiner.

(46) In the present invention, a sparse coefficient is estimated using a least absolute shrinkage and selection operator (LASSO) as shown in Equation 2.

(47) A method in which a reconstruction error of galleries is measured by computing a difference between the galleries and a reconstructed probe and an optimal gallery is searched using a reconstructed probe from an estimated coefficient and D from N gallery images is used.

(48) Equations 7 and 8 represent decision of a minimum error and an ID using multi-shot probes.

(49) $\begin{matrix} r_{i} = \min_{k} .Math. .Math. y_{k} - D_{i} {\hat{}}_{i}^{k} .Math. .Math., & [Equation 7] \end{matrix}$

(50) where r.sub.i represents a minimum reconstruction error of the i.sup.th gallery for a probe, and y.sub.k represents a feature descriptor of the k.sup.th image in a probe.

(51) An index of the gallery corresponding to the probe is defined as Equation 8 below.

(52) $\begin{matrix} c = \underset{i}{\arg \min} {r_{i}}, i = 1, 2, .Math., N, & [Equation 8] \end{matrix}$

(53) Where c represents a selected identity of the probe to re-identify.

(54) FIG. 8 is a flowchart illustrating a method of re-identifying an object in image processing according to the present invention.

(55) Initially, when an image is input (S801), a weighted feature is extracted and a descriptor to which a feature of a salient region is applied is generated (S802).

(56) Subsequently, a dictionary is constructed with images with the most different characteristics of one object such that the dictionary is reduced due to removal of redundant information and coefficient estimation is facilitated (S803).

(57) Subsequently, a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary is estimated (S804).

(58) An identification process using an error between a target and the reconstructed object is performed (S805).

(59) The above-described apparatus and method for re-identifying an object in image processing according to the present invention allow cameras having different characteristics for global monitoring to accurately identify one object. Results obtained using continuous images captured by two non-overlapping cameras in a crowded airport with challenges, such as background clutter, occlusion, and viewpoint and illumination changes.

(60) Table 1 shows results of saliency-based weighted feature extraction.

(61) TABLE-US-00001 TABLE 1 Dataset ILIDS-VID Rank 1 5 10 20 HSV .sup.10% 24.6% 31.3% 41.3% LBP 5.3% 17.3% .sup.26% 38.6% HSV + LBP 16.6% 34.6% 43.3% .sup.50% Weighted HSV .sup.14% 28.6% 37.3% 48.6% Weighted LBP 8.3% 21.3% 28.6% 40.6% Proposed weighted descriptor 28.6% 47.3% 58.6% 66.6%

(62) It can be seen that the performance is improved by the robust atom configuration of the weighted feature descriptor by applying the apparatus and method for re-identifying an object in image processing according to the present invention.

(63) Table 2 shows a result of dictionary construction to which the apparatus and method for re-identifying an object in image processing according to the present invention are applied.

(64) TABLE-US-00002 TABLE 2 Dataset ILIDS-VID Rank 1 5 10 20 Sequential images 14.6% .sup.34% 43.3% .sup.60% Stride selection .sup.20% 42.6% 53.3% 65.3% Proposed dictionary construction 28.6% 47.3% 58.6% 66.6%

(65) It can be seen that a coefficient estimation problem caused by similar atoms can be solved.

(66) Table 3 shows a result of object re-identification to which the apparatus and method for re-identifying an object in image processing according to the present invention are applied.

(67) TABLE-US-00003 TABLE 3 Dataset ILIDS-VID Rank 1 5 10 20 ISR .sup.10% 24.6% 31.3% 41.3% eSDC 5.3% 17.3% .sup.26% 38.6% LC-KSVD 24.3% 38.5% 42.3% 47.3% DVR 23.3% 42.4% 55.3% 68.4% SRID 24.9% 44.5% 54.1% 68.8% Proposed method 28.6% 47.3% 58.6% 66.6%

(68) The above-described apparatus and method for re-identifying an object in image processing according to the present invention resolves a feature extraction problem restricted by a small area of an object occupying an image by using a weighted feature descriptor to which a feature of a salient region is applied, thereby allowing cameras having different characteristics for global monitoring to accurately identify one object.

(69) The apparatus and method for re-identifying an object in image processing according to the present invention achieve the following effects.

(70) First, by using saliency-based learning and sparse representation, it is possible to allow cameras having different characteristics for global monitoring to accurately identify one object.

(71) Second, by using saliency-based learning and sparse representation, it is possible to enhance the accuracy and robustness of object identification.

(72) Third, by using sparse representation, it is possible to reconstruct an observed signal with a few linear combinations of atoms constituting a dictionary.

(73) Fourth, it is possible to estimate a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects constituting a dictionary and perform identification using an error between a target and the reconstructed object, thereby increasing accuracy.

(74) Fifth, it is possible to resolve a feature extraction problem restricted by a small area of an object occupying an image by using a weighted feature descriptor.

(75) Sixth, a weighted description to which a feature of a salient region is applied is generated for re-identification of an object, thereby reducing a dictionary due to removal of redundant information and facilitating coefficient estimation.

(76) It should be apparent to those skilled in the art that various modifications can be made to the above-described exemplary embodiments of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention covers all such modifications provided they come within the scope of the appended claims and their equivalents.

REFERENCE NUMERALS

(77) 30: WEIGHTED FEATURE EXTRACTOR 40: DICTIONARY CONSTRUCTOR 50: DB 60: COEFFICIENT ESTIMATOR AND ID DETERMINER

Apparatus and method for re-identifying object in image processing

Assignee

Inventors

Cpc classification

Classification Explorer

G06V20/52

PHYSICS

Classification Explorer

G06V10/464

PHYSICS

Classification Explorer

G06T7/73

PHYSICS

Classification Explorer

G06T7/41

PHYSICS

International classification

Classification Explorer

G06K9/00

PHYSICS

Classification Explorer

G06K9/46

PHYSICS

Classification Explorer

G06T7/73

PHYSICS

Classification Explorer

G06T7/41

PHYSICS

Abstract

Claims

Description