Apparatus and method for re-identifying object in image processing
10825194 ยท 2020-11-03
Assignee
Inventors
Cpc classification
G06V20/52
PHYSICS
G06V10/464
PHYSICS
International classification
Abstract
The apparatus includes: a weighted feature extractor configured to extract a weighted feature from an input image and generate a weighted descriptor to which a feature of a salient region is applied; a dictionary constructor configured to construct a dictionary composed of images with different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied by the weighted feature extractor and store the dictionary in a database (DB); and a coefficient estimator and ID determiner configured to apply sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and perform identification using an error between a target and the reconstructed object.
Claims
1. An apparatus for re-identifying an object in image processing, the apparatus comprising: a weighted feature extractor configured to extract a weighted feature from an input image and generate a weighted descriptor to which a feature of a salient region is applied; a dictionary constructor configured to construct a dictionary composed of images with the most different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied by the weighted feature extractor and store the dictionary in a database (DB) such that redundant information is not included in the dictionary; and a coefficient estimator and ID determiner configured to apply sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and perform identification using an error between a target and the reconstructed object; wherein an application of the sparse representation is defined as an equation of y=D, where y denotes an object (target) to be re-identified, D denotes a set of candidate objects, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having a same size as ; wherein atoms of must have a non-zero value, y is generated by combining only a few dictionary elements selected by non-zero atoms of , and when dictionary D is given, the equation is modified as
2. The apparatus of claim 1, wherein the weighted feature descriptor of the weighted feature extractor is defined by a weighted hue saturation value (HSV) histogram as
3. The apparatus of claim 2, wherein each histogram bin is modified by multiplying the histogram bin by a corresponding weight as shown in f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b), wherein f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC.sup.1M.sup.
4. The apparatus of claim 1, wherein the weighted feature descriptor of the weighted feature extractor is defined by a weighted local binary pattern as f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)), where f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT.sup.1M.sup.
5. The apparatus of claim 4, wherein a total descriptor d.sup.M of an object is defined by arranging all stripe regions as d=[f.sub.1.sup.wC f.sub.1.sup.wT f.sub.2.sup.wC f.sub.2.sup.wT . . . f.sub.L.sup.wC f.sub.L.sup.wT].
6. The apparatus of claim 1, wherein the coefficient estimator and ID determiner estimates a sparse coefficient using a least absolute shrinkage and selection operator (LASSO), obtains a reconstruction error of galleries by computing a difference between the galleries and a reconstructed probe, and searches an optimal gallery using a reconstructed probe from an estimated coefficient and D from N gallery images, and a decision of a minimum error and an ID is defined as
7. The apparatus of claim 6, wherein an index of a gallery corresponding to a probe is
8. A method of re-identifying an object in image processing, the method comprising: a weighted feature extraction operation of extracting a weighted feature from an input image and generating a weighted descriptor to which a feature of a salient region is applied; a dictionary construction operation of constructing a dictionary composed of images with the most different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied and storing the dictionary in a database (DB) such that redundant information is not included in the dictionary; and a coefficient estimation and ID decision operation of applying sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and performing identification using an error between a target and the reconstructed object; wherein an application of the sparse representation is defined as an equation of y=D, where y denotes an object (target) to be re-identified, D denotes a set of candidate objects, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having a same size as ; and wherein atoms of must have a non-zero value, y is generated by combining only a few dictionary elements selected by non-zero atoms of , and when dictionary D is given, the equation is modified as
9. The method of claim 8, wherein the weighted feature descriptor in the weighted feature extraction operation is defined by a weighted hue saturation value (HSV) histogram as
10. The method of claim 9, wherein each histogram bin is modified by multiplying the histogram bin by a corresponding weight as shown in f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b), wherein f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC.sup.1M.sup.
11. The method of claim 8, wherein the weighted feature descriptor in the weighted feature extraction operation is defined by a weighted local binary pattern as f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)), where f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT.sup.1M.sup.
12. The method of claim 11, wherein a total descriptor d.sup.M of an object is defined by arranging all stripe regions as d=[f.sub.1.sup.wC f.sub.1.sup.wT f.sub.2.sup.wC f.sub.2.sup.wT . . . f.sub.L.sup.wC f.sub.L.sup.wT].
13. The method of claim 8, wherein the coefficient estimation and ID decision operation comprises estimating a sparse coefficient using a least absolute shrinkage and selection operator (LASSO), obtaining a reconstruction error of galleries by computing a difference between the galleries and a reconstructed probe, searching an optimal gallery using a reconstructed probe from an estimated coefficient and D from N gallery images, and a decision of a minimum error and an ID is defined as
14. The method of claim 13, wherein an index of a gallery corresponding to a probe is
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The above and other objects, features and advantages of the present invention will become more apparent to those of ordinary skill in the art by describing exemplary embodiments thereof in detail with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
(10) Hereinafter, exemplary embodiments of an apparatus and method for re-identifying an object in image processing according to the present invention will be described.
(11) Advantages and features of the apparatus and method for re-identifying an object in image processing according to the present invention will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings.
(12)
(13) The apparatus and method for re-identifying an object in image processing according to the present invention are provided to allow cameras having different characteristics for global monitoring to accurately recognize one object.
(14) To this end, the present invention includes a configuration using saliency-based learning and sparse representation in order to enhance accuracy and robustness of object identification and ensure the real-time object identification, where saliency indicates an apparent visual-perceptual characteristic of an object.
(15) The present invention includes a configuration to reconstruct an observed signal with a few linear combinations of atoms constituting a dictionary using sparse representation.
(16) The present invention includes a configuration to estimate a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting a dictionary and perform identification using an error between a target and a reconstructed object.
(17) The present invention includes a configuration to generate a descriptor to which a feature of a salient region is applied, thereby reducing a dictionary due to removal of redundant information and facilitating coefficient estimation.
(18) The configurations for re-identifying an object in image process according to the present invention, which will be described hereinafter, are construed to be applicable to any apparatus, such as a surveillance camera such as a closed-circuit television (CCTV) camera, an image tracking system, a personal digital assistant (PDA), a smart phone, a navigation terminal, a desktop computer, or a personal computer such as a notebook computer, which can recognize an object contained in an image and output information about the recognized object.
(19) As shown in
(20) Sparsity of sparse representation used in the configuration for re-identifying an object in image processing according to the present invention indicates that a coefficient is zero or close to zero.
(21) The present invention uses such sparse representation and uses a principle that reconstructs an observed signal with a few linear combinations of atoms constituting a dictionary.
(22) This will be expressed below:
y=D[Equation 1]
(23) In the equation representing the application of sparse representation of object re-identification, y denotes an object (target) to be re-identified and D denotes a set of candidate objects.
(24) As such, a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of the target object constituting a dictionary is estimated and identification (matching) using an error between the target and the reconstructed object is performed.
(25) In Equation 1, represents a weight assigned as an N1 column vector, D represents a pre-defined MN dictionary with N samples that have M-dimensional features, and y is reconstructed signals having the same size as .
(26) To satisfy the equation, atoms of should have a non-zero value, and y is generated by combining only a few dictionary elements selected by non-zero atoms of .
(27) When dictionary D is given, Equation 1 is modified to minimize the number of the non-zero atoms.
(28)
(29) where {circumflex over ()} represents a modified sparse vector, represents a regularization factor, .sub.2.sup.2 represents the l.sub.2-norm operator, and .sub.0 to represents the l.sub.0-norm operator.
(30) {circumflex over ()} is the first term which is optimally reconstructed due to an error between the input signal y and recovered version D and satisfies a sparsity condition.
(31)
(32) The weighted feature extractor may solve a feature extraction problem restricted by a small area of an object occupying an image.
(33) A weighted feature descriptor may be defined below.
(34) Equations 3 and 4 define a weighted hue saturation value histogram.
(35)
(36) where I.sub.s.sup.H represents a hue channel of the s.sup.th stripe, W.sub.s.sup.C represents a weighted vector, b is a bin of the histogram, and (u, v) are coordinates of each stripe.
(37) Each histogram bin is modified by multiplying it by a corresponding weight as shown in Equation 4.
f.sub.s.sup.wC(b)=W.sub.s.sup.C(b)f.sub.s.sup.C(b)f.sub.s.sup.C(b),[Equation 4]
(38) where f.sub.s.sup.wC represents a color histogram of the s.sup.th stripe satisfying f.sub.s.sup.wC.sup.1M.sup.
(39) Equations 5 and 6 define a weighted local binary pattern (LBP).
f.sub.s.sup.wT(I.sub.s.sup.T(u,v))=W.sub.s.sup.T(I.sub.s.sup.T(u,v))f.sub.s.sup.T(I.sub.s.sup.T(u,v)),[Equation 5]
(40) wherein f.sub.s.sup.wT denotes a weighted texture descriptor of the s.sup.th stripe satisfying f.sub.s.sup.wT.sup.1M.sup.
(41) A total descriptor d.sup.M of an object is defined by arranging all stripe regions as shown in Equation 6.
d=[f.sub.1.sup.wCf.sub.1.sup.wTf.sub.2.sup.wCf.sub.2.sup.wT . . . f.sub.L.sup.wCf.sub.L.sup.wT].sub.T[Equation 6]
(42)
(43) In the present invention, a dictionary is constructed to solve an inefficiency problem according to a method of constructing multiple images of each gallery and a problem in that the number of combinations that can represent a target increases when there are many similar atoms.
(44) In the present invention, a dictionary is constructed with images with the most different characteristics of one object, thereby reducing the dictionary and facilitating coefficient estimation due to removal of redundant information.
(45)
(46) In the present invention, a sparse coefficient is estimated using a least absolute shrinkage and selection operator (LASSO) as shown in Equation 2.
(47) A method in which a reconstruction error of galleries is measured by computing a difference between the galleries and a reconstructed probe and an optimal gallery is searched using a reconstructed probe from an estimated coefficient and D from N gallery images is used.
(48) Equations 7 and 8 represent decision of a minimum error and an ID using multi-shot probes.
(49)
(50) where r.sub.i represents a minimum reconstruction error of the i.sup.th gallery for a probe, and y.sub.k represents a feature descriptor of the k.sup.th image in a probe.
(51) An index of the gallery corresponding to the probe is defined as Equation 8 below.
(52)
(53) Where c represents a selected identity of the probe to re-identify.
(54)
(55) Initially, when an image is input (S801), a weighted feature is extracted and a descriptor to which a feature of a salient region is applied is generated (S802).
(56) Subsequently, a dictionary is constructed with images with the most different characteristics of one object such that the dictionary is reduced due to removal of redundant information and coefficient estimation is facilitated (S803).
(57) Subsequently, a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary is estimated (S804).
(58) An identification process using an error between a target and the reconstructed object is performed (S805).
(59) The above-described apparatus and method for re-identifying an object in image processing according to the present invention allow cameras having different characteristics for global monitoring to accurately identify one object. Results obtained using continuous images captured by two non-overlapping cameras in a crowded airport with challenges, such as background clutter, occlusion, and viewpoint and illumination changes.
(60) Table 1 shows results of saliency-based weighted feature extraction.
(61) TABLE-US-00001 TABLE 1 Dataset ILIDS-VID Rank 1 5 10 20 HSV .sup.10% 24.6% 31.3% 41.3% LBP 5.3% 17.3% .sup.26% 38.6% HSV + LBP 16.6% 34.6% 43.3% .sup.50% Weighted HSV .sup.14% 28.6% 37.3% 48.6% Weighted LBP 8.3% 21.3% 28.6% 40.6% Proposed weighted descriptor 28.6% 47.3% 58.6% 66.6%
(62) It can be seen that the performance is improved by the robust atom configuration of the weighted feature descriptor by applying the apparatus and method for re-identifying an object in image processing according to the present invention.
(63) Table 2 shows a result of dictionary construction to which the apparatus and method for re-identifying an object in image processing according to the present invention are applied.
(64) TABLE-US-00002 TABLE 2 Dataset ILIDS-VID Rank 1 5 10 20 Sequential images 14.6% .sup.34% 43.3% .sup.60% Stride selection .sup.20% 42.6% 53.3% 65.3% Proposed dictionary construction 28.6% 47.3% 58.6% 66.6%
(65) It can be seen that a coefficient estimation problem caused by similar atoms can be solved.
(66) Table 3 shows a result of object re-identification to which the apparatus and method for re-identifying an object in image processing according to the present invention are applied.
(67) TABLE-US-00003 TABLE 3 Dataset ILIDS-VID Rank 1 5 10 20 ISR .sup.10% 24.6% 31.3% 41.3% eSDC 5.3% 17.3% .sup.26% 38.6% LC-KSVD 24.3% 38.5% 42.3% 47.3% DVR 23.3% 42.4% 55.3% 68.4% SRID 24.9% 44.5% 54.1% 68.8% Proposed method 28.6% 47.3% 58.6% 66.6%
(68) The above-described apparatus and method for re-identifying an object in image processing according to the present invention resolves a feature extraction problem restricted by a small area of an object occupying an image by using a weighted feature descriptor to which a feature of a salient region is applied, thereby allowing cameras having different characteristics for global monitoring to accurately identify one object.
(69) The apparatus and method for re-identifying an object in image processing according to the present invention achieve the following effects.
(70) First, by using saliency-based learning and sparse representation, it is possible to allow cameras having different characteristics for global monitoring to accurately identify one object.
(71) Second, by using saliency-based learning and sparse representation, it is possible to enhance the accuracy and robustness of object identification.
(72) Third, by using sparse representation, it is possible to reconstruct an observed signal with a few linear combinations of atoms constituting a dictionary.
(73) Fourth, it is possible to estimate a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects constituting a dictionary and perform identification using an error between a target and the reconstructed object, thereby increasing accuracy.
(74) Fifth, it is possible to resolve a feature extraction problem restricted by a small area of an object occupying an image by using a weighted feature descriptor.
(75) Sixth, a weighted description to which a feature of a salient region is applied is generated for re-identification of an object, thereby reducing a dictionary due to removal of redundant information and facilitating coefficient estimation.
(76) It should be apparent to those skilled in the art that various modifications can be made to the above-described exemplary embodiments of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention covers all such modifications provided they come within the scope of the appended claims and their equivalents.
REFERENCE NUMERALS
(77) 30: WEIGHTED FEATURE EXTRACTOR 40: DICTIONARY CONSTRUCTOR 50: DB 60: COEFFICIENT ESTIMATOR AND ID DETERMINER