Supervised facial recognition system and method

Abstract

A computer executed method for supervised facial recognition comprising the operations of preprocessing, feature extraction and recognition. Preprocessing may comprise dividing received face images into several subimages, converting the different face image (or subimage) dimensions into a common dimension and/or converting the datatypes of all of the face images (or subimages) into an appropriate datatype. In feature extraction, 2D DMWT is used to extract information from the face images. Application of the 2D DMWT may be followed by FastICA. FastICA, or, in cases where FastICA is not used, 2D DMWT, may be followed by application of the l.sub.2-norm and/or eigendecomposition to obtain discriminating and independent features. The resulting independent features are fed into the recognition phase, which may use a neural network, to identify an unknown face image.

Claims

1. A computer executed method for facial recognition comprising: receiving a face image; performing preprocessing on the face image; applying a 2D DMWT to the preprocessed face image to obtain a resultant image matrix for the face image, the resultant image matrix having a plurality of subimages; converting each of the subimages into a vector; combining the vectors for each of the subimages to create a feature matrix; applying 2D FastICA to the feature matrix to obtain a plurality of independent subimages; converting the plurality of independent subimages into two-dimensional form; determining a resultant feature vector using the plurality of two-dimensional independent subimages; and performing recognition of the resultant feature vector.

2. The method of claim 1, wherein the operation of performing preprocessing on the face image comprises: converting an image dimension of the face image to a common dimension; and converting the face image from a first datatype to a second datatype.

3. The method of claim 2, wherein the common dimension is of size NN and wherein N is of the power two.

4. The method of claim 2, wherein the second datatype is a double datatype.

5. The method of claim 1, wherein the plurality of subimages is four subimages and the plurality of independent subimages is four independent subimages.

6. The method of claim 1, wherein the operation of determining the resultant feature vector using the plurality of two-dimensional independent subimages comprises: determining an eigenvalue for each of the two-dimensional independent subimages; combining the eigenvalues to obtain a resultant feature matrix; and converting the resultant feature matrix into a resultant feature vector.

7. The method of claim 1, wherein the operation of determining the resultant feature vector using the plurality of two-dimensional independent subimages comprises: determining an eigenvector for each of the plurality of two-dimensional independent subimages; converting each eigenvector into a feature vector; combining the feature vectors to create a resultant feature matrix; determining a l.sub.2-norm for each row in the resultant feature matrix; and convert the resultant feature matrix to the resultant feature vector using the l.sub.2-norms.

8. The method of claim 1, wherein the operation of performing recognition of the resultant feature vector comprises: identifying the face image by using the resultant feature vector as input to a neural network.

9. A computer executed method for facial recognition comprising: receiving a face image; performing preprocessing on the face image, wherein said step of preprocessing comprises dividing the face image into subimages, converting each of said subimages to a common dimension, and converting the face image to a datatype suitable for transform using 2D DMWT; applying a 2D DMWT to the preprocessed face image to obtain a resultant image matrix for the face image, the resultant image matrix having a plurality of subimages; determining a resultant feature vector using the plurality of subimages; and performing recognition of the resultant feature vector.

10. The method of claim 9, wherein the operation of performing preprocessing on the face image comprises: converting an image dimension of the face image to a common dimension; and converting the face image from a first datatype to a second datatype.

11. The method of claim 10, wherein the common dimension is of size NN and wherein N is of the power two.

12. The method of claim 10, wherein the second datatype is a double datatype.

13. The method of claim 9, wherein the plurality of subimages is four subimages.

14. The method of claim 9, wherein the operation of determining the resultant feature vector using the plurality of subimages comprises: determining an eigenvalue for each of the subimages; combining the eigenvalues to obtain a resultant feature matrix; and converting the resultant feature matrix into a resultant feature vector.

15. The method of claim 9, wherein the operation of determining the resultant feature vector using the plurality of subimages comprises: determining an eigenvector for each of the subimages; converting each eigenvector into a feature vector; combining the feature vectors to create a resultant feature matrix; determining a l.sub.2-norm for each row in the resultant feature matrix; and convert the resultant feature matrix to the resultant feature vector using the l.sub.2-norms.

16. The method of claim 9, wherein the operation of performing recognition of the resultant feature vector comprises: identifying the face image by using the resultant feature vector as input to a neural network.

17. A computer executed method for facial recognition comprising: receiving a face image; performing preprocessing on the face image to obtain a plurality of subimages resulting in a face image of dimension NN; applying a 2D DMWT to the plurality of preprocessed subimages to obtain a resultant subimage matrix for each subimage, each resultant image matrix having a plurality of sub-subimages; converting each of the sub-subimages into a vector; combining the vectors to create a pose matrix; determining a resultant feature vector using the pose matrix; and performing recognition of the resultant feature vector.

18. The method of claim 17, wherein the operation of performing preprocessing on the face image to obtain a plurality of subimages comprises: dividing the face image into the plurality of subimages, wherein the plurality of subimages have an original dimension; converting the original dimension of the plurality of subimages to a common dimension; and converting the plurality of subimages from a first datatype to a second datatype.

19. The method of claim 18, wherein the common dimension is of size NN and wherein N is of the power two.

20. The method of claim 18, wherein the second datatype is a double datatype.

21. The method of claim 17, wherein the plurality of subimages is four subimages and the plurality of sub-subimages is four sub-subimages.

22. The method of claim 17, wherein the operation of determining the resultant feature vector using the pose matrix comprises: applying 2D FastICA to the pose matrix to obtain a FastICA matrix; determining a l.sub.2-norm for each row in the FastICA matrix; and converting the FastICA matrix to the resultant feature vector using the l.sub.2-norms.

23. The method of claim 22, wherein the FastICA matrix is a FastICA signal matrix.

24. The method of claim 22, wherein the FastICA matrix is a mixing matrix of a FastICA signal matrix.

25. The method of claim 22, wherein the FastICA matrix is a feature matrix of a FastICA signal matrix.

26. The method of claim 17, wherein the operation of determining the resultant feature vector using the pose matrix comprises: determining a l.sub.2-norm for each row in the pose matrix; and converting the pose matrix to the resultant feature vector using the l.sub.2-norms.

27. The method of claim 17, wherein the operation of performing recognition of the resultant feature vector comprises: identifying the face image by using the resultant feature vector as input to a neural network.

28. A computer executed method for facial recognition comprising: receiving a face image; performing preprocessing on the face image to obtain a plurality of subimages, resulting in a face image of NN dimension; applying a 2D DMWT to the plurality of preprocessed subimages to obtain a resultant subimage matrix for each subimage, the resultant image matrix having a plurality of sub-subimages that correspond to each of the plurality of subimages; converting each of the plurality of sub-subimages into a plurality of vectors, wherein each vector corresponds to one of the plurality of sub-subimages; combining the vectors that correspond to each of the subimages to create a plurality of pose matrices, wherein each pose matrix corresponds to one of the subimages; determining a plurality of resultant feature vectors using the plurality of pose matrices; and performing recognition of the plurality of resultant feature vectors.

29. The method of claim 28, wherein the operation of performing preprocessing on the face image to obtain a plurality of subimages comprises: dividing the face image into the plurality of subimages, wherein the plurality of subimages have an original dimension; converting the original dimension of the plurality of subimages to a common dimension; and converting the plurality of subimages from a first datatype to a second datatype.

30. The method of claim 29, wherein the common dimension is of size NN and wherein N is of the power two.

31. The method of claim 29, wherein the second datatype is a double datatype.

32. The method of claim 28, wherein the plurality of subimages is four subimages and the plurality of sub-subimages is four sub-subimages.

33. The method of claim 28, wherein the operation of determining the plurality of resultant feature vectors using the plurality of pose matrices comprises: applying 2D FastICA to the pose matrices to obtain a plurality of FastICA matrices, wherein each FastICA matrix corresponds to one of the pose matrices; determining a l.sub.2-norm for each row in each of the plurality of FastICA matrices; and converting the FastICA matrices to the plurality of resultant feature vectors using the l.sub.2-norms.

34. The method of claim 33, wherein the plurality of FastICA matrices are FastICA signal matrices.

35. The method of claim 33, wherein the plurality of FastICA matrices are mixing matrices from FastICA signal matrices.

36. The method of claim 33, wherein the plurality of FastICA matrices are feature matrices from FastICA signal matrices.

37. The method of claim 28, wherein the operation of determining the plurality of resultant feature vectors using the plurality of pose matrices comprises: determining a l.sub.2-norm for each row in each of the plurality of pose matrices; and converting the pose matrices to the plurality of resultant feature vectors using the l.sub.2-norms.

38. The method of claim 28, wherein the operation of performing recognition of the plurality of resultant feature vector comprises: identifying the face image by using the plurality of resultant feature vectors as input to a neural network.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1A is a block diagram illustrating an exemplary system upon which embodiments of the facial recognition method may be executed according to an embodiment of the present invention.

(2) FIG. 1B is a flowchart of a prepossessing method according to an embodiment of the present invention.

(3) FIG. 2 is a table of the dimensions of the face images in the ORL, YALE and FERET databases and the common dimension used for face images in embodiments of the present invention.

(4) FIGS. 3A-3C are exemplary face images from the ORL (3A), YALE (3B) and FERET (3C) databases.

(5) FIG. 4A is a flowchart of a facial recognition method using the 2D DMWT, 2D Fast ICA and eigendecomposition (eigenvalues) according to an embodiment of the present invention.

(6) FIG. 4B is a flowchart of a facial recognition method using the 2D DMWT, 2D Fast ICA and eigendecomposition (eigenvectors) according to an embodiment of the present invention.

(7) FIGS. 5A-5C are images of exemplary applications of the 2D DMWT applied to different databasesORL (5A), YALE (5B) and FERET (5C)according to embodiments of the present invention.

(8) FIG. 6 is a diagram showing the application of 2D FastICA to extracted subimages (S1, S2, S3 and S4) of the low-low (LL) sub-band to obtain independent subimages (IS1, IS2, IS3, IS4, respectively) of the LL sub-band according to an embodiment of the present invention.

(9) FIG. 7A is a flowchart of a facial recognition method using 2D DMWT and eigendecomposition (eigenvalues) according to an embodiment of the present invention.

(10) FIG. 7B is a flowchart of a facial recognition method using 2D DMWT and eigendecomposition (eigenvectors) according to an embodiment of the present invention.

(11) FIG. 8 is a flowchart of an alternative prepossessing method that divides face images into subimages according to an embodiment of the present invention.

(12) FIGS. 9A-9D are images of four exemplary subimages of a face image from the FERET database created during the alternative preprocessing method shown in FIG. 8 according to an embodiment of the present invention.

(13) FIG. 10 is a table of the dimensions of exemplary subimages created from a face image having dimension MR and the new (or common) dimension NN used as an exemplary dimension for subimages in embodiments of the present invention.

(14) FIG. 11A is a flowchart of a facial recognition method that uses the 2D DMWT, FastICA and the l.sub.2-norm to process subimages (as opposed to face images) during the feature extraction phase according to an embodiment of the present invention.

(15) FIG. 11B is a flowchart of a facial recognition method that uses the 2D DMWT and the l.sub.2-norm to process subimages (as opposed to face images) during the feature extraction phase according to an embodiment of the present invention.

(16) FIGS. 12A-12D are exemplary images of the sub-subimages obtained after applying the 2D DMWT to the subimages of FIGS. 9A-9D, respectively, according to an embodiment of the present invention.

(17) FIGS. 13A-13D are diagrams illustrating the conversion of the exemplary sub-subimages of FIG. 12A-12D, respectively, into one dimensional form according to an embodiment of the present invention.

(18) FIG. 14A is a flowchart of a facial recognition method that uses 2D DMWT, FastICA and the l.sub.2-norm and an alternative formation of the pose matrix to process subimages (as opposed to face images) during the feature extraction phase according to an embodiment of the present invention.

(19) FIG. 14B is a flowchart of a facial recognition method that uses 2D DMWT and the l.sub.2-norm and an alternative formation of the pose matrix to process subimages (as opposed to face images) during the feature extraction phase according to an embodiment of the present invention.

(20) FIG. 15 is a table of exemplary experimental results and resultant matrix dimensions for the ORL database according to an embodiment of the present invention.

(21) FIG. 16 is a table of exemplary experimental results and resultant matrix dimensions for the YALE database according to an embodiment of the present invention.

(22) FIG. 17 is a table of exemplary experimental results and resultant matrix dimensions for the FERET database according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

(23) A detailed description of the embodiments for methods and systems of supervised-learning facial recognition using multiresolution and independent features will now be presented with reference to FIGS. 1-17 One of skill in the art will recognize that these embodiments are not intended to be limitations on the scope, and that modifications are possible without departing from the spirit thereof. In certain instances, well-known methods, procedures, components, and circuits have not been described in detail.

(24) FIG. 1A is a diagram of an exemplary system. It should be understood, however, that a system as illustrated and hereinafter described is merely illustrative of a system that could benefit from embodiments of the invention, and therefore, should not be taken to limit the scope of the invention. While an embodiment of the system is illustrated for purposes of example, other configuration of systems may readily employ embodiments of the invention. The exemplary system may comprise sensor 108, which may be a camera or other imaging device, and electronic device 107 (e.g. a computer or server). Sensor 108 produces image 106, which may be a face image or an image that includes a face image. Image 106 may be cropped and sized to only include a face image. Sensor 108 outputs image 106 to electronic device 107, which performs facial recognition on image 106 and presents the results to the user, which may be, for example, the identity of the unknown face in image 106 or a match or non-match notification.

(25) Electronic device 107 may include or be in communication with database 110. During training and testing of embodiments of the facial recognition method, face images 109 may come from database 110.

(26) Electronic device 107 may be a mobile computing device. Electronic device 107 may comprise a user interface for providing output and/or receiving input. Electronic device 107 may comprise an output device such as a display which is coupled to a processor. The user input interface, which allows electronic device 107 to receive data, may comprise means, such as one or more devices that may allow electronic device 107 to receive data, such as a keypad, a touch display, for example if the display comprises touch capability, and/or the like.

(27) Electronic device 107 may comprise a memory device including, in one embodiment, volatile memory, such as volatile Random Access Memory (RAM) including a cache area for the temporary storage of data. Electronic device 107 may also comprise other memory, for example, non-volatile memory, which may be embedded and/or may be removable. The non-volatile memory may comprise an EEPROM, flash memory or the like. The memories may store any of a number of pieces of information, and data. The information and data may be used by electronic device 107 to implement one or more functions of the electronic device.

(28) Electronic device 107 can be connected by conventional access hardware to the sensor via a wired or wireless connection. Electronic device 107 can be connected by conventional access hardware to the Internet. Electronic device 107 and sensor 108 may be in bi-directional communication with each other via the Internet.

(29) Electronic device 107 of an exemplary embodiment need not be the entire electronic device, but may be a component or group of components of the electronic device in other exemplary embodiments. Electronic device 107 may comprise a processor or other processing circuitry. As used in this application, the term circuitry refers to at least all of the following: hardware-only implementations (such as implementations in only analog and/or digital circuitry) and to combinations of circuits and software and/or firmware such as to a combination of processors or portions of processors/software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus, such as a computer, to perform various functions and to circuits, such as a microprocessor(s) or portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present. This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry would also cover an implementation of merely a processor, multiple processors, or portion of a processor and its (or their) accompanying software and/or firmware.

(30) Further, the processor(s) may comprise functionality to operate one or more software programs, which may be stored in memory and which may, among other things, cause the processor to implement at least one embodiment including, for example, one or more of the functions or operations described herein.

(31) Embodiments of the facial recognition method comprise three phases: preprocessing, feature extraction and recognition. In embodiments, as shown in FIG. 1B, preprocessing (method 1) begins when face images 5 are received in operation 6. Face images 5 may come from one or more databases (in the case of training and/or testing the system) or may be images from another source, for example, a camera or scanning device. The images may have various dimensions or be of a dimension that is not acceptable for processing by the method. Therefore, after the method receives the face images, in operation 7, the dimensions of face images 5 are converted to a single (or common) dimension. In the exemplary embodiments described herein, the common dimension is an NN dimension where N is a power of two (e.g. 128128). Although other dimensions can be used, for simplicity, examples given herein are based on an NN dimension. The dimensions of exemplary databases having varying dimension, the ORL, YALE and FERET databases, are shown in the table of FIG. 2 along with the common dimension used herein. Once the face images all have the same (or common) dimension, the face images are each converted to a datatype suitable for the transform used in operation 8. In this case, the double datatype is used resulting in a NN double (e.g. 128128) extension.

(32) Embodiments of the facial recognition methods were trained and tested using the ORL, YALE and FERET databases, which have a large number of poses and persons and different light conditions, facial expressions and angle rotations. The variation in the number of poses, facial expressions, light conditions and angle rotations of these databases can be seen in FIGS. 3A (ORL), 3B (YALE) and 3C (FERET) in which face images for one person were selected for each respective database. Face images from these databases are used herein as example input to embodiments of methods described herein.

(33) A face image has highly redundant information and large dimensions. For feature extraction, techniques can be applied to get an efficient representation of the images by extracting discriminating and independent features. Exemplary methods for feature extraction, including use of the 2D DMWT, FastICA, the l.sub.2-norm and/or eigendecomposition, to obtain discriminating and independent features, are described in further detail below.

(34) The 2D DMWT based on MRA can be used for dimensionality reduction, localizing all the useful information in one single band, and/or noise reduction. 2D FastICA can be used for decorrelating the high order statistics since most of the important information is contained in the high order statistics of the image and/or reducing the computational complexity and improving the convergence rate. ICA features may be less sensitive to the facial variations arising from different facial expressions and different poses. ICA features are independent which may lead to a better representation and, hence, better identification and recognition rates.

(35) In an embodiment, illustrated in the flowchart of FIG. 4A, facial recognition method 10 uses 2D DMWT, FastICA and eigendecomposition. As shown, face images 11 are received in operation 12. In operation 14, preprocessing, which may be preprocessing method 1, is performed on face images 11. In operation 15, the 2D DMWT is applied to each of the preprocessed face images. This produces a resultant image matrix for each face image. In the resultant image matrix, each face image has a set of subimages. FIGS. 5A-5C provide examples of applying the 2D DMWT to face images from the ORL, YALE, and FERET databases, respectively. As shown in FIGS. 5A-5C, the images are divided into four main sub-bands with dimension

(36) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464) and each one is further divided into four

(37) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) subimages. As can be seen, all of the useful information is localized in the upper left band, which corresponds to the low-low (LL) frequency band of the multiwavelet transform. The LL sub-band is retained, while the remaining sub-bands are eliminated. Therefore, the resultant image matrix in this example is

(38) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464). Returning to the flowchart of FIG. 4A, once the resultant image matrix (with its subimages) has been obtained using 2D DMWT, each of the

(39) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) subimages is converted to a

(40) $\frac{N^{2}}{16} 1$
(e.g. 10241) vector in operation 16, thereby creating a vector corresponding to each subimage (and four vectors corresponding to each image).

(41) Next, in operation 17, the

(42) $\frac{N^{2}}{16} 1$
vectors for each set of four subimages are combined to create a feature matrix each having dimension

(43) 0 $\frac{N^{2}}{16} 4$
(e.g. 10244). Each feature matrix corresponds to a set of four

(44) $\frac{N}{4} \frac{N}{4}$
subimages. 2D Fast ICA is applied, in operation 18, to each of the

(45) $\frac{N^{2}}{16} 4$
feature matrices to obtain a set of four

(46) $\frac{N}{4} 1$
(e.g. 321) independent subimages for each feature matrix. An example of the application of 2D Fast ICA is shown in FIG. 6. As shown FastICA is applied to extracted subimages S1, S2, S3, and S4 of the LL sub-band, resulting in independent subimages IS1, IS2, IS3 and IS4 of the LL sub-band. Returning to FIG. 4A, once the independent subimages have been obtained, they are converted to two-dimensional form in operation 19.

(47) Next, new features are extracted from the original features. In operation 20, the eigenvalues are determined for each of the independent subimages. The eigenvalues for each set of four independent subimages are then combined in operation 21 to obtain a resultant feature matrix having dimension

(48) $\frac{N}{4} 4$
(e.g. 324). Then, in operation 22, each of the resultant feature matrices is converted into one dimensional form into a N1 (e.g. 1281) resultant feature vector. Recognition is then performed, in operation 23, using the resultant feature vectors.

(49) In the recognition phase, one of several methods can be used to identify the unknown image including, for example, by measuring the Euclidean distance, which may be done using a neural network toolbox (NNT) based classifier for training and testing. Training and testing can also be done using the Back Propagation Training Algorithm (BPTA), Raidal Bias Function or Kohonen Self Organizing Networks based classifiers.

(50) Alternatively, as shown in FIG. 4B, facial recognition method 30 uses eigenvectors to extract features rather than eigenvalues. As shown, face images 111 are received in operation 112. In operation 114, preprocessing, which may be preprocessing method 1 is performed on face images 111. In operation 115, the 2D DMWT is applied to each of the preprocessed face images. This produces a resultant image matrix for each face image. In the resultant image matrix, each face image has a set of subimages. As described above, FIGS. 5A-5C provide examples of applying the 2D DMWT to face images from the ORL, YALE, and FERET databases, respectively. Once the resultant image matrix (with its subimages) has been obtained using 2D DMWT, each of the

(51) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) subimages is converted to a

(52) $\frac{N^{2}}{16} 1$
(e.g. 10241) vector in operation 116, thereby creating a vector corresponding to each subimage (and four vectors corresponding to each image).

(53) Next, in operation 117, the

(54) $\frac{N^{2}}{16} 1$
vectors for each set of four subimages are combined to create a feature matrix each having dimension

(55) $\frac{N^{2}}{16} 4$
(e.g. 10244). Each feature matrix corresponds to a set of four

(56) $\frac{N}{4} \frac{N}{4}$
subimages. 2D Fast ICA is applied, in operation 118, to each of the

(57) 0 $\frac{N^{2}}{16} 4$
feature matrices to obtain a set of four

(58) $\frac{N}{4} 1$
(e.g. 321) independent subimages for each feature matrix. As described above, an example of the application of 2D Fast ICA is shown in FIG. 6. Once the independent subimages have been obtained, they are converted to two-dimensional form in operation 119.

(59) Next, new features are extracted from the original features. In operation 124, eigenvectors for each of the independent subimages are determined. The eigenvectors have dimension

(60) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232). Then, in operation 125, each eigenvector is converted into a feature vector having dimension

(61) $\frac{N^{2}}{16} 1$
(e.g. 10241). The feature vectors for each set of four independent subimages are then combined, in operation 126 to create a resultant feature matrix having dimension

(62) $\frac{N^{2}}{16} 4$
(e.g. 10244). In operation 127, the l.sub.2-norm is determined for each row in each of the resultant feature matrices and used, in operation 128, to convert each resultant feature matrix to a resultant feature vector having dimension

(63) $\frac{N^{2}}{16} 1$
(e.g. 10241), thereby reducing the dimensionality and constraining the energy of each image in a column. Recognition is then performed, in operation 3, using the resultant feature vectors.

(64) In the recognition phase, one of several methods can be used to identify the unknown image including, for example, by measuring the Euclidean distance, which may be done using a neural network toolbox (NNT) based classifier for training and testing. Training and testing can also be done using the Back Propagation Training Algorithm (BPTA), Raidal Bias Function or Kohonen Self Organizing Networks based classifiers.

(65) In another embodiment, illustrated in the flowchart of FIG. 7A, facial recognition method 40 uses 2D DMWT and eigendecomposition (without the use of FastICA). As shown, face images 41 are received in operation 42. In operation 43, preprocessing, which may be preprocessing method 1, is performed on face images 41. In operation 44, the 2D DMWT is applied to each of the preprocessed face images. This produces a resultant image matrix for each face image. Each face image has a set of subimages. As described above, FIGS. 5A-5C provide examples of images to which the 2D DMWT has been applied. Once the resultant image matrices (with each's subimages) has been obtained using 2D DMWT, rather than applying FastICA, new features are extracted using the resultant image matrices.

(66) In operation 45, the eigenvalues are determined for each of the

(67) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) subimages. The eigenvalues for each set of four subimages are then combined in operation 46 to obtain a resultant feature matrix. Then, in operation 47, each of the resultant feature matrices is converted into one dimensional form into a N1 (e.g. 1281) resultant feature vector. Recognition is then performed, in operation 48, using the resultant feature vectors.

(68) Alternatively, as shown in FIG. 7B, facial recognition method 55 uses eigenvectors to extract features rather than eigenvalues. As shown, face images 141 are received in operation 142. In operation 143, preprocessing, which may be preprocessing method 1, is performed on face images 141. In operation 144, the 2D DMWT is applied to each of the preprocessed face images. This produces a resultant image matrix for each face image. Each face image has a set of subimages. As described above, FIGS. 5A-5C provide examples of images to which the 2D DMWT has been applied. Once the resultant image matrices (with each's subimages) has been obtained using 2D DMWT, new features are extracted using the resultant image matrices.

(69) In operation 49, eigenvectors for each of the

(70) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) subimages are determined. Then, in operation 50, each eigenvector is converted into a feature vector. The feature vectors for each set of four subimages are then combined, in operation 51, to create a resultant feature matrix. In operation 52, the l.sub.2-norm is determined for each row in each of the resultant feature matrices and used, in operation 53, to convert each resultant feature matrix to a resultant feature vector having dimension

(71) $\frac{N^{2}}{16} 1$
(e.g. 10241), thereby reducing the dimensionality and constraining the energy of each image in a column. Recognition is then performed, in operation 48 using the resultant feature vectors.

(72) In another exemplary embodiment, as shown in FIG. 8, preprocessing (method 60) may begin when face images 65 are received in operation 66. Face images 65 are then divided into subimages (or subposes) in operation 67 as illustrated in FIGS. 9A-9D. Face images 65 may come from one or more databases (in the case of training and/or testing the system) or may be images from another source, for example, a camera or scanning device. Face images 65 may have various dimensions or be of a dimension that is not acceptable for processing by the method. Even if face images 65 are of the same dimension, after the division process in operation 67, the subimages will not likely be of the same dimension, as illustrated in the table of FIG. 10, or a dimension acceptable for processing by the method. Therefore, after the method divides face images 65 into subimages, in operation 68, the dimensions of the subimages are converted to a single (or common) dimension. In the exemplary embodiments described below, the common dimension is an NN dimension where N is a power of two (e.g. 128128). Although other dimensions can be used, for simplicity, examples given herein are based on NN dimensions. Once the subimages all have the same (or common) dimension, the face images are each converted to a datatype suitable for the transform used in operation 69. In this case, the double datatype is used resulting in a NN double (e.g. 128128) extension.

(73) In an embodiment, illustrated in the flowchart of FIG. 11A, facial recognition method 70 uses 2D DMWT, FastICA and the l.sub.2-norm. As shown, face images 71 are received in operation 72. In operation 73, preprocessing, which may be preprocessing method 60, is performed on face images 71. In operation 74, the 2D DMWT is applied to each of the subimages. This produces a resultant subimage matrix for each subimage. Each subimage has a set of four sub-subimages. FIGS. 12A-12D illustrate an example of applying 2D DMWT to the subimages of FIGS. 9A-9D. As shown in FIGS. 12A-12D, each subimage is divided into four main sub-bands with dimension

(74) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464) and each one is further divided into four

(75) 0 $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) sub-subimages. As can be seen in FIGS. 12A-12D, all of the useful information is localized in the upper left band, which corresponds to the LL frequency band of the multiwavelet transform. The LL sub-band is retained, while the remaining sub-bands are eliminated. Therefore, the resultant subimage matrix is

(76) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464). Returning to the flowchart of FIG. 11A, once the resultant subimage matrix (with its sub-subimages) has been obtained using 2D DMWT, each of the

(77) $\frac{N}{4} \frac{N}{4}$
sub-subimages is converted into a vector of

(78) $\frac{N^{2}}{16} 1$
(e.g. 10241) dimension in operation 75 as shown in FIG. 13A-13D. Next, in operation 76, the

(79) $\frac{N^{2}}{16} 1$
vectors that correspond to the subimages in each set of four subimages (for a total 16 sub-subimages, which represent all of the sub subimages that correspond to a single original face image) are combined to create a

(80) $\frac{N^{2}}{16} 16$
pose matrix. Each pose matrix corresponds to a set of four subimages.

(81) 2D Fast ICA is applied, in operation 77, to each of the

(82) $\frac{N^{2}}{16} 16$
pose matrices to obtain a FastICA matrix, which may be, for example, a FastICA signal matrix, a mixing matrix of a FastICA signal matrix or a feature matrix of a FastICA signal matrix. A FastICA matrix is created for each pose matrix. In operation 78, the l.sub.2-norm is determined for each row in each of the FastICA matrices and used, in operation 79, to convert each FastICA matrix to a resultant feature vector having dimension

(83) $\frac{N^{2}}{16} 1$
(e.g. 10241), thereby reducing the dimensionality and constraining the energy of each image in a column. Recognition is then performed, in operation 80, using the resultant feature vectors.

(84) As an alternative, facial recognition method 85 extracts features without using FastICA as shown in FIG. 11B. Face images 171 are received in operation 172. In operation 173, preprocessing, which may be preprocessing method 60, is performed on face images 171. In operation 174, the 2D DMWT is applied to each of the subimages. This produces a resultant subimage matrix for each subimage. Each subimage has a set of four sub-subimages. As described above, FIGS. 12A-12D illustrate an example of applying 2D DMWT to the subimages of FIGS. 9A-9D. Once the resultant subimage matrix (with its sub-subimages) has been obtained using 2D DMWT, each of the

(85) $\frac{N}{4} \frac{N}{4}$
sub-subimages is converted into a vector of

(86) $\frac{N^{2}}{16} 1$
(e.g. 10241) dimension in operation 175 as shown in FIG. 13A-13D. Next, in operation 176, the

(87) 0 $\frac{N^{2}}{16} 1$
vectors that correspond to the subimages in each set of four subimages (for a total 16 sub-subimages, which represent all of the sub subimages that correspond to a single original face image) are combined to create a

(88) $\frac{N^{2}}{16} 16$
pose matrix. Each pose matrix corresponds to a set of four subimages. After the

(89) $\frac{N^{2}}{16} 16$
pose matrix is created in operation 176, l.sub.2-norm is determined for each row in each of the pose matrix in operation 181. The l.sub.2-norm is then used, in operation 182, to convert each pose matrix to a resultant feature vector having dimension

(90) $\frac{N^{2}}{16} 1$
(e.g. 10241). Recognition is then performed, in operation 180, using the resultant feature vectors.

(91) In another embodiment, illustrated in the flowchart of FIG. 14A, face recognition method 90 uses 2D DMWT, FastICA and l.sub.2-norm; however, here, FastICA is applied to each subimage (with dimension

(92) $\frac{N^{2}}{16} 4)$
rather than the set of four subimages (with dimension

(93) $\frac{N^{2}}{16} 16)$
(that correspond to a single face image). As shown in FIG. 14A, face images 91 are received in operation 92. In operation 93, preprocessing, which may be preprocessing method 60, is performed on face images 91. In operation 94, the 2D DMWT is applied to each of the subimages. This produces a resultant subimage matrix for each subimage. Each subimage has a set of four sub-subimages. FIGS. 12A-12D illustrate an example of applying 2D DMWT to the subimages of FIGS. 9A-9D. As shown in FIGS. 12A-12D and explained above, each subimage is divided into four main sub-bands with dimension

(94) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464) and each one is further divided into four

(95) $\frac{N}{4} \frac{N}{4}$
(e.g. 3232) sub-subimages. The resultant subimage matrix is

(96) $\frac{N}{2} \frac{N}{2}$
(e.g. 6464). Returning to the flowchart of FIG. 14A, once the resultant subimage matrix (with its sub-subimages) has been obtained using 2D DMWT, each of the

(97) $\frac{N}{4} \frac{N}{4}$
sub-subimages is converted into a vector of

(98) 0 $\frac{N^{2}}{16} 1$
(e.g. 10241) dimension in operation 95 as shown in FIG. 13A-13D. Next, in operation 96, the

(99) $\frac{N^{2}}{16} 1$
vectors for each set of four sub-subimages are combined to create a

(100) $\frac{N^{2}}{16} 4$
pose matrix. Each pose matrix corresponds to a set of four sub-subimages.

(101) 2D Fast ICA is applied, in operation 97, to each of the

(102) $\frac{N^{2}}{16} 4$
pose matrices to obtain a FastICA matrix, which may be, for example, a FastICA signal matrix, a mixing matrix of a FastICA signal matrix or a feature matrix of a FastICA signal matrix. A FastICA matrix is created for each pose matrix. In operation 98, the l.sub.2-norm is determined for each row in each of the FastICA matrices and used, in operation 99, to convert each FastICA matrix to a resultant feature vector having dimension

(103) $\frac{N^{2}}{16} 1$
(e.g. 10241), thereby reducing the dimensionality and constraining the energy of each image in a column. Recognition is then performed, in operation 100, using the resultant feature vectors.

(104) As an alternative, facial recognition method 105 extracts features without using FastICA FIG. 14B. Face images 191 are received in operation 192. In operation 193, preprocessing, which may be preprocessing method 60, is performed on face images 91. In operation 194, the 2D DMWT is applied to each of the subimages. This produces a resultant subimage matrix for each subimage. Each subimage has a set of four sub-subimages. As described above, FIGS. 12A-12D illustrate an example of applying 2D DMWT to the subimages of FIGS. 9A-9D. Once the resultant subimage matrix (with its sub-subimages) has been obtained using 2D DMWT, each of the

(105) $\frac{N}{4} \frac{N}{4}$
sub-subimages is converted into a vector of

(106) $\frac{N^{2}}{16} 1$
(e.g. 10241) dimension in operation 195 as shown in FIG. 13A-13D. Next, in operation 196, the

(107) $\frac{N^{2}}{16} 1$
vectors for each set of four sub-subimages are combined to create a

(108) $\frac{N^{2}}{16} 4$
pose matrix. Each pose matrix corresponds to a set of four sub-subimages. The l.sub.2-norm is determined for each row in each of the pose matrix in operation 197. The l.sub.2-norm is then used, in operation 198, to convert each pose matrix to a resultant feature vector having dimension

(109) $\frac{N^{2}}{16} 1$
(e.g. 10241). Recognition is then performed, in operation 199, using the resultant feature vectors.

(110) In the recognition phase, a neural network based on the BPTA was used for training and testing. BPTA is a supervised learning algorithm. Therefore, it is necessary to choose a desired output for each database. ORL, YALE and FERET databases were used as an example for training and testing the recognition phase. There are 40, 15, and 200 different desired outputs for the ORL, YALE, and FERET databases, respectively, corresponding to the different number of persons in each database. Three layers are used in the NNT, namely, an input, a hidden, and an output layer.

(111) Exemplary experimental results of various embodiments described herein are provided in FIGS. 15-17. The methods of the various embodiments were tested using ORL, YALE and FERET databases, which have different light conditions, facial expressions, angle rotation, and a large number of poses and persons. The variation in the number of persons, the number of poses, facial expressions, light conditions and angle rotations of these databases can be seen in FIGS. 3A (ORL), 3B (YALE) and 3C (FERET).

(112) There are 40 persons in the ORL database, each with 10 different poses. Therefore, the total number of poses used to test the system is 400 poses. P denotes the number of poses used for training. Hence, 10P poses are used for testing. P=1, P=3 and P=5 poses were used here. The table of FIG. 15 summarizes the results for the different approaches.

(113) The YALE database consists of 15 persons, each with 11 different poses. Therefore, the total number of poses used to test the system is 165 poses. In the training phase, P=1, P=3 and P=5 poses were used. The table of FIG. 16 summarizes the results of the different approaches.

(114) There are 200 persons in the FERET database, each with 11 different poses. Therefore, the total number of poses used to test the system is 2200 poses. In the training phase, P=1, P=3 and P=5 poses were used. The table of FIG. 17 summarizes the results of the different approaches.

(115) Note that FastICA decorrelates the images and produces statistically independent sets of images. Then eigenanalysis of the resulting features generates an efficient image representation. The configuration of the neural network during the training phase can affect the overall performance of the methods. Choosing the number of hidden layers, the number of neurons in the hidden layers, the types of the activation functions, the training function, the training method and the target performance can impact the overall performance of the system. In the exemplary results for methods 10, 30, 40 and 55, one hidden layer was used with 512 neurons for the ORL and YALE databases and two hidden layers were used with 512 and 256 neurons for first and the second hidden layer, respectively for the FERET database. In the above exemplary results for methods 70, 85, 90 and 105, one hidden layer was used with 1024 neurons for the ORL and YALE databases and two hidden layers were used with 1024 and 512 neurons for the first and second hidden layer, respectively, for the FERET database. The activation function that was used was the hyperbolic tangent sigmoid and back propagation was used for training and testing. The mean square error (MSE) was 10.sup.7.

(116) Having now described the invention, the construction, the operation and use of preferred embodiments thereof, and the advantageous new and useful results obtained thereby, the new and useful constructions, and reasonable mechanical equivalents thereof obvious to those skilled in the art, are set forth in the appended claims.

Supervised facial recognition system and method

Assignee

Inventors

Cpc classification

Classification Explorer

G06V10/76

PHYSICS

Classification Explorer

G06V40/172

PHYSICS

Classification Explorer

G06V10/52

PHYSICS

Classification Explorer

G06F18/2134

PHYSICS

Classification Explorer

G06V40/165

PHYSICS

International classification

Classification Explorer

G06K9/00

PHYSICS

Classification Explorer

G06K9/62

PHYSICS

Classification Explorer

G06K9/52

PHYSICS

Abstract

Claims

Description