SYSTEMS AND METHODS FOR RECOGNITION OF FACES E.G. FROM MOBILE-DEVICE-GENERATED IMAGES OF FACES
20170262472 · 2017-09-14
Inventors
Cpc classification
G06F21/32
PHYSICS
International classification
Abstract
A method for recognizing faces including providing image/s in which a face is to be recognized; using a processor for constructing biometric feature set/s, including a statistical distribution thereof, from a multiplicity of N facial images (samples); generating template/s, using a processor, from a multiplicity of M facial images which is at least partly disjoint to the N facial images; computing scores, using a processor, to quantify an extent to which at least some of the templates match one another, pairwise; and testing whether an enroll image and a test image match by using plural feature extraction technologies to generate plural respective templates for the enroll image and for the test image and comparing therebetween thereby to generate score/s indicating an extent to which the enroll and test images match.
Claims
1. A method for recognizing faces including: providing at least one image in which a face is to be recognized; using a processor for constructing at least one biometric feature set, including a statistical distribution thereof, from a multiplicity of N facial images (samples); generating at least one template, using a processor, from a multiplicity of M facial images which is at least partly disjoint to the multiplicity of N facial images; computing scores, using a processor, to quantify an extent to which at least some of said templates match one another, pairwise; and testing whether an enroll image and a test image match by using plural feature extraction technologies to generate plural respective templates for the enroll image and plural respective templates for the test image and comparing said plural templates for the enroll image to said plural templates for the test image thereby to generate at least one score indicating an extent to which the enroll and test images match.
2. A method according to claim 1 wherein WPCA is used to construct at least one biometric feature set from the multiplicity of N facial images.
3. A method according to claim 1 wherein LDA is used to construct at least one biometric feature set from the multiplicity of N facial images.
4. A method according to claim 1 wherein PLDA is used to construct at least one biometric feature set from the multiplicity of N facial images.
5. A method according to claim 1 wherein at least one biometric feature set is constructed from a multiplicity of N gray-scale registered images each generated from a raw color image.
6. A method according to claim 1 wherein at least one biometric feature set is constructed from a multiplicity of N registered photo-normalized images each generated from a raw color image.
7. A face recognition system including: a repository including at least one biometric feature set, including a statistical distribution thereof, constructed from a multiplicity of N facial images (samples); and a processor configured for generating at least one template from a multiplicity of M facial images which is at least partly disjoint to the multiplicity of N facial images, computing scores to quantify an extent to which at least some of said templates match one another, pairwise; and testing whether an enroll image and a test image match by using plural feature extraction technologies to generate plural respective templates for the enroll image and plural respective templates for the test image and comparing said plural templates for the enroll image to said plural templates for the test image thereby to generate at least one score indicating an extent to which the enroll and test images match.
8. A method according to claim 5 or claim 6 wherein said raw color image is imaged by a mobile device camera.
9. A method according to claim 1 wherein said at least one biometric feature set comprises first and second biometric feature sets respectively constructed from a first multiplicity of N gray-scale registered images and a second multiplicity of N registered photo-normalized images and wherein corresponding pairs of first and second images, from among the first and second multiplicities respectively, are both generated from the same raw color image in a data repository including N raw color images.
10. A method according to claim 1 wherein WPCA is used to generate at least one template from the multiplicity of M facial images.
11. A method according to claim 1 wherein LDA is used to generate at least one template from the multiplicity of M facial images.
12. A method according to claim 1 wherein PLDA is used to generate at least one template from the multiplicity of M facial images.
13. A method according to claim 1 wherein at least one biometric feature set is generated from a multiplicity of M grayscale registered images each generated from a raw color image.
14. A method according to claim 1 wherein at least one biometric feature set is generated from a multiplicity of M registered photo-normalized images each generated from a raw color image.
15. A method according to claim 13 or 14 wherein said raw color image is imaged by a mobile device camera.
16. A method according to claim 1 wherein said at least one biometric feature set comprises first and second biometric feature sets respectively constructed from a first multiplicity of M gray-scale registered images and a second multiplicity of M registered photo-normalized images and wherein corresponding pairs of first and second images, from among the first and second multiplicities respectively, are both generated from the same raw color image in a data repository including M raw color images.
17. A method according to claim 1 wherein said computing of scores employs cosine-based scoring.
18. A method according to claim 1 wherein said templates include templates derived using plural feature extraction technologies and wherein said computing scores comprises fusing scores quantifying an extent to which templates derived using a first feature extraction technology match pairwise with scores quantifying an extent to which templates derived using at least a second feature extraction technology match pairwise.
19. A method according to claim 1 wherein said templates include templates derived from grayscale registered images and templates derived from registered photo-normalized images and wherein said computing scores comprises fusing scores quantifying an extent to which templates derived from gray-scale registered images match pairwise with scores quantifying an extent to which templates derived from registered photo-normalized images match pairwise.
20. A method according to claim 1 wherein said plural feature extraction technologies include at least one of: Gabor, LBP, DCT.
21. A method according to claim 19 wherein said fusing comprises computing a linear combination of scores.
22. A method according to claim 19 wherein said fusing comprises LLR fusion.
23. A method according to claim 1 wherein said testing includes generating gray-scale registered and registered photo-normalized images from a full color raw enroll image, generating gray-scale registered and registered photo-normalized images from a full color raw test image, and applying plural feature extraction technologies to the gray-scale registered and registered photo-normalized images generated from the full color raw enroll image and also to the gray-scale registered and registered photo-normalized images generated from the full color raw test image.
24. A method according to claim 1 and also comprising thresholding said score using at least first and second thresholds and determining whether the enroll and test images do or do not match, if the score outlies the first and second thresholds respectively and performing at least one additional identity verification process if the score lies between the first and second thresholds.
25. A computer program product, comprising a non-transitory tangible computer readable medium having computer readable program code embodied therein, said computer readable program code adapted to be executed to implement a method for recognizing faces including: providing at least one image in which a face is to be recognized; constructing at least one biometric feature set, including a statistical distribution thereof, from a multiplicity of N facial images (samples); generating at least one template from a multiplicity of M facial images which is at least partly disjoint to the multiplicity of N facial images; computing scores to quantify an extent to which at least some of said templates match one another, pairwise; and testing whether an enroll image and a test image match by using plural feature extraction technologies to generate plural respective templates for the enroll image and plural respective templates for the test image and comparing said plural templates for the enroll image to said plural templates for the test image thereby to generate at least one score indicating an extent to which the enroll and test images match.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0179]
[0180]
[0181]
[0182] Methods and systems included in the scope of the present invention may include some (e.g. any suitable subset) or all of the functional blocks shown in the specifically illustrated implementations by way of example, in any suitable order e.g. as shown.
[0183] Computational components described and illustrated herein can be implemented in various forms, for example, as hardware circuits such as but not limited to custom VLSI circuits or gate arrays or programmable hardware devices such as but not limited to FPGAs, or as software program code stored on at least one tangible or intangible computer readable medium and executable by at least one processor, or any suitable combination thereof. A specific functional component may be formed by one particular sequence of software code, or by a plurality of such, which collectively act or behave or act as described herein with reference to the functional component in question. For example, the component may be distributed over several code sequences such as but not limited to objects, procedures, functions, routines and programs and may originate from several computer files which typically operate synergistically.
[0184] Any method described herein is intended to include within the scope of the embodiments of the present invention also any software or computer program performing some or all of the method's operations, including a mobile application, platform or operating system e.g. as stored in a medium, as well as combining the computer program with a hardware device to perform some or all of the operations of the method.
[0185] Data can be stored on one or more tangible or intangible computer readable media stored at one or more different locations, different network nodes or different storage devices at a single node or location.
[0186] It is appreciated that any computer data storage technology, including any type of storage or memory and any type of computer components and recording media that retain digital data used for computing for an interval of time, and any type of information retention technology, may be used to store the various data provided and employed herein. Suitable computer data storage or information retention apparatus may include apparatus which is primary, secondary, tertiary or off-line; which is of any type or level or amount or category of volatility, differentiation, mutability, accessibility, addressability, capacity, performance and energy use; and which is based on any suitable technologies such as semiconductor, magnetic, optical, paper and others.
DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS
[0187] A face recognition system (engine/s) for recognizing faces in images captured by mobile communication devices is described herein in detail. The system may comprise some or all of the following software modules: face detection module, feature extraction module, feature fusion module and matching module e.g. as shown in the block diagram of
[0188] The input to the face recognition system typically comprises a color digital image acquired by a device camera. Typically, there is no restriction on image size which may be, but is not necessarily, VGA or QVGA. Typically, there is no restriction on image format which may be, but is not necessarily, jpeg, png, tiff. The first module in the face recognition system typically comprises a real-time face detection module which may be based on the Viola-Jones object detector, e.g. as implemented in OpenCV. The face detection module is used to detect a face (e.g. distinguish faces from non-faces) in the acquired image, e.g. locate the face, if any, in the image, and typically crop accordingly.
[0189] Typically, the trained data included some light tilted pose variation, e.g. up to +(−) 15 degrees, to facilitate detection of slightly tilted faces.
[0190] More generally, any suitable face detection module may be employed, which may perform some or all of the following stages, suitably ordered e.g. as follows: Feature Selection e.g. Haar feature selection; creating an Integral Image; training e.g. Adaboost Training; and cascading classifiers.
[0191] Typically, if multiple faces are detected by the face detection module 20, the face with the largest size is selected, using conventional image processing e.g. for identifying the largest rectangle from among rectangles surrounding or circumscribing the respective face regions. According to certain embodiments, if a low light environment is detected by face detection module 20 and if no face is detected, a suitable low light compensation algorithm (e.g. adaptive histogram equalization followed by Gaussian smoothing to flatten noise) is employed by the face detection module 20 to facilitate detecting the face, if any.
[0192] The facial image yielded by face detector 20 next undergoes “registration”.
[0193] Within the detected face, a suitable tool e.g. application, again, of the Viola-Jones object detector is used for eye detection. Alternatively any suitable eye detector tool may be employed to find eye coordinates; it is appreciated that object detectors such as Viola-Jones typically comprise a trainable machine; face data or eye data may be used thereby to form a specific face-detection or eye-detection model respectively.
[0194] Once eye coordinates are found by the eye detector, the facial image undergoes “registration” e.g. geometric normalization, after which the face is said to be “registered and normalized”. Typically, registration includes face rotation for plane rotated faces, so that the registered face is aligned to the reference X-axis. Typically, the face is cropped according to the left (say) eye coordinates, le(x,y), using crop parameters, aka distance parameters, pl, pu, pr and pd, typically comprising predetermined percentages of the face's ocular distance which is abs(re(x,y)−le(x,y)).
[0195] Typically, then, the face is cropped to the left by a distance le(x,y)−pl, upward by a distance le(x,y)−pu, to the right by a distance abs(re(x,y)−le(x,y))+pr and downward by a distance le(x,y)+pd,
[0196] Suitable values may be selected, conventionally, to yield suitable face scales.
[0197] The final color face image is also rescaled to a uniform size e.g. 100×100 pixels.
[0198] The color registered face image is next converted into a gray-scale image following a photo-normalization procedure using any conventional contrast stretching or histogram stretching or dynamic range expansion technique particularly if certain or many of the faces are believed to suffer from uneven or low illumination.
[0199] A feature extraction module 30 typically utilizes 3 separate processes (Gabor feature extraction, Local Binary Pattern (LBP) feature extraction, and Discrete Cosine Transform (DCT) feature extraction) for extracting facial features from the grayscale normalized face image and/or from the registered face image.
[0200] Typically, the feature extraction module 30 operates on both of: the original gray-scale image and the photo-normalized images of the face.
[0201] If Gabor feature extraction is used, the gray-scale face image typically undergoes a convolution process with a set of Gabor bank filters which may have 8 spatial orientations and 5 frequency scales in which case the convolution process output includes 40 complex results. The Gabor features are then constructed by taking the magnitude of the Gabor convolution process. Finally, the resulting data is down-sampled, e.g. to 16×16 pixels, in which case the output comprises a first 10240-dimensional Gabor based vector.
[0202] The same procedure is applied to the photo-normalized image, thereby yielding a second 10240-dimensional (say) Gabor based vector.
[0203] If LPB feature extraction is used, a (second e.g.) set of vectors may be generated by computing LBP histograms locally, on 8×8 window size scanning the whole image. It is known in the art that LBP image descriptors transform an image into an array or image of integer labels describing the small-scale appearance of the image. This array or image of integer labels may be represented by suitable image statistics such as a histogram.
[0204] An LPB image descriptor is typically applied to the gray-scale face image and also to the photo-normalized image, on both image representations.
[0205] If DCT feature extraction is used, a conventional cosine transformation may be run, on the pixel level, for each of the 2 images (the gray-scale face (first) image and the photo-normalized (second) image). The DCT transforms the pixel space into a frequency domain The output, for each image typically comprises an ordered frequency coefficient matrix having the same dimension (say m×n) as the original (1.sup.st or 2.sup.nd) image which is scanned lexicographically e.g. from left to right and from top to bottom. Each row of the scanned matrix is concatenated into a single column-wise vector of dimension [mn]×1, whose components are then sorted in descending (say) order of their coefficient values.
[0206] The resulting vector is then cropped to obtain a shorter vector including only the L (e.g. 960) first i.e. largest elements.
[0207] The output of the feature extraction module 30 for each face-containing image, is termed herein the “compact biometric feature data set” of each image and typically totals 2 Gabor based vectors, 2 LBP histograms, and 2 shortened column vectors, also termed herein “the 6 (2×3) trained biometric features” or “the 6 feature vectors” or “the 6 transformation data elements” (assuming that the 2 above-described image representations and the 3 above-described image descriptors are all used).
[0208] A system training procedure is typically performed during set up, in which suitable transformation matrices are typically computed, including the following 6 matrices: [0209] PCA, LDA and PLDA transformation matrices to be used for Gabor features extracted from the original gray-scale image. [0210] PCA, LDA and PLDA transformation matrices to be used for Gabor features extracted from the photometrically normalized gray-scale image. [0211] PCA, LDA and PLDA transformation matrices to be used for LBP features extracted from the original gray-scale image. [0212] PCA, LDA and PLDA transformation matrices to be used for LBP features extracted from the photometrically normalized gray-scale image. [0213] LDA and PLDA transformation matrices to be used for pixel-intensity features of the original gray-scale image. [0214] LDA and PLDA transformation matrices to be used for pixel-intensity features of the photometrically normalized gray-scale image.
[0215] The training procedure typically trains on annotated training data (e.g., facial imagery); the annotation typically comprises an indication associating certain images since all of these images are shots of a single person.
[0216] Typically, the data repository which serves as input to the training stage includes as many images as possible of each human subject. Typically, some or all human subjects are imaged under different conditions such as different illumination conditions (bright/dark/uneven), and with/without various occluding accessories, such as glasses or a hat.
[0217] The face recognition system is typically model based; the encoded statistics represented by the 6 transformation data may be extracted using a large set of facial imagery which may include tens of thousands of sample images of human subjects. At least 3 samples are typically provided per human subject. Large numbers of distinctive classes of unique identities (humans) may be harvested from multiple public databases such as Facebook or proprietary data repositories. To ensure that the face recognition system, once trained, can cope with a large degree of environmental variability, the imagery (training) set includes facial images acquired in a variety of real life environmental conditions including various illumination conditions, various partial occlusions, also termed herein “occlusions”, such as but not limited to hat, glasses, scarf, beard, fingers, or poses which deviate slightly from the frontal pose.
[0218] To ensure that the process is computationally tractable, the PCA (Principal Component Analysis)-based whitening transform (WPCA) may be used, as is conventional, to reduce data dimensionality. WPCA may also be employed as is conventional, to de-correlate data as much as possible, so that redundant information is removed while valuable and distinctive information is kept.
[0219] To reduce the memory footprint, WPCA may be applied only to the first 2 transformations (to the Gabor and LBP transformations and not to the cosine transformation. The WPCA operation is followed by (a) a linear discriminant analysis (LDA) projection operation, followed by computation of PLDA (Probabilistic Linear Discriminant Analysis) feature vectors for each of the two image descriptors employed in the LBP transformation. The LDA typically minimizes the class-within scatter matrix and maximizes the class-between scatter matrixes.
[0220] As described above, the compact “biometric feature data set” of each image typically comprises statistics yielded by 6 separate transformations which may be regarded as 6 respective “biometric models”.
[0221] A “matching attempt” refers to an evaluation of a target sample including determining whether the target sample is genuine or impostor, relative to a given query sample (previously imaged human subject), the target and query samples are suitably compared; for example, some or all of the following operations may be performed: [0222] a. each facial target image sample may first undergo face detection, pre-processing and feature extraction as above which may yield 6 trained biometric features. projection into a corresponding model may yield a single target template. [0223] b. The same procedure is applied to the query sample, yielding a single query template. [0224] c. Conventional cosine-based scoring may be computed between the two biometric templates generated in operations a, b above. [0225] d. Operations a, b, c may be repeated e.g. 5 more times, if 6 different scores are to be provided e.g. assuming that the 2 above-described image representations are crossed with the 3 above-described image descriptors. [0226] e. All 6 (say) scores are combined e.g. using a linear-logistic-regression-based (LLR) fusion technique, thereby to generate a final score for a given matching attempt. The fusion technique provides 6 different weights corresponding to the model contribution which are used to compute a single overall score.
[0227] If the result of the above evaluation or comparison yields similarity, this indicates that the target sample is a “genuine” representation of the face in the “query sample”. If the result of the above evaluation or comparison indicates a lack of similarity between the target and query samples, this indicates that the target sample is not a genuine representation of the face in the “query sample”; instead, the human whose face is represented in the target sample may be regarded as an “impostor”.
[0228] When matching, the face recognition system employs an enroll data set including a multiplicity of images in which each of a corresponding multiplicity of registered users, typically identified respectively by unique identifiers e.g. names or ID numbers, has captured her or his facial biometrics. His facial image typically undergoes face detection pre-processing and feature extraction, as described above.
[0229] A similar enroll (gallery) biometric template may be generated by “projecting” the enroll biometric data features into the trained models by multiplying the matrix from the biometric template with the matrix corresponding to the trained model.
[0230] To control unknown users' access to a system a test biometric template (aka “compact biometric feature data set”) may be computed for an unknown user and matched versus all 100,000 (say) gallery (enroll) biometric templates.
[0231] A false acceptance threshold (FAT) and a similarity threshold (ST) may be predetermined e.g. using a conventional receiver operating characteristic curve, which may be derived from a suitable “development/test” set of images. When the matching score is below FAT, access may be denied. When the computed score is above ST, access to the unknown user may be granted. If the score value is between FAT and ST, a learning operation may be performed, e.g. by asking the user to provide his credentials.
[0232] For example, in an enrollment phase, each user may be shown a pictogram including a predefined set of images and may select a few of these images, thereby to define for himself a “graphic password” comprising a selected set of, or sequence of, predetermined images. If this is the case, then if a user's score is between FAT and ST, the user is asked to present additional validation e.g. his graphic password. If the graphic password (say) supplied by the unknown user is correct, the unknown user is assumed to be genuine. The application typically has a bypass procedure in case the user is manually deemed genuine, although his score is under FAT.
[0233] Typically, the above operations are performed by an application (cell app) downloaded onto a user's mobile device. Typically, the app does not store any graphical depiction of the user's face and instead may store only the template e.g. compact biometric feature data set for that user's face such that each user's privacy is protected since his facial identity is stored only in encoded form.
[0234] An example facial recognition method for recognizing faces e.g. in images imaged by mobile device cameras is now described in detail with reference to
[0235] The method of
1. Image pre-processing e.g. as per
2. Training to form biometric features e.g. as per
e.g. Input a set of N (say: 100,000) images, construct feature set therefor (e.g. using at least one suitable feature extraction transformation such as but not limited to Gabor, LBP, DCT) and learn feature statistics e.g. by applying WPCA, LDA, and/or PLDA to at least one feature set, thereby to generate at least one template; Output includes: 6 Biometric Trained Models (output of operations 2.3.4, 2.3.8, 2.3 12, 2.3.16, 2.3.19, and 2.3.22, respectively).
3. Training to compute 6 individual matching scores e.g. as per
[0236] It is appreciated that typically, two separate at least partly disjoint sets of images (samples) are provided, one for fusion and one for training models, rather than using the same set for both purposes. Typically, the less common samples between the two sets, the smaller the resulting bias.
4. Compute Scores by matching (comparing) each sample against others e.g. as per
5. Fusion at score level e.g. by computing an optimum weight vector, including 6 (say) weights, which may be used to linearly combine the 6 (say) scores a-f (each of which may be represented as a 6×P matrix) computed in operation 4 (e.g. in operations 4.1-4.6 respectively) into a final aggregated score. LLR fusion technique may be used to aggregate scores.
6. determine: does test image matched enrolled (“enroll”) image/s?
[0237] The method of
1.1 detect face within a full color image; Output includes: image of detected face e.g. by cropping from the full image
1.2 Face Registration: transform the image, e.g. as per
1.3 Normalization of image photo (e.g. using any suitable conventional contrast stretching or histogram stretching or dynamic range expansion) to render the image less sensitive to variation in illumination between images. Input includes: output from 1.2.5; Output includes: photo-normalized face image.
[0238] The method of
1.2.1 Eyes detection. Input includes: output from 1.1; Process: detect left eye, detect right eye; Output includes: (x,y) coordinates for the left and right eye.
1.2.2 Ocular distance. Input includes: output from 1.2.1; Process: Compute the ocular distance; Output includes: Ocular distance.
1.2.3 Face alignment according to X axis (rotation). Input includes: output from 1.1 and output from 1.2.1; Process: the face is rotated so that the difference between the y-coordinate of the left eye and y-coordinate of the right eye is zero; Output includes: X-axis aligned face image.
1.2.4 Face cropping. Input includes: output from 1.2.2 and output from 1.2.3; Process: the face is cropped according to some predefined distances computed relative to the ocular distance; next, the face cropped face is rescaled to 100×100 pixels; Output includes: registered (scaled) face image.
1.2.5 RGB to gray-scale. Input includes: output from 1.2.4. Process: color registered face image conversion to gray-scale face image; Output includes: gray-scale registered face image.
[0239] The method of
2.1 provide data repository, aka BFTS (biometric feature training set) including a multiplicity N. e.g. tens, thousands, 10's of thousands or 100's of thousands, of “clusters” or “classes” of full color images of human subjects' faces, each cluster or class including plural e.g. several or dozens or more samples (images) of a single subject, typically acquired under various environment conditions. Each class or cluster is assigned (annotated) a unique ID.
2.2 generate 2 sets of N images from the biometric feature training set provided in operation 2.1; a first set of N registered gray-scale face images and a second set, F2, of N registered photo-normalized face images.
2.3 Feature extraction from sets F1, F2 e.g. as per
[0240] The method of
2.3a Gabor processing on registered grayscale image/s e.g. as per
2.3b Gabor processing on registered photo-normalized image/s e.g. as per
2.3c LBP processing on registered grayscale image/s e.g. as per
2.3d LBP processing on registered photo-normalized image/s e.g. as per
2.3e DCT processing on registered grayscale image/s e.g. as per
2.3f DCT processing on registered photo-normalized image/s e.g. as per
FIG. 6a
[0241] The method of
2.3.2 Apply WPCA operation to Gabor Set F1. Input includes: output from 2.3.1; Process: Apply WPCA to reduce dimension of matrix generated in operation 2.3.1 from 100,000 to a pre-defined size. Output includes: Reduced Gabor Set F1 matrix.
2.3.3 LDA and Gabor data projection for Set F1. Input includes: output from 2.3.2; Process: Apply LDA, then project this result onto output of 2.3.2 (e.g. using matrix multiplication). Output includes: LDA Gabor Set F1 matrix.
2.3.4 PLDA to Gabor for Set F1. Input includes: output from 2.3.3; Process: Apply PLDA to that input. Output includes: Gabor type Biometric Trained Model for Set F1.
[0242] The method of
2.3.5 Gabor feature extraction to Set F2. Input includes: Set F2; Process: Apply Gabor feature based extraction to each image from Set F2. Output includes: 100,000 Gabor transformation vectors corresponding to Set F2 collected into a Big Gabor Set F2 matrix.
2.3.6 Apply WPCA operation to Gabor Set F2. Input includes: output from 2.3.5; Process: Apply WPCA to reduce dimension of matrix generated in operation 2.3.5 from 100,000 to a pre-defined size. Output includes: Reduced Gabor Set F2 matrix.
2.3.7 LDA and Gabor data projection for Set F2. Input includes: output from 2.3.6; Process: Apply LDA, then project this result onto output of 2.3.6 (e.g. using matrix multiplication). Output includes: LDA Gabor Set F2 matrix.
2.3.8 PLDA to Gabor for Set F2. Input includes: output from 2.3.7; Process: Apply PLDA to that input. Output includes: Gabor type Biometric Trained Model for Set F2.
[0243] The method of
2.3.9 LBP feature extraction to Set F1. Input includes: Set F1; Process: Apply LBP feature based extraction to each image from Set F1. Output includes: 100,000 LBP transformation vectors corresponding to Set F1 collected into a Big LBP Set F1 matrix.
2.3.10 Apply WPCA operation to LBP Set F1. Input includes: output from 2.3.9; Process: Apply WPCA to reduce dimension of matrix generated in operation 2.3.9 from 100,000 to a pre-defined size. Output includes: Reduced LBP Set F1 matrix.
2.3.11 LDA and LBP data projection for Set F1. Input includes: output from 2.3.10; Process: Apply LDA, then project this result onto output of 2.3.10 (e.g. using matrix multiplication). Output includes: LDA LBP Set F1 matrix.
2.3.12 PLDA to LBP for Set F1. Input includes: output from 2.3.11; Process: Apply PLDA to that input. Output includes: LBP type Biometric Trained Model for Set F1.
[0244] The method of
2.3.13 LBP feature extraction to Set F2. Input includes: Set F2; Process: Apply LBP feature based extraction to each image from Set F2. Output includes: 100,000 LBP transformation vectors corresponding to Set F2 collected into a Big LBP Set F2 matrix.
2.3.14 Apply WPCA operation to LBP Set F2. Input includes: output from 2.3.13; Process: Apply WPCA to reduce dimension of matrix generated in operation 2.3.13 from 100,000 to a pre-defined size. Output includes: Reduced LBP Set F2 matrix.
2.3.15 LDA and LBP data projection for Set F2. Input includes: output from 2.3.14; Process: Apply LDA, then project this result onto output of 2.3.14 (e.g. using matrix multiplication). Output includes: LDA LBP Set F2 matrix.
2.3.16 PLDA to LBP for Set F2. Input includes: output from 2.3.15; Process: Apply PLDA to that input. Output includes: LBP type Biometric Trained Model for Set F2.
[0245] The method of
2.3.17 DCT feature extraction to Set F1. Input includes: Set F1; Process: Apply DCT feature based extraction to each image from Set F1. Output includes: 100,000 DCT transformation vectors corresponding to Set F1 collected into DCT Set F1 matrix.
2.3.18 LDA and DCT data projection for Set F1. Input includes: output from 2.3.17; Process: Apply LDA, then project this result onto output of 2.3.17 (e.g. using matrix multiplication). Output includes: LDA DCT Set F1 matrix.
2.3.19 PLDA to DCT for Set F1. Input includes: output from 2.3.18; Process: Apply PLDA to that input. Output includes: DCT type Biometric Trained Model for Set F1.
[0246] The method of
2.3.20 DCT feature extraction to Set F2. Input includes: Set F2; Process: Apply DCT feature based extraction to each image from Set F2. Output includes: 100,000 DCT transformation vectors corresponding to Set F2 collected into a DCT Set F2 matrix.
2.3.21 LDA and DCT data projection for Set F2. Input includes: output from 2.3.20; Process: Apply LDA, then project this result onto output of 2.3.20 (e.g. using matrix multiplication). Output includes: LDA DCT Set F2 matrix.
2.3.22 PLDA to DCT for Set F2. Input includes: output from 2.3.21; Process: Apply PLDA to that input. Output includes: DCT type Biometric Trained Model for Set F2.
[0247] The method of
3.1 Initial Setup: generate BMTS (biometric matching training set): a set of M full color images of different people including several samples (images) per subject. Typically this set is at least partly disjoint to BFTS to avoid bias. M may be any suitable number such as 200, 500, 1000, 2000, 5000, 10,000 or more.
3.2 BMTS pre-processing. Input includes: BMTS: Process: Apply operation 1 above mutatis mutandis; Output includes: 2 sets of M images, including set M1 with M registered grayscale face images and set M2 with M registered photo-normalized face images.
3.3 Template creation e.g. as per
[0248] The method of
3.3a Gabor processing on M registered grayscale images and on M registered photo-normalized images e.g. as per
3.3c LBP processing on M registered grayscale images and on M registered photo-normalized images e.g. as per
3.3e DCT processing on M registered grayscale images and on M registered photo-normalized image/s e.g. as per
[0249] The method of
3.3.1 Gabor feature extraction to Set M1. Input: Set M1; Process: Apply Gabor feature based extraction to each image from Set M1. Output includes: M Gabor transformation vectors corresponding to Set M1 collected into a Gabor Set M1 matrix.
3.3.2 Gabor matrix projection for Set M1. Input includes: output from 3.3.1 AND output of 2.3.4; Process: Project output of 3.3.1 into output of 2.3.4, Output includes: 1×M Gabor type Biometric Templates for Set M1.
3.3.3 Gabor feature extraction to Set M2. Input includes: Set M2; Process: Apply Gabor feature based extraction to each image from Set M2. Output includes: M Gabor transformation vectors corresponding to Set M2 collected into a Gabor Set M2 matrix.
3.3.4 Gabor matrix projection for Set M2. Input includes: output from 3.3.3 AND output of 2.3.8; Process: Project output of 3.3.3 into output of 2.3.8, Output includes: 1×M Gabor type Biometric Templates for Set M2.
[0250] The method of
3.3.5 LBP feature extraction to Set M1. Input includes: Set M1; Process: Apply LBP feature based extraction to each image from Set M1. Output includes: M LBP transformation vectors corresponding to Set M1 collected into a LBP Set M1 matrix.
3.3.6 LBP matrix projection for Set M1. Input includes: output from 3.3.5 AND output of 2.3.12; Process: Project output of 3.3.5 into output of 2.3.12, Output includes: 1×M LBP type Biometric Templates for Set M1.
3.3.7 LBP feature extraction to Set M2. Input includes: Set M2; Process: Apply LBP feature based extraction to each image from Set M2. Output includes: M LBP transformation vectors corresponding to Set M2 collected into a LBP Set M2 matrix.
3.3.8 LBP matrix projection for Set M2. Input includes: output from 3.3.7 AND output of 2.3.16; Process: Project output of 3.3.7 into output of 2.3.16, Output includes: 1×M LBP type Biometric Templates for Set M2.
[0251] The method of
3.3.9 DCT feature extraction to Set M1. Input includes: Set M1; Process: Apply DCT feature based extraction to each image from Set M1. Output includes: M LBP transformation vectors corresponding to Set M1 collected into a DCT Set M matrix.
3.3.10 DCT matrix projection for Set M1. Input includes: output from 3.3.9 AND output of 2.3.19; Process: Project output of 3.3.9 into output of 2.3.19, Output includes: 1×M DCT type Biometric Templates for Set M1.
3.3.11 DCT feature extraction to Set M2. Input includes: Set M2; Process: Apply DCT feature based extraction to each image from Set M2. Output includes: M LBP transformation vectors corresponding to Set M2 collected into a LBP Set M2 matrix.
3.3.12 DCT matrix projection for Set M2. Input includes: output from 3.3.11 AND output of 2.3.22; Process: Project output of 3.3.11 into output of 2.3.22, Output includes: 1×M DCT type Biometric Templates for Set M2.
[0252] The method of
4.0 Form P pairs from the full set of M samples. For example, if M is 1598, P may be=2,553,606 or M squared. Typically P=CONTC+CONTI, where CONTC is the number of each pair satisfying CLASS_ID(i)=CLASS_ID(j) (i.e. if the samples i and j from the pair (i,j) belong to the same class) and CONTC is the number of each pair satisfying CLASS_ID(i)≠CLASS_ID(j) (sample i and i belongs to different classes), for any i=1, . . . , M and any j=1, . . . M. Typically each pair of templates is associated with a pair of samples from the BFTS. According to certain embodiments, the number of templates equals the number of samples.
4.1 compute Score a). Input includes: Pairs of biometrics templates formed in operation 4.0, to which operation 3.3.2 was applied; Process: Compute conventional cosine-based scoring between pairs; Output may include a 1×P score vector, where P is the number of all possible pairs (corresponding to Gabor Template for Set M1). Typically, the P components of the output vector are respectively associated with the P pairs.
4.2 compute Score b). Input includes: Pairs of biometrics templates formed in operation 4.0, to which operation 3.3.4 was applied; Process: Compute conventional cosine-based scoring between pairs; Output includes: 1×P dimensional score vector, where P is the numbers of all possible pairs (corresponding to Gabor Template for Set M2).
4.3 compute Score c). Input includes: Pairs of biometric templates formed in operation 4.0, to which operation 3.3.6 was applied; Process: Compute conventional cosine-based scoring between pairs; Output includes: 1×P dimensional score vector, where P is the numbers of all possible pairs (corresponding to LBP Template for Set M1).
4.4 compute Score d). Input includes: Pairs of biometric templates formed in operation 4.0, to which operation 3.3.8 was applied; Process: Compute conventional cosine-based scoring between pairs; Output includes: 1×P dimensional score vector, where P is the number of all possible pairs (corresponding to LBP Template for Set M2).
4.5 compute Score e). Input includes: Pairs of biometric templates formed in operation 4.0, to which operation 3.3.10 was applied; Process: Compute conventional cosine-based scoring between pairs; Output includes: 1×P dimensional score vector, where P is the number of all possible pairs (corresponding to DCT Template for Set M1).
4.6 compute Score f). Input includes: Pairs of biometrics templates formed in operation 4.0, to which operation 3.3.12 was applied; Process: Compute conventional cosine-based scoring between pairs; Output includes: 1×P dimensional score vector, where P is the number of all possible pairs (corresponding to DCT Template for Set M2).
[0253] The method of
6.1 Enroll processing: apply operation 3 to an enroll image, typically to generate 6 Enroll Biometric Templates (outputs of applying operations 3.3.2, 3.3.4, 3.3.6, 3.3.8, 3.3.10, and 3.3.12 respectively)
6.2 Test processing. Input includes: 1 test image, Process: Apply operation 3 mutatis mutandis; Output includes: Six (6) Test Biometric Templates (output of 3.3.2, 3.3.4, 3.3.6, 3.3.8, 3.3.10, and 3.3.12)
6.3 Score and match. Input 6 enroll Biometric Templates and 6 Test Biometric Templates thereby to define 6 pairs of templates. Compute conventional cosine-based scoring between each pair of biometric. linearly combine the 6 scores using the weights computed in operation 5. Output includes: Single Score Value also termed “the Final Score”.
6.4 Final decision. Input includes: Final Score computed in operation 6.3. score may be thresholded using a single threshold such that scores above the single threshold indicate that the test and enroll images match whereas the test image is rejected as dissimilar to the enroll image if the final score generated in operation 6.3 falls below the single threshold. Alternatively however, 2 thresholds, FAT and ST are predefined: the final match/reject decision may then include suitable further processing if the Final Score is intermediate FAT and ST e.g. as per
[0254] The method of
6.4.1 For Final Score value >ST, decide “Match”,
6.4.2 For Final Score value between FAT and ST, additional human/manual intervention determines whether this is a match or a reject. e.g. user picks a “correct” pictogram from among several presented to her or him; password recognition; voice recognition, etc.
6.4.3 For Final Score <FAT, decide user Reject.
[0255] It is appreciated that the particular operations in
[0256] Referring now to
[0257] The application, in its different manifestations, may use Face Recognition to authenticate a smartphone's user as its legitimate owner and allow/deny access to the phone itself (phone unlock) and/or to specific applications as defined by the user.
[0258] The application may perform some or all of:
[0259] 1. Enrollment
[0260] 2. Setup & Application Assignment
[0261] 3. User authentication: Face capture
[0262] 4. User authentication: Successful analysis and positive authentication of user
[0263] 5. User authentication: Successful analysis and negative authentication of user
[0264] 6. Failure: either face capture or analysis fail
[0265] At Startup by default: By default, once the application is installed, enrolled, provisioned and enabled the application may require no launch since it works in the background constantly monitoring the applications it is assigned to protect. If such application is launched by the user, the application may sense that and automatically launch the notification avatar which may act to authenticate the user while working concurrently with the application launched, thus the workflow does not stop to authenticate the user; authentication is handled in parallel with the normal workflow. The notification avatar can be disabled e.g. from a setup panel of the Athena application.
[0266] The application may be enabled or disabled (as master switch) by the application setup. If disabled, it may provide protection to no application. If enabled, all protection logic is available for selected applications, users and phone.
[0267] In one example, a smartphone user installed the application. On operating the smartphone, the application detects the face of the user and performs global histogram equalization and then face recognition. If the face recognition is successful (that is, the person operating the phone is the registered user), the smartphone may be activated; in cases where the detection fails, the smartphone may be locked.
[0268] In another example, a tablet user installed the application. The application is running while the tablet is working, and only when trying to access a predefined second application in the tablet, does the recognition application alert on failing/succeeding in the face recognition.
[0269] The particular description herein is not intended to be limiting. Any suitable implementation of
[0270] According to certain embodiments of facial authentication, plural feature extraction technologies (which may include some or all of Gabor, LBP, DCT) are applied to raw images of faces, directly or after pre-processing thereof, and plural similarity scores are generated respectively by comparing resulting features from images of faces to be compared; then a final similarity determination is made including combining (using any suitable combination technique known in the art) the plural scores into a single final score.
[0271] According to certain embodiments, plural pre-processing technologies are employed to generate images (e.g. to generate both gray-scale registered images and registered photo-normalized images from raw images of faces) to which to apply one or more feature extraction technology, and plural similarity scores are generated by comparing resulting features; then a final similarity determination is made including combining (using any suitable combination technique known in the art) the plural scores into a single final score.
[0272] According to certain embodiments, plural (e.g. K) feature extraction technologies are crossed with plural (e.g. L) pre-processing technologies and plural (K×L) similarity scores are generated by comparing resulting features; then a final similarity determination is made including combining (using any suitable combination technique known in the art) the plural scores into a single final score.
[0273] Any or all of the operations of any or all of the following methods A-E may for example be performed, together or separately; operations within methods A-E may also be suitably combined with the methods of
[0274] Face Recognition—Method A
[0275] 100: setup: training (e.g. by performing method B)
[0276] 110: Acquire “target” color images from each of a population of registered users; store in “enroll data set” repository in association with unique identifier for each user
[0277] 115: generate 6-feature “compact biometric feature data set” for each target image (e.g. by performing method c) and, typically, store in the “enroll data set” repository for subsequent uses e.g. in operation 120
[0278] 120: matching attempt—does a specific query image match a specific target image? (e.g. by performing method c). if query image is an unknown user, flowchart c may be performed 100,000 times (say) to compare the query image's data set to each of the 100,000 (say) data sets in the “enroll data set” repository
[0279] Training—Method b
[0280] 210: Assemble, from repositories such as Facebook, plural images of each of a multiplicity of human subjects with representation for various illumination conditions and presence/absence of various common occlusions
[0281] 220: In computer storage, annotate images to indicate which images are associated with a single subject
[0282] 230: Train on the annotated images, thereby to generate 3×2=6 PCA transformation matrices, 6 LDA transformation matrices and 6 PLDA transformation matrices to be used, in operation 470, for extraction of each of Gabor features, LBP features and pixel-intensity features, from each of 1.sup.st and 2.sup.nd gray-scale representations of a current image
[0283] 240: Use PCA (Principal Component Analysis)-based whitening transform (WPCA) to reduce data dimensionality.
[0284] 250: Apply WPCA to the Gabor and LBP transformations to reduce memory footprint
[0285] 260: Linear discriminant analysis (LDA) projection
[0286] 270: Compute PLDA (Probabilistic Linear Discriminant Analysis) feature vectors for each of the two image descriptors employed in the LBP transformation.
[0287] Matching—Method c
[0288] 305—Generate “compact biometric feature data set” for 1.sup.st & 2.sup.nd gray-scale representations of the query image (e.g. by performing method d)
[0289] 315—Compute 6 cosine-based scores by respectively comparing each of the 6 features of the target image's feature data set with each of the 6 features of the target image
[0290] 325—Use linear-logistic-regression-based (LLR) fusion to combine the 6 cosine-based scores into a final target-query similarity score for a given matching attempt
[0291] 335—Output genuine/imposter if final similarity score is above/below predetermined threshold
[0292] Feature Extraction—Method d:
[0293] 420: Detect face/s e.g. using Viola-Jones object detector trained on faces
[0294] 430: If no face detected and light is low, repeat face detection with low-light compensation
[0295] 440: Crop largest face
[0296] 450: Registration of face thereby to generate 1st gray-scale image (e.g. by performing method e)
[0297] 460: Generate 2nd gray-scale image using photo-normalization procedure
[0298] 470: Extract facial Features: use the PCA, LDA and PLDA transformation matrices generated in training operation 100, to extract each of Gabor features, LBP features and pixel-intensity features, respectively, from each of 1.sup.st and 2.sup.nd gray-scale images, thereby to generate, per face, a “compact biometric feature data set” including 6 sets of facial features
[0299] Face Registration—Method e
[0300] 610: Detect eyes e.g. using Viola-Jones object detector trained on eyes
[0301] 620: Find ocular distance
[0302] 630: Determine left/right/up/down crop parameters by computing predetermined percentages of ocular distance
[0303] 640: Crop face relative to left eye coordinates, using crop parameters
[0304] 650: Rescale face to predetermined uniform size (x pixels times y pixels)
[0305] 660: output 1.sup.st gray scale image
[0306] A enroll (gallery) biometric template may be generated including projecting the enroll biometric data features into trained models by multiplying the matrix associated with the biometric template with the matrix associated with the trained model.
[0307] It is appreciated that parameters and values stipulated herein are merely by way of example.
[0308] Mobile devices may be biometrically enabled using certain embodiments herein, and/or biometric apps, implementing certain embodiments herein, may be downloaded.
[0309] Advantages of certain embodiments include: [0310] a. low false reject levels i.e. failure to recognize registered users is infrequent even given challenges hampering face recognition such as bright sunshine, dark rooms, and shadows on the user's face. [0311] b. set-up e.g. training the face authentication system to recognize an end user's face requires that end use only to take 2-3 unsupervised selfies, typically in any conditions. [0312] c. Since a mobile device protected with the face recognition system herein is uniquely personal, enterprise management is facilitated e.g. in terms of employee management such as but not limited to Time & Attendance fraud prevention e.g. “buddy punching”, allowing remote employees to clock in and out, obviating the need for an on-site attendance clock, and management of restricted data access and restricted work areas. [0313] d. privacy may be enhanced by storing digital data e.g. templates representing registered users' faces, rather than the faces themselves. [0314] e. yields an electronic verification tool accurate enough for secure online and mobile operations including but not limited to banking and communication, including but not limited to email and social networking. [0315] f. fast enough and accurate enough to be useable for real-time mobile applications having limited hardware capabilities and resources, as compared to PC server based applications.
[0316] It is appreciated that terminology such as “mandatory”, “required”, “need” and “must” refer to implementation choices made within the context of a particular implementation or application described herewithin for clarity and are not intended to be limiting since in an alternative implementation, the same elements might be defined as not mandatory and not required, or might even be eliminated altogether.
[0317] It is appreciated that software components of the present invention including programs and data may, if desired, be implemented in ROM (read only memory) form including CD-ROMs, EPROMs and EEPROMs, or may be stored in any other suitable typically non-transitory computer-readable medium such as but not limited to disks of various kinds, cards of various kinds and RAMs. Components described herein as software may, alternatively, be implemented wholly or partly in hardware and/or firmware, if desired, using conventional techniques, and vice-versa. Each module or component may be centralized in a single location or distributed over several locations.
[0318] Included in the scope of the present disclosure, inter alia, are electromagnetic signals in accordance with the description herein. These may carry computer-readable instructions for performing any or all of the operations of any of the methods shown and described herein, in any suitable order including simultaneous performance of suitable groups of operations as appropriate; machine-readable instructions for performing any or all of the operations of any of the methods shown and described herein, in any suitable order; program storage devices readable by machine, tangibly embodying a program of instructions executable by the machine to perform any or all of the operations of any of the methods shown and described herein, in any suitable order; a computer program product comprising a computer useable medium having computer readable program code, such as executable code, having embodied therein, and/or including computer readable program code for performing, any or all of the operations of any of the methods shown and described herein, in any suitable order; any technical effects brought about by any or all of the operations of any of the methods shown and described herein, when performed in any suitable order; any suitable apparatus or device or combination of such, programmed to perform, alone or in combination, any or all of the operations of any of the methods shown and described herein, in any suitable order; electronic devices each including at least one processor and/or cooperating input device and/or output device and operative to perform e.g. in software any operations shown and described herein; information storage devices or physical records, such as disks or hard drives, causing at least one computer or other device to be configured so as to carry out any or all of the operations of any of the methods shown and described herein, in any suitable order; at least one program pre-stored e.g. in memory or on an information network such as the Internet, before or after being downloaded, which embodies any or all of the operations of any of the methods shown and described herein, in any suitable order, and the method of uploading or downloading such, and a system including server/s and/or client/s for using such; at least one processor configured to perform any combination of the described operations or to execute any combination of the described modules; and hardware which performs any or all of the operations of any of the methods shown and described herein, in any suitable order, either alone or in conjunction with software. Any computer-readable or machine-readable media described herein is intended to include non-transitory computer- or machine-readable media.
[0319] Any computations or other forms of analysis described herein may be performed by a suitable computerized method. Any operation or functionality described herein may be wholly or partially computer-implemented e.g. by one or more processors. The invention shown and described herein may include (a) using a computerized method to identify a solution to any of the problems or for any of the objectives described herein, the solution optionally includes at least one of a decision, an action, a product, a service or any other information described herein that impacts, in a positive manner, a problem or objectives described herein; and (b) outputting the solution.
[0320] The system may if desired be implemented as a web-based system employing software, computers, routers and telecommunications equipment as appropriate.
[0321] Any suitable deployment may be employed to provide functionalities e.g. software functionalities shown and described herein. For example, a server may store certain applications, for download to clients, which are executed at the client side, the server side serving only as a storehouse. Some or all functionalities e.g. software functionalities shown and described herein may be deployed in a cloud environment. Clients e.g. mobile communication devices such as smartphones may be operatively associated with but external to the cloud.
[0322] The scope of the present invention is not limited to structures and functions specifically described herein and is also intended to include devices which have the capacity to yield a structure, or perform a function, described herein, such that even though users of the device may not use the capacity, they are if they so desire able to modify the device to obtain the structure or function.
[0323] Features of the present invention, including operations, which are described in the context of separate embodiments may also be provided in combination in a single embodiment. For example, a system embodiment is intended to include a corresponding process embodiment and vice versa. Also, each system embodiment is intended to include a server-centered “view” or client centered “view”, or “view” from any other node of the system, of the entire functionality of the system, computer-readable medium, apparatus, including only those functionalities performed at that server or client or node. Features may also be combined with features known in the art and particularly although not limited to those described in the Background section or in publications mentioned therein.
[0324] Conversely, features of the invention, including operations, which are described for brevity in the context of a single embodiment or in a certain order may be provided separately or in any suitable subcombination, including with features known in the art (particularly although not limited to those described in the Background section or in publications mentioned therein) or in a different order. “e.g.” is used herein in the sense of a specific example which is not intended to be limiting. Each method may comprise some or all of the operations illustrated or described, suitably ordered e.g. as illustrated or described herein.
[0325] Devices, apparatus or systems shown coupled in any of the drawings may in fact be integrated into a single platform in certain embodiments or may be coupled via any appropriate wired or wireless coupling such as but not limited to ooptical fiber, Ethernet, Wireless LAN, HomePNA, power line communication, cell phone, PDA, Blackberry GPRS, Satellite including GPS, or other mobile delivery. It is appreciated that in the description and drawings shown and described herein, functionalities described or illustrated as systems and sub-units thereof can also be provided as methods and operations therewithin, and functionalities described or illustrated as methods and operations therewithin can also be provided as systems and sub-units thereof. The scale used to illustrate various elements in the drawings is merely exemplary and/or appropriate for clarity of presentation and is not intended to be limiting.