METHOD AND SYSTEM FOR DETECTING A SPOOFING ATTEMPT
20230118532 · 2023-04-20
Assignee
Inventors
- Björn BENDERIUS (Lund, SE)
- Jimmie Jönsson (Lund, SE)
- Johan Jeppsson KARLIN (Lund, SE)
- Niclas SVENSSON (Lund, SE)
- Andreas MUHRBECK (Lund, SE)
Cpc classification
G06T7/80
PHYSICS
G06V40/171
PHYSICS
International classification
G06T7/80
PHYSICS
Abstract
A first image is captured by the camera, using a first focus setting and a first aperture size. A first protrusion focus measure in a protrusion area of an object in the first image and a first recess focus measure in a recess area of the object in the first image are determined. A second image is captured by the camera, using the first focus setting and a second aperture size, and the object is detected. A second protrusion focus measure and a second recess focus measure are determined in the second image. A protrusion focus difference between the first and second protrusion focus measures, and a recess focus difference between the first and second recess focus measures are calculated. The protrusion focus difference and the recess focus difference are compared and if they differ by less than a predetermined threshold amount, it is determined that the object is fake.
Claims
1. A method for detecting if an object detected in a surveillance or access control system comprising a camera is fake, the method comprising: capturing a first image by the camera, using a first focus setting and a first aperture size, detecting the object in the first image, determining a first protrusion focus measure in a protrusion area of the object in the first image, determining a first recess focus measure in a recess area of the object in the first image, capturing a second image by the camera, using the first focus setting and a second aperture size which is different from the first aperture size, detecting the object in the second image, determining a second protrusion focus measure in a protrusion area of the object in the second image, determining a second recess focus measure in a recess area of the object in the second image, calculating a protrusion focus difference between the first and second protrusion focus measures, calculating a recess focus difference between the first and second recess focus measures, comparing the protrusion focus difference and the recess focus difference, and if the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount, determining that the object is fake.
2. The method according to claim 1, wherein the object is a face and wherein the protrusion area is an area corresponding to a nose area of the face, and wherein the recess area is an area corresponding to an ear, cheek, chin, or forehead area of the face.
3. The method according to claim 1, further comprising determining a first additional focus measure in an additional area of the object in the first image, determining a second additional focus measure in an additional area of the object in the second image, calculating an additional focus difference between the first and second additional focus measures, comparing the additional focus difference and at least one of the protrusion focus difference and the recess focus difference, and if the additional focus difference and said at least one of the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount, determining that the object is fake.
4. The method according to claim 1, wherein the focus measures are determined using a contrast detection algorithm.
5. The method according to claim 1, wherein the focus measures are determined using an algorithm chosen from the group consisting of a Sobel algorithm, a Laplacian algorithm, a Gaussian algorithm, a Scharr algorithm, a Roberts algorithm, a Prewitt algorithm, a Brenner algorithm, a Tenengrad algorithm, a histogram algorithm, a Vollath algorithm, a frequency analysis algorithm using FFT, and a frequency analysis algorithm using DCT.
6. The method according to claim 1, wherein the steps of determining focus measures, calculating focus differences, and comparing focus differences are performed by a neural network.
7. The method according to claim 1, further comprising marking the second image as a non-display image.
8. A system for detecting if an object detected in a surveillance or access control system comprising a camera is fake, the system comprising: an aperture setting controller arranged to control an aperture size of the camera, an image capture initiator arranged to initiate capture of a first image, and a second image, wherein the aperture setting controller is arranged to control the aperture size such that the first image is captured using a first aperture size and the second image is captured using a second aperture size, which is different from the first aperture size, the system further comprising: an object detector arranged to detect an object in the first and second images, a focus determinator arranged to determine a first protrusion focus measure in a protrusion area of the object in the first image, a first recess focus measure in a recess area of the object in the first image, a second protrusion focus measure in a protrusion area of the object in the second image, and a second recess focus measure in a recess area of the object in the second image, a focus difference calculator arranged to calculate a protrusion focus difference between the first and second protrusion focus measures, and a recess focus difference between the first and second recess focus measures, a focus difference comparator arranged to compare the protrusion focus difference and the recess focus difference, and an evaluator arranged to determine that the object is fake if the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount.
9. The system according to claim 8, wherein the object detector is a face detector.
10. A camera comprising a system according to claim 8.
11. A non-transitory computer readable storage medium having stored thereon instructions for implementing the method according to claim 1, when executed on a device having processing capabilities.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0025] The invention will now be described in more detail by way of example and with reference to the accompanying schematic drawings, in which:
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
DETAILED DESCRIPTION OF EMBODIMENTS
[0032] In
[0033] Let us assume that the person 1 standing at the door station 2 is in fact not supposed to be allowed to enter. The person 1 may in such case try to gain entry by attempting to trick the receptionist or guard, or the face recognition algorithm. As shown in
[0034] The camera 3 captures a first image 31, shown in
[0035] A face detection algorithm in the camera 3 is used for detecting the face 5 of the person 1 appearing before the door station 2, and the nose area 6 and an ear area 7 of the face are located. The nose and ear areas are chosen because they represent a protrusion area and a recess area, respectively, of the face. Other areas of the face may be chosen, as long as one of them is an area that is expected to protrude in relation to another of the chosen areas. For example, the forehead may be chosen as a protrusion area and an ear as a recess area, or the nose may be chosen as a protrusion area and a cheek or the chin as a recess area.
[0036] When the nose area has been located, a focus measure is determined for this area. This will in the following be referred to as a protrusion focus measure since the nose has been chosen as protrusion area. The focus measure for the nose area in the first image will be referred to as a first protrusion focus measure F.sub.p1.
[0037] Similarly, a focus measure is determined for the ear area. This will be referred to as a recess focus area since the ear has been chosen as recess area. The focus measure for the ear area in the first image will be referred to as a first recess focus measure Fri.
[0038] The focus measures are determined using any suitable known focus algorithm, such as a contrast detection algorithm. If the camera 2 has an autofocus function, the same focus determination algorithm may be used as for the autofocus function. Some examples of focus determination algorithms that may be useful are a Sobel algorithm, a Laplacian algorithm, a Gaussian algorithm, a Scharr algorithm, a Roberts algorithm, a Prewitt algorithm, a Brenner algorithm, a Tenengrad algorithm, a histogram algorithm, a Vollath algorithm, a frequency analysis algorithm using FFT, and a frequency analysis algorithm using DCT.
[0039] The camera 3 captures a second image 32. For this second image 32, the camera 2 uses the same first focus setting as for the first image 31. However, for the second image 32, the camera 3 uses a second aperture size. The second aperture size may be smaller or larger than the first aperture size, but not equal to the first aperture size. When a different aperture size is used, the depth of field changes. If a larger aperture size is used, the depth of field decreases. Thus, some parts of the image that were in focus with a smaller aperture will now be out of focus. Conversely, if a smaller aperture size is used, the depth of field increases, and some parts of the image that were previously out of focus will now be in focus. In
[0040] The face 5 is detected also in the second image 32. A second protrusion focus measure F.sub.p2 is determined for the nose area of the face 5 in second image 32 and a second recess focus measure F.sub.r2 is determined for the ear area. These focus measures are determined in the same way as for the first image 31.
[0041] When the focus measures have been determined for the first and second images, a protrusion focus difference ΔF.sub.p is calculated as the difference between the first protrusion focus measure F.sub.p1 and the second protrusion focus measure F.sub.p2:
ΔF.sub.p=F.sub.p1−.sub.p2
[0042] Analogously, a recess focus difference ΔF.sub.r is calculated as the difference between the first recess focus measure F.sub.r1 and the second recess focus measure F.sub.r2:
ΔF.sub.r=F.sub.r1−F.sub.r2
[0043] Alternatively, the protrusion focus difference and the recess focus difference may be calculated as relative differences. Thus, the protrusion focus difference ΔF.sub.r may instead be calculated as follows:
[0044] Analogously, the recess focus difference ΔF.sub.r may be calculated as follows:
[0045] The protrusion focus difference ΔF.sub.p is compared to the recess focus difference ΔF.sub.r. If the face is real, the change in focus measure between the first and second images is expected to be different for the nose area and the ear area since the nose is expected to protrude more than the ear, or in other words, the nose is expected to be located closer to the camera, along the optical axis of the camera, than the ear. If, on the other hand, the face is a photo, the change in focus measure will likely be the same for the nose as for the ear, as they are both in the same plane, i.e. the plane of the two-dimensional photo. Therefore, if the protrusion focus difference ΔF.sub.p differs from the recess focus difference ΔF.sub.r by less than a predetermined threshold amount δ.sub.th, it is determined that the face is not a real three-dimensional face, but only a two-dimensional representation, such as a photo. In other words, it is determined that the face is fake. The principle of comparing to the predetermined threshold amount δ.sub.th is the same whether the focus differences are calculated as absolute or relative differences, but as the skilled person will appreciate, the value of the predetermined threshold amount δ.sub.th will be different depending on if the differences are absolute or relative.
[0046] The predetermined threshold amount δ.sub.th may, e.g., be determined empirically by studying focus differences for a number of real faces and for a number of photos, possibly bent or angled to different degrees.
[0047] If it has been determined that the face is fake an alert to this effect may be generated. For instance, if a receptionist looking at images from the door station makes the decision to allow or deny entry, a warning message may be displayed as an overlay on the images, such that the receptionist is made aware of the possible intrusion attempt. If an automated process decides if the person should be allowed or denied entry, the alert of the determination that the face is fake may trigger a denial of entry. Additionally, a warning message may be sent to, e.g., a security guard, such that the attempted intrusion may be investigated further.
[0048] Added protection against spoofing attempts may be achieved if an additional area of the face is taken into account when studying focus measures. If the nose area was chosen as the protrusion area and an ear area was chosen as the recess area, the other ear, a cheek, the chin, or the forehead may be chosen as an additional area 8. In the first image, a first additional focus measure F.sub.a1 is determined, in the same way as the protrusion focus measures and recess focus measures discussed above, and in the second image, a second additional focus measure F.sub.a2 is determined.
[0049] An additional focus difference ΔF.sub.a is calculated as the difference between the first additional focus measure F.sub.a1 and the second additional focus measure F.sub.a2:
ΔF.sub.a=F.sub.a1−F.sub.a2
[0050] The additional focus difference ΔF.sub.a is compared to at least one of the protrusion focus difference ΔF.sub.p and the recess focus difference ΔF.sub.r. In the same way as described above, it the additional focus difference ΔF.sub.a differs from the focus difference it is compared to by less than a predetermined threshold amount δ.sub.th, it is determined that the object is fake.
[0051] By adding an additional area to the analysis, the risk that someone manages to trick the physical access control system by presenting a photo at an angle to the camera may be reduced.
[0052] Depending on whether false alarms or missed spoofing attempts are considered more important to avoid, the method may be varied. If false alarms should be avoided as far as possible, it may be determined that the face is fake only if both the comparison of the protrusion focus difference ΔF.sub.p and the recess focus difference ΔF.sub.r, and the comparison of the additional focus difference ΔF.sub.a and one of the two other focus differences ΔF.sub.p, ΔF.sub.r shows that they differ by less than the predetermined threshold amount δ.sub.th. If it is deemed more important not to miss any spoofing attempts, it may be determined that the face is fake if at least one of the comparisons shows that the focus differences differ by less than the predetermined threshold amount δ.sub.th.
[0053] Regardless of the number of areas studied in the images, a neural network may be used for determining the focus measures, for calculating the focus differences, and for comparing the focus differences. If a neural network is used for these method steps, they need not be distinct steps, but could be integrated into each other. Even if a neural network is used, it is not necessary to perform all of the mentioned step by means of the neural network. For instance, the focus measures may be determined by a regular focus measure algorithm and the resulting focus measures may be input to a neural network that determines whether these focus measures indicate that the studied object is real or fake. The neural network may be a deep learning model that has been trained to distinguish between images of real, three-dimensional objects and fake, two-dimensional photos of objects. The deep learning model may be trained in a supervised or unsupervised setting. In a supervised setting, the deep learning model is trained using labelled datasets to classify data or predict outcomes accurately, in this case to classify images as depicting real or fake objects. As input data are fed into the deep learning model, the model adjusts its weights until the model has been fitted appropriately, which occurs as part of a cross validation process. In an unsupervised setting, the deep learning model is trained using unlabelled datasets. From the unlabelled datasets, the deep-learning model discovers patterns that can be used to cluster data from the datasets into groups of data having common properties. Common clustering algorithms are hierarchical, k-means, and Gaussian mixture models. Thus, the deep learning model may be trained to learn representations of data.
[0054] As already noted, when the aperture size is altered, the depth of field is also altered. Depending on by how much and in what parts of the image focus is thereby changed, the change in focus may be annoying to a person viewing the images. In order to avoid such annoying focus changes, the second image 32 may be marked as a non-display image when encoding the images. Thereby, both the first image 31 and the second image 32 may be included in a stream transmitted by the camera 3, but only the first image 31 need be displayed to a viewer at the receiving end. The need to mark the second image 32 as a non-display image may be reduced by adjusting an exposure time or gain, either already at the sensor, or in image processing.
[0055] In some variants of the method, the first focus setting may be chosen such that it is a little “too close” compared to what would otherwise be used. In order to get a normal image, the first aperture setting may be reduced just a little, such that, e.g., an aperture size of F5.6 is used rather than F2.0 which could normally be expected. In this manner, focus will be in the distance range of interest. For a door station, focus will be at a distance at which a person seeking to enter may be expected to stand. When the detection of a possible spoofing attempt is to be performed, a sweep may be done with the iris or diaphragm of the camera, such that the depth of field is moved from the rear to the front. In this manner, the likelihood of achieving different focus measures at different aperture sizes increases. If instead an “optimal” focus distance is used as the first focus setting, as described above, the difference between different aperture sizes may be smaller, since it is the endpoints of the depth of field that are moved when the aperture size is adjusted.
[0056] The method for detecting a spoofing attempt will now be summarised with reference to
[0057] In step S1, a first image is captured, using a first focus setting and a first aperture size. In step S2, an object is detected in the first image. This object may, e.g., be a face, such as in the preceding examples, or another object that it is desired to detect and be able to determine if it is potentially fake. In step S3a, a first protrusion focus measure is determined in a protrusion area of the object. If the object is a face, the protrusion area may, for instance, be the area of a nose or a forehead. Similarly, in step S3b, a recess focus measure is determined in a recess area of the object. For a face, the recess area may, e.g., be the area of an ear or a cheek.
[0058] A second image is captured in step S4 using the first focus setting and a second aperture size. The second aperture is different from the first aperture size. In the same way as for the first image, in step S5, the object that was detected in the first image is also detected in the second image. In step S6a, a second protrusion focus measure is determined in the protrusion area of the object in the second image. The protrusion area studied in the second image is the same as the protrusion area studied in the first image. Thus, if the nose was chosen as the protrusion area in the first image, the nose will be the protrusion area also in the second image. Similarly, in step S6b, a second recess focus measure is determined in the recess area of the object in the second image. The recess area studied in the second image is the same as the recess area studied in the first image. In other words, if an ear was chosen as the recess area in the first image, the same ear will be the recess area in the second image.
[0059] In step S7a, the difference between the first protrusion focus measure and the second protrusion focus measure is calculated, and in step S7b the difference between the first recess focus measure and the second recess focus measure is calculated. In step S8, these differences are compared, and in step S9, if the focus differences differ by less than the predetermined threshold amount, it is determined that the object is fake.
[0060] The spoofing detection methods described above may be implemented by means of a system such as the one illustrated in
[0061] The system 50 may be integrated in the camera 3, such as exemplified in
[0062] The camera 3 in
[0063] The camera 3 may in turn be integrated in another device, such as the door station 2 shown in
[0064] Instead of being integrated in the camera 3, the spoofing detection system 50 may be arranged separately, but operatively connected to the camera 3.
[0065] The spoofing detection system 50 may be embodied in hardware, firmware, or software, or any combination thereof. When embodied as software, the spoofing detection system may be provided in the form of computer code or instructions that when executed on a device having processing capabilities will implement the temperature control method described above. Such device may for instance be, or include, a central processing unit (CPU), a graphics processing unit (GPU), a custom-made processing device implemented in an integrated circuit, an ASIC, an FPGA, or logical circuitry including discrete components. When embodied as hardware, the system may comprise circuitry in the form of an arrangement of a circuit or a system of circuits. It may for example be arranged on a chip and may further comprise or be otherwise arranged together with software for performing the processing.
[0066] It will be appreciated that a person skilled in the art can modify the above-described embodiments in many ways and still use the advantages of the invention as shown in the embodiments above. As an example, the invention has so far been described in the context of a physical access control system, in which a door station employs a camera and a face detection or face recognition algorithm. However, the anti-spoofing methods and systems of the invention can be used also in other systems, such as monitoring or surveillance systems employing cameras. They may also be used in contexts where other objects than faces are detected. As long as the detected type of object has at least one protrusion area and at least one recess area, the spoofing detection methods and systems may be used for detecting if a detected object appears to be a flat photo rather than a real three-dimensional object. For instance, looking at a car more or less from the front, the headlights and the outer rear-view mirrors are expected to be at different distances from the camera. Hence, a difference in focus measure is expected between the headlights and the rear-view mirrors if the detected car is three-dimensional and not just a photo of a car. The skilled person will realise that if the objects to study are expected to be located further away from the camera than is typically expected of a face in front of a door station, the focal length of the camera will need to be longer.
[0067] In the examples above, the person trying to gain access attempts to trick the system using a photo of a person who is authorised to enter. Instead of a photo the intruder could just as well hold a display, such as on a mobile phone or tablet, up to the camera. The display could show a still photo or a video sequence.
[0068] Thus, the invention should not be limited to the shown embodiments but should only be defined by the appended claims.