METHOD AND SYSTEM FOR DETECTING A SPOOFING ATTEMPT

20230118532 · 2023-04-20

Assignee

Inventors

Cpc classification

International classification

Abstract

A first image is captured by the camera, using a first focus setting and a first aperture size. A first protrusion focus measure in a protrusion area of an object in the first image and a first recess focus measure in a recess area of the object in the first image are determined. A second image is captured by the camera, using the first focus setting and a second aperture size, and the object is detected. A second protrusion focus measure and a second recess focus measure are determined in the second image. A protrusion focus difference between the first and second protrusion focus measures, and a recess focus difference between the first and second recess focus measures are calculated. The protrusion focus difference and the recess focus difference are compared and if they differ by less than a predetermined threshold amount, it is determined that the object is fake.

Claims

1. A method for detecting if an object detected in a surveillance or access control system comprising a camera is fake, the method comprising: capturing a first image by the camera, using a first focus setting and a first aperture size, detecting the object in the first image, determining a first protrusion focus measure in a protrusion area of the object in the first image, determining a first recess focus measure in a recess area of the object in the first image, capturing a second image by the camera, using the first focus setting and a second aperture size which is different from the first aperture size, detecting the object in the second image, determining a second protrusion focus measure in a protrusion area of the object in the second image, determining a second recess focus measure in a recess area of the object in the second image, calculating a protrusion focus difference between the first and second protrusion focus measures, calculating a recess focus difference between the first and second recess focus measures, comparing the protrusion focus difference and the recess focus difference, and if the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount, determining that the object is fake.

2. The method according to claim 1, wherein the object is a face and wherein the protrusion area is an area corresponding to a nose area of the face, and wherein the recess area is an area corresponding to an ear, cheek, chin, or forehead area of the face.

3. The method according to claim 1, further comprising determining a first additional focus measure in an additional area of the object in the first image, determining a second additional focus measure in an additional area of the object in the second image, calculating an additional focus difference between the first and second additional focus measures, comparing the additional focus difference and at least one of the protrusion focus difference and the recess focus difference, and if the additional focus difference and said at least one of the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount, determining that the object is fake.

4. The method according to claim 1, wherein the focus measures are determined using a contrast detection algorithm.

5. The method according to claim 1, wherein the focus measures are determined using an algorithm chosen from the group consisting of a Sobel algorithm, a Laplacian algorithm, a Gaussian algorithm, a Scharr algorithm, a Roberts algorithm, a Prewitt algorithm, a Brenner algorithm, a Tenengrad algorithm, a histogram algorithm, a Vollath algorithm, a frequency analysis algorithm using FFT, and a frequency analysis algorithm using DCT.

6. The method according to claim 1, wherein the steps of determining focus measures, calculating focus differences, and comparing focus differences are performed by a neural network.

7. The method according to claim 1, further comprising marking the second image as a non-display image.

8. A system for detecting if an object detected in a surveillance or access control system comprising a camera is fake, the system comprising: an aperture setting controller arranged to control an aperture size of the camera, an image capture initiator arranged to initiate capture of a first image, and a second image, wherein the aperture setting controller is arranged to control the aperture size such that the first image is captured using a first aperture size and the second image is captured using a second aperture size, which is different from the first aperture size, the system further comprising: an object detector arranged to detect an object in the first and second images, a focus determinator arranged to determine a first protrusion focus measure in a protrusion area of the object in the first image, a first recess focus measure in a recess area of the object in the first image, a second protrusion focus measure in a protrusion area of the object in the second image, and a second recess focus measure in a recess area of the object in the second image, a focus difference calculator arranged to calculate a protrusion focus difference between the first and second protrusion focus measures, and a recess focus difference between the first and second recess focus measures, a focus difference comparator arranged to compare the protrusion focus difference and the recess focus difference, and an evaluator arranged to determine that the object is fake if the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount.

9. The system according to claim 8, wherein the object detector is a face detector.

10. A camera comprising a system according to claim 8.

11. A non-transitory computer readable storage medium having stored thereon instructions for implementing the method according to claim 1, when executed on a device having processing capabilities.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0025] The invention will now be described in more detail by way of example and with reference to the accompanying schematic drawings, in which:

[0026] FIG. 1 is an illustration of a person standing in front of a door station,

[0027] FIG. 2 is an illustration of a person standing in front of a door station and holding up a photo of another person,

[0028] FIG. 3 shows a first image and a second image of the person in FIG. 2,

[0029] FIG. 4 is a flow chart showing the steps of a method for detecting if an object detected in a surveillance or access control system comprising a camera is fake,

[0030] FIG. 5 is a block diagram of a system for detecting if an object detected in a surveillance or access control system comprising a camera is fake, and

[0031] FIG. 6 is a block diagram of a camera comprising the system in FIG. 5.

DETAILED DESCRIPTION OF EMBODIMENTS

[0032] In FIG. 1, a scenario is shown in which a person 1 is standing in front of a door station 2. The door station has a camera 3 that can be used for taking images of anyone standing in front of the door station. The images may be still images or video sequences. As noted in the background section above, such images may be used for determining who the person at the door station is. This may be done manually, by a receptionist or guard watching the images, or it may be done automatically, by a face recognition algorithm.

[0033] Let us assume that the person 1 standing at the door station 2 is in fact not supposed to be allowed to enter. The person 1 may in such case try to gain entry by attempting to trick the receptionist or guard, or the face recognition algorithm. As shown in FIG. 2, the person 1 may attempt to spoof the system by holding up a photo 4 of another person, who is trusted to enter. If the photo is of high enough quality, it may appear as though the trusted person is standing in front of the door station 2. Thus, the receptionist or guard may decide to let the unwelcome person 1 in. Similarly, a face recognition algorithm may unintentionally let the unwelcome person 1 in. By means of the invention, such spoofing attempts may be automatically detected. Examples of how such method and system may be embodied will now be described.

[0034] The camera 3 captures a first image 31, shown in FIG. 3. When capturing the first image 31, the camera 3 uses a first focus setting, adapted for focusing on a person 1 standing at an expected distance in front of the door station 2. Additionally, the camera 3 uses a first aperture size. The first aperture size may be a default aperture size for the camera 3, or it may be specifically chosen for the lighting conditions at the door station 2 and for providing a desired depth of field to be able to focus on a person standing in front of the door station 2. For instance, the first aperture size may be set such that a depth of field is achieved which gives acceptable focus on the person 1 even when standing slightly closer to or slightly further away from the door station 2 than expected. The depth of field need not enable focus at distances further away from the door station than a person can reasonably be expected to stand when interacting with the door station 2. In this way, monitoring of a wider area than intended around the door station 2 can be prevented.

[0035] A face detection algorithm in the camera 3 is used for detecting the face 5 of the person 1 appearing before the door station 2, and the nose area 6 and an ear area 7 of the face are located. The nose and ear areas are chosen because they represent a protrusion area and a recess area, respectively, of the face. Other areas of the face may be chosen, as long as one of them is an area that is expected to protrude in relation to another of the chosen areas. For example, the forehead may be chosen as a protrusion area and an ear as a recess area, or the nose may be chosen as a protrusion area and a cheek or the chin as a recess area.

[0036] When the nose area has been located, a focus measure is determined for this area. This will in the following be referred to as a protrusion focus measure since the nose has been chosen as protrusion area. The focus measure for the nose area in the first image will be referred to as a first protrusion focus measure F.sub.p1.

[0037] Similarly, a focus measure is determined for the ear area. This will be referred to as a recess focus area since the ear has been chosen as recess area. The focus measure for the ear area in the first image will be referred to as a first recess focus measure Fri.

[0038] The focus measures are determined using any suitable known focus algorithm, such as a contrast detection algorithm. If the camera 2 has an autofocus function, the same focus determination algorithm may be used as for the autofocus function. Some examples of focus determination algorithms that may be useful are a Sobel algorithm, a Laplacian algorithm, a Gaussian algorithm, a Scharr algorithm, a Roberts algorithm, a Prewitt algorithm, a Brenner algorithm, a Tenengrad algorithm, a histogram algorithm, a Vollath algorithm, a frequency analysis algorithm using FFT, and a frequency analysis algorithm using DCT.

[0039] The camera 3 captures a second image 32. For this second image 32, the camera 2 uses the same first focus setting as for the first image 31. However, for the second image 32, the camera 3 uses a second aperture size. The second aperture size may be smaller or larger than the first aperture size, but not equal to the first aperture size. When a different aperture size is used, the depth of field changes. If a larger aperture size is used, the depth of field decreases. Thus, some parts of the image that were in focus with a smaller aperture will now be out of focus. Conversely, if a smaller aperture size is used, the depth of field increases, and some parts of the image that were previously out of focus will now be in focus. In FIG. 3, hatching illustrates schematically that focus has changed.

[0040] The face 5 is detected also in the second image 32. A second protrusion focus measure F.sub.p2 is determined for the nose area of the face 5 in second image 32 and a second recess focus measure F.sub.r2 is determined for the ear area. These focus measures are determined in the same way as for the first image 31.

[0041] When the focus measures have been determined for the first and second images, a protrusion focus difference ΔF.sub.p is calculated as the difference between the first protrusion focus measure F.sub.p1 and the second protrusion focus measure F.sub.p2:


ΔF.sub.p=F.sub.p1−.sub.p2

[0042] Analogously, a recess focus difference ΔF.sub.r is calculated as the difference between the first recess focus measure F.sub.r1 and the second recess focus measure F.sub.r2:


ΔF.sub.r=F.sub.r1−F.sub.r2

[0043] Alternatively, the protrusion focus difference and the recess focus difference may be calculated as relative differences. Thus, the protrusion focus difference ΔF.sub.r may instead be calculated as follows:

[00001] Δ F p = F p 1 - F p 2 F p 1

[0044] Analogously, the recess focus difference ΔF.sub.r may be calculated as follows:

[00002] Δ F r = F r 1 - F r 2 F r 1

[0045] The protrusion focus difference ΔF.sub.p is compared to the recess focus difference ΔF.sub.r. If the face is real, the change in focus measure between the first and second images is expected to be different for the nose area and the ear area since the nose is expected to protrude more than the ear, or in other words, the nose is expected to be located closer to the camera, along the optical axis of the camera, than the ear. If, on the other hand, the face is a photo, the change in focus measure will likely be the same for the nose as for the ear, as they are both in the same plane, i.e. the plane of the two-dimensional photo. Therefore, if the protrusion focus difference ΔF.sub.p differs from the recess focus difference ΔF.sub.r by less than a predetermined threshold amount δ.sub.th, it is determined that the face is not a real three-dimensional face, but only a two-dimensional representation, such as a photo. In other words, it is determined that the face is fake. The principle of comparing to the predetermined threshold amount δ.sub.th is the same whether the focus differences are calculated as absolute or relative differences, but as the skilled person will appreciate, the value of the predetermined threshold amount δ.sub.th will be different depending on if the differences are absolute or relative.

[0046] The predetermined threshold amount δ.sub.th may, e.g., be determined empirically by studying focus differences for a number of real faces and for a number of photos, possibly bent or angled to different degrees.

[0047] If it has been determined that the face is fake an alert to this effect may be generated. For instance, if a receptionist looking at images from the door station makes the decision to allow or deny entry, a warning message may be displayed as an overlay on the images, such that the receptionist is made aware of the possible intrusion attempt. If an automated process decides if the person should be allowed or denied entry, the alert of the determination that the face is fake may trigger a denial of entry. Additionally, a warning message may be sent to, e.g., a security guard, such that the attempted intrusion may be investigated further.

[0048] Added protection against spoofing attempts may be achieved if an additional area of the face is taken into account when studying focus measures. If the nose area was chosen as the protrusion area and an ear area was chosen as the recess area, the other ear, a cheek, the chin, or the forehead may be chosen as an additional area 8. In the first image, a first additional focus measure F.sub.a1 is determined, in the same way as the protrusion focus measures and recess focus measures discussed above, and in the second image, a second additional focus measure F.sub.a2 is determined.

[0049] An additional focus difference ΔF.sub.a is calculated as the difference between the first additional focus measure F.sub.a1 and the second additional focus measure F.sub.a2:


ΔF.sub.a=F.sub.a1−F.sub.a2

[0050] The additional focus difference ΔF.sub.a is compared to at least one of the protrusion focus difference ΔF.sub.p and the recess focus difference ΔF.sub.r. In the same way as described above, it the additional focus difference ΔF.sub.a differs from the focus difference it is compared to by less than a predetermined threshold amount δ.sub.th, it is determined that the object is fake.

[0051] By adding an additional area to the analysis, the risk that someone manages to trick the physical access control system by presenting a photo at an angle to the camera may be reduced.

[0052] Depending on whether false alarms or missed spoofing attempts are considered more important to avoid, the method may be varied. If false alarms should be avoided as far as possible, it may be determined that the face is fake only if both the comparison of the protrusion focus difference ΔF.sub.p and the recess focus difference ΔF.sub.r, and the comparison of the additional focus difference ΔF.sub.a and one of the two other focus differences ΔF.sub.p, ΔF.sub.r shows that they differ by less than the predetermined threshold amount δ.sub.th. If it is deemed more important not to miss any spoofing attempts, it may be determined that the face is fake if at least one of the comparisons shows that the focus differences differ by less than the predetermined threshold amount δ.sub.th.

[0053] Regardless of the number of areas studied in the images, a neural network may be used for determining the focus measures, for calculating the focus differences, and for comparing the focus differences. If a neural network is used for these method steps, they need not be distinct steps, but could be integrated into each other. Even if a neural network is used, it is not necessary to perform all of the mentioned step by means of the neural network. For instance, the focus measures may be determined by a regular focus measure algorithm and the resulting focus measures may be input to a neural network that determines whether these focus measures indicate that the studied object is real or fake. The neural network may be a deep learning model that has been trained to distinguish between images of real, three-dimensional objects and fake, two-dimensional photos of objects. The deep learning model may be trained in a supervised or unsupervised setting. In a supervised setting, the deep learning model is trained using labelled datasets to classify data or predict outcomes accurately, in this case to classify images as depicting real or fake objects. As input data are fed into the deep learning model, the model adjusts its weights until the model has been fitted appropriately, which occurs as part of a cross validation process. In an unsupervised setting, the deep learning model is trained using unlabelled datasets. From the unlabelled datasets, the deep-learning model discovers patterns that can be used to cluster data from the datasets into groups of data having common properties. Common clustering algorithms are hierarchical, k-means, and Gaussian mixture models. Thus, the deep learning model may be trained to learn representations of data.

[0054] As already noted, when the aperture size is altered, the depth of field is also altered. Depending on by how much and in what parts of the image focus is thereby changed, the change in focus may be annoying to a person viewing the images. In order to avoid such annoying focus changes, the second image 32 may be marked as a non-display image when encoding the images. Thereby, both the first image 31 and the second image 32 may be included in a stream transmitted by the camera 3, but only the first image 31 need be displayed to a viewer at the receiving end. The need to mark the second image 32 as a non-display image may be reduced by adjusting an exposure time or gain, either already at the sensor, or in image processing.

[0055] In some variants of the method, the first focus setting may be chosen such that it is a little “too close” compared to what would otherwise be used. In order to get a normal image, the first aperture setting may be reduced just a little, such that, e.g., an aperture size of F5.6 is used rather than F2.0 which could normally be expected. In this manner, focus will be in the distance range of interest. For a door station, focus will be at a distance at which a person seeking to enter may be expected to stand. When the detection of a possible spoofing attempt is to be performed, a sweep may be done with the iris or diaphragm of the camera, such that the depth of field is moved from the rear to the front. In this manner, the likelihood of achieving different focus measures at different aperture sizes increases. If instead an “optimal” focus distance is used as the first focus setting, as described above, the difference between different aperture sizes may be smaller, since it is the endpoints of the depth of field that are moved when the aperture size is adjusted.

[0056] The method for detecting a spoofing attempt will now be summarised with reference to FIG. 4, which is a flow chart showing the steps of an example of the method.

[0057] In step S1, a first image is captured, using a first focus setting and a first aperture size. In step S2, an object is detected in the first image. This object may, e.g., be a face, such as in the preceding examples, or another object that it is desired to detect and be able to determine if it is potentially fake. In step S3a, a first protrusion focus measure is determined in a protrusion area of the object. If the object is a face, the protrusion area may, for instance, be the area of a nose or a forehead. Similarly, in step S3b, a recess focus measure is determined in a recess area of the object. For a face, the recess area may, e.g., be the area of an ear or a cheek.

[0058] A second image is captured in step S4 using the first focus setting and a second aperture size. The second aperture is different from the first aperture size. In the same way as for the first image, in step S5, the object that was detected in the first image is also detected in the second image. In step S6a, a second protrusion focus measure is determined in the protrusion area of the object in the second image. The protrusion area studied in the second image is the same as the protrusion area studied in the first image. Thus, if the nose was chosen as the protrusion area in the first image, the nose will be the protrusion area also in the second image. Similarly, in step S6b, a second recess focus measure is determined in the recess area of the object in the second image. The recess area studied in the second image is the same as the recess area studied in the first image. In other words, if an ear was chosen as the recess area in the first image, the same ear will be the recess area in the second image.

[0059] In step S7a, the difference between the first protrusion focus measure and the second protrusion focus measure is calculated, and in step S7b the difference between the first recess focus measure and the second recess focus measure is calculated. In step S8, these differences are compared, and in step S9, if the focus differences differ by less than the predetermined threshold amount, it is determined that the object is fake.

[0060] The spoofing detection methods described above may be implemented by means of a system such as the one illustrated in FIG. 5. The system 50 comprises an aperture setting controller 51, which is arranged to control the aperture size of the camera. The system 50 further comprises an image capture initiator 52, which is arranged to initiate capture of a first image, and a second image. The aperture setting controller is arranged to control the aperture size, such that the first image is captured with a first aperture size and such that the second image is captured with a second aperture size. The first and second aperture sizes are not the same. The system 50 also comprises an object detector 53. If the objects of interest are faces, then the object detector 53 may be a face detector. The object detector 53 is arranged to detect an object in the first and second images. Furthermore, the system 50 comprises a focus determinator 54. The focus determinator 54 is arranged to determine the first protrusion focus measure in the protrusion area of the object, such as nose area in a face, in the first image and to determine the corresponding second protrusion focus measure in the second image. The focus determinator 54 is also arranged to determine the first recess protrusion focus measure in the recess area of the object, such as an ear area in a face, and to determine the corresponding second recess focus measure in the second image. A focus difference calculator 55 comprised in the system 50 is arranged to calculate the protrusion focus difference between the first and second protrusion focus measures and to calculate the recess focus difference between the first and second recess focus measures. The system 50 further comprises a focus difference comparator 56, which is arranged compare the protrusion focus difference and the recess focus difference. Additionally, the system 50 comprises an evaluator 57, which is arranged to determine that the object is fake if the protrusion focus difference and the recess focus difference differ by less than a predetermined threshold amount, i.e. if the protrusion focus difference and the recess focus difference are too similar.

[0061] The system 50 may be integrated in the camera 3, such as exemplified in FIG. 6. In FIG. 6, only a very simplified block diagram of the camera 3 is shown. The camera 3 has other components as well, but as these are not of particular relevance to the present invention, they are not shown in the drawings and will not be further discussed here.

[0062] The camera 3 in FIG. 6 has a lens 61, and a sensor 62 for capturing images. Further, the camera has an image processor 63 for processing the captured images, an encoder 64 for encoding the images, and a network interface 65 for transmitting encoded images from the camera 3. Additionally, the camera 3 has a system 50 for detecting if an object detected in images captured by the camera 3 is fake. The system 50 may be integrated in the image processor 63, or it may be a separate module, such as shown in FIG. 6.

[0063] The camera 3 may in turn be integrated in another device, such as the door station 2 shown in FIGS. 1 and 2, or it may be a standalone camera.

[0064] Instead of being integrated in the camera 3, the spoofing detection system 50 may be arranged separately, but operatively connected to the camera 3.

[0065] The spoofing detection system 50 may be embodied in hardware, firmware, or software, or any combination thereof. When embodied as software, the spoofing detection system may be provided in the form of computer code or instructions that when executed on a device having processing capabilities will implement the temperature control method described above. Such device may for instance be, or include, a central processing unit (CPU), a graphics processing unit (GPU), a custom-made processing device implemented in an integrated circuit, an ASIC, an FPGA, or logical circuitry including discrete components. When embodied as hardware, the system may comprise circuitry in the form of an arrangement of a circuit or a system of circuits. It may for example be arranged on a chip and may further comprise or be otherwise arranged together with software for performing the processing.

[0066] It will be appreciated that a person skilled in the art can modify the above-described embodiments in many ways and still use the advantages of the invention as shown in the embodiments above. As an example, the invention has so far been described in the context of a physical access control system, in which a door station employs a camera and a face detection or face recognition algorithm. However, the anti-spoofing methods and systems of the invention can be used also in other systems, such as monitoring or surveillance systems employing cameras. They may also be used in contexts where other objects than faces are detected. As long as the detected type of object has at least one protrusion area and at least one recess area, the spoofing detection methods and systems may be used for detecting if a detected object appears to be a flat photo rather than a real three-dimensional object. For instance, looking at a car more or less from the front, the headlights and the outer rear-view mirrors are expected to be at different distances from the camera. Hence, a difference in focus measure is expected between the headlights and the rear-view mirrors if the detected car is three-dimensional and not just a photo of a car. The skilled person will realise that if the objects to study are expected to be located further away from the camera than is typically expected of a face in front of a door station, the focal length of the camera will need to be longer.

[0067] In the examples above, the person trying to gain access attempts to trick the system using a photo of a person who is authorised to enter. Instead of a photo the intruder could just as well hold a display, such as on a mobile phone or tablet, up to the camera. The display could show a still photo or a video sequence.

[0068] Thus, the invention should not be limited to the shown embodiments but should only be defined by the appended claims.