Method for detecting and recognizing long-range high-density visual markers
10956799 · 2021-03-23
Assignee
Inventors
- Juan Manuel Saez Martínez (Alicante, ES)
- Francisco Escolano Ruiz (Alicante, ES)
- Miguel Ángel Lozano Ortega (Alicante, ES)
- Javier Pita Lozano (Alicante, ES)
Cpc classification
G06K19/0614
PHYSICS
International classification
G06K19/06
PHYSICS
G06F11/10
PHYSICS
Abstract
The proposal relates to a complete system of long-range, high-density visual markers (marker design and detection method). In the design, a conventional location system for the long-range markers is used. The proposal therefore focuses on a system for coding information, which in this case is a colour-based code having four states, duplicating the code density with respect to conventional black-and-white systems. Moreover, the detection method requires very few computational resources, making it very efficient and especially suitable for mobile devices. To a great extent, the success of the technique lies in the methods proposed for the treatment of the colour.
Claims
1. A detection method for detecting a visual marker in a form of a black frame on a white background, comprising the steps of: detecting the black frame, locating coordinates of corners of the black frame; obtaining a grid from the black frame by means of bilinear interpolation, wherein the grid is made up of elements having four different color tones, with the four different color tones being distinguishable from one another and forming a color palette, wherein the grid has a plurality of cells, wherein a central cell of the grid determines a size of the grid, wherein the cells corresponding with a central row and column, except for the central cell, define a cyclic redundancy check and wherein remaining cells forming the grid are elements dedicated to a message that can be transmitted by the visual marker; obtaining a color of each cell, calculating an arithmetic mean of colors of the black frame and white background for obtaining a reference black and a reference white; normalizing the color of each cell from the reference black and white by establishing a white balance; correcting an orientation of the visual marker so that the cell of a darkest corner must be in a last position of the color palette; obtaining the color palette from the corners and labelling the visual marker indicating which value of the color palette corresponds to each cell; verifying that the visual marker belongs to a desired family by analyzing whether a central label is what was expected; composing the message and the cyclic redundancy check; and verifying integrity of the message, recalculating the cyclic redundancy check and comparing the recalculated cyclic redundancy check with the cyclic redundancy check read from a code.
2. The method according to claim 1, where coordinates of a center of each cell are used in a bilinear interpolation, also taking into account the black frames and the white background.
3. The method according to claim 1, wherein to obtain the color of each cell and given that its center is in real coordinates and the image is in discrete coordinates, a 4-neighbor point bilinear interpolation is performed to obtain the color.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) A series of drawings which help to better understand the invention and which are expressly related to an embodiment of said invention presented as a non-limiting example thereof are very briefly described below.
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION OF AN EMBODIMENT OF THE INVENTION
(7) The proposed marker design is based on a grid of NN elements which can take up to four different colours inside a black frame which, in turn, is inside a white frame, as can best be seen in
(8) The colour palette is made up of the four colours which are going to be used in the marker:
P=[p.sub.0,p.sub.1,p.sub.2,p.sub.3]
(9) This palette is introduced in the actual marker, specifically in the four corners of the data grid, as can be seen in
(10) It is possible to have grids of size N{5,7,9,11} according to the needs of the application. In order to determine the size of the grid of the current marker, the cell in the central position of the grid is consulted. Note that the four sizes that the grid may have (i.e. N={5,7,9,11}) are uneven, then there is always a clear central position. This central cell, like the rest of the cells of the grid, can take any of the values of the palette S{p.sub.0,p.sub.1,p.sub.2,p.sub.3} corresponding, respectively, with the possible sizes {5,7,9,11} of the grid (see
(11) Taking into account that each cell holds 2 bits (i.e. four combinations) a marker of NN elements contains 2N.sup.24N6 bits of data and 4N4 bits of CRC. Therefore, the CRC length grows in accordance with the message length. Table 1 shows, for each marker size, the message length, the CRC length, and the generator polynomial used for the calculation. In this sense, standard generator polynomials have been used (which have proven effectiveness) in accordance with each length.
(12) TABLE-US-00001 TABLE 1 N N Message length CRC length CRC Polynomial 5 5 24 bits 16 bits CRC-16- CDMA2000 7 7 64 bits 24 bits CRC-24-Radix- 64 9 9 120 bits 32 bits CRC-32Q 11 11 192 bits 40 bits CRC-40-GSM
(13) To observe the effect of the inclusion of the colour on the density of the message, and as comparative data, the 55 system described in [S. Garrido, R. Muoz, F. J. Madrid, M. J. Marn, Automatic generation and detection of highly reliable fiducial markers under occlusion, Pattern Recognition, 2014] provides 1024 combinations, and the 66 system described in [E. Olson, AprilTag: A robust and flexible visual fiducial system, IEEE International Conference on Robotics and Automation (ICRA), 2011] provides only 500 combinations, whereas the present invention, in its least dense version (55), is capable of handling 24 bits of message, that is, 16,777,216 combinations.
(14) The steps of the method for detecting the marker are summarized below. Given an image l(x,y) and a marker size N{5,7,9,11}, extract the frames M={m.sub.0, m.sub.1, . . . , m.sub.k-1} of l(x,y) and for each m.sub.iM:
(15) 1) Obtain coordinates G.sub.i(x,y) of (N+4)(N+4) from m.sub.i
(16) 2) Obtain the colours from the data C.sub.i(x,y), the white reference R.sub.i.sup.b and the black reference R.sub.i.sup.n from G.sub.i
(17) 3) Normalise C.sub.i(x,y)=(C.sub.i(x,y)R.sub.i.sup.n)/(R.sub.i.sup.bR.sub.i.sup.n)
(18) 4) Obtain C.sub.i by orienting C.sub.i with its reference corner.
(19) 5) Obtain palette P.sub.i=[p.sub.0,p.sub.1,p.sub.2,p.sub.3] from the corners of C.sub.i
(20) 6) Label E.sub.i(x,y) by the nearest neighbour of C.sub.i(x,y) to P.sub.i
(21) 7) If E.sub.i([N/2],[N/2])=(N5)/2 holds, then:
(22) Extract message I.sub.i.sup.m and CRC I.sub.i.sup.c from E.sub.i Calculate the CRC of I.sub.i.sup.m. If it is consistent with I.sub.i.sup.c, add I.sub.i.sup.m to T.
Therefore, given a digital colour image I(x,y) captured by the camera of the device and the desired size of the marker N{5,7,9,11}, for detecting a set of markers T contained in this image, in first place, it is applied a frame detection algorithm. As a result of the frame detection algorithm on the image I(x,y), a set of frames M={m.sub.0, m.sub.1, . . . , m.sub.k-1} contained in the image is obtained. Each frame is defined by four coordinates m.sub.i=(c.sub.0, c.sub.1, c.sub.2, c.sub.3) on the space of the image, corresponding with the outer corners of the frame in a clockwise order (see centre of
(23) Each detected frame m.sub.i represents a possible marker. Taking as a reference the four coordinates of the frame m.sub.i=(c.sub.0, c.sub.1, c.sub.2, c.sub.3), bilinear interpolation is performed to obtain a grid G.sub.i(x,y) of (N+4)(N+4) equidistant coordinates. This grid contains the coordinates of the centres of the NN cells of information of the marker, 4N+4 coordinates on the black frame and 4N+12 coordinates on the white frame (see right side of
(24) For each coordinate of G.sub.i, the corresponding colour in image I(G.sub.i(x,y)) is obtained (taking into account that G.sub.i is in real coordinates and the image is in discrete coordinates, this colour is obtained by a 4-neighbour point bilinear interpolation). With the (N+4)(N+4) colours obtained, C.sub.i is taken as the NN grid of colours belonging to the data of the marker and R.sub.i.sup.b and R.sub.i.sup.n as the arithmetic means of the colours belonging to the black and white frames, respectively.
(25) The values R.sub.i.sup.b and R.sub.i.sup.n play an essential role in the present invention, since they represent the black and white reference of the marker. Therefore, the location frames can be used not only to locate the marker but also to perform a white balance and thereby treat the colour in a robust manner. To that end, C.sub.i is formed from C.sub.i normalising each of the cells as follows:
C.sub.i(x,y)=(C.sub.i(x,y)R.sub.i.sup.n)/(R.sub.i.sup.bR.sub.i.sup.n).
(26) Although C.sub.i contains a normalised reference of the colour of the NN cells of the marker, the orientation is still unknown, since the frame detection algorithm does not provide this information (the frame does not contain orientation information). To resolve this, the corners of the marker will be taken into account:
{C.sub.i(0,0),C.sub.i(N1,0),C.sub.i(N1,N1),C.sub.i(0,N1)}
(27) They contain the palette in a clockwise order, with reference in the darkest element in its last position. Therefore, from the 4 possible orientations of C.sub.i, the one leaving the darkest element (the lowest luminance) of the four corners in C.sub.i(0,N1) is selected, obtaining C.sub.i.
(28) Using the ordered colour samples, the colour palette of the four corners is obtained:
P.sub.i=[p.sub.0,p.sub.1,p.sub.2,p.sub.3]=[C.sub.i(0,0),C.sub.i(N1,0),C.sub.i(N1,N1),C.sub.i(0,N1)]
(29) The labelling E.sub.i(x,y) of the marker is obtained from the palette. It is a matrix which indicates for each cell the index value of the palette to which it corresponds. To that end, a nearest neighbour classification is performed (assigning the index of the palette with the colour value nearest to the colour of the cell):
E.sub.i(x,y)=argmin.sub.k{0,1,2,3}C.sub.i(x,y)p.sub.k.
(30) To calculate the Euclidean distance . between two colours, the CIE 1976 L*a*b* colour space is recommended since this space is isotropic (unlike other spaces such as RGB), which justifies the use of this distance function.
(31) Once the labelling of the cells has been obtained, it is necessary to verify if the marker which is being analysed is from the desired family of markers. To that end, it is necessary to verify that the size specified in the central cell coincides with the desired marker size, that is:
E.sub.i([N/2],[N/2])=(N5)/2.
(32) Otherwise, the current frame is ruled out from the possible markers.
(33) At this point it is time to extract information from the marker, that is, the message I.sub.i.sup.m and the cyclic redundancy check I.sub.i.sup.c. To obtain I.sub.i.sup.c, cells are taken from the central column and row (except the central cell which determines the type of marker), and a single number of 4N4 bits is formed with the labels of the cells in binary (labels [0,1,2,3] correspond with binary codes [00,01,10,11]) following the reading order on the matrix (from left to right and from top to bottom).
(34) Likewise, I.sub.i.sup.m is obtained by composing the cells of the message (those which do not correspond with the palette, the CRC, or the marker type), forming a binary number of 2N.sup.24N6 bits.
(35) To finish detection, the integrity of the message is verified by calculating the cyclic redundancy check of I.sub.i.sup.m with the suitable polynomial (see Table 1) and comparing it with the read CRC I.sub.i.sup.c. If both codes coincide, the message is considered valid and I.sub.i.sup.m is added to the set of markers T detected in the current image I(x,y).
Example 1. Results of the Method
(36) To assure the correct operation of the markers and the detection method proposed in the present invention, a functional prototype has been developed which consists of two applications: a marker generator and detector. Both applications have been developed in C++ in the high-performance cross-platform programming environment Qt SDK (http:/www.qt-project.org).
(37) The generator manages a marker database with the information of each marker (marker code, text associated with the marker, real print size, etc.) and allows to create, remove, search for, and print these markers with the selected physical size.
(38) Furthermore, the detector is in charge of detecting the markers on the images obtained from the camera of the device. In this case, it has been developed both for Android and iOS devices. When a marker is detected, the system queries its code in the database and vocalises (using Text To Speech) its content. Furthermore, it also vocalises the real distance at which this marker is located, since the print size of each marker is stored in the database and this information together with the camera information (focal distance and aperture) is used to recover its real scale.
(39) As for performance, the system is capable of processing a mean of 18.6 fps. This data differs depending on the device and camera resolution. In this case, a Samsung Galaxy S6 (SM-G920F) equipped with a Cortex A57 octa-core processor with 2.1 GHz and 3 Gb of RAM memory, with a camera resolution of 1280720 has been used, and the frame detection has been performed with version 2.0 of the library described in [S. Garrido, R. Muoz, F. J. Madrid, M. J. Marn, Automatic generation and detection of highly reliable fiducial markers under occlusion, Pattern Recognition, 2014], which is available at the following link: http://www.uco.es/investiga/grupos/ava/node/26.
(40) Taking into account that the camera of the device provides images at 30 fps, by eliminating camera access times the system is capable of processing each frame in a mean time of 20.43 milliseconds in the above-mentioned device.
(41) Regarding the detection distance, it depends on the camera resolution, lighting conditions, and marker size and type. Under favourable lighting conditions (daylight) and with the aforementioned resolution (1280720), a 55 marker printed on a size of 2020 cm (standard A4 print size) is detected at a maximum distance of 8.12 meters.
(42) Markers of this type may be applied to contexts in which both the range and the code density play an important role (see