VISION SYSTEM FOR OBJECT DETECTION, RECOGNITION, CLASSIFICATION AND TRACKING AND THE METHOD THEREOF
20230145405 · 2023-05-11
Inventors
- Palle Geltzer Dinesen (Dyssegard, DK)
- Boris Stankovic (Copenhagen NV, DK)
- Per Eld Ibsen (Copenhagen K, DK)
- Mohammad Tavakoli (Copenhagen S, DK)
- Christoffer Gøthgen (Copenhagen SV, DK)
Cpc classification
G06T7/80
PHYSICS
G06V20/52
PHYSICS
G06V10/462
PHYSICS
G08B13/19602
PHYSICS
G06F18/2148
PHYSICS
International classification
G06F18/214
PHYSICS
G06T7/80
PHYSICS
G06V10/46
PHYSICS
G06V10/94
PHYSICS
Abstract
The present invention relates to a method 100 for object detection (140), recognition, classification and tracking using a distributed networked architecture comprising one or more sensor units (20) wherein the image acquisition and the initial feature extraction are performed and a gateway processor (30) for further data processing. The present invention also relates to a vision system (10) for object detection (140) wherein the method may be implemented, to the devices of the vision system (10), and to the algorithms implemented in the vision system (10) for executing the method acts.
Claims
1-17. (canceled)
18. A method of object detection, identification and localization, the method including acts of: acquiring an image from a camera; generating a pre-processed image by performing image pre-processing of the said acquired image; detecting and identifying an object in the pre-processed image using a computer vision detection algorithm; localizing the object; wherein localizing the object includes approximating a distance of the detected object to the camera.
19. The method of claim 18, wherein the acts are performed on a single image.
20. The method of claim 18, further comprising an act of: extracting a feature on the detected and identified object using a computer vision data feature extraction algorithm (DFE algorithm) and generating a reduced dataset comprising extracted data features.
21. The method of claim 18, wherein the act of approximating the distance is performed by: acquiring a pixel object distance of the detected object; and comparing the pixel object distance with tabulated physical object height(s) and tabulated camera parameter(s).
22. The method of claim 20, wherein the act of the approximating the distance is performed by: acquiring a pixel object distance of the detected object from the reduced dataset; and comparing the pixel object distance with a tabulated physical object height and a tabulated camera parameter.
23. The method of claim 18, being performed in a sequence and further comprising an act of motion tracking the localized object.
24. The method of claim 20, further comprising an act of approximating an object-camera angle between a feature point in the feature and a center point in a feature plane that is parallel to an image plane of the camera.
25. The method of claim 24, wherein the acts are performed on a single image.
26. The method of claim 24, further comprising an act of combining the approximation of the distance and the approximation of angle to improve the localization of the object.
27. The method of claim 18 performed by acquiring images from multiple cameras.
28. The method of claim 24, wherein the angle approximation may be used on one object, using two sensors with overlapping fields of view and triangulation, for a more precise object location.
29. The method of claim 27 further including acts of approximating a first object-camera-distance to a detected object in a first preprocessed image, approximating a second object-camera-distance to a detected object in a second pre-processed image, where the first pre-processed image captures a first scene, and the second preprocessed image captures a second scene which completely or partly overlaps the first scene, and using the first and second object-camera-distances to validate that the detected object in the first and second pre-processed image is the same object.
30. The method of claim 27 further including an act of estimating an orientation of the object.
31. The method of claim 27, further including an act of self-calibration based on at least two approximated distances.
32. The method of claim 24, further including an act of self-calibration based on at least two approximated angles.
33. The method of claim 27, further including an act of self-calibration based on at least two approximated angles.
34. The method of claim 27, further including an act of time-synchronization of acquiring a plurality of images.
35. The method of claim 27, further including an act of spatial coordination of acquiring cameras by deducing relative geometries of the cameras from their pixel correspondence.
36. A sensor unit configured to perform the acts of claim 18.
37. The sensor unit according to claim 36 further comprising sensor communication means arranged for transmitting detected, identified and localized object data.
38. A vision system comprising one or more sensor units according to claim 36.
Description
DESCRIPTION OF THE DRAWING
[0128]
[0129]
[0130]
[0131]
[0132]
[0133]
[0134]
[0135]
TABLE-US-00001 Detailed Description of the Invention No Item 10 Vision system 20 Sensor unit 22 Sensor communication means 24 Camera 26 Pre-processor means 28 Camera parameter 30 Gateway processor 32 Gateway communication means 40 Management server 42 Object data 50 Computer program product 52 Computer-readable medium 60 Acquired image 62 Full-frame image 64 Sub-frame image 70 Pre-processed image 80 Reduced dataset 90 Detected object 92 Pixel object height 94 Physical object height 96 Object-camera distance 97 Object-camera angle 100 method 110 acquiring 112 performing 114 transmitting 116 receiving 118 obtaining 120 generating 122 feeding 124 comparing 126 approximate 130 Pre-processing 140 object detection 142 Object feature 150 Object recognition 160 Object tracking 180 Object classification 190 Data feature extraction (DFE) 192 extracted data features 210 Computer vision detection algorithm 220 computer vision DFE algorithm 240 Machine learning algorithm 242 Machine learning model
[0136]
[0137] The pre-processed image 70 is used for performing 112 object detection 140. The object detection 140 is performed using a computer vision detection algorithm 210. In another method act of performing 112 data feature extraction 190 a reduced dataset 80 is generated. The data feature extraction 190 is performed using a computer vision DFE algorithm 220. The pre-processed image 70, information from the performed object detection 140, and object features 142 are used in the computer vision DFE algorithm 220 to generate the reduced dataset 80 comprising extracted data features 192. The reduced dataset 80 is transmitted 114 from the sensor unit 20 to the gateway processor 30 using the sensor communication means 22. Optionally object features 142 may also be transmitted to the gateway processor 30 either as separate date or comprised in the reduced dataset 80. In the gateway processor 30, the reduced dataset 80 is received 116 using the gateway communication means 32.
[0138]
[0139] The gateway processor 30 and the sensor unit(s) 20 may each comprise a computer program product 50 comprising instructions, which, when executed by a computer, may cause the computer to carry out one or more of the illustrated method acts.
[0140] The gateway processor 30 and the sensor unit(s) 20 may each comprise a computer-readable medium 52 comprising instructions which, when executed by a computer, may cause the computer to carry out one or more of the illustrated method acts.
[0141]
[0142] One embodiment of the method acts of image pre-processing 130 is illustrated in
[0143]
[0144]
[0145]
[0146] One embodiment of object tracking is illustrated in
[0147] The object tracking may thus be performed by tracking object features 142. The object tracking may be performed by performing only a minor degree of analyzing of the subsequent sub-frame images where only the object features are tracked and the sub-frame image is not analysed for new objects. For the subsequent full-frame images the other sub-frame images may be successively analysed.
[0148] Using object features for tracking may aid for a further use of the method and the vision system. The object features may reveal the mood of a person by estimating the distance from the eyes to the mouth corners, a change in eye size, the change in the position of the shoulders to mention a few features which may be used.
[0149] One embodiment of the use of the vision system 10 is illustrated in
[0150] This embodiment illustrates the use of multiple sensor units. The illustration shows how one or more persons may be imaged by multiple sensor units each imaging a scene different from the scenes of the other sensor units. Person x4 is illustrated to be imaged by five sensor units. In the case where x4 is placed to face the table, he is imaged from the back, the side, frontally and semi-frontally. This embodiment may illustrate the item in the description of the invention referred to as Mitigation of doublets.
[0151] This illustrated embodiment may have the effect of mitigating the appearance of doublets of objects when the reduced datasets are further analysed after being transmitted from the sensor units, thereby increasing the quality and the robustness of the vision system 10.
[0152] The embodiment in
[0153] Furthermore,
[0154] Another embodiment of the use of the vision system 10 is illustrated in
[0155] The room in