GPU BASED EMBEDDED VISION SOLUTION

Abstract

The present invention relates to system and method that detects objects and assigns trackers to each object and maps it to detections to its corresponding Geo-Coordinates. Based on the number of detections, it assigns new trackers and counts the objects identified using optimal and efficient computation. The present system requires just one detection and its location to create a tracker accurately. Also, if an object is missing in any given location, the present system identifies the Geo-Coordinates and shows the image along with its details.

Claims

1. A system (1000) for identifying and analyzing roadway assets, comprising: a video capturing unit (300) configured to record a video; a comparison unit (200) configured to detect fixed assets by comparing data from a master video with the video recorded by the video capturing unit (100); an identification unit (300) configured to identify random assets in the captured video; and a GPS unit (400) configured to provide location co-ordinates for the fixed assets and the random assets, such that the system (1000) uses this information to identify and track the assets.

2. The system (1000), as claimed in claim 1, wherein master video is obtained from one or more users.

3. The system (1000), as claimed in claim 1, wherein the comparison unit (200) extracts data from the master video.

4. The system (1000), as claimed in claim 1, wherein the video capturing unit (300) is captured while patrolling to monitor roadways.

5. The system (1000), as claimed in claim 1, wherein the GPS unit (400) can be based on any positioning system.

6. The system(100), as claimed in claim 1, wherein the system (1000) monitors the roadways assets by detecting the fixed assets and identifying the random assets.

7. The system(1000), as claimed in claim 1, further comprises of a tracking module (500) uses a vehicle speed and object coordinates ie. object on left side or the right side of roadways to predict the objects next location in consecutive frames, wherein the vehicle speed and object coordinates are determined by identifying the pattern of movement of bounding boxes along its direction and using this information to create a custom tracker model.

8. The system(1000), as claimed in claim 1, wherein the tracker module (500) is configured to generate a count which identifies missing objects and its location based on the count.

9. The system(1000), as claimed in claim 1, further comprising a mapping module 500 configured to compute distance between a reference point in the master video and the identified assets captured in the video, and cross map the identified asset to original database coordinates.

10. The system(1000), as claimed in claim 1, further comprising a localizing module 600 configured to map the identified assets to corresponding chainage blocks for accurate location identification.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] In the figures, similar components and/or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label with a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.

[0015] FIG. 1 illustrates a block diagram of a system, in accordance with one exemplary embodiment of present disclosure.

[0016] FIG. 2 illustrates a block diagram of comparison unit of the system, in accordance with one exemplary embodiment of present disclosure.

[0017] FIG. 3 illustrates a flow chart of the custom developed tracker module to suit the system, in accordance with one exemplary embodiment of present disclosure.

[0018] FIG. 4 illustrates functionality aspect of mapping module, in accordance with one exemplary embodiment of present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0019] Before the present system and method for identifying and analyzing any assets on the roads, streets, pathways, highways is described, it is to be understood that this disclosure is not limited to the particular system and method for achieving so, as described, since it may vary within the specification indicated. Throughout the specification, the terms roads, streets, highways may be collectively used as roadways or referred individually, and can be generically used to refer to a system and method for asset management for roadways which detects, tracks and monitors the assets.

[0020] It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present invention, which will be limited only by the appended claims. The words “comprising,” “having,” “containing,” and “including,” and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. The disclosed embodiments are merely exemplary methods of the invention, which may be embodied in various forms.

[0021] In accordance with one general embodiment of present disclosure, a system for detecting objects and assigning trackers to each object and mapping it to detections to its corresponding Geo-Coordinates, is disclosed. Based on the number of detections, it assigns new trackers and counts the objects identified using optimal and efficient computation. The present system requires just one detection and its location to create a tracker accurately. Also if an object is missing in any given location, the system identifies the geo-coordinates and shows the image along with its details. Particularly, in one significant embodiment, a GPU based embedded computer vision technology for asset management for roadways which detects, tracks and monitors the assets, is disclosed.

[0022] In one example embodiment, the list of assets include: [0023] 1. Kilometer Stone [0024] 2. Delineator [0025] 3. Thermoplastic Paint [0026] 4. Kerb [0027] 5. Metal beam crash barrier [0028] 6. All variations of High Mast Light [0029] 7. All variations of Solar Blinker [0030] 8. Hazard Board [0031] 9. Hectometer Stone [0032] 10. All variations of Signboards [0033] 11. Gantry Board [0034] 12. VMS Gantry [0035] 13. All variations of Guard Rails [0036] 14. All variations of Street Lights [0037] 15. All variations of Streetlight DB Box [0038] 16. All variations of ATCC RO Panel [0039] 17. All variations of Emergency Call Box (SOS) [0040] 18. All variations of Litter Bins along the project [0041] 19. Crash Barriers [0042] 20. Expansion joints [0043] 21. New Jersey barrier [0044] 22. Median Plantation [0045] 23. All types of Encroachment (Unauthorized Occupants) [0046] 24. Potholes [0047] 25. Cracks [0048] 26. All variations of Mud Accumulation [0049] a. Normal Sand [0050] b. Red Sand [0051] c. Cement [0052] 27. Water Stagnation [0053] 28. Drainage [0054] a. Lined drain [0055] b. Divider drain [0056] 29. Over speeding of Route Patrol Vehicle [0057] 30. Wall Posters [0058] 31. Shop Boards/Hoardings [0059] 32. Vegetation growth on structures [0060] 33. All variations of CCTV Camera [0061] 34. Night Video : Working Status of Street Light [0062] 35. Night Video : Working Status of Solar Blinker [0063] 36. Night Video : Working Status of VMS

[0064] Referring now to FIG. 1, a GPU embedded system and method for detecting missing and damaged assets/use cases along the roadway. These assets include streetlights, signboards, kilometer stones, delineators, cracks, potholes, etc. In one particular embodiment, the system 1000 comprises of a comparison unit 100 and an identification unit 200 for identifying and tracking the assets of the roadway. Here, the comparison unit 100 is configured to compare with a master video that the user has uploaded. Data extracted from the master video is used to compare with every day shift/patrol videos. The comparison unit 100 works best for managing fixed assets such as Signboards, Kilometer stones, etc.

[0065] In one other embodiment, the identification unit 200 does not require the master video. Thus, the identification unit 200 works best for linear/random assets/use cases such as potholes, cracks, missing MBCB, damaged kerbs etc. In one specific embodiment, a system also works with night videos. This can be used to know the working status of assets such as street lights, solar blinkers and reflective assets such as hazard markers, caution boards etc.

[0066] Further, system 1000 includes a video capturing unit 300 to record the daily patrolling videos. In an aspect when a user uploads a master video, the comparison unit 100 compares the data extracted from the master video to get information on fixed assets such as Signboards, Kilometre stones, etc. on the roadway with every day shift/patrol videos of the video capturing unit 300. In another aspect the present invention includes GPS unit 400 which can be any positioning system known to the person skilled in the art. The GPS unit 400 associates the fixed assets detected by the comparison unit 100 and the random assets identified by the identification unit 200 with location co-ordinates. Thus, the system 100 of the present invention is used for monitoring the assets on the roadways.

[0067] In accordance with one preferred embodiment, the present invention provides a combination of YOLOACT and YoloV4 algorithms that are customized to run on GPU as a multi-threaded process to detect assets on the roadways. The present invention includes modules such as Tracking Module 450, Mapping module 500, Localizing module 600, etc. in addition to the YoloV4 model. The present invention also combines other proven segmentation models such as YOLACT to create a unique solution which addresses a wider range of use cases. The system 1000 of the present invention is customized such that the entire framework works specifically for roadways.

[0068] Following from above, the tracking module 450 and mapping module 500 (shown in FIG. 4) is discussed in one working aspect of present disclosure. As GPS data is not consistent to the same location every day, the coordinates assigned to an asset differs in every video recorded. With GPS coordinates varying for the same location, the mapping module 500 is configured to accurately map the asset. Keeping a reference point in the master data, the distance between the reference point and the identified asset is assessed in every video. By comparing the distance value and the asset details identified in a location, the asset is cross-mapped to its original database coordinates. This data further validates if the asset is missing in the current video.

[0069] Next, the localizing module 600 is discussed in sufficient detail. Every road network has chainage blocks for easy referencing the location and their assets. The starting and ending coordinates of every chainage is required to map the assets belonging in its region. The process of identifying the required starting and ending chainage coordinates is now automated. This is achieved by generating Distance Matrix API from google maps and comparing it with the GPS coordinates generated from the Master Video to identify the boundary coordinates for every chainage and divide the road into respective chainage blocks. Though original database coordinates are mapped to the missing assets, users will not be able to readily identify or locate its location. With this localizing module 600 one can accurately map every identified or missing asset to its respective chainage blocks, thereby making the assets easier to locate.

[0070] In one example embodiment, more than 35 use cases are to be identified and classified. However, out of them, most cases have a lot of similarity. Default YOLO model assigns equal priority to every class, but roadway assets are of various sizes and shapes. Thus, the loss functions are modified according to the dataset and tweaked anchor boxes to get more than 95% accurate results (based on trials). The hyper parameters are then finalized after several design of experiments.

[0071] Having a lot of trackers also takes a hit on the processing time. Any conventional tracker requires at least 3 frames to assign a tracker and track its path. Any occlusion which is very common on roadways, leads to multiple counts which will affect the whole solution. But in the present disclosure of tracking roadway assets, various occlusions are anticipated and possibilities of an asset being seen could even be just one frame. The present invention develops a tracking algorithm which not only works on a single frame but also hits the right balance of performance and processing time. Conventional trackers use multiple frames to assign a tracker to an asset. They also use IOU of bounding boxes to triangulate the asset and identify the speed and direction of the asset movement. It assigns a new tracker if the asset is missed over multiple frames.

[0072] Accordingly, the video is divided into left and right side as there will not be any assets in the middle of the road. The tracker of present disclosure works on the logic where an asset present in the left side always moves in the left direction and vice versa. Instead of using IOU of bounding boxes, the centre point of bounding boxes is used and the difference in pixel movements is compared to triangulate an asset. Further, the speed data from the GPS metadata is extracted to anticipate the trackers next location instead of calculating the speed of trackers. This way the present system doesn’t require multiple frames to identify speed and direction of a tracker as opposed to the conventional trackers.

[0073] The present invention consumes considerably less computational power than the existing available tracker algorithms. Also, as a lot of assets are located nearby and assets looking similar, it is difficult to use existing trackers like Kalman filter or IOU for tracking the objects.

[0074] In one exemplary embodiment of present disclosure, as the solution requires to map assets to its location, the present invention uses geo coordinates and speed from the dash camera on the video to locate and mark assets location. On the whole, the present invention runs multiple processes simultaneously using a multithreading approach, for identifying & classifying objects (GPU process), for tracking the objects (for counting), etc.

[0075] FIG. 2 illustrates a brief overview of the system and its various modules, in accordance with one preferred embodiment. Further, as will be discussed in FIG. 3, the system follows the below steps to identify, track and detect the missing objects. Detections from the YoloV4 model is handed over to the custom developed Tracker module 500. The GPS and Speed data of the vehicle is collected. In general, trackers will determine over consecutive frames, the speed and direction of the object and assign a tracker to it. But in present disclosure, the system collects frames from roadways where receiving consecutive frames with detections is not always possible. This could be due to many factors such as vehicle occlusions, lighting conditions, smaller objects, etc. By using traditional trackers, this will not be possible as it would assign multiple trackers to the same object.

[0076] The present system 1000 detects a pattern that all the objects detected on the right side will always move along the right side in the next consecutive frames. The same pattern can be identified for the objects detected on the left side. Based on this pattern the tracker module 500 is configured to estimate the direction of the object’s movements. For accurate tracking of objects, any module requires both the speed and direction of the objects.

[0077] The presently developed tracker innovatively combines the above mentioned data of the vehicle speed and object coordinates ie. object on the left side or the right side of the roadways to predict the objects next location in the consecutive frames. This method significantly decreases the computation power and also reduces false counts considerably. As illustrated in FIG. 3, the system 100 assigns trackers to the detected objects. If there are no active trackers, the system 100 will assign a tracker to the specific detected object. If the object is detected for the first time, its geo-coordinates and the asset count is saved to the database. In the next consecutive frames, if the same object is detected within a pixel distance threshold, the system 1000 does not assign a new tracker; instead updates the active tracker’s current position. During post processing, the system 1000 tallies the object detections in the given stretch. If the numbers don’t match with the database, the ‘Missing Object’ module 700 will calculate and locate the Objects missing in the stretch.

[0078] The foregoing description is a specific embodiment of the present disclosure. It should be appreciated that this embodiment is described for purpose of illustration only, and that numerous alterations and modifications may be practiced by those skilled in the art without departing from the spirit and scope of the invention. It is intended that all such modifications and alterations be included insofar as they come within the scope of the invention as claimed or the equivalents thereof.

GPU BASED EMBEDDED VISION SOLUTION

Inventors

Cpc classification

Classification Explorer

G06T11/20

PHYSICS

Classification Explorer

G06V20/58

PHYSICS

Classification Explorer

G06V20/54

PHYSICS

Classification Explorer

G06V20/588

PHYSICS

Classification Explorer

G06T2210/12

PHYSICS

Classification Explorer

G06V20/56

PHYSICS

International classification

Classification Explorer

G06V20/56

PHYSICS

Classification Explorer

G06T11/20

PHYSICS

Classification Explorer

G06V20/58

PHYSICS

Abstract

Claims

Description