OBJECT DETECTION METHOD AND OBJECT DETECTION SYSTEM
20230091892 · 2023-03-23
Assignee
Inventors
Cpc classification
G06V10/457
PHYSICS
G06V10/25
PHYSICS
G06V10/26
PHYSICS
G06T3/40
PHYSICS
G06V10/22
PHYSICS
G06V10/478
PHYSICS
International classification
Abstract
An object detection method, for detecting a target object, comprising: capturing at least two detection portions with a first aspect ratio from an input image with a second aspect ratio; confirming whether any object is detected in each of the detection portions and obtaining corresponding boundary boxes for detected objects; and wherein the first aspect ratio is different to the second aspect ratio.
Claims
1. An object detection method, for detecting a target object, comprising: capturing at least two detection portions with a first aspect ratio from an input image with a second aspect ratio; confirming whether any object is detected in each of the detection portions and obtaining corresponding boundary boxes for detected objects; and wherein the first aspect ratio is different to the second aspect ratio.
2. The object detection method of claim 1, resizing the detection portions for detection of the object.
3. The object detection method of claim 1, wherein a CNN model to is executed to confirm and obtain the corresponding boundary boxes.
4. The object detection method of claim 1, wherein a size of a union of the detection portions is equal to the input image.
5. The object detection method of claim 1, wherein at least part of detection portions is identical.
6. The object detection method of claim 1, further comprising: providing a ROI (region of interest), and computing motions of the target object in the ROI; wherein the ROI is adjusted based on the boundary boxes when the ROI is larger than a ROI threshold area, and not adjusted based on the boundary boxes when the ROI is smaller than the ROI threshold area.
7. The object detection method of claim 1, further comprising: removing or merging the boundary boxes.
8. The object detection method of claim 7, wherein the step of removing or merging the boundary boxes comprises: defining at least one filtering region in the input image; classifying the boundary box having an edge in the filter region as a candidate boundary box, and classifying the boundary box having no edge in the filter region as a maintained boundary box; and removing the candidate boundary box from the boundary boxes, according to a relation between an area of an intersection region of the candidate boundary box and the maintained boundary box and an area of the candidate boundary box, or a relation between the area of the intersection region and the area of the maintained boundary box.
9. The object detection method of claim 8, wherein the step of removing the candidate boundary box from the boundary boxes removes the candidate boundary box from the boundary boxes, if
10. The object detection method of claim 8, wherein the filtering ranges comprises a first filtering region and a second filtering region, wherein the first filtering region covers all vertical coordinates and X1 to X2 horizontal coordinates of the input image, wherein the second filtering region covers all of the vertical coordinates and X3 to X4 horizontal coordinates of the input image, X4>X3>X2>X1.
11. An object detection system, for detecting a target object, comprising: a partial image capturing device, configured to capture at least two detection portions with a first aspect ratio from an input image with a second aspect ratio; and an object detector, configured to receiving the at least two detection portions with the first aspect ratio and then confirm whether any object is detected in each of the detection portions and to obtain corresponding boundary boxes for detected objects; wherein the first aspect ratio is different to the second aspect ratio.
12. The object detection system of claim 11, wherein the detection portions are resized for the object detector to confirm whether the object is detected in each of the detection portions.
13. The object detection system of claim 11, wherein the object detector executes a CNN model to confirm and obtain the corresponding boundary boxes.
14. The object detection system of claim 13, wherein a size of a union of the detection portions is equal to the input image.
15. The object detection system of claim 13, wherein at least part of the detection portions is identical.
16. The object detection system of claim 11, further comprising: a motion computing device, configured to provide a ROI (region of interest), and to compute motions of the target object in the ROI; wherein the motion computing device adjusts the ROI based on the boundary boxes when the ROI is larger than a ROI threshold area, and not adjusting the ROI based on the boundary boxes when the ROI is smaller than the ROI threshold area.
17. The object detection system of claim 11, further comprising: a filter, configured to remove or to merge the boundary boxes.
18. The object detection system of claim 17, wherein the filter performs following steps to remove or merging the boundary boxes comprises: defining at least one filtering region in the input image; classifying the boundary box having an edge in the filter region as a candidate boundary box, and classifying the boundary box having no edge in the filter region as a maintained boundary box; and removing the candidate boundary box from the boundary boxes, according to a relation between an area of an intersection region of the candidate boundary box and the maintained boundary box and an area of the candidate boundary box, or a relation between the area of the intersection region and the area of the maintained boundary box.
19. The object detection system of claim 18, wherein the step of removing the candidate boundary box from the boundary boxes removes the candidate boundary box from the boundary boxes, if
20. The object detection system of claim 18, wherein the filtering ranges comprises a first filtering region and a second filtering region, wherein the first filtering region covers all vertical coordinates and X1 to X2 horizontal coordinates of the input image, wherein the second filtering region covers all of the vertical coordinates and X3 to X4 horizontal coordinates of the input image, X4>X3>X2>X1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
DETAILED DESCRIPTION
[0021] Several embodiments are provided in following descriptions to explain the concept of the present invention. Each component in following descriptions can be implemented by hardware (e.g. a device or a circuit) or hardware with software (e.g. a program installed to a processor). Besides, the method in following descriptions can be executed by programs stored in a non-transitory computer readable recording medium such as a hard disk, an optical disc or a memory. Additionally, the term “first”, “second”, “third” in following descriptions are only for the purpose of distinguishing different one elements, and do not mean the sequence of the elements. For example, a first device and a second device only mean these devices can have the same structure but are different devices.
[0022] Furthermore, in following embodiments, the target object which is desired to be detected is a person, but the target object can be any other object such as a specific animal or a vehicle. Additionally, the following embodiments can be provided to an image capturing device such as a camera, but can be any other device as well.
[0023]
[0024] Further, after the first coordinates are computed and recorded, the first resized detection portion DP1 is removed from the buffer. A second detection portion DP2 of the input image 200 is resized to generate a second resized detection portion RDP2. The second detection portion DP2 comprises at least second portion of the target object image 201. Please note, the generation of the second resized detection portion RDP2 is not limited be performed after the first coordinates are computed. At least part of the first portion is identical with the second portion, as illustrated in
[0025] After the first resized detection portion RDP1 is removed from the buffer, buffering the second resized detection portion RDP2 to the buffer. Then, computing second coordinates of the second portion of the target object image 201 according to the second resized detection portion RDP2 in the buffer. After the first coordinates and the second coordinates are acquired. Computing an object range of the target object image 201 according to the first coordinates and the second coordinates. The first/second detection portion DP1/DP2 and the first/second resized detection portion RDP1/RDP2 are second aspect ratio, which matches the input of the target object detection. In this way, it prevents from over shrinking the target object. In another input image case, the detection portions DP1 and DP2 may not both comprises at least portion of the target object image 201.
[0026] In one embodiment, the first detection portion DP1 and the second detection portion DP2 are first squares. Additionally, the first resized detection portion RDP1 and the second resized detection portion RDP2 are second squares smaller than the first squares. Further, in one embodiment a width and a length of the first resized detection portion RDP1 and the second resized detection portion RDP2 are less than half of which of the first detection portion DP1 and the second detection portion DP2.
[0027] For example, in one embodiment, the input image 200 is an 640×480 image. Also, the first detection portion DP1 and the second detection portion DP2 are 480×480 images. Besides, the first resized detection portion RDP1 and the second resized detection portion RDP2 are 224×224 images. However, the sizes of the first detection portion DP1, the second detection portion DP2, the first resized detection portion RDP1 and the second resized detection portion RDP2 are not limited to these examples.
[0028] In the above-mentioned example, the aspect ratio of the input image 200 (640/480) is different to which of the detection portions DP1 and DP2 (480/480). Also, aspect ratios of the detection portions DP1 and DP2 (480/480) and the resized detection portions RDP1 and RDP2 (224/224) are the same.
[0029] Besides, the above-mentioned object range which is computed based on first coordinates and the second coordinates can be a boundary box shown in
[0030] Please note, in the embodiment of
[0031] The above-mentioned object detection can be summarized as
[0032] Step 301
[0033] Resize a first detection portion DP1 of an input image 200 to generate a first resized detection portion RDP1. The first detection portion DP1 comprises at least first portion of a target object image 201 of the target object.
[0034] The target object can be a person, an animal, a vehicle, or any other object is desired to be detected.
[0035] Step 303
[0036] Buffer the first resized detection portion RDP1 to the buffer.
[0037] Step 305
[0038] Compute first coordinates of the first portion of the target object image 201 according to the first resized detection portion RDP1 in the buffer.
[0039] Step 307
[0040] Remove the first resized detection portion RDP1 from the buffer.
[0041] Step 309
[0042] Resize a second detection portion DP1 of the input image 200 to generate a second resized detection portion RDP2. The second detection portion RDP2 comprises at least second portion of the target object image 201.
[0043] Step 311
[0044] Buffer the second resized detection portion RDP2 to the buffer after the first resized detection portion RDP1 is removed from the buffer.
[0045] Step 313
[0046] Compute second coordinates of the second portion of the target object image 201 according to the second resized detection portion RDP2 in the buffer and object detect algorithm.
[0047] Step 315
[0048] Compute an object range of the target object image 201 according to the first coordinates and the second coordinates.
[0049] Please note, the sequence of the object detection method corresponding to the embodiment of
[0050] Compared with prior art, a size of the buffer can be reduced since the input image 200 is processed based on two smaller images. Also, the resized images RDP1 and RDP2 do not have blank regions as shown in
[0051] In one embodiment, a ROI (region of interest) is provided in the input image to compute motions of the target objects in the ROI. However, if some objects are wrongly determined as the target object, the ROI may be too large. Under such case, the power consumption is high and the motion computation may be non-accurate. The object detection method illustrated in
[0052] As illustrated in the upper drawing in
[0053] Via the embodiment illustrated in
[0054] Following the embodiments illustrated in
[0055] As shown in
[0056] Also, the intersection region of the first object range OR1 and the second object range OR2 is computed. Besides, the union region of the first object range OR1 and the second object range OR2 is also computed. The definitions of the intersection region and the union region are illustrated in
[0057] After the intersection region and the union region are acquired, the first object range Or1 or the second objection range Or2 is removed according to a relation between an area of the intersection region and an area of the union region. In one embodiment, a smaller one of the first object range OR1 and the second objection range OR2 is removed if
is larger than a first threshold value. AI is the area of the intersection region and AU is the area of the union region.
[0058] The steps illustrated in
[0059] Step 701
[0060] Compute a plurality of object ranges corresponding to a target object image of the target object. Each of the object ranges respectively correspond to at least one portion of the target object image.
[0061] The object range can be acquired by the object detection method illustrated in
[0062] Step 703
[0063] Compute an intersection region of at least two of the object ranges and computing a union region of the at least two of the object ranges.
[0064] Step 705
[0065] Removes at least corresponding one of the object range according to a relation of an area of the intersection region and an area of the union region.
[0066] In one embodiment, the step 705 removes at least corresponding one of the object range if
is larger than a first threshold value. AI is the area of the intersection region acquired in the step 703 and AU is the area of the union region acquired in the step 703.
[0067] Another method for removing unneeded object ranges is provided in following
[0068] As shown in
[0069] In one embodiment, the candidate object range is removed from the object ranges, if
is larger than a second threshold value. The second threshold value can be the same as or different from the above-mentioned first threshold value. AI is the area of the intersection region of the candidate object range and the maintained object range, and MA is a minimum one of the areas of the candidate object range and the maintained object range.
[0070] The filtering ranges can be set corresponding to different requirements. In one embodiment, the filtering ranges comprise a first filtering region (e.g., the filter region FR1) and a second filtering region (e.g., the filter region FR2). As shown in
[0071] In the embodiment of
[0072] The embodiment illustrated in
[0073] Step 901
[0074] Define at least one filtering region in an input image. The input image can be an image which is not processed yet, but can be an image which has been processed the method in
[0075] Step 903
[0076] Compute a plurality of object ranges corresponding to a target object image of the target object. Such object ranges can be generated by the object detection method illustrated in
[0077] Step 905
[0078] Classify the object range having an edge in the filter region as a candidate object range, and classifying the object range having no edge in the filter region as a maintained object range.
[0079] For example, in the embodiment of
[0080] Step 905
[0081] Remove the candidate object range from the object ranges, according to a relation between an area of an intersection region of the candidate object range and the maintained object range and an area of the candidate object range, or a relation between the area of the intersection region and the area of the maintained object range.
[0082] Other detail steps are illustrated in the embodiment of
[0083]
[0084] The frame buffer 1001 is configured to buffer an input image such as the input image 200 shown in
[0085] The object detector 1009 is configured to confirm whether any target object, such as person, is detected in each of the detection portions and to obtain corresponding boundary boxes for detected objects. In one embodiment, the object detector executes a CNN (Convolutional Neural Network) model to confirm and obtain the corresponding boundary boxes. The CNN model is a result of a known CNN training method, which trains the CNN model by a mass of images for detecting at least one kind of specific object, such as cars, person, dogs . . . etc.
[0086] In the embodiment of
[0087] Besides the components illustrated in
[0088] In view of above-mentioned embodiments, the detection of persons can be more accurate without increasing a size of the buffer.
[0089] Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.