Image recognizing method for preventing recognition results from confusion
10275692 ยท 2019-04-30
Assignee
Inventors
Cpc classification
G06V20/41
PHYSICS
G06F18/214
PHYSICS
International classification
Abstract
An image recognizing method adopted by a platform is disclosed. The method first receives multiple targets to be recognized at the platform, and inquiries a pre-established semantic tree by reference to the targets for determining if the recognition results of the multiple targets will cause confusion or not. If confusion is not foreseeable, the method obtains respectively a parent-classifier corresponding to each parent-category of each of the targets, and uses the parent-classifiers directly to perform a recognition action to the targets. Otherwise, the method obtains respectively multiple child-classifiers corresponding to multiple subcategories below each of the targets, and uses the multiple child-classifiers to perform such recognition action to the targets.
Claims
1. An image recognizing method adopted by a recognition platform, the image recognizing method comprising: a) receiving multiple targets to be recognized at the recognition platform; b) providing a semantic tree to be inquired according to the multiple targets to determine whether recognition results of the multiple targets may cause confusion or not, wherein two of the multiple targets cause confusion in the recognition results when a subcategory below any one of the multiple targets is overlapped with other subcategory below another of the multiple targets; c) obtaining multiple parent-classifiers respectively corresponding to multiple parent-categories of the multiple targets if the recognition results of the multiple targets do not cause confusion; d) using the parent-classifiers to perform a recognition action to a target video at the recognition platform after the step c; e) obtaining multiple child-classifiers respectively corresponding to multiple subcategories below each of the multiple targets if the recognition results of the multiple targets cause confusion, wherein the multiple parent-categories are unions of the multiple subcategories; f) obtaining a specific parent-classifier corresponding to specific parent-category of one of the multiple targets that do not cause confusion if the multiple targets cause confusion in the recognition results; g) using the specific parent-classifier and the multiple child-classifiers to perform the recognition action to the target video after the step f; h) determining whether any of the multiple child-classifiers obtains an effective recognition value after the step g; i) performing a translation to a name of the subcategory corresponding to the child-classifier obtaining the effective recognition value in order to obtain multiple parent-categories that encompass the subcategory; and j) outputting names of the multiple parent-categories as a recognition result.
2. The image recognition method in claim 1, further comprising following steps of: k) determining whether any of the multiple parent-classifiers obtains an effective recognition value after the step d; and l) outputting a name of the parent-category corresponding to the parent-classifier obtaining the effective recognition value as recognition result.
3. The image recognition method in claim 1, wherein the multiple targets comprise objects or scenes.
4. The image recognition method in claim 3, wherein the multiple parent-categories comprise a Phone category, a Tablet category, a TV category, a Laptop category and a Monitor category, and the multiple subcategories comprise a Phone monitor category, a Tablet monitor category, a TV monitor category and a Laptop monitor category.
5. The image recognition method in claim 3, wherein the multiple parent-categories comprise a Laptop category, a PC category and a Keyboard category, and the multiple subcategories comprise a Laptop keyboard category and a PC keyboard category.
6. The image recognition method in claim 3, wherein the multiple parent-categories comprise an Automobile category, a Bicycle category and a Wheel category, and the multiple subcategories comprise an Automobile wheel category and a Bicycle wheel category.
7. The image recognition method in claim 3, wherein the multiple parent-categories comprise a Restaurant category, a BAR category and a Decoration category, and the multiple subcategories comprise a Restaurant decoration category and a BAR decoration category.
Description
BRIEF DESCRIPTION OF THE INVENTION
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DETAILED DESCRIPTION OF THE INVENTION
(11) In cooperation with the attached drawings, the technical contents and detailed description of the present invention are described thereinafter according to a preferable embodiment, being not used to limit its executing scope. Any equivalent variation and modification made according to appended claims is all covered by the claims claimed by the present invention.
(12) One aspect of the present invention discloses an image recognizing method for preventing recognition results from confusion (refers to as the method hereinafter); the method is adopted by an off-line recognition system or an on-line recognition platform. In the following embodiments, the on-line recognition platform which connects to the Internet will be taken into account in examples for further discussion.
(13) The recognition platform may perform an analysis action to static images and dynamic videos, so as to recognize different types of targets appearing in each of the images and the videos. For example, a recognition platform adopting FITAMOS system developed by Viscovery Pte, Ltd. may recognize at least seven types of targets from the images and the videos including Faces, Images/Trademarks, Texts, Audio, Motions, Objects and Scenes.
(14) In one aspect of the present invention, before performing a recognition action to multiple targets simultaneously, the recognition platform inquires a pre-established semantic tree in advance to determine if multiple recognition results of the multiple targets will cause confusion or not. Next, the recognition platform decides to use upper classifiers directly corresponding to parent-categories of the multiple targets to perform the recognition action or to use lower classifiers corresponding to subcategories below the multiple targets to perform the recognition action according to the determination.
(15)
(16) In the embodiment shown in
(17) If the Automobile category is regarded as a parent-category, at least three subcategories such as a Sports car category, a Sedan category and a Sightseeing bus category are included below the Automobile category. That is to say, the Automobile category (parent-category) is a union of the Sports car category, the Sedan category and the Sightseeing bus category (subcategories). If the Sports car category is regarded as a parent-category, at least two subcategories such as a 2D sports car category and a 3D sports car category are included below the Sports car category. If the 2D sports car category is regarded as a parent-category, at least two subcategories such as a Wheel category and a Car door category are included below the 2D sports car category.
(18) Similarly in
(19) It should be mentioned that the semantic tree 3 indicates a tree-type semantic structure built according to recognition demand. When training the classifiers, the structure of the semantic tree 3 is used to perform training to each corresponding classifier (including parent-classifiers and child-classifiers). The type and the amount of subcategories below a parent-category depend on the real recognition demand. In one embodiment shown in
(20) In one aspect of the present invention, the recognition platform may inquire the semantic tree 3 when performing the recognition action to a target video, so as to analyze and determine whether recognition results of multiple targets in the target video will cause confusion or not. Therefore, the recognition platform may decide to use parent-classifiers respectively corresponding to parent-categories of the targets to be recognized to perform the recognition action, or use child-classifiers respectively corresponding to subcategories below the targets to be recognized to perform the recognition action (detailed described in the following). In particular, the names of the parent-categories of the targets are the same as the names of the targets to be recognized.
(21)
(22) In one embodiment, the multiple child-classifiers (such as a Phone monitor classifier, a TV monitor classifier, etc.) respectively corresponding to multiple subcategories below each of the parent-categories are also well-trained in advance and may be obtained and used directly by the recognition platform. In the embodiment, the relationship among these parent-categories and subcategories is the same as the definition indicated by the semantic tree 3.
(23) After the step S10, the recognition platform inquiries the semantic tree 3 according to the multiple targets (step S12), so as to determine if the multiple targets may cause confusion in their recognition results (step S14).
(24) In one aspect of the present invention, the recognition platform determines that two recognition results of two targets may cause confusion if any subcategory below one of the two targets is overlapping with any subcategory below another target. For example, in
(25) As mentioned above, if the recognition platform determines in the step S14 that the recognition results of the multiple targets may not cause confusion (i.e., no overlapped subcategory exists below the multiple targets), the recognition platform then obtains parent-classifiers respectively corresponding to the parent-categories of the multiple targets (step S16), and uses the parent-classifiers to perform a recognition action to the target video (step S18).
(26) For example, if the multiple targets inputted by the user include Phone and Automobile, the recognition platform may determine that no overlapped subcategories exist below a Phone category and an Automobile category after inquiring the semantic tree 3, and uses a phone classifier and an automobile classifier corresponding to the parent-categories (i.e., parent-classifiers) to perform the recognition action to the target video.
(27) For another example, if the recognition platform determines in the step S14 that the recognition results of the multiple targets may cause confusion (i.e., at least one overlapped subcategory exists below the multiple targets), the recognition platform then obtains multiple child-classifiers respectively corresponding to multiple subcategories below the multiple targets that may cause confusion (step S20), and uses the child-classifiers to perform the recognition action to the target video. In one embodiment, the parent-categories described in the step S16 are unions of the subcategories described in the step S20.
(28) For a further example, if the multiple targets inputted by the user include Phone and Monitor, the recognition platform may determine that an overlapped Phone monitor subcategory exists below a Phone category and a Monitor category after inquiring the semantic tree 3. In this case, the recognition platform does not use a phone classifier and a monitor classifier directly corresponding to the parent-categories of the multiple targets (i.e., parent-classifiers) to perform the recognition action to the target video, but uses multiple child-classifiers below the multiple targets, such as a Phone monitor classifier, a Back cover of phone classifier, a Phone shield classifier, a TV monitor classifier, a Computer monitor classifier, etc., to perform the recognition action to the target video.
(29) It should be mentioned that if the multiple targets inputted by the user include targets that may cause confusion in the recognition results (such as Phone and Monitor) and also include another target that may not cause confusion in the recognition result (such as Automobile), the recognition platform may further obtain a parent-classifier corresponding to the parent-category of the another target that may not cause confusion (step S22). The recognition platform then simultaneously uses the multiple child-classifiers obtained in the step S20 and the parent-classifier obtained in the step S22 to perform the recognition action to the target video (step S18).
(30) The subcategories described in the step S20 and the parent-categories described in the step S22 are belonging to different layers in the semantic tree 3. As the semantic tree 3 shown in
(31)
(32) In the embodiment, the recognition platform uses the parent-classifiers obtained in the step S16 of
(33) After the step S30, if any one of the parent-classifiers obtains the effective recognition value, the recognition platform may output the name of the parent-category corresponding to the parent-classifier obtaining the effective recognition value as a recognition result of the corresponding target (step S32). In particular, if the Car classifier recognizes successfully and obtains the effective recognition value, the recognition platform may directly output Car as a recognition result of the corresponding target.
(34)
(35) In the embodiment, the recognition platform uses the child-classifiers obtained in the step S20 of
(36) For example, if the Phone monitor classifier (regarded as a child-classifier) recognizes successfully, the recognition platform performs a translation to the Phone monitor subcategory, so as to obtain the Phone category and the Monitor category (belonging to parent-category, wherein the Phone category encompasses the Phone monitor subcategory and the Monitor category also encompasses the Phone monitor subcategory). Next, the recognition platform may output Phone and Monitor simultaneously as a recognition result of the corresponding object.
(37)
(38) In one embodiment, the recognition platform inquires the semantic tree 3 according to the four targets, and determines that the Car target may not cause confusion with the Human target, the Computer target and the Monitor target in recognition results, so the recognition platform obtains a parent-classifier corresponding to a parent-category of the Car target and uses the parent-classifier to perform the analysis action, and obtains a recognition result showing that a first object 41 in the video 4 as a car.
(39) In one embodiment, the recognition platform determines that the Human target may not cause confusion with the Car target, the Computer target and the Monitor target in recognition results after inquiring the semantic tree 3, so the recognition platform obtains a parent-classifier corresponding to a parent-category of the Human target and uses the parent-classifier to perform the analysis action, and obtains a recognition result showing that a second object 42 in the video 4 as a human.
(40) In one embodiment, after inquiring the semantic tree 3, the recognition platform determines that Computer category and Monitor category may comprise the same subcategory which is the Computer monitor subcategory, so the Computer target and the Monitor target may cause confusion in recognition results. Therefore, the recognition platform does not use the parent-classifiers respectively corresponding to the two parent-categories of the Computer target and the Monitor target, but uses multiple child-classifiers respectively corresponding to multiple subcategories below the two parent-categories, such as a Computer keyboard classifier, a Computer shield classifier, a Computer monitor classifier, a Phone monitor classifier, etc., to perform the analysis action to the video 4.
(41) In the embodiment shown in
(42) In the aforementioned embodiments, the multiple targets to be recognized are objects of the images or the videos. In other embodiments, the method may also be used to recognize scenes of the images or the videos, not limited thereto.
(43)
(44)
(45)
(46)
(47) According to the embodiments of the present invention, the method may increase the accuracy rate of recognizing images and videos, prevent the recognition results from confusion, and provide the recognition results that can satisfy user demand.
(48) As the skilled person will appreciate, various changes and modifications can be made to the described embodiment. It is intended to include all such variations, modifications and equivalents which fall within the scope of the present invention, as defined in the accompanying claims.