HIERARCHICAL CONSTRAINT (HC)-BASED METHOD AND SYSTEM FOR CLASSIFYING FINE-GRAINED GRAPTOLITE IMAGES
20230267703 · 2023-08-24
Assignee
- Nanjing Institute of Geology and Palaeontology, CAS (Nanjing City, CN)
- Tianjin University (Tianjin, CN)
Inventors
Cpc classification
G06V10/774
PHYSICS
G06V10/7715
PHYSICS
G06V10/26
PHYSICS
G06V10/25
PHYSICS
International classification
G06V10/74
PHYSICS
G06V10/24
PHYSICS
G06V10/26
PHYSICS
G06V10/77
PHYSICS
G06V10/774
PHYSICS
Abstract
The present disclosure relates to a hierarchical constraint (HC) based method and system for classifying fine-grained graptolite images. The method includes: constructing a graptolite fossil dataset; extracting features in graptolite images; calculating the similarity between graptolite images, and performing weighting according to a genetic relationship among species to obtain a weighted HC loss function (HC-Loss) of all graptolite images; calculating cross-entropy loss; taking a weighted sum of HC-Loss and CE-Loss as a total loss function in a training stage; and performing model training. The system of the present disclosure includes a processor and a memory.
Claims
1. A hierarchical constraint (HC)-based method for classifying fine-grained graptolite images, comprising: step 1, collecting an original graptolite image; step 2, annotating fine granularity of a graptolite specimen in the original graptolite image; step 3, according to an annotation result of fine granularity, conducting cropping at a pixel level, cropping by an annotation box, and data enhancement on the original graptolite image to obtain a graptolite image representing the graptolite specimen, and constructing a graptolite dataset; step 4, extracting features in the graptolite image by a Convolutional Neural Network (CNN) model, which specifically comprises: extracting a feature map from an input graptolite image by convolution, activation and pooling operation of the CNN model to obtain a feature vector; and projecting, by an embedding layer, the feature vector into a feature with a dimension being a number of categories in the dataset, wherein a feature vector obtained after projection represents a prediction vector of the CNN to the input graptolite image, each value in the prediction vector represents a prediction score of a corresponding category of the graptolite image, and the higher the prediction score, the greater the probability that a graptolite in the graptolite image belongs to this category; step 5, calculating a similarity between graptolite images, and performing weighting according to a genetic relationship among species to obtain a weighted HC loss function (HC-Loss) of all graptolite images, which specifically comprises: dividing graptolite images in a same batch randomly into multiple groups of graptolite image pairs, each group of graptolite image pairs comprising two graptolite images; for two graptolite images in each graptolite image pair, quantifying a similarity weight according to the genetic relationship of categories graptolites in the two graptolite images belong, wherein for the graptolite images belonging to two categories, the closer the genetic relationship is, the greater the degree of similarity is, and the greater the similarity weight value is; on the contrary, the farther the genetic relationship is, the smaller the similarity weight value is; and in each training batch, calculating the weighted HC-Loss of all graptolite images by the following process: calculating a Euclidean distance between prediction vectors of two graptolite images in each graptolite image pair; according to the similarity weight value for each graptolite image pair, making weighted summation on the similarity of all groups of graptolite image pairs, and dividing a sum by the number of groups to obtain weighted HC-Loss of all graptolite images; step 6, calculating cross-entropy loss (CE-Loss) used to represent a difference between predicted probability distribution of the CNN model and real label distribution of images; step 7, taking a weighted sum of HC-Loss and CE-Loss as a total loss function of the CNN model in a training stage; and step 8, training the CNN model.
2. The method for classifying fine-grained graptolite images according to claim 1, wherein in step 1, the collected original graptolite image comprises high-resolution images covering various families, genera and species.
3. The method for classifying fine-grained graptolite images according to claim 1, wherein step 4 is executed by the following process: extracting, from an input graptolite image x, a feature map f.sub.x with respect to x by the convolution, activation and pooling operation of the CNN model, and setting a size of f.sub.x to C×H×W, wherein C, H and W denote a channel, a height and a width of the feature map, respectively; and expanding the feature map f.sub.x to a feature vector with one dimension being C×H×W, and projecting the feature vector into a feature vector with a dimension of N through an embedding layer, wherein N denotes a number of categories in a dataset, the embedding layer is implemented through a fully connected layer, and the final image feature vector represents a prediction vector of the CNN model to the inputted image x.
4. The method for classifying fine-grained graptolite images according to claim 3, wherein the Euclidean distance between prediction vectors of two graptolite images in each graptolite image pair is expressed as follows:
5. The method for classifying fine-grained graptolite images according to claim 1, wherein in step 5, said setting a similarity weight value for a graptolite image pair according to the categories and genetic relationship specifically comprises: if two graptolite images of a graptolite image pair belong to the same category and a lowest common parent category is at a species level, setting the similarity weight value to 0; if two graptolite images of a graptolite image pair belong to different species of the same genus and a lowest common parent category is at a genus level, setting the similarity weight value to 1.0; if two graptolite images of a graptolite image pair belong to different genera of the same family and a lowest common parent category is at a family level, setting the similarity weight value to be greater than 0.5 and less than 1.0; and if two graptolite images of a graptolite image pair belong to different families and a lowest common parent category is at an order level, setting the similarity weight value to be greater than 0.1 and less than 0.3.
6. The method for classifying fine-grained graptolite images according to claim 5, wherein in step 5, if two graptolite images of a graptolite image pair belong to different genera of the same family and the lowest common parent category is at a family level, the similarity weight value is set to 0.6.
7. The method for classifying fine-grained graptolite images according to claim 5, wherein in step 5, if two graptolite images of a graptolite image pair belong to different families and the lowest common parent category is at an order level, the similarity weight value is set to 0.2.
8. An HC-based system for classifying fine-grained graptolite images, comprising a processor and a memory, wherein the memory has a program instruction stored therein, and the processor calls the program instruction stored in the memory to enable the system to perform the steps of the method according to claim 1.
9. The HC-based system for classifying fine-grained graptolite images according to claim 8, wherein in step 1, the collected original graptolite image comprises high-resolution images covering various families, genera and species.
10. The HC-based system for classifying fine-grained graptolite images according to claim 8, wherein step 4 is executed by the following process: extracting, from an input graptolite image x, a feature map f.sub.x with respect to x by the convolution, activation and pooling operation of the CNN model, and setting a size of f.sub.x to C×H×W, wherein C, H and W denote a channel, a height and a width of the feature map, respectively; and expanding the feature map f.sub.x to a feature vector with one dimension being C×H×W, and projecting the feature vector into a feature vector with a dimension of N through an embedding layer, wherein N denotes a number of categories in a dataset, the embedding layer is implemented through a fully connected layer, and the final image feature vector represents a prediction vector of the CNN model to the inputted image x.
11. The HC-based system for classifying fine-grained graptolite images according to claim 10, wherein the Euclidean distance between prediction vectors of two graptolite images in each graptolite image pair is expressed as follows:
12. The HC-based system for classifying fine-grained graptolite images according to claim 8, wherein in step 5, said setting a similarity weight value for a graptolite image pair according to the categories and genetic relationship specifically comprises: if two graptolite images of a graptolite image pair belong to the same category and a lowest common parent category is at a species level, setting the similarity weight value to 0; if two graptolite images of a graptolite image pair belong to different species of the same genus and a lowest common parent category is at a genus level, setting the similarity weight value to 1.0; if two graptolite images of a graptolite image pair belong to different genera of the same family and a lowest common parent category is at a family level, setting the similarity weight value to be greater than 0.5 and less than 1.0; and if two graptolite images of a graptolite image pair belong to different families and a lowest common parent category is at an order level, setting the similarity weight value to be greater than 0.1 and less than 0.3.
13. The HC-based system for classifying fine-grained graptolite images according to claim 12, wherein in step 5, if two graptolite images of a graptolite image pair belong to different genera of the same family and the lowest common parent category is at a family level, the similarity weight value is set to 0.6.
14. The HC-based system for classifying fine-grained graptolite images according to claim 12, wherein in step 5, if two graptolite images of a graptolite image pair belong to different families and the lowest common parent category is at an order level, the similarity weight value is set to 0.2.
15. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program that comprises program instructions, and the program instructions, when executed by a processor, enables the processor to perform the steps of the method according to claim 1.
16. The computer-readable storage medium according to claim 15, wherein in step 1, the collected original graptolite image comprises high-resolution images covering various families, genera and species.
17. The computer-readable storage medium according to claim 15, wherein step 4 is executed by the following process: extracting, from an input graptolite image x, a feature map f.sub.x with respect to x by the convolution, activation and pooling operation of the CNN model, and setting a size of f.sub.x to C×H×W, wherein C, H and W denote a channel, a height and a width of the feature map, respectively; and expanding the feature map f.sub.x to a feature vector with one dimension being C×H×W, and projecting the feature vector into a feature vector with a dimension of N through an embedding layer, wherein N denotes a number of categories in a dataset, the embedding layer is implemented through a fully connected layer, and the final image feature vector represents a prediction vector of the CNN model to the inputted image x.
18. The computer-readable storage medium according to claim 17, wherein the Euclidean distance between prediction vectors of two graptolite images in each graptolite image pair is expressed as follows:
19. The computer-readable storage medium according to claim 15, wherein in step 5, said setting a similarity weight value for a graptolite image pair according to the categories and genetic relationship specifically comprises: if two graptolite images of a graptolite image pair belong to the same category and a lowest common parent category is at a species level, setting the similarity weight value to 0; if two graptolite images of a graptolite image pair belong to different species of the same genus and a lowest common parent category is at a genus level, setting the similarity weight value to 1.0; if two graptolite images of a graptolite image pair belong to different genera of the same family and a lowest common parent category is at a family level, setting the similarity weight value to be greater than 0.5 and less than 1.0; and if two graptolite images of a graptolite image pair belong to different families and a lowest common parent category is at an order level, setting the similarity weight value to be greater than 0.1 and less than 0.3.
20. The computer-readable storage medium according to claim 19, wherein in step 5, if two graptolite images of a graptolite image pair belong to different genera of the same family and the lowest common parent category is at a family level, the similarity weight value is set to 0.6.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0043] To make the technical solutions of the present disclosure clearer, the present disclosure is further described below with reference to the accompanying drawings and specific embodiments.
[0044] The present disclosure is inspired by the following priori knowledge: organisms are born with attributes of being classified to different hierarchies, and from the perspective of classification hierarchies, subcategory specimens belonging to the same parent category have similar morphological characteristics, and the degree of similarity usually increases with the decrease of the level of their lowest common parent category. Therefore, compared with the current advanced fine-grained image classification method, the present disclosure can achieve higher classification accuracy on graptolite images, and is also suitable for other biological images. In addition, the present disclosure does not introduce additional parameters in the CNN training stage, and can be added to any CNN to carry out end-to-end training.
[0045] Step 1, Construct a graptolite dataset.
[0046] (1) Perform Graptolite Image Collection and Fine Granularity Annotation.
[0047] All the original graptolite images (also known as graptolite fossil images) are collected from 1,565 fossil specimens stored in Nanjing Institute of Geology and Palaeontology, Chinese Academy of Sciences. Finally, a total of 40,597 original graptolite images are collected, including 20,644 SLR images (each with a resolution of 4,912*7,360 pixels) and 19,953 microscope images (each with a resolution of 2,720*2,048 pixels). The dataset is then cleaned, from which 5,977 low-quality original graptolite images are deleted. Finally, the dataset retains 34,613 cleaned original graptolite images, covering 15 families, 42 genera and 113 species. After image collection, the annotator uses COCO Annotator (an open source image annotation tool) to mark graptolite specimens in the cleaned original graptolite images at a pixel level.
[0048] (2) Perform Graptolite Image Cropping and Data Enhancement.
[0049] Since fossils are prone to natural factors like weathering and erosion, the tissue structure and texture features of graptolite have been seriously destroyed, and there are problems such as feature missing and indistinguishability. Considering this situation, researchers first crop all the original graptolite images at the pixel level according to the annotation results, so as to improve the classification accuracy of CNN. Then, because the resolution of the original graptolite image is very high, and the proportion of some graptolite specimens is too small, all the graptolite original images are cropped by an annotation box to scale the graptolite specimens to an appropriate proportion, so as to obtain the graptolite images representing the graptolite specimens. In addition, because the original graptolite images all come from specimens, different original graptolite images collected from the same specimen may be very similar after the above two steps of cropping, the graptolite images are further augmented by random rotation, random flipping, random translation, random scaling and other operations to enhance the diversity of images in the dataset, and thus enhance the model's generalize ability (e.g., to enable the model to adapt to various test images) .
[0050] (3) Perform Division of Graptolite Datasets.
[0051] Because the original graptolite images come from specimens, different original graptolite images collected from the same specimen may present similar visual content, and they are only different in angle, spatial position and specimen size. Therefore, the graptolite datasets are not divided randomly, but in accordance with the following principle: the graptolite images belonging to the same specimen cannot exist in the training set and the test set at the same time. Rather, they should be put together. Finally, divided test sets contain 8,454 graptolite images, accounting for about 24% of the total images contained in datasets, while training sets contain 26,159 graptolite images, accounting for about 76% of the total images contained in graptolite datasets.
[0052] Step 2, Extract features of a graptolite image using CNN.
[0053] As shown in
[0054] Step 3, Calculate a similarity between graptolite images, and perform weighting according to a genetic relationship among species.
[0055] As shown in
[0056] Assume that a batch has a batch size of B, all graptolite images are divided into B/2 groups, where B is set as an even number. Then, for each graptolite image pair (x.sub.m,x.sub.n), a Euclidean distance is used to calculate the similarity between their features as a constraint:
[0057] Where φ(x) denotes a feature vector regarding a graptolite image x extracted from CNN, and φ(x).sub.i denotes an ith element in the feature vector.d(.Math.) denotes a Euclidean distance;
[0058] (2) Quantify a similarity weight according to the genetic relationship of categories graptolites in the two graptolite images belong. For graptolite images belonging to two categories, the closer their genetic relationship is, the greater the degree of similarity is, the easier it is for CNN to focus on detail features, which in turn leads to over-fitting. In view of this, the similarity weight value is set larger to constrain the model. On the contrary, the farther the genetic relationship is, the smaller the similarity weight value is.
[0063] Finally, in each batch, the weighted hierarchical constraint loss (HC-Loss) for all graptolite images is calculated as follows:
[0064] Where S denotes all graptolite images in a batch, which are divided into n groups of graptolite image pairs (n=B/2).w.sub.i,j denotes a weight value determined according to the genetic relationship in the classification hierarchy according to category i and category j.
[0065] Step 4, Calculate a weighted sum of HC-Loss and CE-Loss.
[0066] When HC-Loss is adopted, the total loss function Loss(θ) of CNN in the training stage consists of two parts: one is CE-Loss, namely L.sub.CE(θ); and the other is HC-Loss, namely L.sub.HC(θ)
Loss(θ)=L.sub.CE(θ)+μ×L.sub.HC(θ)
[0067] Where θ denotes all parameters in the CNN model.μ denotes a hyper-parameter configured to control a weight of HC-Loss L.sub.HC.CE-Loss is calculated as follows:
[0068] Where y represents the real label distribution, and S represents all the input graptolite images in a batch.p.sub.x,c represents the probability value of the category c in the prediction probability distribution of graptolite image x by the CNN model.p.sub.x,c is calculated through Softmax function.
[0069] In the training stage, the gradient descent method is used to optimize the model parameters in the process of CNN backward propagation. If the input data contains graptolite images which have similar visual content but belong to different categories, the similarity between them is quantified by HC-Loss and used as a constraint term to restrain the model from overlearning the detail features between them, so as to prevent over-fitting and improve the classification accuracy of the model.
[0070] Step 5, Test the classification effect of the present disclosure.
[0071] Firstly, test the classification effect of the current advanced fine-grained image classification methods on graptolite datasets. As shown in
[0072] In contrast, the HC-Loss proposed in the present disclosure can effectively improve the classification accuracy of graptolite images by CNN of different architectures without using additional training parameters, and the classification accuracy is higher than the classification results obtained by the current advanced methods. As shown in
[0073] HC-Loss can not only improve the classification effect of the model at the species level, but also significantly improve the classification results of CNN at the family and genus levels. As shown in
[0074] On biological fine-grained image datasets, HC-Loss can also significantly improve the classification accuracy of CNNs of different architectures. As shown in
[0075] It is taken as another corresponding embodiment, and the corresponding embodiment of the system. The hardware structure is shown in
[0076] In
[0077] Data signals are transmitted between the memory 2 and the processor 1 through a bus 3, which will not be repeated herein.
[0078] Based on the same inventive concept, the embodiments of the present disclosure further provide a computer-readable storage medium. The storage medium includes a stored program, and the program is run to control a device where the storage medium is located to implement the steps of the foregoing method described in the embodiments.
[0079] The computer-readable storage medium includes but is not limited to a flash memory, a hard disk, a solid state disk and the like.
[0080] It should be noted herein that the readable storage medium described in the above embodiments corresponds to the method described in the embodiments, which will not be repeated herein.
[0081] Some or all of the functions in the foregoing embodiments may be implemented by software, hardware, firmware, or any combination thereof. When software is used to implement the functions, some or all of the functions may be implemented in a form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedure or functions according to the embodiments of the present disclosure are completely or partially generated.
[0082] The computer may be a general-purpose computer, a dedicated computer, a computer network, or another programmable apparatus. Computer instructions can be stored in a computer-readable storage medium or transmitted through a computer-readable storage medium. The computer-readable storage medium may be any usable medium accessible by a computer, or a data storage device, such as a server or a data center, integrating one or more usable media. The usable medium may be a magnetic medium, a semiconductor medium, or the like.
[0083] Through the description of the foregoing implementations, those skilled in the art can clearly understand that the foregoing embodiments can be implemented by either software or software plus a necessary universal hardware platform. Based on this understanding, the technical solutions according to the embodiments of the present disclosure may be implemented in a form of a software product. The software product may be stored in a non-volatile storage medium (which may be a USB flash drive, a removable hard disk, or the like), and includes a plurality of instructions to enable a computer device (which may be a personal computer, a server, a network device, or the like) to perform the method according to the embodiments of the present disclosure.
[0084] It should be noted herein that the system described in the above embodiment corresponds to the method described in the embodiment, which will not be repeated herein.
[0085] The above merely describes a preferred example of the present disclosure, but the protection scope of the present disclosure is not limited thereto. A person skilled in the art can easily conceive modifications or replacements within the technical scope of the present disclosure, and these modifications or replacements shall fall within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope defined by the claims.