METHOD AND SYSTEM FOR ANNOTATION OF MEDICAL IMAGES
20230096522 · 2023-03-30
Inventors
Cpc classification
G16H50/20
PHYSICS
G16H15/00
PHYSICS
International classification
Abstract
The present invention relates to image data processing, in particular to a method and system for annotation of medical images. The method includes: retrieving a plurality of medical images; preparing reports by physicians based on assessment of the retrieved medical images; sending the initial data that contains images and reports from database to merge selector, wherein the data is prepared and exported to annotation; checking the returned annotated data for discrepancies and disagreements; sending correctly annotated data without discrepancies and disagreements from the database for the input to the model training generating a trained model which makes automatic data annotations, wherein all decisions made by the trained model are checked additionally, and if there are discrepancies, the decisions are corrected and returned to the database for the improvement of the model training.
Claims
1. A method for improvement a training model for annotation of medical images comprising: retrieving a plurality of medical images; preparing reports by physicians based on assessment of the retrieved medical images; sending the initial data that contains images and reports from database to merge selector, wherein the data is prepared and exported to annotation; checking the returned annotated data for discrepancies and disagreements; sending correctly annotated data without discrepancies and disagreements from the database for the input to the model training generating a trained model which makes automatic data annotation; characterized in that all decisions made by the trained model are checked additionally, and if there are discrepancies, the decisions are corrected and returned to the database for the improvement of the model training.
2. The method of claim 1, characterized in that the segmentation tool is used for data annotation creating the mandatory tissue and pathology segmentation and classifications.
3. The method of claim 1, characterized in that data annotation is performed for data samples where the software can learn some new valuable information.
4. The method of claim 1, characterized in that the merge selector selects data with incorrect and/or mandatory segmentations, and incorrect classifications based on those segmentations, and only the chosen data is sent for a second check and annotation.
5. The method of claim 1, characterized in that data for annotation is sent to an arbiter, if there are classification disagreements, wherein the arbiter makes the final decision creating a majority opinion.
6. The method of claim 1, characterized in that an intelligent segmentation merge process is used to take annotations from multiple inputs, compared by provided classification and segmentation.
7. The method of claim 1, characterized in that the trained model creates automated annotations, wherein annotations are used by the intelligent segmentation merge process to fill in missing data.
8. The method of claim 1, characterized in that the data generated from the trained model is merged with the data in the database to make it easier to annotate the future data.
9. A data processing system comprising means for carrying out the steps of the method according to claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0066] The following embodiments of the method for annotating medical images are described with reference to the enclosed figures:
[0067]
[0068]
[0069] DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0070] The present invention discloses a method and a system for annotation of medical images.
[0071] At first, a plurality of medical images are acquired. The image can be provided by any kind of image acquisition apparatus (102), for example, X-ray apparatus, a computer tomography or a magnetic resonance scanning apparatus, that is located in a medical center (101), or can be retrieved from the image database.
[0072] Once the images are acquired, they are reviewed and analysed by a physician, and a report is issued (103) based on assessment of images.
[0073] Before the images and the report are sent outside the medical center (101), they are anonymized (104) to preserve the identity of the patient.
[0074] The studies (the data) containing images and reports goes through a second anonymization process to ensure their anonymity. Afterwards, the studies are saved in a database (106), and flagged as original (not yet annotated).
[0075] The studies are sent from the database (106) to the merge selector (107), wherein the said merge selector (107) selects which studies should be sent for annotation based on the available information for the certain study.
[0076] A specially designed segmentation tool (108) is used for data annotation. It is used to create the mandatory tissue and pathology segmentation. It is also additionally used to create classifications. Once the annotation process is complete, the studies are sent back and saved in the database (106). They are flagged as annotated during importing. The annotated studies are passed through the merge selector (108) once again, and if there are missing mandatory tissues, the segmentations from the algorithm are added. If still there are missing tissues or segmentation-based disagreements, in such cases only the slices in need of correction are selected and sent for a second check and annotation. Additionally, a structured report containing a list of the pathologies and mandatory tissues is included. The report is used to assist in determination of what corrections need to be made. The correction process follows the aforementioned method of annotation.
[0077] If there are classification disagreements, the studies are selected and sent to an arbiter. Using another specially designed tool (109), the arbiter resolves the disagreements by creating a majority opinion. The studies are then imported back into the database (106). After this step, they are passed through the merge selector (107) again.
[0078] The prepared and corrected studies are merged again. If all the mandatory tissues are presented and there are no more disagreements on any of the classifications, the studies are flagged as “ready for training”. The “ready for training” studies are compiled and sent to be used as an input for the model training (110).
A study is considered “complete” and ready to be imported into the final database for training when: [0079] there are no disagreements left on the classifications; [0080] all the mandatory tissues are presented; [0081] all tissues for pathologies associated with the classifications are presented.
[0082] The model training (110) is the process of teaching a model how to make segmentations, and afterwards measurements based on them. Using these segmentations and measurements, the software can then make its own classification. Part of the studies are used during the learning process of the software. After training, the model is tested on a test set of studies, which are another part of already annotated data. In order to improve the efficiency of the model in relation to the volume of training data, the task is divided into subtasks, as they are trained simultaneously. Training is the minimization of a common goal function, which is a combination of the goal functions of each of the sub-tasks. The subtasks are localization, segmentation and classification of individual objects.
[0083] The information given by the software is compared to the ground truth, wherein the ground truth is the information from the annotation. Statistics are created based on this comparison to measure the accuracy of the software. If the accuracy is low, the model is trained again using an improved dataset as input. The model also uses some rule-based techniques to determine the presence, or specific characteristics of different pathologies.
[0084] The trained model (111) is the machine learning model (algorithm), which is the product of the successful model training (110). The trained model can then be used in a software as the main resource (112) that makes the analytics.
[0085] Additionally, the trained model is also can be used to create the aforementioned segmentations used in the merge selector when merging data before exporting it. This ensures that any missed tissues are added to the annotated data before the data is sent for review and/or correction.
[0086] The database (106) consists of initial data (201) containing images and reports from physicians, and annotated data (202) containing classifications and segmentations made for the studies.
[0087] The reports are put through a custom parser (203) that picks out the different pathologies and their specifications, and creates the initial classifications in the form of a structured report. The parser is a custom script that reads the provided report and extracts only the needed information through a report parsing process.
[0088] The initial data (201) goes through the merge selector (107), before being sent for initial annotation. Inside the merge selector (107) the data is prepared and exported. Then it is sent for annotation.
[0089] After the annotation process, the data is returned and is ready to be checked by the merge selector (107) again. Inside the merge selector (107), the data is firstly imported using an import script. The imported data (205) is merged with the data generated by the trained model (111). Afterwards, the data is checked by the merge selector (107) for any incorrect/missing segmentations and classification disagreements.
[0090] If there are incorrect/missing segmentations or classification disagreements between the initial annotation and the algorithm-generated one, the information is prepared again and is sent for further correction.
[0091] If all the mandatory tissues are presented, and are correctly segmented, the data is flagged as “ready for training”. The data is then sent to be used as input for the model training (110) which implementation is presented above.
[0092] The model is the product of the whole training process. It is then used to generate data that is used by the merge selector (107) in the aforementioned steps.
[0093] In order to illustrate and describe the present invention, the above is a description of the most preferred embodiments. This is not an exhaustive or limiting description intended to determine the specific form or embodiment example. Obviously, many modifications and variations will be apparent to those skilled in the art. An embodiment is selected and described for those skilled in the art to better understand the principles of the present invention and their best practices for various embodiments with different modifications suitable for a particular use or application of the embodiment. It is intended that the scope of the invention be defined by the accompanying claim and its equivalents, in which all of the above terms have their broadest meaning unless otherwise indicated.
[0094] The embodiments described by those skilled in the art may be subject to modifications within the limits of the scope of the present invention as defined in the claim below.