High-definition labeling system for medical imaging AI algorithms
20230118546 · 2023-04-20
Assignee
Inventors
Cpc classification
G06V20/70
PHYSICS
G16H50/20
PHYSICS
G06V10/26
PHYSICS
G06V10/7788
PHYSICS
G06T2207/20101
PHYSICS
G16H50/30
PHYSICS
International classification
G06V10/778
PHYSICS
G06V10/26
PHYSICS
G06V10/94
PHYSICS
G06V20/70
PHYSICS
Abstract
An authoring tool and method by which users (e.g., diagnosticians) are enabled to design, train, and deploy custom-made AI models tailored to their needs and specific to their data. In the approach herein, and using the authoring tool, users are provided the ability to provide (feed) actual labeling to the AI during the model training process itself (i.e., prior to validation testing of the model results themselves), preferably via a master template (or “questionnaire”) that is specific to a single modality-single body part pair.
Claims
1. A method for medical imaging, comprising: providing a template uniquely associated with a modality-body part pair, the template exposing a set of information fields that define multi-layered criteria for diagnosing a feature of interest, the information fields soliciting at least two of: an answer to a question, definition of a region of interest on a medical image by drawing or annotation, and identification of a risk factor; receiving information in the set of information fields, thereby configuring the template with respect to the feature of interest, the information comprising a multi-layered data set; for each of one or more users: as a patient medical image is rendered in a viewer, and in response to identification of the feature of interest, retrieving the configured template; based on the configured template, receiving data entered by the user; and in response, generating a multi-layered labeled data set for the feature of interest; using the multi-layered labeled data sets derived from the one or more users to train a machine learning model; and using the machine learning model to automatically classify the feature of interest for one or more additional medical images.
2. The method as described in claim 1 wherein the one or more users comprise a set of users associated with one of: a given facility, and two or more facilities.
3. The method as described in claim 1 wherein providing the template includes rendering a configuration page that exposes the set of information fields.
4. The method as described in claim 3 wherein the configuration page exposes one or more lesion templates, wherein a lesion template defines the region of interest on the medical image.
5. The method as described in claim 1 wherein the information configuring the template includes one of: an answers to the question, data defining a lesion template, and data defining the one or more risk factors.
6. The method as described in claim 1 wherein the machine learning model provides one of: classification, and segmentation.
7. The method as described in claim 1 wherein the machine learning model is a convolutional neural network (CNN).
8. The method as described in claim 1 wherein receiving data in the configured tool includes activating a segmenting tool for a particular class of pathology as represented by the feature of interest.
9. The method as described in claim 1 wherein the feature of interest is one of: a mass, a calcification and an architectural distortion.
10. The method as described in claim 1 wherein the data received in the configured template includes one of: location, shape, density, and an indication of associated calcification.
11. The method as described in claim 1 wherein the clinical feature of interest is breast cancer and the machine learning model is a Mask-RCNN model.
12. The method as described in claim 1 further including validating the machine learning model.
13. A software-as-a-service (SaaS) platform, comprising: a set of hardware processors; computer memory holding computer program code executed by the one or more hardware processors to train and use a machine learning algorithm for use in medical imaging, the computer program code comprising program code configured to: provide a template uniquely associated with a modality-body part pair, the template exposing a set of information fields that define multi-layered criteria for diagnosing a feature of interest, the information fields soliciting at least two of: an answer to a question, definition of a region of interest on a medical image by drawing or annotation, and identification of a risk factor; receive information in the set of information fields, thereby configuring the template with respect to the feature of interest, the information comprising a multi-layered data set; for each of one or more users: as a patient medical image is rendered in a viewer, and in response to identification of the feature of interest, retrieve the configured template; based on the configured template, receive data entered by the user; and in response, generate a multi-layered labeled data set for the feature of interest; use the multi-layered labeled data sets derived from the one or more users to train a machine learning model; and use the machine learning model to automatically classify the feature of interest for one or more additional images.
14. The SaaS platform as described in claim 13 wherein the one or more users comprise a set of users associated with one or more enterprises or facilities.
15. A computer program product in a non-transitory computer-readable medium, the computer program product comprising computer program code executable by a hardware processor to train and use a machine learning model for use in medical imaging, the computer program code configured to: provide a template uniquely associated with a modality-body part pair, the template exposing a set of information fields that define multi-layered criteria for diagnosing a feature of interest, the information fields soliciting at least two of: an answer to a question, definition of a region of interest on a medical image by drawing or annotation, and identification of a risk factor; receive information in the set of information fields, thereby configuring the template with respect to the feature of interest, the information comprising a multi-layered data set; for each of one or more users: as a patient medical image is rendered in a viewer, and in response to identification of a clinical feature of interest, retrieve the configured template; based on the configured template, receive data entered by the user; and in response, generate a multi-layered labeled data set for the feature of interest; use the multi-layered labeled data sets derived from the one or more users to train a machine learning model; and use the machine learning model to automatically classify the feature of interest for one or more additional images.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] For a more complete understanding of the subject matter and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
DETAILED DESCRIPTION
[0018]
[0019] In the preferred embodiment, the platform is accessible as a service and thus multiple independent enterprises (e.g., hospitals, health care facilities, offices, labs, etc.) can utilize the authoring tools to self-develop their own medical imaging models. These models can be shared across enterprises. This deployment architecture is not intended to be limiting, as the approach herein can also be implemented in a standalone (or private) manner.
[0020]
[0021]
[0022] To facilitate the machine learning, the platform provides an authoring tool 321 in the form of a web-based editor from which one or more master templates are built. As noted above, and according to this disclosure, users are provided the ability to provide (feed) actual labeling to the AI during the model training process itself (i.e., prior to validation testing of the model results themselves), preferably via a master template (or “questionnaire”) that is specific to a single modality-single body part pair. To generate a master template, the modality and body part are selected from a predefined list, and the master template is provided a unique identifier (or name). Typically, once the modality is selected, body parts are identified according to the selected modality. A given master template typically includes at least one question and/or one lesion template. A question typically requests non-localized information, whereas lesion templates typically seek specific localized information, e.g., that prompt the user to draw on images (e.g., an x-ray) and provide some information identifying a region-of-interest (ROI).
[0023] As depicted in
[0024] As depicted in
[0025] According to this disclosure, the platform exposes a web-based Questionnaire to enable a user (or user group) to custom build questions that pertain to diagnostic criteria of a particular clinical feature of interest, typically an abnormality such as a mass, a lesion, a cyst, or the like. A clinical feature of interest is sometimes referred to herein as a region of interest (ROI). A Questionnaire is sometimes referred to herein as a template. A template exposes a set of information fields that define multi-layered criteria for diagnosing the clinical feature of interest. Typically, the information fields solicit one or more of the following “layers” of information with respect to an image, namely, an answer to a general (i.e., non-localized) question about the clinical feature of interest, a prompt for the user to draw localized information on the image and to enter accompanying descriptive information identifying the ROI, and information about a patient risk factor. Some information, such as the patient risk factor data, may be obtained directly or programmatically from other data sources (e.g., EMRs). During provisioning, a template is configured by specifying the information fields and the data to be captured by those fields. As will be described, the configured template for the clinical feature of interest is sometimes referred herein as a multi-layered data set. The system then selectively exposes the configured template as other users evaluate imaging studies for the clinical feature of interest. As those other users interact with the configured template and, in particular, as they review the radiographic findings and examine morphological features, the users are prompted by the configured template to enter information specific to what they are viewing. As a result, and for each such interaction, a multi-layered labeled data set for the clinical feature of interest is generated. The system then collects these multi-layered labeled data sets and uses them to train an AI algorithm. Once trained, that AI algorithm is then used to classify new images for the feature of interest.
[0026] In one embodiment, the Questionnaire is defined by an administrator and then used by a set of clinicians (the users). A service provider may publish (make available) a predefined or configured set of templates from a template repository. A particular entity (e.g., a hospital) or entity location (a regional branch of a hospital, a working group, or the like) may define its own Questionnaire; thus, the nature of the particular ML model that is generated from the data captured from a particular Questionnaire may be global or location specific, entity-specific, department-specific, and the like. As noted above, the information solicited by the template enables particular radiographic findings and morphological features to be examined and annotated to facilitate the high definition training for a self-generated AI model. The Questionnaire thus enables users to define criteria that are useful to feed clinical input to an AI model.
[0027]
[0028] Generalizing, the configuration pages that comprise the Questionnaire comprise a substrate for mapping out the work of image classification, vision segmentation and labeling for a particular ML algorithm of interest. Using the custom fields, appropriate selections are entered to define the diagnostic criteria for the clinical feature of interest and that is being defined by the particular Questionnaire. The nature and type of information that are defined/selected by the user will depend on the clinical feature. In an example, assume that the feature of interest is breast cancer. In this example, the user may enter “Mammography for breast cancer” in the Name field and enter an appropriate description of the algorithm in the Description field 608. Typically, the Name field is unique. The modality and body part are selected from predefined dropdown lists. Modality typically refers to a type of imaging machine, e.g., CT, MRI, US (ultrasound), MG (mammogram), etc.). Conveniently, multiple modality data sets can be assigned to the algorithm. Thus, for example, here the user may select both mammogram (MG) and ultrasound studies from the dropdown list for Modality 610. The user enters “breast” in the Body Part field for this example of course. The Questionnaire 600 includes at least one question (configured in General Question field 602) or one Lesion Template (configured in Lesion Template field 604), and typically there are multiple questions and multiple lesion templates. The inclusion of a Lesion Template enables the user to draw on images and mark regions of interest (ROI). General Questions typically involve the user just answering a question without drawing (i.e., no localization on a particular image). Typically, each Questionnaire exposes several possible combinations of prompting: an optional list of General Questions (e.g., Do you see any radiological signs of Tuberculosis?) and an optional list of Lesion Templates that should be drawn on a specific image in a study (e.g., draw lesion with specific attributes on a given view position of a chest x-ray image). Preferably, the system configurator does not allow the administrator or other user to provision multiple questionnaires for the same modality and same body part. In other words, a Questionnaire for a modality-body part pair is unique. Preferably, entries in the Modality and Body Part fields are mandatory. Once the Modality is selected from the dropdown, the Body Parts are filled according to the selected Modality. As noted above, a user can optionally add one or more Image Grouping Criteria (e.g., attribute name and supported values) from several options, e.g., image view position, image laterality, image type (e.g., the user can select an attribute name=image laterality with L and R supported values). Preferably, the user is prevented from adding grouping criteria with the same attribute. More generally, the user (or at least an authorized user, such as an administrator) has the ability to add, edit or delete a Questionnaire. Preferably, the system confirms the user's intention (e.g., via a prompt) when he or she is attempting to delete a template that is currently linked with branches to one or more other templates.
[0029] Referring back to the General Questions field, preferably each question has a specific type, such as MCQ (multiple choice), OCQ (one choice), Polar (yes/no) and Fill In (answer is free text). Preferably, a question has the following attributes: Question Text, Question Type (MCQ,OCQ, etc.), and whether the question or Mandatory or Optional (default is Mandatory).
[0030] Question details typically are defined according to the Question Type. For MCQ and OCQ questions, each question must have one more possible answers, and preferably the system affords the user the ability to delete or edit a specific answer. For Polar and Fill-In questions, no specific details are required.
[0031] Selection of an entry in the Lesion Templates page navigates the user to a linked Lesions page 700, such as depicted in
[0032] Lesion Templates enables classes of anomalies (mass, cyst, calcification, architectural distortion, etc.) to be segmented on the images displayed to the user. Each type of anomaly typically has an associated type for the segmenting tool, and typically there are multiple segmenting tools associated with a template. Using the Lesion Template, and as described above, the designer can add questions about the morphology of each lesion (e.g., lesion type, location, shape, density, associated calcification, etc.).
[0033]
[0034] As noted above, the platform of this disclosure enables ML algorithms to be developed in a DIY manner. A representative field of use is for radiology, although this is not a limitation. The platform enables researchers, clinicians and data scientists to self-develop machine learning algorithms that require zero coding and that are clinically-validated. From a high level configuration page, the platform enables the designer to select (e.g., from a dropdown) a type of ML model (e.g., segmentation, classification, or the like), and to enable the model for use in the imaging system. Selection of “segmentation” type enables the user to segment lesions, and selection of the “classification” type enables classification of the image. From this configuration page, the designer can identify a user or set of users to participate in the model training, i.e., to work collaboratively, in the training by interactions with the templates. In addition, and to facilitate training image models, the platform preferably is linked with available data sources in a facility (or across facilities) so that facilities can access and use their existing data sets (e.g., images) as well as incoming or other data sets available to them.
[0035] In the example referenced above, a questionnaire is developed for an algorithm that is developed to study breast malignancy on conventional diagnostic mammograms. For example, the model is a Mask-RCNN model, which provides pixel-based segmentation of lesions in the mammogram, and in this example the model is trained on six (6) different classes: Benign Architectural Distortion, Benign Calcification, Benign Masses, Malignant Architectural Distortion, Malignant Calcification, and Malignant Masses. By way of background, CNN refers to a Convolutional Neural Network, which is an artificial neural network architecture that consists of three main layers, a convolutional layer, a pooling layer, and one or more fully connected layers. The convolutional layer abstracts an image input as a feature map via the use of filters. The pooling layer down-samples feature maps by summarizing the presence of features therein. The fully connected layers connect every neuron in one layer to every neuron in another layer. R-CNN refers to a family of CNN-based machine learning models for computer vision and specifically object detection. Given an input image, R-CNN begins by applying a mechanism called selective search to extract regions of interest (ROI), where each ROI is a rectangle that may represent the boundary of an object in image. Each ROI is fed through a neural network to produce output features. For each ROI's output features, a collection of support-vector machine classifiers is used to determine what type of object (if any) is contained within the ROI. While the original R-CNN independently computed the neural network features on each of the regions of interest, Fast R-CNN runs the neural network once on the whole image. Further, previous versions of R-CNN focused on object detection, Mask R-CNN adds the capability for instance segmentation. In this example, the high definition training data collected by the user interaction(s) with the system are utilized for training the Mask-RCNN model.
[0036] Assume now that the user (e.g. a Radiologist) has selected a study for analysis. Image rendering systems may be used for this purpose. In a preferred embodiment, the platform technologies and tools of this disclosure are integrated with the existing image rendering system, e.g., and exposed as an option for selection. To provide a concrete example, and with reference to
[0037] Continuing with the breast cancer detection model as the example, breast malignancy can manifest in the form of a mass, a cluster of micro-calcifications, and/or architectural distortion. Thus, in this example, the segmentation toolbar also exposes other tabs (panels) for enabling each of these anomalies to be selected as appropriate. These toolbar options are shown in
[0038] As noted above, information obtained by the questionnaire may be augmented with additional relevant Risk Factor data, such as laboratory results, data from the patient's health records, and the like.
[0039] A particular template may identify multiple features of interest for labeling. Thus, for example, a first level feature of interest may be a tumor that has various associated characteristics, such as the tumor's contour. In this example, the template may also expose an additional set of questions or annotation/drawing options with respect to these additional characteristics. These template elements constitute a second level of labeling.
[0040] The techniques of this disclosure have many advantages. Self-authoring of AI-based medical imaging AI algorithms as described herein reduces patient privacy risk, reduces time and cost, and reduces or obviates extensive software coding. Imaging systems that incorporate the described technologies are more robust and efficient, as they enable automated detection of sophisticated radiomics. The techniques herein provide for improvements to such imaging technologies.
Enabling Technologies
[0041] Typically, the computing platform (
[0042] The system (
[0043] As described above, the platform supports a machine learning system. The nature and type of Machine Learning (ML) algorithms that are used to process the query may vary. As is known, ML algorithms iteratively learn from the data, thus allowing the system to find hidden insights without being explicitly programmed where to look. ML tasks are typically classified into various categories depending on the nature of the learning signal or feedback available to a learning system, namely supervised learning, unsupervised learning, and reinforcement learning. In supervised learning, the algorithm trains on labeled historic data and learns general rules that map input to output/target. The discovery of relationships between the input variables and the label/target variable in supervised learning is done with a training set, and the system learns from the training data. In this approach, a test set is used to evaluate whether the discovered relationships hold and the strength and utility of the predictive relationship is assessed by feeding the model with the input variables of the test data and comparing the label predicted by the model with the actual label of the data. The most widely used supervised learning algorithms are Support Vector Machines, linear regression, logistic regression, naive Bayes, and neural networks. As will be described, the techniques herein preferably leverage a network of neural networks. Formally, a NN is a function g: X.fwdarw.Y, where X is an input space, and Y is an output space representing a categorical set in a classification setting (or a real number in a regression setting). For a sample x that is an element of X, g(x)=f.sub.L(f.sub.L-1( . . . ((f.sub.1/(x)))). Each f.sub.i represents a layer, and f.sub.L is the last output layer. The last output layer creates a mapping from a hidden space to the output space (class labels) through a softmax function that outputs a vector of real numbers in the range [0, 1] that add up to 1. The output of the softmax function is a probability distribution of input x over C different possible output classes.
[0044] Thus, for example, in one embodiment, and without limitation, a neural network such as described is used to extract features from an utterance, with those extracted features then being used to train a Support Vector Machine (SVM).
[0045] One or more functions of the computing platform of this disclosure may be implemented in a cloud-based architecture (
[0046] The platform may comprise co-located hardware and software resources, or resources that are physically, logically, virtually and/or geographically distinct. Communication networks used to communicate to and from the platform services may be packet-based, non-packet based, and secure or non-secure, or some combination thereof.
[0047] More generally, the techniques described herein are provided using a set of one or more computing-related entities (systems, machines, processes, programs, libraries, functions, or the like) that together facilitate or provide the described functionality described above. In a typical implementation, a representative machine on which the software executes comprises commodity hardware, an operating system, an application runtime environment, and a set of applications or processes and associated data, that provide the functionality of a given system or subsystem. As described, the functionality may be implemented in a standalone machine, or across a distributed set of machines.
[0048] Other enabling technologies for the machine learning algorithms include, without limitation, vector autoregressive modeling (e.g., Autoregressive Integrated Moving Average (ARIMA)), state space modeling (e.g., using a Kalman filter), a Hidden Markov Model (HMM), recurrent neural network (RNN) modeling, RNN with long short-term memory (LSTM), Random Forests, Generalized Linear Models, Extreme Gradient Boosting, Extreme Random Trees, and others. By applying these modeling techniques, new types of features are extracted, e.g., as follows: model parameters (e.g. coefficients for dynamics, noise variance, etc.), latent states, and predicted values for a next couple of observation periods.
[0049] Typically, but without limitation, a client device is a mobile device, such as a smartphone, tablet, or wearable computing device, laptop or desktop. A typical mobile device comprises a CPU (central processing unit), computer memory, such as RAM, and a drive. The device software includes an operating system (e.g., Google® Android™, or the like), and generic support applications and utilities. The device may also include a graphics processing unit (GPU). The mobile device also includes a touch-sensing device or interface configured to receive input from a user's touch and to send this information to processor. The touch-sensing device typically is a touch screen. The mobile device comprises suitable programming to facilitate gesture-based control, in a manner that is known in the art.
[0050] Generalizing, the mobile device is any wireless client device, e.g., a cellphone, pager, a personal digital assistant (PDA, e.g., with GPRS NIC), a mobile computer with a smartphone client, or the like. Other mobile devices in which the technique may be practiced include any access protocol-enabled device (e.g., an Android™-based device, or the like) that is capable of sending and receiving data in a wireless manner using a wireless protocol. Typical wireless protocols are: WiFi, GSM/GPRS, CDMA or WiMax. These protocols implement the ISO/OSI Physical and Data Link layers (Layers 1 & 2) upon which a traditional networking stack is built, complete with IP, TCP, SSL/TLS and HTTP.
[0051] Each above-described process preferably is implemented in computer software as a set of program instructions executable in one or more processors, as a special-purpose machine.
[0052] While the above describes a particular order of operations performed by certain embodiments, it should be understood that such order is exemplary, as alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, or the like. References in the specification to a given embodiment indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic.
[0053] While the disclosed subject matter has been described in the context of a method or process, the subject matter also relates to apparatus for performing the operations herein. This apparatus may be a particular machine that is specially constructed for the required purposes, or it may comprise a computer otherwise selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including an optical disk, a CD-ROM, and a magnetic-optical disk, a read-only memory (ROM), a random access memory (RAM), a magnetic or optical card, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
[0054] A given implementation of the computing platform is software that executes on a hardware platform running an operating system such as Linux. A machine implementing the techniques herein comprises a hardware processor, and non-transitory computer memory holding computer program instructions that are executed by the processor to perform the above-described methods.
[0055] The functionality may be implemented with other application layer protocols besides HTTP/HTTPS, or any other protocol having similar operating characteristics.
[0056] There is no limitation on the type of computing entity that may implement the client-side or server-side of the connection. Any computing entity (system, machine, device, program, process, utility, or the like) may act as the client or the server.
[0057] While given components of the system have been described separately, one of ordinary skill will appreciate that some of the functions may be combined or shared in given instructions, program sequences, code portions, and the like. Any application or functionality described herein may be implemented as native code, by providing hooks into another application, by facilitating use of the mechanism as a plug-in, by linking to the mechanism, and the like.
[0058] The platform functionality (
[0059] Each above-described process preferably is implemented in computer software as a set of program instructions executable in one or more processors, as a special-purpose machine.
[0060] As previously noted, the techniques herein generally provide for the above-described improvements to a technology or technical field (e.g., medical imaging systems), as well as the specific technological improvements to various fields, all as described above.
[0061] Preferably, the authoring tool is implemented as a web-based editor tool, namely, software executing on a hardware processor.
[0062] Of course, the breast cancer detection model described above is not intended to be limiting, as the basic approach herein can be used for many other types of diseases of interest and their associated modalities. These include, without limitation, common thorax disease (DX|CR modality), intracranial hemorrhage (CT modality), acute liver failure (ALF) (CT modality), thyroid nodules (US modality), Liver tumors (MR modality), bone age assessment (DX|CR modality), liver cancer (CT modality), hand tumor assessment (CR|DX modality), focal splenic lesions (CT), Reed Sternberg Cell (OT modality), Ductal Carcinoma in situ (SM|OT modality), CT lung nodules (CT), focal sclerotic lesions (MR), cell type classification (OT|SM), and others. The AI models for these conditions of course will vary.
[0063] What is claimed is as follows: