SYSTEMS AND METHODS USING WEIGHTED-ENSEMBLE SUPERVISED-LEARNING FOR AUTOMATIC DETECTION OF OPHTHALMIC DISEASE FROM IMAGES
20170357879 · 2017-12-14
Assignee
Inventors
Cpc classification
A61B2576/02
HUMAN NECESSITIES
G06F18/214
PHYSICS
G06V10/454
PHYSICS
A61B3/0025
HUMAN NECESSITIES
G06F18/254
PHYSICS
G16H50/20
PHYSICS
A61B3/14
HUMAN NECESSITIES
G06N3/082
PHYSICS
International classification
A61B3/00
HUMAN NECESSITIES
Abstract
Disclosed herein are systems, methods, and devices for classifying ophthalmic images according to disease type, state, and stage. The disclosed invention details systems, methods, and devices to perform the aforementioned classification based on weighted-linkage of an ensemble of machine learning models. In some parts, each model is trained on a training data set and tested on a test dataset. In other parts, the models are ranked based on classification performance, and model weights are assigned based on model rank. To classify an ophthalmic image, that image is presented to each model of the ensemble for classification, yielding a probabilistic classification score—of each model. Using the model weights, a weighted-average of the individual model-generated probabilistic scores is computed and used for the classification.
Claims
1. A method for weighted-ensemble training of machine-learning models to classify ophthalmic images according to features such as disease type and state; where the method comprises of: a) an ensemble of machine-learning models each of which consists of: i. a feature extraction mechanism ii. a classification mechanism b) a step to split the input data into training and test sets c) a step to initialize the weights d) for each model, a step in which the feature extraction mechanism yields a feature vector or other object encoding the ophthalmic image features e) for each model, a step in which the feature vector is passed into the classifier to yield a class prediction f) for each model, a mechanism to iteratively update the weights to reduce class prediction error g) for each model, a stopping mechanism for the iteration h) a step to compare and rank the models based on their performance on a test dataset i) a step to assign weights to the various models in the ensemble j) given a subject ophthalmic image, a step to compute the weighted-average of the class predictions of the plurality of models, and to choose the ophthalmic image class based on this weighted-averaging step.
2. The method of claim 1 wherein some model of the ensemble is a convolutional neural network
3. The method of claim 1 wherein some model of the ensemble is a recurrent neural network
4. The method of claim 1 wherein a rectified linear unit (ReLU) or leaky ReLU is used as the activation function of hidden layers
5. The method of claim 1 wherein a softmax function is used as the activation function of the output layer
6. The method of claim 1 wherein batch normalization is performed
7. The method of claim 1 wherein drop out regularization is performed in the input layers
8. The method of claim 1 wherein the weight initialization step utilizes a pre-trained model
9. The method of claim 1 wherein the weight initialization step is based on random assignment
10. The method of claim 1 wherein the iterative weight update mechanism is back-propagation
11. The method of claim 1 wherein the stopping mechanism is to proceed iteratively till a preset number of iterations or till a preset prediction performance threshold is reached
12. The method of claim 1 wherein the method for assigning weights to models is based on model performance rank
13. The method of claim 1 wherein a pooling step is performed between feature extraction or classification layers
14. A combined imaging and computing system, consisting: a) a system to capture or retrieve an ophthalmic image b) a computer or computing envirnomnent consisting of processing and storage components c) a trained weighted-ensemble of machine learning models stored on the storage component d) executable commands stored on the storage component such that, upon command, i. a ophthalmic image is obtained ii. the ophthalmic image is stored in the storage components iii. the ophthalmic image is retrieved and a classified by passage through the trained weighted-ensemble iv. the image class such as disease state and stage is provided as output v. the image class can be transmitted over a network to a third party for storage, further interpretation, and/or appropriate action.
15. The method of claim 14 wherein the ophthalmic image is obtained by an integrated local device which captures the image of an eye or some of its parts in real time
16. The method of claim 14 wherein the ophthalmic image is obtained by retrieval from a remote imaging system or database
17. The method of claim 14 wherein some of the models in the ensemble are convolutional neural networks
18. The method of claim 14 wherein some of the models in the ensemble are recurrent neural networks
19. The method of claim 14 wherein the trained weighted-ensemble is trained as follows: a) a database of labeled ophthalmic images is split into training and test sets b) each model in the ensemble is trained and tested c) the models are ranked based on their performance on the test dataset d) a model weight is assigned to each model based on its performance rank
20. The method of claim 19 wherein classification of an ophthalmic image is done as follows: a) the image is passed through each model, generating probabilistic class scores for each b) using the model weights, a weighted-average of the probabilistic class scores is computed across models c) the weighted-average of class scores is used to classify the image
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0025] In the following detailed description of the invention, we reference the herein listed drawings and their associated descriptions, in which:
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0038] The illustration in
[0039] The depiction in
[0040] In some embodiment of the invention, some of the members of the ensemble can be convolutional neural networks (CNNs). An exemplary illustration of a feature extraction scheme of a CNN is depicted in
[0041] Depicted in
[0042] where u.sub.i is the ith pixel value in the filter, v.sub.i,k is the ith pixel value of the portion of the ophthalmic image that overlaps the filter when the filter is in the kth position. And c.sub.k is the value of the kth pixel of the generated feature map. The multiple overlapping positions of the filter can be thought of as the filter scanning over the ophthalmic image and performing the aforementioned computations as it does so. In
[0043] In some embodiment of the invention, the ensemble contains some machine learning models whose classification mechanisms are multilayer perceptrons—also known as fully connected layers. An exemplification of such a fully connected layer is depicted in
[0044] The depiction in
[0045] The depiction in
[0046] where x.sub.α denotes the output from neuron X.sub.α, w.sub.ij is the weight connecting neuron X.sub.i to neuron X.sub.j, and n is the number of neurons providing input into neuron X.sub.j, such as is depicted in 710 of
[0047] Equation (2) and its type are then subsequently fed as input into an activation function σ(x) such as ReLU for example but not limitation, yielding the following form:
[0048] An exemplary method by which an individual model of the ensemble performs feature extraction and subsequent classification is depicted in
[0054] The error computed above is the objective function which we seek to minimize. An example is as follows:
[0055] where x.sub.i are the input features; w are weights; σ, γ, ρ are activation functions; and ŷ.sub.p is the target value of the pth class. Of note L is a composite function consisting of the weighted linear combinations of inputs into each successive layer. The effect of any given weight on the net loss can therefore be computed using the chain rule. For instance, we can re-write the loss function in the notationally concise functional form
L(w)=b(c(d( . . . i(j(w))))), (5)
[0056] where w is a weight and b, c, d, . . . , i, j are functions describing the network. Then the effect of weight w on loss L, denoted
is given by
[0057] This is done in a computationally efficient manner using the well-known back-propagation algorithm. In some preferred embodiment of the invention disclosed herein, an ophthalmic image input is obtained and the training procedure is carried out in an iterative manner as shown in
[0058]
P(uεt.sub.j|m.sub.1). (7)
[0059] Similarly, 1060 is the probability predicted by model 2, 1020, that ophthalmic image u 1000 is of class t.sub.j, 1070 is the probability predicted by model 3, 1030, that ophthalmic image u 1000 is of class t.sub.j, and 1080 is the probability predicted by model N, 1040, that ophthalmic image u 1000 is of class t.sub.j. Model weights are determined based on performance of the individual models on test data. Any number of order preserving weight assignment schemes can be applied, such that the better the relative performance of a model, the higher its assigned weight. The weight assignment scheme can include a performance threshold below which a weight of zero is assigned. i.e. models with low enough performance can be excluded from the voting. In
[0060] In
[0061] The denominator in the above equation is the normalization factor that makes weighted-ensemble class scores a distribution, i.e. sum to unity. In contrast to the loss function—whose evaluation can be negative, and hence can require for exponentiation (or similar mechanism) to ensure positivity and to allow for the formation of a distribution. Here, each of the individual model predictions are typically already probabilities, i.e. non-negative and in [0, 1].
[0062] Ones skilled in the art will recognize that the invention disclosed herein can be implemented over an arbitrary range of computing configurations. We will refer to any instantiation of these computing configurations as the computing environment. An exemplary illustration of a computing environment is depicted in
[0063] As illustrated in
[0064] In some embodiment of the invention disclosed herein, the computing environment can contain a memory mechanism to store computer-readable media. By way of example and not limitation, this can include removable or non-removable media, volatile or non-volatile media. By way of example and not limitation, removable media can be in the form of flash memory card, USB drives, compact discs (CD), blu-ray discs, digital versatile disc (DVD) or other removable optical storage forms, floppy discs, magnetic tapes, magnetic cassettes, and external hard disc drives. By way of example but not limitation, non-removable media can be in the form of magnetic drives, random access memory (RAM), read-only memory (ROM) and any other memory media fixed to the computer.
[0065] As depicted in
[0066] The computer readable content stored on the various memory devices can include an operating system, computer codes, and other applications 16050. By way of example not limitation, the operating system can be any number of proprietary software such as Microsoft windows, Android, Macintosh operating system, iphone operating system (iOS), or Linux commercial distributions. It can also be open source software such as Linux versions e.g. Ubuntu. In other embodiments of the invention, imaging software and connection instructions to an imaging device 16060 can also be stored on the memory mechanism. The procedural algorithm set forth in the disclosure herein can be stored on—but not limited to—any of the aforementioned memory mechanisms. In particular, computer readable instructions for training and subsequent image classification tasks can be stored on the memory mechanism.
[0067] The computing environment typically includes a system bus 16010 through which the various computing components are connected and communicate with each other. The system bus 16010 can consist of a memory bus, an address bus, and a control bus. Furthermore, it can be implemented via a number of architectures including but not limited to Industry Standard Architecture (ISA) bus, Extended ISA (EISA) bus, Universal Serial Bus (USB), microchannel bus, peripheral component interconnect (PCI) bus, PCI-Express bus, Video Electronics Standard Association (VESA) local bus, Small Computer System Interface (SCSI) bus, and Accelerated Graphics Port (AGP) bus. The bus system can take the form of wired or wireless channels, and all components of the computer can be located remote from each other and connected via the bus system. By way of example and not of limitation, the processing unit 16000, memory 16020, input devices 16120, output devices 16150 can all be connected via the bus system. In the representation depicted in
[0068] In some embodiment of the invention disclosed herein,
[0069] In some embodiment of the invention disclosed herein,
[0070] In some embodiment of the invention disclosed herein some of the computing components can be located remotely and connected to via a wired or wireless network. By way of example and not limitation,
[0071] In some embodiment of the invention disclosed herein, an imaging system which captures and pre-processes images, e.g. 16060, is attached directly to the system. Stored in the memory mechanism—16020, 16240, or 16210—is a model trained according to the machine learning procedure set-forth herein. Computer-readable instructions are also stored in the memory mechanism, so that upon command, images can be captured from a patient in real time, or can be received over a network from a remote or local previously collated database. In response to command such images can be classified by the pre-trained machine learning procedure disclosed herein. The classification output can then be transmitted to the care provider and/or patient for information, interpretation, storage, and appropriate action. This transmission can be done over a wired or wireless network as previously detailed, as the recipient of the classification output can be at a remote location.
[0072] Illustrating the invention disclosed herein, an anonymized database of 3000 ocular coherence tomograms (OCTs) of the macula was compiled. Binary labels were assigned by an American board-certified ophthalmologist and Retina specialist. The labels were ‘actively exudating age-related macula degeneration’ or ‘not actively exudating age-related macula degeneration’. The database was split into one dataset for training and a separate dataset for validation. 400 OCT images were used for validation—200 ‘actively exudating’ and 200 ‘not actively exudating’. The algorithm achieved 99.2% accuracy in distinguishing between ‘actively exudating’ and ‘not actively exudating’.
[0073] The objects set forth in the preceding are presented in an illustrative manner for reason of efficiency. It is hereby noted that the above disclosed methods and systems can be implemented in manners such that modifications are made to the particular illustration presented above, while yet the spirit and scope of the invention is retained. The interpretation of the above disclosure is to contain such modifications, and is not to be limited to the particular illustrative examples and associated drawings set-forth herein.
[0074] Furthermore, by intention, the following claims encompass all of the general and specific attributes of the invention described herein; and encompass all possible expressions of the scope of the invention, which can be interpreted—as pertaining to language—as falling between the aforementioned general and specific ends.