Deep-learning-based cancer classification using a hierarchical classification framework
10939874 ยท 2021-03-09
Assignee
Inventors
- Kyung Hyun Sung (Los Angeles, CA, US)
- William Hsu (Westlake Village, CA, US)
- Shiwen Shen (Los Angeles, CA, US)
- Xinran Zhong (Los Angeles, CA, US)
Cpc classification
A61B5/055
HUMAN NECESSITIES
A61B5/08
HUMAN NECESSITIES
G16H50/70
PHYSICS
International classification
A61B5/00
HUMAN NECESSITIES
A61B5/055
HUMAN NECESSITIES
Abstract
An automatic classification method for distinguishing between indolent and clinically significant carcinoma using multiparametric MRI (mp-MRI) imaging is provided. By utilizing a convolutional neural network (CNN), which automatically extracts deep features, the hierarchical classification framework avoids deficiencies in current schemes in the art such as the need to provide handcrafted features predefined by a domain expert and the precise delineation of lesion boundaries by a human or computerized algorithm. This hierarchical classification framework is trained using previously acquired mp-MRI data with known cancer classification characteristics and the framework is applied to mp-MRI images of new patients to provide identification and computerized cancer classification results of a suspicious lesion.
Claims
1. An apparatus for detecting and grading carcinoma, comprising: (a) a computer processor; and (b) a non-transitory computer-readable memory storing instructions executable by the computer processor; (c) wherein said instructions, when executed by the computer processor, perform steps comprising: (i) acquiring a plurality of multi-parametric MRI (mp-MRI) images of a subject with an mp-MRI imager; (ii) pre-processing the acquired mp-MRI images to produce standardized small image patches; (iii) extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of the standardized small image patches with a convolution neural network (CNN) method; (iv) obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers; and (v) applying as second order classification of a Gaussian radial basis function kernel SVM classification of combined first order classification data to produce a final classification.
2. The apparatus of claim 1, wherein said pre-processing of mp-MRI images instructions further comprise pre-processing the mp-MRI images with pixel intensity normalization, pixel spacing normalization and rescaling to produce said standardized small image patches.
3. The apparatus of claim 1, wherein said convolution neural network (CNN) method is pre-trained.
4. The apparatus of claim 3, wherein said pre-trained convolution neural network (CNN) method comprises OverFeat.
5. The apparatus of claim 1, wherein said second order classification comprises a Gaussian radial basis function kernel SVM classification of combined first order classification data and one or more standard imaging features selected from the group of features consisting of: skewness of intensity histograms in T2w images; an average ADC value; lowest 10.sup.th percentile; an average K.sup.trans, highest 10.sup.th percentile K.sup.trans value; and region of interest size in T2w images.
6. A computer implemented method for detecting and grading carcinoma, the method comprising: (a) acquiring a plurality of magnetic resonance images of a subject; (b) pre-processing the acquired images; (c) applying a convolution neural network (CNN) method to extract deep learning features from said pre-processed images; (d) applying support vector machine (SVM) classifiers to the extracted deep learning features to produce SVM decision values; and (e) obtaining a Gaussian radial basis function (RBF) kernel SVM classification of combined support vector machine (SVM) decision values and statistical features to produce a final decision; and (f) wherein said method is performed by a computer processor executing instructions stored on a non-transitory computer-readable medium.
7. The method of claim 6, wherein said magnetic resonance images comprise multi-parametric MRI (mp-MRI) images.
8. The method of claim 6, wherein said pre-processing comprises: (a) pixel intensity normalization; (b) pixel spacing normalization; and (c) rescaling.
9. The method of claim 6, wherein said convolution neural network (CNN) method is pre-trained.
10. The method of claim 9, wherein said pre-trained convolution neural network (CNN) method comprises OverFeat.
11. The method of claim 7, wherein said applying a convolution neural network (CNN) method to extract deep learning features from said pre-processed images comprises extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of standardized small image patches.
12. The method of claim 7, wherein said applying support vector machine (SVM) classifiers to the extracted deep learning features to produce SVM decision values comprises obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers.
13. The method of claim 11, wherein said support vector machine (SVM) decision values are combined with one or more statistical features (fs) from the group of statistical features consisting of: (a) skewness of intensity histograms in T2w images; (b) average ADC value; (c) lowest 10.sup.th percentile; (d) ADC value; (e) average K.sup.trans; (f) highest 10.sup.th percentile K.sup.trans value; and (g) ROI size in T2w images.
14. A non-transitory computer readable medium storing instructions executable by a computer processor, said instructions when executed by the computer processor performing steps comprising: (a) acquiring a plurality of multi-parametric MRI (mp-MRI) images of a subject; (b) preprocessing the images to produce standardized small image patches; (c) extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of the standardized small image patches with a convolution neural network (CNN); (d) obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers; and (e) applying as second order classification of a Gaussian radial basis function kernel SVM classification of combined first order classification data to produce a final classification.
15. The medium of claim 14, wherein said pre-processing of mp-MRI images instructions further comprise pre-processing the mp-MRI images with pixel intensity normalization, pixel spacing normalization and rescaling to produce said standardized small image patches.
16. The medium of claim 14, wherein said convolution neural network (CNN) method is pre-trained.
17. The medium of claim 16, wherein said pre-trained convolution neural network (CNN) method comprises OverFeat.
18. The medium of claim 14, wherein said second order classification comprises a Gaussian radial basis function kernel SVM classification of combined first order classification data and one or more standard imaging features selected from the group of features consisting of: skewness of intensity histograms in T2w images; an average ADC value; lowest 10.sup.th percentile; an average K.sup.trans, highest 10.sup.th percentile K.sup.trans value; and region of interest size in T2w images.
19. The method of claim 7, wherein said pre-processing comprises: (a) pixel intensity normalization; (b) pixel spacing normalization; and (c) rescaling.
20. The method of claim 7, wherein said convolution neural network (CNN) method is pre-trained.
21. The method of claim 20, wherein said pre-trained convolution neural network (CNN) method comprises OverFeat.
Description
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
(1) The technology described herein will be more fully understood by reference to the following drawings which are for illustrative purposes only:
(2)
(3)
(4)
DETAILED DESCRIPTION
(5) Referring more specifically to the drawings, for illustrative purposes, embodiments of methods for cancer classification from diagnostic imaging are generally shown. Several embodiments of the technology are described generally in
(6) Generally, an automatic classification method to distinguish between indolent and clinically significant prostatic carcinoma using multi-parametric MRI (mp-MRI) is used to illustrate the technology. Although the methods are demonstrated in the domain of prostate cancer, they can be adapted and applied to other types of cancer classification tasks, including breast, lung, kidney, and liver cancers.
(7) The main contributions of the methods include: 1) utilizing a state-of-art deep learning methods to characterize a lesion in mp-MRI through a pre-trained convolutional neural network model; 2) building a hybrid two-order classification model that combines deep learning and conventional statistical features, and thereby 3) avoiding the need for precise lesion boundaries and anatomical-location-specific training.
(8) Turning now to
(9) The preferred mp-MRI data that is obtained at block 12 of
(10) Dynamic contrast-enhanced MRI (DCE-MRI) images are obtained from rapid T1w gradient echo scans taken before, during and following intravenous administration of a gadolinium-based contrast agent (GBCA). Diffusion-weighted imaging (DWI) normally includes an ADC map and high b-value images. An ADC map is a map of the calculated ADC values for each voxel in an image. ADC mean and DWI signal intensities may also be obtained.
(11) The mp-MRI data that are acquired at block 12 are transformed with a hierarchical classification framework 14 to produce a classification value at block 16 of
(12) As also shown in the illustration of
(13) The hierarchical classification framework 14 of
(14) Second, the deep learning feature extractor 18 takes the standardized small image patches as input. A convolutional neural network is used to extract the deep features from T2w, ADC and K.sup.trans mp-MRI data. Here, the output of the 21 layer (the last convolutional layer) of the pre-trained CNN (e.g. OverFeat) can be used as the deep learning features.
(15) Third, there are three linear support vector machine (SVM) classifiers (Classifier 1) 20 that are used in the first-order classification to obtain the prediction score for each set of deep features. The first-order classification is then combined with the other optional standard imaging features 24. Standard imaging features 24 may include: (a) skewness of intensity histograms in T2w images; (b) average ADC value; (c) lowest 10.sup.th percentile; (d) ADC value; (e) average K.sub.trans; (f) highest 10.sup.th percentile K.sub.trans value; and (g) ROI size in T2w images.
(16) The combined Classifier 1 and optional standard imaging features 24 are used as input for a Gaussian radial basis function kernel SVM classifier in the second-order classification (Classifier 2) 22, which outputs the final decision 16 (i.e. indolent vs. clinically significant prostate cancer).
(17) Table 2 illustrates an example embodiment of Matlab computer program instructions that may be used for implementing the technology.
(18) The technology described herein may be better understood with reference to the accompanying examples, which are intended for purposes of illustration only and should not be construed as in any sense limiting the scope of the technology described herein as defined in the claims appended hereto.
EXAMPLE 1
(19) In order to demonstrate the operational principles of the apparatus and imaging and classification methods 30, a dataset of mp-MRI images were recorded for a total of 68 patients and processed using the processing steps shown generally in
(20) As shown schematically in
(21) Two training stages (deep learning feature extraction 46 and first and second order classifications 54) were used to obtain the final decision in the embodiment 30 of
(22) In the second stage 54, the decision values from the three classifiers 56, 58, 60 were combined with six statistical features 62 to train a Gaussian radial basis function (RBF) kernel SVM classifier 64, which produced an output of the final decision (indolent vs. CS). Statistical features (f.sub.s) 62 included skewness-of-intensity histogram in T2w images, average ADC value, lowest 10.sup.th percentile ADC value, average K.sup.trans, highest 10.sup.th percentile K.sup.trans value, and ROI size in T2w images.
(23) The training process was generally designed as follows. First, the whole dataset was randomly divided into five folds of similar size. One fold was then selected as test set IMAGE.sub.test and the other four folds were training set IMAGE.sub.train. After this, IMAGE.sub.train was equally and randomly divided into two phases, IMAGE.sub.train1) and IMAGE.sub.train2. IMAGE.sub.train1) was employed to train the three linear SVMs in Stage 1 with leave-one-out cross-validation for selecting the optimal parameters. Once trained, the three trained classifiers were applied to IMAGE.sub.train2, to generate prediction score vectors. With the prediction scores and f.sub.s, IMAGE.sub.train2 was used to train the RBF SVM in Stage 2 and the performance of the prediction was measured on IMAGE.sub.test. The whole procedure was repeated five times (known as five-fold cross-validation), where each fold was used as a test set once. The final classification results are the average performance of the five-fold cross-validation of Example 2.
EXAMPLE 2
(24) To demonstrate the effectiveness of the system, the four classification models were built and compared. Specifically, four different SVMs were built using only f.sub.s, f.sub.T2, f.sub.ADC or f.sub.K, respectively. The performance of these models was also evaluated with a five-fold cross validation using the whole dataset. The results were measured using the mean areas under curve, mean accuracy, mean sensitivity and mean specificity as shown in Table 1.
(25)
(26) It can also be seen that the system achieves significantly higher accuracy over the others for distinguishing indolent vs. clinically significant PCa without requiring precise segmentation of lesion boundaries nor requiring location-specific training. The method has the potential to improve subjective radiologist based performance in the detection and grading of suspicious areas on mp-MRI.
(27) From the description herein, it will be appreciated that the present disclosure encompasses multiple embodiments which include, but are not limited to, the following:
(28) 1. An apparatus for detecting and grading carcinoma, comprising: (a) a computer processor; and (b) a non-transitory computer-readable memory storing instructions executable by the computer processor; (c) wherein the instructions, when executed by the computer processor, perform steps comprising: (i) acquiring a plurality of multi-parametric MRI (mp-MRI) images of a subject; (ii) pre-processing the mp-MRI images to produce standardized small image patches; (iii) extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of the standardized small image patches with a convolution neural network (CNN) method; (iv) obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers; and (v) applying as second order classification of a Gaussian radial basis function kernel SVM classification of combined first order classification data to produce a final classification.
(29) 2. The apparatus of any preceding embodiment, wherein the pre-processing of mp-MRI images instructions further comprise pre-processing the mp-MRI images with pixel intensity normalization, pixel spacing normalization and rescaling to produce the standardized small image patches.
(30) 3. The apparatus of any preceding embodiment, wherein the convolution neural network (CNN) method is pre-trained.
(31) 4. The apparatus of any preceding embodiment, wherein the pre-trained convolution neural network (CNN) method comprises OverFeat.
(32) 5. The apparatus of any preceding embodiment, wherein the second order classification comprises a Gaussian radial basis function kernel SVM classification of combined first order classification data and one or more standard imaging features selected from the group of features consisting of: skewness of intensity histograms in T2w images; an average ADC value; lowest 10.sup.th percentile; an average K.sup.trans; highest 10.sup.th percentile K.sup.trans value; and region of interest size in T2w images.
(33) 6. A computer implemented method for detecting and grading carcinoma, the method comprising: (a) acquiring a plurality of magnetic resonance images of a subject; (b) pre-processing the acquired images; (c) applying a convolution neural network (CNN) method to extract deep learning features from the pre-processed images; (d) applying support vector machine (SVM) classifiers to the extracted deep learning features to produce SVM decision values; and (e) obtaining a Gaussian radial basis function (RBF) kernel SVM classification of combined support vector machine (SVM) decision values and statistical features to produce a final decision; and (f) wherein the method is performed by a computer processor executing instructions stored on a non-transitory computer-readable medium.
(34) 7. The method of any preceding embodiment, wherein the magnetic resonance images comprise multi-parametric MRI (mp-MRI) images.
(35) 8. The method of any preceding embodiment, wherein the pre-processing comprises: (a) pixel intensity normalization; (b) pixel spacing normalization; and (c) rescaling.
(36) 9. The method of any preceding embodiment, wherein the convolution neural network (CNN) method is pre-trained.
(37) 10. The method of any preceding embodiment, wherein the pre-trained convolution neural network (CNN) method comprises OverFeat.
(38) 11. The method of any preceding embodiment, wherein the applying a convolution neural network (CNN) method to extract deep learning features from the pre-processed images comprises extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of standardized small image patches.
(39) 12. The method of any preceding embodiment, wherein the applying support vector machine (SVM) classifiers to the extracted deep learning features to produce SVM decision values comprises obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers.
(40) 13. The method of any preceding embodiment, wherein the support vector machine (SVM) decision values are combined with one or more statistical features (fs) from the group of statistical features consisting of: (a) skewness of intensity histograms in T2w images; (b) average ADC value; (c) lowest 10.sup.th percentile; (d) ADC value; (e) average K.sup.trans; (f) highest 10.sup.th percentile K.sup.trans value; and (g) ROI size in T2w images.
(41) 14. A computer readable non-transitory medium storing instructions executable by a computer processor, the instructions when executed by the computer processor performing steps comprising: (a) acquiring a plurality of multi-parametric MRI (mp-MRI) images of a subject; (b) preprocessing the images to produce standardized small image patches; (c) extracting deep learning features from T2-weighted (T2w), apparent diffusion coefficient (ADC) and K.sup.trans data of the standardized small image patches with a convolution neural network (CNN); (d) obtaining a prediction score for each set of deep learning features by applying a first order classification of support vector machine (SVM) classifiers; and (e) applying as second order classification of a Gaussian radial basis function kernel SVM classification of combined first order classification data to produce a final classification.
(42) 15. The medium of any preceding embodiment, wherein the pre-processing of mp-MRI images instructions further comprise pre-processing the mp-MRI images with pixel intensity normalization, pixel spacing normalization and rescaling to produce the standardized small image patches.
(43) 16. The medium of any preceding embodiment, wherein the convolution neural network (CNN) method is pre-trained.
(44) 17. The medium of any preceding embodiment, wherein the pre-trained convolution neural network (CNN) method comprises OverFeat.
(45) 18. The medium of any preceding embodiment, wherein the second order classification comprises a Gaussian radial basis function kernel SVM classification of combined first order classification data and one or more standard imaging features selected from the group of features consisting of: skewness of intensity histograms in T2w images; an average ADC value; lowest 10.sup.th percentile; an average K.sup.trans; highest 10.sup.th percentile K.sup.trans value; and region of interest size in T2w images.
(46) Embodiments of the present technology may be described herein with reference to flowchart illustrations of methods and systems according to embodiments of the technology, and/or procedures, algorithms, steps, operations, formulae, or other computational depictions, which may also be implemented as computer program products. In this regard, each block or step of a flowchart, and combinations of blocks (and/or steps) in a flowchart, as well as any procedure, algorithm, step, operation, formula, or computational depiction can be implemented by various means, such as hardware, firmware, and/or software including one or more computer program instructions embodied in computer-readable program code. As will be appreciated, any such computer program instructions may be executed by one or more computer processors, including without limitation a general purpose computer or special purpose computer, or other programmable processing apparatus to produce a machine, such that the computer program instructions which execute on the computer processor(s) or other programmable processing apparatus create means for implementing the function(s) specified.
(47) Accordingly, blocks of the flowcharts, and procedures, algorithms, steps, operations, formulae, or computational depictions described herein support combinations of means for performing the specified function(s), combinations of steps for performing the specified function(s), and computer program instructions, such as embodied in computer-readable program code logic means, for performing the specified function(s). It will also be understood that each block of the flowchart illustrations, as well as any procedures, algorithms, steps, operations, formulae, or computational depictions and combinations thereof described herein, can be implemented by special purpose hardware-based computer systems which perform the specified function(s) or step(s), or combinations of special purpose hardware and computer-readable program code.
(48) Furthermore, these computer program instructions, such as embodied in computer-readable program code, may also be stored in one or more computer-readable memory or memory devices that can direct a computer processor or other programmable processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory or memory devices produce an article of manufacture including instruction means which implement the function specified in the block(s) of the flowchart(s). The computer program instructions may also be executed by a computer processor or other programmable processing apparatus to cause a series of operational steps to be performed on the computer processor or other programmable processing apparatus to produce a computer-implemented process such that the instructions which execute on the computer processor or other programmable processing apparatus provide steps for implementing the functions specified in the block(s) of the flowchart(s), procedure (s) algorithm(s), step(s), operation(s), formula(e), or computational depiction(s).
(49) It will further be appreciated that the terms programming or program executable as used herein refer to one or more instructions that can be executed by one or more computer processors to perform one or more functions as described herein. The instructions can be embodied in software, in firmware, or in a combination of software and firmware. The instructions can be stored local to the device in non-transitory media, or can be stored remotely such as on a server or all or a portion of the instructions can be stored locally and remotely. Instructions stored remotely can be downloaded (pushed) to the device by user initiation, or automatically based on one or more factors.
(50) It will further be appreciated that as used herein, that the terms processor, computer processor, central processing unit (CPU), and computer are used synonymously to denote a device capable of executing the instructions and communicating with input/output interfaces and/or peripheral devices, and that the terms processor, computer processor, CPU, and computer are intended to encompass single or multiple devices, single core and multicore devices, and variations thereof.
(51) Although the description herein contains many details, these should not be construed as limiting the scope of the disclosure but as merely providing illustrations of some of the presently preferred embodiments. Therefore, it will be appreciated that the scope of the disclosure fully encompasses other embodiments which may become obvious to those skilled in the art.
(52) In the claims, reference to an element in the singular is not intended to mean one and only one unless explicitly so stated, but rather one or more. All structural, chemical, and functional equivalents to the elements of the disclosed embodiments that are known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the present claims. Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed as a means plus function element unless the element is expressly recited using the phrase means for. No claim element herein is to be construed as a step plus function element unless the element is expressly recited using the phrase step for.
(53) TABLE-US-00001 TABLE 1 Summary of Mean Classification Performance Mean Disclosed Performance Method f.sub.T2 f.sub.ADC f.sub.K f.sub.s AUC 0.922 0.926 0.890 0.899 0.660 Accuracy 0.904 0.827 0.821 0.830 0.617 Sensitivity 0.876 0.837 0.757 0.808 0.600 Specificity 0.955 0.833 0.923 0.875 0.665
(54) TABLE-US-00002 TABLE 2 Matlab Code %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% % This code requires Matlab and depends on two external libraries: % 1) OverFeat cilvr.nyu.edu/doku.php?id=code:start % 2) LibSVM www.csie.ntu.edu.tw/~cjlin/libsvm/ % % Authors: Xinran Zhong, Shiwen Shen, William Hsu, Kyung Sung % Radiological Sciences, UCLA % % % The overall script runs as follows for each case: % FeatureToAdd = TakeFeature(T2,ADC,Ktrans); % for each modality % JPG = Prepro(DICOM) % Get deep feature from command = BashFileOverfeat(ImageName) % end % % (label: m*1 vector with m number of cases) % (feature: m*n matrix with m number of cases and n number of features for each case) % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% %% Step 1: Generate the region of interest (square) % % For each lesion, the input is a bounding box, which contains % the lesion across each imaging (ADC, Ktrans, T2). % All pixel values outside of the bounding box are set to zero. % % NOTE: Our method can be generaliazable to include any other imaging % components in multi-parametric MRI (mp-MRI). % % Input:ADC, Ktrans, T2 images % Output:Region of interest masked images (0 value for background) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% % Step 2: Extract statistical features from each ROI % % For each of lesion, generate statistical features: % skewness-of-intensity histogram in T2w images, % average ADC value, % lowest 10th percentile ADC value, % average Ktrans, % highest 10th percentile Ktrans value, % ROI size in T2w images. % % NOTE: We illustrate and demonstrate our approach using the above % statistical features, but our method can be generalizable to % other standard imaging features. % % Input:Pixel values within the ROW defined in Step 1 % Output:Statistical features calculated from each ROI function FeatureToAdd = TakeFeature(T2,ADC,Ktrans) T2_ROI = find_ROI(T2); % find_ROI is a function that identifies the non-zero regions in the image [x_T2, y_T2] = size(T2_ROI); size_T2 = x_T2*y_T2; % Skewness of T2 T2_ROI = reshape(T2_ROI,x_T2*y_T2,1); T2_skewness = skewness(T2_ROI); % Average ADC value ADC_ROI = find ROI(ADC); [x_ADC, y_ADC] = size(ADC_ROI); ADC_ROI = reshape(ADC_ROI,x_ADC*y_ADC,1); ADC_average = mean(ADC_ROI); % 10th percentile of the lowest ADC value ADC_ROI = sort(ADC_ROI); n = round(x_ADC*y_ADC/10); ADC_10 = ADC_ROI(n); % Average ADC value Ktrans_ROI = find_ROI(Ktrans); [x_Ktrans, y_Ktrans] = size(Ktrans_ROI); Ktrans_ROI = reshape(Ktrans_ROI,x_Ktrans*y_Ktrans,1); Ktrans_average = mean(Ktrans_ROI); % 10th percentile of the highest Ktrans Value Ktrans_ROI = sort(Ktrans_ROI,descend); n = round(x_Ktrans*y_Ktrans/10); Ktrans_10 = Ktrans_ROI(n); FeatureToAdd = [size_T2, T2_skewness, ADC_average, ADC_10, Ktrans_average, Ktrans_10]; end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% % Step 3: Save each DICOM to png and pre-process them % % Preprocess each image into format that is capable of being processed % using Overfeat (e.g., Size 231*231, RBG channel and intensity range [0 255]) % % % Input:Images of each ROI % Output:Image in .png format function X = Prepro(A) % resize A = imresize(A,[231,231]); A = im2double(A); A(find(A<0)) = 0; % rescale A = round((A min(A(:)))/(max(A(:))min(A(:))) *255); X = A; X(:,:,2) = X; X(:,:,3) = X(:,:,1); X = uint8(X); end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% % Step 4: Use OverFeat to extract deep features for each image % % For each png image, run OverFeat commands % % NOTE: OverFeat is used in this implementation as an example. % Our method can be generalizable to both any pre-trained % and non-pre-trained convolutional neural networks (CNN) methods % (or classifiers, more broadly), depending on availabilities of % labeled training data that can attain sufficient feature learning. % % % Input:File path of exported PNG images % Output:Command line statements to execute OverFeat function BashFileOverfeat(ImageName) DirectPath = cd /Users/xinranzhong/Documents/phd/tools/Overfeat/overfeat/src/; fprintf(fid,%s\\n, DirectPath); OutputName = strcat(ImageName,\u2019.txt); fprintf(fid, ./overfeat f %s L 20 > %s;\\n,ImageName, OutputName); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%% % Step 5: Use LibSVM to train the model % % Use the feature generated using the deep classifier (i.e., OverFeat) to train the two layer SVM classier % and evaluate using five-fold cross validation. % 1) Generate train1 for linear SVM training, train2 for RBF SVM training and test for testing % 2) Train three linear SVMs with leave-one-out cross validation using train1 and find the best C for all three classifiers. % 3) Generate Probability Feature from three linear SVMs as new features for train2 and test % 4) Cross validate RBF SVM % % Input:Label vector for each case % Output:Accuracy and area under curve %% 1) Randomly split data into five folds: % train1 for linear SVM training % train2 for RBF SVM training % test for testing for i = 1: 5 train_ind = randperm(length(LabelT2)); %% Five-fold cross validation partition = length(train_ind)/5; for i = 1: 5 % split into different data sets ind(i).test = train_ind(((i1 )*partition + 1 : i*partition)); ind(i).train = setdiff(train_ind,ind(i).test); ind(i).train =ind(i).train(randperm(size(ind(i).train,1)),:); ind(i).train1 = ind(i).train(1: 2*partition); ind(i).train2 = ind(i).train(2*partition :end); end %% 2) Train three linear SVM with leave-one-out cross validation %and find the best C for all of the three classifier. % Train each linear SVM for i = 1:5 v = 10; t = 0; % myCV_SVM is a function for cross-validation; [bestcv_T2(i), cmd1] = myCV_SVM(LabelT2(ind(i).train1,1), FeatureT2(ind(i).train1,:), v, t); [bestcv_ADC(i), cmd2] = myCV_SVM(LabelADC(ind(i).train1,1), FeatureADC(ind(i).train1,:), v, t); [bestcv_Ktrans(i),cmd3] = myCV_SVM(LabelKtrans(ind(i).train1,1), FeatureKtrans(ind(i).train1,:), v, t); cmd_first(i).cmd1 = cmd1; cmd_first(i).cmd2 = cmd2; cmd_first(i).cmd3 = cmd3; end % Find optimal parameters [X, T2_index] = max(bestcv_T2); cmd_T2 = [cmd_first(T2_index).cmd1, b 1]; % Classifier 1 [X, ADC_index] = max(bestcv_ADC); cmd_ADC = [cmd_first(ADC_index).cmd2, b 1]; % Classifier 2 [X, Ktrans_index] = max(bestcv_Ktrans); cmd_Ktrans = [cmd_first(Ktrans_index).cmd3, b 1]; % Classifier 3 %% 3) Generate Probability Feature from three linear SVMs as new features for i = 1:5 model(i).T2 = svmtrain(LabelT2(ind(i).train1,1), FeatureT2(ind(i).train1,:), cmd_T2); [predicted_label, accuracy, prob_T2] = svmpredict(LabelT2(ind(i).train2,1), FeatureT2(ind(i).train2,:), model(i).T2, b 1); [predicted_label, accuracy, prob_T2_t] = svmpredict(LabelT2(ind(i).test,1), FeatureT2(ind(i).test,:), model(i).T2, b 1); model(i).ADC = svmtrain(LabelADC(ind(i).train1,1), FeatureADC(ind(i).train1,:), cmd_ADC); [predicted_label, accuracy, prob_ADC] = svmpredict(LabelADC(ind(i).train2,1), FeatureADC(ind(i).train2,:), model(i).ADC, b 1); [predicted_label, accuracy, prob_ADC_t] = svmpredict(LabelADC(ind(i).test,1), FeatureADC(ind(i).test,:), model(i).ADC, b 1); model(i).Ktrans = svmtrain(LabelKtrans(ind(i).train1,1), FeatureKtrans(ind(i).train1,:), cmd_Ktrans); [predicted_label, accuracy, prob_Ktrans] = svmpredict(LabelKtrans(ind(i).train2,1), FeatureKtrans(ind(i).train2,:), model(i).Ktrans, b 1); [predicted_label, accuracy, prob_Ktrans_t] = svmpredict(LabelKtrans(ind(i).test,1), FeatureKtrans(ind(i).test,:), model(i).Ktrans, b 1); ProbT2(:,i) = prob_T2(:,1); ProbADC(:,i) = prob_ADC(:,1); ProbKtrans(:,i) = prob_Ktrans(:,1); ProbT2Test(:,i) = prob_T2_t(:,1); ProbADCTest(:,i) = prob_ADC_t(:,1); ProbKtransTest(:,i) = prob_Ktrans_t(:,1); end %% 4) Five-fold cross validation for RBF SVM bestcv = 0; for log2c = 5:15, % Parameter searching range from LibSVM for log2g = 3:1:15, % Parameter searching range from LibSVM cmd = [ c, num2str(2{circumflex over ()}log2c), g , num2str(2{circumflex over ()}log2g), t 2]; cv = 0; for i = 1: 5 instance_two_train = [ProbT2(:,i), ProbADC(:,i), ProbKtrans(:,i)]; instance_two_test = [ProbT2Test(:,i), ProbADCTest(:,i), ProbKtransTest(:,i)]; model = svmtrain(LabelT2(ind(i).train2,1), instance_two_train, cmd); [predicted_label, cvOne, prob_estimates] = svmpredict(LabelT2(ind(i).test,1), instance_two_test, model); cv = cv + cvOne(1); end cv = cv / 5; if (cv >= bestcv), bestcv = cv; bestc = 2{circumflex over ()}log2c; bestg = 2{circumflex over ()}log2g; cmdout = [ c, num2str(bestc), g , num2str(bestg), t ,num2str(t)]; end end end end