Method and apparatus for real-time detection of polyps in optical colonoscopy
11017526 · 2021-05-25
Assignee
- Centre National De La Recherche Scientifique (Paris, FR)
- UNIVERSITE DE CERGY PONTOISE (Cergy Pontoise, FR)
- ECOLE NAT SUP ELECTRONIQUE APPLICATION (Cergy Pontoise, FR)
- Assistance Publique Hopitaux De Paris (Paris, FR)
- Sorbonne Universite (Paris, FR)
Inventors
- Quentin Angermann (Cergy, FR)
- Xavier DRAY (Paris, FR)
- Olivier Romain (Montgeron, FR)
- Aymeric Histace (Conflans Sainte-Honorine, FR)
Cpc classification
G06F18/254
PHYSICS
G06F18/217
PHYSICS
G06F18/2148
PHYSICS
International classification
Abstract
A method for performing real-time detection and displaying of polyps in optical colonoscopy, includes a) acquiring and displaying a plurality of real-time images within colon regions to a video stream frame rate, each real-time image comprising a plurality of color channels; b) selecting one single color channel per real-time image for obtaining single color pixels; c) scanning the single color pixels across each the real-time image with a sliding sub-window; d) for each position of the sliding sub-window, extracting a plurality of single color pixels local features of the real-time image; e) passing the extracted single color pixels local features of the real-time image through a classifier to determine if a polyp is present within the sliding sub-window; f) real-time framing on display of colon regions corresponding to positions of the sliding sub-window wherein polyps are detected. A system for carrying out such a method is also provided.
Claims
1. A method for performing real-time detection and displaying of polyps in optical video-colonoscopy, wherein the method comprises the steps of: a) acquiring and displaying a plurality of real-time images within colon regions of a video stream frame rate, each real-time image comprising a plurality of color channels; b) selecting only one single color channel for all the real-time images for obtaining single color pixels; c) scanning the single color pixels across each said real-time image with a sliding sub-window; d) for each position of said sliding sub-window, extracting local features from the single color pixels within the sliding sub-window of the real-time image, all the local features being only based on single-color pixels, a local feature being a function of neighboring single color pixels surrounding a given single color pixel; e) passing the extracted local features of single color pixels of each position of the sliding sub-window through a classifier to determine if a Region of Interest (ROI), containing a polyp, is detected for at least one series of n successive images which comprises an image I.sub.f destined to display the ROI, for each image of the series, spatial fusion of the sub-windows in which a same polyp is detected, for the sub-windows overlapping each other spatially on the image with at least x % of their size, to obtain ROI.sub.final in each successive image, for each series of images, temporal fusion of the ROI.sub.final in only one ROI.sub.displayed, for the ROI.sub.final overlapping each other with at least y % of their size in the fixed referential of the images, real-time framing on display of colon regions corresponding to position of ROI.sub.displayed in the image I.sub.f, x and y being a non-zero number, and the ROI being delimited by at least one sub-window generated by the sliding sub-window; and f) real-time framing on display of colon regions corresponding to Regions Of Interest of said sliding sub-window wherein polyps are detected.
2. The method of claim 1, wherein, for each image, the step c) of scanning is performed p times, with p being an odd number and greater than one, each time using a different size of the sliding window, the classifier is then applied to all p scans in order to decide whether a polyp is detected or not with a majority vote.
3. The method of claim 1, wherein x is greater than or equal to 50, and y is greater than or equal to 70.
4. The method of claim 2, wherein the ROI.sub.displayed is calculated for some series of n successive images, n being an odd number greater than or equal to 3, and a polyp is considered present in the ROI.sub.final if the polyp is detected at least (n+1)/2 times in the series.
5. The method of claim 1, wherein the image I.sub.f is the final image of the series.
6. The method of claim 1, wherein the time of the all the steps a), b), c), d), e), f) lasts less than 40 ms.
7. The method of claim 1, wherein the scanning of the step c) is realized without polyp boundaries detection.
8. The method of claim 1, wherein said single color channel is blue.
9. The method of claim 1, wherein local features are chosen from the group comprising local binary patterns and Haar-like features.
10. The method of claim 1, wherein each local feature is associated to a respective classifier, called weak classifier, the classifier used in step e) of the method comprising a sum of at least one hundred weak classifiers.
11. The method of claim 10, wherein the classifier is based on a boosting algorithm.
12. The method of claim 11, wherein said boosting algorithm is cascade Adaboost.
13. The method of claim 11, further comprising a preliminary step of creating said classifier by active learning.
14. The method of claim 13, wherein said active learning is carried out using a learning database or video comprising a sequence of images, wherein said images include ground truth images of known polyps, the active learning comprising the steps of: s1) selecting an initial set of sub-images with and without polyps, extracted from a set of said images for training, and another set of said images for testing; and selecting one single color channel from all of said images for obtaining single color pixels for the sub-images used for training and the images for testing; s2) extracting local features for training from the initial set of single-color sub-images used for training and local features for testing from the set of images for testing; s3) computing a classifier based on the boosting algorithm applied on the local features of the initial set of sub-images used for training, and testing the first classifier on the local features of the sub-set of images for testing; s4) for each sliding sub-window considered on the images used for testing, identifying false positive detection cases of polyps by said classifier during said testing of step s4), and creating an additional set of sub-images which present the false positive detection cases; and s5) using said false positive detection cases of polyps detected in the additional set of sub-images, to re-compute the classifier based on the boosting algorithm applied on the local features of the initial set of sub-images and on the local features of the additional set of sub-images, steps s4) to s5) being repeated a plurality of times to create a final classifier, and the classifier used in step e) of the method being said final classifier.
15. The method according to claim 1, wherein said real-time images are acquired at a minimum frame rate of 24 images per second.
16. The method of claim 1, wherein said sub-windows comprises n×m pixels, with n and m greater than or equal to 30, and wherein step d) comprises extracting at least 5 local features for each single color pixel of the sliding sub-window.
17. The method according to claim 1, wherein the plurality of real-time images forms a high-definition or a standard definition video.
18. A system for real-time image detection and displaying of polyps in optical video-colonoscopy, comprising an input port for receiving a video stream, an image processor for processing images from said video stream and an output port for outputting processed images, wherein the image processor is configured for: a) acquiring and displaying a plurality of real-time images within colon regions to a video stream frame rate, each real-time image comprising a plurality of color channels; b) selecting only one single color channel for all the real-time images for obtaining single color pixels; c) scanning the single color pixels across each said real-time image with a sliding sub-window; d) for each position of said sliding sub-window, extracting local features from the single color pixels within the sliding sub-window of the real-time image, all the local features being based only on single-color pixels from the selected single color channel selected, a local feature being a function of neighbouring single color pixels surrounding a given single color pixel; e) passing the extracted single color pixels local features of each sliding sub-window through a classifier to determine if a polyp is present within a region, called a Region Of Interest (ROI), for at least one series of n successive images with an image I.sub.f displaying the ROI of the sliding sub-windows, for each image of the series, spatial fusion of the sub-windows in which a same polyp is detected, in each successive image, for the sub-windows overlapping each other spatially on the image with at least x % of their size, to obtain ROI.sub.final in each successive image, for each series of images, temporal fusion of the ROI.sub.final in only one ROI.sub.displayed, for the ROI.sub.final of the series overlapping each other with at least y % of their size in the fixed referential of the images, real-time framing on display of colon regions corresponding to position of ROI.sub.displayed in the final image I.sub.f, x and y being non-zero numbers, the ROI being delimited by at least one sub-window generated by the sliding sub-window; and f) real-time framing on display of colon regions corresponding to regions of interest of said sliding sub-window wherein polyps are detected.
19. The system of claim 18, wherein the image processor is configured for realizing: for each image, the step c) of scanning p times, with p being an odd number and greater than one, each time using a different size of the sliding window, the classifier is then applied to all p scans in order to decide whether a polyp is detected or not with a majority vote.
20. The system of claim 18, wherein x is greater than or equal to 50, and y is greater than or equal to 70.
21. The system of claim 19, wherein the ROI.sub.displayed is calculated for some series of n successive images, n being an odd number greater than or equal to 3, and a polyp is considered present in the ROI.sub.final if the polyp is detected at least (n+1)/2 times in the series of n images.
22. An optical colonoscopy apparatus comprising a system according to claim 18, and an optical colonoscopy probe connected to an input port of said system.
Description
(1) A more complete understanding of the present disclosure thereof may be acquired by referring to the following description taken in conjunction with the accompanying drawings wherein:
(2)
(3)
(4)
(5)
(6)
(7) While the present invention is susceptible to various modifications and alternative forms, specific example embodiments thereof have been shown in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific example embodiments is not intended to limit the disclosure to the particular forms disclosed herein, but on the contrary, this disclosure is to cover all modifications and equivalents as defined by the appended claims.
(8)
(9) The video may be a High Definition (HD, 1440×1080 pixels) or a Standard Definition (SD, 720×480 pixels) video.
(10) In
(11)
(12) Table 1 below show the computational time required for obtaining a non-reinforced classifier (1.sup.st line) and a 1.sup.st, 2.sup.nd and 3.sup.rd reinforced classifier (2.sup.nd, 3.sup.rd, 4.sup.th line, respectively). These classifiers were created using a same computer (a 64-bits Windows with 32 Go of RAM). For each image of the training database, the researchers identified and isolated the position of one polyp (positive example) and also isolated 5 negative examples (without polyps, negative example). To test the ‘blue’ component, classifiers were first trained with 550 positive examples and 3000 negative examples. Then, considering the active learning reinforcement, the three different classifiers were trained using 6000, 7500 and 8500 negative examples, respectively.
(13) TABLE-US-00001 TABLE 1 Number of positives Number examples of positives Computational Classifier used examples used time Non Reinforced Classifier 550 3000 30 minutes 1.sup.st Reinforced Classifier 550 6000 1 hour 2.sup.nd Reinforced Classifier 550 7500 2 hours 3.sup.rd Reinforced Classifier 550 8500 6 hours
(14) Table 2 below is a comparison of the computation results of first classification performed in step s3 according to various embodiments of the present invention, which are for different visible single color channel selections.
(15) TABLE-US-00002 TABLE 2 Grayscale Red Channel Green Channel Blue Channel Local Binary Pattern Classifier Image Image Image Image True Positive Detections 155 238 241 254 False Positive Detections 117 867 898 1067 False Negative Detections 118 35 32 19 Recall (%) 56.78 87.18 88.24 93.04 Precision (%) 56.99 21.54 21.16 19.23 F.sub.2 Score (%) 56.82 54.16 54.01 52.63 Average Detection Time for 1 Image (s) 0.221 0.092 0.066 0.051
(16) From Table 2, one can see that the single color channel blue is not only capable of detecting correctly the highest number of polyps but it is also the one necessitating the shortest time on average to process one single image.
(17)
(18) The present invention uses the same database as the previously cited prior art J. Bernal et al., therefore it is easier to compare performances of both techniques. In Table 3, best performances are overall obtained from the present invention. Results of the present invention are after three reinforcement of the classifier while using LBPs. These are shown and compared to the most up-to-date reports at the time of writing.
(19) TABLE-US-00003 TABLE 3 Authors Performances Database Real-time J. Bernal Sensitivity = 89%, CVC-ColonDB No (19 s/image) et al. F2 = 89% Present Sensitivity = 86%, CVC-ColonDB Yes (35 ms/image) invention F2 = 65%
(20) Both methods have approximately the same sensitivity, 89% for prior art of J. Bernal et al. and 86% for the method of the present invention. Even though both methods are on par in terms of sensitivity their F2 score is substantially different in favor of the prior art. But the difference is even greater if one looks at the average processing time per image showing a factor of nearly 550. Prior art is almost 550 times slower than the method of the present invention. Making object of the invention real-time compliant with satisfactory detection results.
(21) Table 4 below shows the effect of the active Learning strategy on the performances (recall, precision, F2 score, average detection time) of the inventive method. It can be seen that this strategy significantly improve the is overall performances and particularly recall and F2-score.
(22) TABLE-US-00004 TABLE 4 Classifyer using Local Without 1st 2nd 3rd Binary active Reinforced Reinforced Reinforced Pattern learning Classifier Classifier Classifier Recall 93.04% 93.77% 88.28% 86.21% Precision 19.23% 23.66% 30.70% 32.83% F2 Score 52.63% 58.88% 64.20% 65.33% Average 51 ms 44 ms 40 ms 39 ms detection time for 1 image
(23) The implementation of the real-time detection method is not limited to a computer system. In other embodiments, one can take advantage of the low computational power requirements and use GPUs (Graphics Processing Unit), FPGAs (Field Programmable Gate Array) or even integrated computer systems like RaspberryPis to implement such method.
(24) The AdaBoost algorithm of the present invention is developed with OpenCV. Other embodiments may include the use of different means or database to develop the algorithm.
(25) Other embodiments may also use a different boosting algorithm such as logitboost.
(26) In one embodiment, the scanning is performed p times with p being odd and greater than one, each time using a different size of the sliding window. The classifier is then applied to all n scans in order to decide whether a polyp is detected or not, e.g. through a majority vote (in this case, a polyp is considered present if it is detected at least p+½ times).
(27) When using videos, e.g. at a typical rate of 25 frames/sec., rather than sets of still images, a significant improvement of the performances can be obtained through a “spatio-temporal coherence processing” stage. The idea is to improve the polyp detection rate and stability by combining “present” information, provided by the current frame, and “past” information, provided by previous frames showing a same region of the colon.
(28) This approach is a spatial block fusion strategy to reduce the amount of candidates provided by selecting as final candidate ROI only those in which there was a higher degree of overlapping out of all the candidate boxes initially provided by the method. The spatial block fusion is applied to some successive images of a video to confirm or not the detection of a polyp by the method of the invention.
(29) More precisely, the final sub-windows identified as polyps are defined as ROIs (Region of Interests). If multiple ROIs are located on the same regions of the image, a fusion strategy is used. This strategy consists in merging of the ROIs sufficiently overlapping (e.g. by 50% or more of their surfaces) and a final ROI.sub.final is generated. These ROIs within the images are defined as final ROI.sub.final.
(30) According to this approach, a polyp detection is confirmed at time t.sub.i—corresponding to the “i.sup.th” frame—if and only if the polyp has been detected, at a same location, in at least two among the “(i−2).sup.th”, “(i−1).sup.th” and “i.sup.th” frames. On
(31) Majority voting can also be performed on more than three frames, and other spatio-temporal coherence processing method may be applied to the invention.
(32) In other terms, the method comprises in the step e): passing the extracted local features of single color pixels of each position of the sliding sub-window through a classifier to determine if a Region of Interest, containing a polyp, is detected, for at least one series of n successive images with a image I.sub.f displaying the Region of Interest; the image I.sub.f can be the final image of the series or not; for each image of the series, spatial fusion of the sub-windows in which a same polyp is detected, in each successive image, for the sub-windows overlapping each other spatially on the image with at least x % of their size, to obtain ROI.sub.final in each successive image where there is a polyp detected; thus if there is one polyp on the image one ROI.sub.final is obtained on the image, and if there is two polyps two different ROI.sub.final are obtained on the image; for each series of images, temporal fusion of the ROI.sub.final in only one ROI.sub.displayed by polyp detected, for the ROI.sub.final of successive images overlapping each other with at least y % of their size in the referential of the images; for keeping only one ROI.sub.displayed by polyp detected;
(33) the referential of the images which is common to all the images means that the mask images (having the same size) are superposed (or stacked) and the position of ROI.sub.final of each image is compared to the other position of the other ROI.sub.final of the other images of the considered series; real-time framing on display of colon regions corresponding to position of ROI.sub.displayed in the final image I.sub.f,
(34) X and y being non-zero numbers.
(35) When the image I.sub.f is the final image of the series, the method still performs completely real-time detection because the calculation realized in the method of the invention takes less than 30 ms allowing the displaying of the Region of Interest ROI.sub.displayed in the same time than the image which takes 40 ms to appear in a video (with 25 images/second).
(36) In another embodiment, for instance, the image I.sub.f can be in the middle of the series of images. For instance if n=5, I.sub.f is preceded by two images and is followed by two images.
(37) Advantageously, x is equal to or superior to 50, and y is equal to and superior to 70. These values allows to have ROI that encompasses with some precision the polyp.
(38) Advantageously, the ROI.sub.displayed is calculated for some series of n successive images, n being odd and a natural integer greater than or equal to 3, and a polyp is considered present in the ROI.sub.final if the polyp is detected at least (n+1)/2 times in the series of the n images.
(39) To sum up, the method takes into account three successive majority votes: one majority vote during the step c) consisting in strengthening the classification of a sub-window using different scales of the same sub-window; for instance the polyp must have been detected on at least two scales of the same sub-window out of three; then the middle size of the polyp sub-window is stored; one second majority vote during the step e) for spatial coherence, consisting in comparing and merging the polyp sub-windows of step c) when overlapping criteria x is filled to obtain ROI.sub.final for each image of the sequence; for instance a majority vote is realized on the polyp middle sub-windows for obtaining the ROI.sub.final of the image; one final following majority vote during the step e) for temporal coherence, consisting in comparing ROI.sub.final of at least 3 successive images and merging the ROI.sub.final of step e) when overlapping criteria y is filled to obtain ROI.sub.display for each image of the sequence; for instance a majority vote is realized on the ROI.sub.final for obtaining the ROI.sub.display of the series of the images.
(40) Table 5 below shows the results obtained using two different kind of local features for polyp detection—LBP and Haar-like features—with (STC) and without spatio-temporal coherence processing. Active learning was not used (“N0” suffix).
(41) The following metrics were used to measure performances obtained by the inventive method on videos: Prec: Precision; Rec: Recall; F1: combines Precision and Recall; PDR: polyp detection rate; MPT: Mean processing time per frame; MNFP: Mean number of false positive per frame; RT reaction time (latency between the first detection of a polyp by the algorithm and its actual appearance on the ground truth);
(42) TABLE-US-00005 TABLE 5 Methods PDR MPT MNFP Prec Rec F1 RT LBPN0 100% 140 ms 3.5 12.42% 54.65% 20.24% 7.2 (0.3 s) HaarN0 100% 24 ms 1.4 23.29% 46.82% 31.10% 17.5 (0.7 s) LBPN0_STC 100% 140 ms 1.9 16.25% 41.25% 23.31% 35.0 (1.4 s) HaarN0_STC 100% 36 ms 0.9 27.02% 39.61% 32.12% 38.3 (1.5 s)
(43) In Table 5, it can be noticed that for all the considered videos (18), the polyp was detected in a significant number of frames, leading to a PDR of 100%. The Mean Processing Time per frame is only of 24 ms using Haar-like features without spatio-temporal coherence and of 36 ms with it, which is fully compatible with a real-time use. It is also observable that the spatio temporal coherence leads, for both local features, to an improvement of the global performances in terms of Precision and Recall as well as of F1 score. The Reaction Time also increases using spatio-temporal coherence processing with a mean delay of 1.5 s, which, nevertheless, remains compatible with a clinical use.
(44) Table 6 shows the results obtained using both spatio-temporal coherence processing and active learning. It can be seen that the combined use of active learning strategy and spatio-temporal coherence processing leads to a significant improvement of the overall performance in terms of Precision, Recall without altering the 100% Polyp Detection Rate. HaarN1 appears to be the best local features to use with a MPT of only 21 ms and a Reaction Time of only 1.1 s.
(45) In table 6:
(46) N0 represents no active reinforcement;
(47) N1 represents one active reinforcement;
(48) N2 represents two active reinforcements;
(49) TABLE-US-00006 TABLE 6 Method PDR MPT MNFP Prec Rec F1 RT LBPN0_STC 100% 140 ms 1.9 16.25% 41.25% 23.31% 35.0 (1.4 s) LBPN1_STC 100% 160 ms 1.1 27.11% 46.02% 34.12% 43.7 (1.7 s) LBPN2_STC 100% 162 ms 0.7 29.88% 34.96% 32.22% 45.9 (1.8 s) HaarN0_STC 100% 36 ms 0.9 27.02% 39.61% 32.12% 38.3 (1.5 s) HaarN1_STC 100% 21 ms 0.6 39.14% 42.56% 48.78% 27.3 (1.1 s)