Endoscopic Guidance Using Neural Networks
20220151708 · 2022-05-19
Inventors
Cpc classification
A61B10/04
HUMAN NECESSITIES
A61B34/20
HUMAN NECESSITIES
G16H20/40
PHYSICS
G16H50/20
PHYSICS
G06N3/082
PHYSICS
G06T2207/10101
PHYSICS
International classification
Abstract
A method comprises obtaining an endoscope; obtaining a needle; inserting the endoscope into the needle to obtain a system; inserting the system into an animal body; and distinguishing components of the animal body using the endoscope and while the system remains in the animal body. A system comprises a needle; and an endoscope inserted into the needle and configured to: store a convolutional neural network (CNN); distinguish among a cortex of a kidney of an animal body, a medulla of the kidney, and a calyx of the kidney using the CNN; and distinguish between vascular tissue and non-vascular tissue in the animal body using the CNN.
Claims
1. A method comprising: obtaining an endoscope; obtaining a needle; inserting the endoscope into the needle to obtain a system; inserting the system into an animal body; and distinguishing components of the animal body using the endoscope and while the system remains in the animal body.
2. The method of claim 1, further comprising training a convolutional neural network (CNN) to distinguish the components.
3. The method of claim 2, further comprising further training the CNN to distinguish among a cortex of a kidney, a medulla of the kidney, and a calyx of the kidney.
4. The method of claim 2, further comprising further training the CNN to distinguish blood vessels from other components.
5. The method of claim 2, further comprising incorporating the CNN into the endoscope.
6. The method of claim 2, wherein the CNN comprises an input layer, a convolutional layer, a max-pooling layer, a flatten layer, dense layers, and an output layer.
7. The method of claim 1, wherein the animal body is a human body.
8. The method of claim 1, further comprising: further distinguishing a calyx of a kidney from a cortex of the kidney and a medulla of the kidney; inserting, based on the distinguishing, the system into the calyx; and removing the endoscope from the system to obtain the needle.
9. The method of claim 8, further comprising: further distinguishing the calyx from a blood vessel; and avoiding contact between the system and the blood vessel.
10. The method of claim 8, further comprising removing kidney stones while the needle remains in the calyx.
11. The method of claim 1, further comprising: inserting, based on the distinguishing, the system into a kidney of the animal body; and obtaining a biopsy of the kidney.
12. The method of claim 1, wherein the system is a forward-view endoscopic optical coherence tomography (OCT) system.
13. A system comprising: a needle; and an endoscope inserted into the needle and configured to: store a convolutional neural network (CNN); distinguish among a cortex of a kidney of an animal body, a medulla of the kidney, and a calyx of the kidney using the CNN; and distinguish between vascular tissue and non-vascular tissue in the animal body using the CNN.
14. The system of claim 13, wherein the CNN comprises an input layer, a convolutional layer, a max-pooling layer, a flatten layer, dense layers, and an output layer.
15. The system of claim 13, wherein the animal body is a human body.
16. The system of claim 13, wherein the system is a forward-view endoscopic optical coherence tomography (OCT) system.
17. The system of claim 13, wherein the endoscope has a diameter of about 1.3 millimeters (mm).
18. The system of claim 13, wherein the endoscope has a length of about 138.0 millimeters (mm).
19. The system of claim 13, wherein the endoscope is configured to have a view angle of 11.0°.
20. The system of claim 13, wherein the needle is configured to remove a kidney stone from the kidney or obtain a biopsy of the kidney.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.
[0008]
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
DETAILED DESCRIPTION
[0029] Disclosed herein are embodiments for endoscopic guidance using neural networks. In an embodiment, a forward-view OCT endoscopic system images kidney tissues lying ahead of a PCN needle during PCN surgery to access the renal calyx. This may be done to remove kidney stones. In another embodiment, similar imaging is used for percutaneous renal biopsies, urine drainage, urine diversion, and other therapeutic interventions in the kidney. The embodiments provide for neural networks, for instance CNNs, which can distinguish types of renal tissue and other components. The types of renal tissue include the cortex, medulla, and calyx. Other components include blood vessels and diseased renal tissues. By distinguishing the types of renal tissue and other components, the embodiments provide for injection of a needle into the desired tissue and provide for avoidance of undesired components.
[0030] In an experiment, images of the renal cortex, medulla, and calyx were obtained from ten porcine kidneys using the OCT endoscope system. The tissue types were clearly distinguished due to the morphological and tissue differences from the OCT endoscopic images. To further improve the guidance efficacy and reduce the learning burden of the clinical doctors, a deep-learning-based, computer-aided diagnosis platform automatically classified the OCT images by the renal tissue types. A tissue type classifier was developed using the ResNet34, ResNet50, and MobileNetv2 CNN architectures. Nested cross-validation and testing were used for model selection and performance benchmarking to account for the large biological variability among kidneys through uncertainty quantification. The predictions from the CNNs were interpreted to identify the important regions in the representative OCT images used by the CNNs for the classification.
[0031] ResNet50-based CNN models achieved an average classification accuracy of 82.6%±3.0%. The classification precisions were 79%±4% for cortex, 85%±6% for medulla, and 91%±5% for calyx, and the classification recalls were 68%±11% for cortex, 91%±4% for medulla, and 89%±3% for calyx. Interpretation of the CNN predictions showed the discriminative characteristics in the OCT images of the three renal tissue types. The results validated the technical feasibility of using this novel imaging platform to automatically recognize the images of renal tissue structures ahead of the PCN needle in PCN surgery.
[0032] The following abbreviations apply:
[0033] ASIC: application-specific integrated circuit
[0034] AUC: area under the ROC curve
[0035] BD: balanced detector
[0036] BE: Barrett's esophagus
[0037] CCD: charge-coupled device
[0038] CNN: convolutional neural network
[0039] CPU: central processing unit
[0040] CT: computed tomography
[0041] DAQ: data acquisition
[0042] dB: decibel(s)
[0043] DOCT: doppler optical coherence tomography
[0044] DSP: digital signal processor
[0045] EO: electrical-to-optical
[0046] FC: fiber coupler
[0047] FOV: field of view
[0048] FPGA: field-programmable gate array
[0049] GI: gastrointestinal
[0050] GRAD-CAM: gradient-weighted class activation mapping
[0051] GRIN: gradient-index
[0052] GSM: galvanometer scanning mirror
[0053] H&E: hematoxylin and eosin
[0054] kHz: kilohertz
[0055] MEMS: microelectromechanical systems
[0056] mIoU: mean intersection-over-union
[0057] mm: millimeter(s)
[0058] MRI: magnetic resonance imaging
[0059] mW: milliwatt(s)
[0060] MZI: Mach-Zehnder interferometer
[0061] nm: nanometer(s)
[0062] OCT: optical coherence tomography
[0063] OE: optical-to-electrical
[0064] PC: polarization controller
[0065] PCN: percutaneous nephrostomy
[0066] PCNL: percutaneous nephrolithotomy
[0067] PT: pre-trained
[0068] RAM: random-access memory
[0069] ResNet: residual neural network
[0070] RF: radio frequency
[0071] RI: randomly-initialized
[0072] ROM: read-only memory
[0073] ROC: receiver operating characteristic
[0074] RX: receiver unit
[0075] SGD: stochastic gradient descent
[0076] SRAM: static RAM
[0077] SS-OCT: swept-source OCT
[0078] TCAM: ternary content-addressable memory
[0079] TX: transmitter unit
[0080] 2D: two-dimensional
[0081] 3D: three-dimensional
[0082] μm: micrometer(s)
[0083] °: degree(s).
[0084] Before describing various embodiments of the present disclosure in more detail by way of exemplary description, examples, and results, it is to be understood that the present disclosure is not limited in application to the details of methods and compositions as set forth in the following description. The present disclosure is capable of other embodiments or of being practiced or carried out in various ways. As such, the language used herein is intended to be given the broadest possible scope and meaning; and the embodiments are meant to be exemplary, not exhaustive. Also, it is to be understood that the phraseology and terminology employed herein is for the purpose of description and should not be regarded as limiting unless otherwise indicated as so. Moreover, in the following detailed description, numerous specific details are set forth in order to provide a more thorough understanding of the disclosure. However, it will be apparent to a person having ordinary skill in the art that the embodiments of the present disclosure may be practiced without these specific details. In other instances, features which are well known to persons of ordinary skill in the art have not been described in detail to avoid unnecessary complication of the description.
[0085] Unless otherwise defined herein, scientific and technical terms used in connection with the present disclosure shall have the meanings that are commonly understood by those having ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
[0086] All patents, published patent applications, and non-patent publications mentioned in the specification are indicative of the level of skill of those skilled in the art to which the present disclosure pertains. All patents, published patent applications, and non-patent publications referenced in any portion of this application are herein expressly incorporated by reference in their entirety to the same extent as if each individual patent or publication was specifically and individually indicated to be incorporated by reference.
[0087] As utilized in accordance with the methods and compositions of the present disclosure, the following terms, unless otherwise indicated, shall be understood to have the following meanings:
[0088] The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.” The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or when the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.” The use of the term “at least one” will be understood to include one as well as any quantity more than one, including but not limited to, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 100, or any integer inclusive therein. The term “at least one” may extend up to 100 or 1000 or more, depending on the term to which it is attached; in addition, the quantities of 100/1000 are not to be considered limiting, as higher limits may also produce satisfactory results. In addition, the use of the term “at least one of X, Y and Z” will be understood to include X alone, Y alone, and Z alone, as well as any combination of X, Y and Z.
[0089] As used herein, all numerical values or ranges include fractions of the values and integers within such ranges and fractions of the integers within such ranges unless the context clearly indicates otherwise. Thus, to illustrate, reference to a numerical range, such as 1-10 includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, as well as 1.1, 1.2, 1.3, 1.4, 1.5, etc., and so forth. Reference to a range of 1-50 therefore includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etc., up to and including 50, as well as 1.1, 1.2, 1.3, 1.4, 1.5, etc., 2.1, 2.2, 2.3, 2.4, 2.5, etc., and so forth. Reference to a series of ranges includes ranges which combine the values of the boundaries of different ranges within the series. Thus, to illustrate reference to a series of ranges, for example, of 1-10, 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, 60-75, 75-100, 100-150, 150-200, 200-250, 250-300, 300-400, 400-500, 500-750, 750-1,000, includes ranges of 1-20, 10-50, 50-100, 100-500, and 500-1,000, for example. A reference to degrees such as 1 to 90 is intended to explicitly include all degrees in the range. A reference to the number of unit cells in a sub-array panel, such as 4-256, 4-400, or 4-676, is intended to include all whole numbers (positive integers) within each range.
[0090] In certain embodiments, the element spacing of the disclosed SWGA array units can be in a range of about 0.6λ.sub.o to about 0.5λ.sub.o (in the azimuth plane), providing a reduction of from about 50% to about 58% vs. a conventional spacing of 1.2λ.sub.o, thereby enabling a 1D e-scanning range in a range of from about 84° (±42°) up to at about 180° (±90°) in the azimuth plane perpendicular to the waveguide axis. For example, the element spacing may be from 0.6λ.sub.o, to about 0.59λ.sub.o, to about 0.58λ.sub.o, to about 0.57λ.sub.o, to about 0.56λ.sub.o, to about 0.55λ.sub.o, to about 0.54λ.sub.o, to about 0.53λ.sub.o, to about 0.52λ.sub.o, to about 0.51λ.sub.o, to about 0.50λ.sub.o, or fractional portions thereof, thereby enabling a 1D e-scanning range of from about 84° (±42°), to about 86° (±43°), to about 88° (±44°), to about 90° (±45°), to about 92° (±46°), to about 94° (±47°), to about 96° (±48°), to about 98° (±49°), to about 100° (±50°), to about 102° (±51°), to about 104° (±52°), to about 106° (±53°), to about 108° (±54°), to about 110° (±55°), to about 112° (±56°), to about 114° (±57°), to about 116° (±58°), to about 118° (±59°), to about 120° (±60°), to about 122° (±61°), to about 124° (±62°), to about 126° (±63°), to about 128° (±64°), to about 130° (±65°), to about 132° (±66°), to about 134° (±67°), to about 136° (±=68°), to about 138° (±69°), to about 140° (±70°), to about 142° (±71°), to about 144° (±72°), to about 146° (±73°), to about 148° (±74°), to about 150° (±75°), to about 152° (±76°), to about 154° (±77°), to about 156° (±78°), to about 158° (±79°), to about 160° (±80°), to about 162° (±81°), to about 164° (±82°), to about 166° (±83°), to about 168° (±84°), to about 170° (±85°), to about 172° (±86°), to about 174° (±87°), to about 176° (±88°), to about 178° (±89°), to at about 180° (±90°). Cross-polarization isolation may be within a range of about −55 dB to about −70 dB, but will generally be within a range of about −60 dB to about −70 dB.
[0091] As used herein, the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
[0092] The term “or combinations thereof” as used herein refers to all permutations and combinations of the listed items preceding the term. For example, “A, B, C, or combinations thereof” is intended to include at least one of: A, B, C, AB, AC, BC, or ABC, and if order is important in a particular context, also BA, CA, CB, CBA, BCA, ACB, BAC, or CAB. Continuing with this example, expressly included are combinations that contain repeats of one or more item or term, such as BB, AAA, AAB, BBC, AAABCCCC, CBBAAA, CABABB, and so forth. The skilled artisan will understand that typically there is no limit on the number of items or terms in any combination, unless otherwise apparent from the context.
[0093] Throughout this application, the terms “about” and “approximately” are used to indicate that a value includes the inherent variation of error. Further, in this detailed description, each numerical value (e.g., degrees or frequency) should be read once as modified by the term “about” (unless already expressly so modified), and then read again as not so modified unless otherwise indicated in context. As noted, any range listed or described herein is intended to include, implicitly or explicitly, any number within the range, particularly all integers, including the end points, and is to be considered as having been so stated. For example, “a range from 1 to 10” is to be read as indicating each possible number, particularly integers, along the continuum between about 1 and about 10. Thus, even if specific data points within the range, or even no data points within the range, are explicitly identified or specifically referred to, it is to be understood that any data points within the range are to be considered to have been specified, and that the inventors possessed knowledge of the entire range and the points within the range. The use of the term “about” may mean a range including ±10% of the subsequent number unless otherwise stated.
[0094] As used herein, the term “substantially” means that the subsequently described parameter, event, or circumstance completely occurs or that the subsequently described parameter, event, or circumstance occurs to a great extent or degree. For example, the term “substantially” means that the subsequently described parameter, event, or circumstance occurs at least 90% of the time, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, of the time, or means that the dimension or measurement is within at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, of the referenced dimension or measurement (e.g., degrees, frequency, width, length, etc.).
[0095] As used herein any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
System
[0096]
[0097] The light source 105 generates a laser beam with a center wavelength of 1300 nm and a bandwidth of 100 nm. The wavelength-swept frequency (A-scan) rate is 200 kHz with an ˜25 mW output power. The FC 110 splits the laser beam into a first beam with 97% of the whole laser power on the top path 115 and a second beam with 3% of the whole laser power on the bottom path 120. The second beam delivers into the MZI 125 for the MZI 125 to generate a frequency clock signal. The frequency clock signal triggers an OCT sampling procedure and passes to the DAQ board 135. The first beam passes to the circulator 145, which runs only in one direction. Therefore, the light entering port 1 only emits from port 2, and then it evenly splits towards the reference arm 185 and the sample arm 190. Backscattered light from both the reference arm 185 and the sample arm 190 form interference fringes at the FC 150 and transmit to the BD 140. The interference fringes from different depths received by the BD 140 are encoded with different frequencies. The BD 140 transmits an output signal to the DAQ board 135 and the computer 130 for processing. Cross-sectional information can be obtained through a Fourier transform of the interference fringes.
[0098] In the experiment, the lenses 175, 180 were stabilized in front of GSMs 195, 197. The proximal GRIN lens entrance of the endoscope was placed close to the focal plane of the objective lens. The GRIN lens can preserve the spatial relationship between the entrance and the output (distal end) and further to the sample. Therefore, one or two directional scanning can be readily performed on the proximal GRIN lens surface to create 2D or 3D images. In addition, the same GRIN rod lens was put in the light path of the reference arm 185 for the purpose of compensating light dispersion and expanding the length of the reference arm 185. The PCs 155, 160 decreased background noise. The forward-view endoscopic OCT system 100 had an axial resolution of ˜11 μm and lateral resolution of ˜20 μm in tissue. The lateral imaging FOV was around 1.25 mm. The sensitivity of the forward-view endoscopic OCT system 100 was optimized to 92 dB and calculated using a silver mirror with a calibrated attenuator.
Data Acquisition
[0099] Ten fresh porcine kidneys were obtained from a local slaughterhouse. The cortex, medulla, and calyx of the porcine kidneys were exposed and imaged in the experiment. Renal tissue types can be identified from the anatomic appearance. The forward-view endoscopic OCT system 100 was placed against different renal tissues for image acquisition. To mimic a clinical situation, some force was applied while imaging the ex-vivo kidney tissues to generate tissue compression. 3D images of 320×320×480 pixels on X, Y and Z axes (Z presents the depth direction) were obtained with the pixel size of 6.25 μm on all three axes. Therefore, the size of the original 3D images is 2.00 mm×2.00 mm×3.00 mm. For every kidney sample, at least 30 original 3D OCT images were obtained for each tissue type, and each 3D tissue scanning took no more than 2 seconds. Afterwards, the original 3D images were separated to 2D cross-sectional images as shown in
[0100] Since the GRIN lens is cylindrical, the 3D OCT images obtained were also in the cylindrical shape. Therefore, not all of the 2D cross-sectional images contained the same structural signal of the kidney. Only the 2D images with sufficient tissue structural information (cross-sectional images close to the center of the 3D cylindrical structures) were subsequently selected and utilized for the image preprocessing. At the end of imaging, tissues of cortex, medulla, and calyx of the porcine kidneys were excised and processed for histology to compare with corresponding OCT results. The tissues were fixed with 10% formalin, embedded in paraffin, sectioned (4 μm thick) and stained with H&E for histological analysis. Images were taken by Keyence Microscope BZ-X800.
[0101] Although the three tissue types showed different imaging features for visual recognition, it will take time and expertise for doctors to differentiate them during surgeries. In order to improve the efficiency, we developed deep learning methods for automatic tissue classification based on the imaging data. In total, ten porcine kidneys were imaged in this study. For each kidney, 1,000 2D cross-sectional images were obtained for each cortex, medulla, and calyx. For the purpose of convenient analysis and increasing the speed of deep-learning process of the OCT images, a custom MATLAB algorithm was designed to recognize the surface of the kidney tissue on the 2D cross-sectional images. The algorithm automatically cropped the images from the size of 320×480 to 235×301. Therefore, all the 2D cross-sectional images have the same dimensions and cover the same FOV before deep-learning processing.
CNN Training
[0102] A CNN was used to classify the images of the renal cortex, medulla, and calyx. ResNet34, ResNet50, and MobileNetv2 were tested using Tensorflow 2.3 in open-ce version 0.1.
[0103] Pre-trained ResNet50 and MobileNetv2 models on the ImageNet dataset were imported. The output layer of the models was changed to one containing 3 softmax output neurons for cortex, medulla, and calyx. The input images were preprocessed by resizing to the 224×224 resolution, replicating the input channel to 3 channels, and scaling the pixel intensities to [−1, 1]. Model fine-tuning was conducted in two stages. First, the output layer was trained with all the other layers frozen. The optimizer, SGD, was used with a learning rate of 0.2, a momentum of 0.3, and a decay of 0.01. Then, the entire model was unfrozen and trained. The SGD with Nesterov momentum optimizer was used with a learning rate of 0.01, a momentum of 0.9, and a decay of 0.001. Early stopping with a patience of 10 and a maximum number of epochs 50 was used for the Pre-trained ResNet50. Early stopping with a patience of 20 and a maximum number of epochs 100 was used for MobileNetv2.
[0104] The ResNet34 and ResNet50 architectures were also trained using randomly initialized weights. ResNet34 was obtained. The mean pixel in the training dataset was used to center the training, validation, and test datasets. The input layer was modified to accept only one input channel in the OCT images and the output layer was changed for the classification of the three tissue types. For ResNet50, the optimizer SGD with Nesterov momentum with learning rate 0.01, momentum 0.9, and decay 0.01 was used. ResNet50 was trained with a maximum of 50 epochs, early stopping with a patience of 10, and a batch size of 32. For ResNet34, the Adam optimizer was used with learning rate 0.001, beta1 0.9, beta2 0.9999 and epsilon 1E-7. ResNet34 was trained with a maximum of 200 epochs, early stopping with a patience of 10, and a batch size of 512.
Validation and Testing
[0105] A nested cross-validation and testing procedure was used to estimate the validation performance and the test performance of the models across the 10 kidneys with uncertainty quantification. The pseudo-code of the nested cross-validation and testing is shown below.
TABLE-US-00001 # 10-fold cross-testing loop for kidney i in the 10 kidneys do Hold out kidney i in the test set # model optimization loop for each model configuration do # 9-fold cross-validation loop for kidney j in the remaining 9 kidneys do Use kidney j as the validation set Train a model using the remaining 8 kidneys as the training set Benchmark the validation performance using kidney j end for Estimate the mean validation accuracy and its standard error end for Select the best model configuration based on the validation performance Train a model with the selected configuration using the 9 kidneys Benchmark the test performance of this model using kidney i end for Summarize the test performance of this procedure
[0106] In the 10-fold cross-testing, one kidney was selected in turn as the test set. In the 9-fold cross-validation, the remaining nine kidneys were partitioned 8:1 between the training set and the validation set. Each kidney had a total of 3,000 images, including 1,000 images for each tissue type. The validation performance of a model was tracked based on its classification accuracy on the validation kidney. The classification accuracy is the percentage of correctly labeled images out of all 3,000 images of a kidney.
[0107] The 9-fold cross-validation loop was used to compare the performance of ResNet34, ResNet50, and MobileNetv2, and optimize the key hyperparameters of these models, such as pre-trained versus randomly initialized weights, learning rates, and number of epochs. The model configuration with the highest average validation accuracy was selected for the cross-testing loop. The cross-testing loop enabled iterative benchmarking of the selected model across all 10 kidneys, giving a better estimation of generalization error with uncertainty quantification.
[0108] GRAD-CAM was used to explain the predictions of a selected CNN model by highlighting the important regions in the image for the prediction outcome.
OCT Imaging of Different Renal Tissues
[0109]
[0110]
[0111] The renal calyx in
CNN Development and Benchmarking Results
[0112]
[0113]
[0114] There was substantial variability in the test accuracy among different kidneys. While three kidneys had test accuracies higher than 92% (softmax score threshold of 0.333), the kidney in the sixth fold had the lowest test accuracy of 67.7%. Therefore, the current challenge in the image classification mainly comes from the anatomic differences among the samples.
[0115]
[0116]
[0117]
[0118]
Detecting Blood Vessels
[0119]
[0120] The real-time blood vessel detection of the forward imaging OCT/DOCT needle in another 5 perfused human kidneys was demonstrated. During the insertion of the OCT needle into the kidney in the PCN procedure, the blood vessels in front of the needle tip were detected by Doppler OCT.
[0121] To improve the accuracy of image segmentation, a novel nnU-net framework was trained and tested using 100 2D Doppler OCT images. The blood vessels in these 100 images were first manually labeled to mark the blood vessel regions as shown in
[0122] After obtaining the predicted regions by nnU-net as shown in
[0123] These preliminary data clearly demonstrated at least three favorable outcomes. First, the thin-diameter forward imaging OCT/DOCT needle can detect the blood vessels in front of the needle tip in real time in the human kidney. Second, the newly developed nnU-net model can achieve >88% mIoU for 2D Doppler OCT images. Third, the size and location of blood vessel can be accurately predicted. Thus, this showed a viable approach to preventing accidental blood vessel ruptures.
CONCLUSION
[0124] The feasibility of an OCT endoscopic system for PCN surgery guidance was investigated. Three porcine kidney tissue types, the cortex, medulla and calyx, were imaged. These three kidney tissues show different structural features, which can be further used for tissue type recognition. To increase the image recognition efficiency and reduce the learning burden of the clinical doctors, CNN methods were developed and evaluated for image classification and recognition. ResNet50 had the best performance compared to ResNet34 and PT MobileNetv2 and achieved an average classification accuracy of 82.6%±3.0%.
[0125] The porcine kidneys samples were obtained from a local slaughterhouse without controlling the sample preservation and time after death. Biological changes may have occurred in the ex-vivo kidneys, including collapse of some structures of nephrons such as the renal tubules. This may have made the tissue recognition more difficult, especially the classification between the cortex and the medulla. Characteristic renal structures in the cortex can be clearly imaged by OCT in both well-preserved ex-vivo human kidneys and living kidneys and verified in an ongoing study in a lab using well-preserved human kidneys. Additionally, nephron structures distributed in the renal cortex and the medulla are different. These additional features in the renal cortex and the medulla will improve the recognition of these two tissue types and increase the classification accuracy of future CNN models when imaging in-vivo samples or well-preserved ex-vivo samples. The study established the feasibility of automatic tissue recognition using CNN and provided information for the model selection and hyper-parameter optimization in future CNN model development using in-vivo pig kidneys and well-preserved ex-vivo human kidneys.
[0126] For translating the proposed OCT probe into clinics, the endoscope will be assembled with appropriate diameter and length into the clinically-used PCN needle. In current PCN punctures, a trocar needle is inserted into the kidney. Since the trocar has a hollow structure, the endoscope can be fixed within the trocar needle. Then the OCT endoscope can be inserted into the kidney together with the trocar needle. After the trocar needle tip arrives at the destination (such as the kidney pelvis), we will withdraw the OCT endoscope from the trocar needle and other surgical processes can be continued. During the whole puncture, no extra invasiveness will be caused. Since the needle will keep moving during the puncture, there will be a tight contact between the needle tip and the tissue. Therefore, the blood, if any, will not accumulate in front of the needle tip. From previous experience in the in-vivo pig experiment guiding the epidural anesthesia using the OCT endoscope, the presence of blood is not a substantial issue. The diameter of the GRIN rod lens used in the study was 1.3 mm. In the future, the current setup will be improved with a smaller GRIN rod lens that can be fit inside the 18-gauge PCN needle, which is clinically used in the PCN puncture. Furthermore, the GSM device will be miniaturized based on MEMS technology, which will enable ease of operation and is important for translating the OCT endoscope to clinical applications. The current employed OCT system has a scanning speed up to 200 kHz, and the 2D tissue images in front of the PCN needle can be provided to surgeons in real time. Using ultra-high-speed laser scanning and a data processing system, 3D images of the detected sample can be obtained in real time. In the next step, 3D images that further improve classification accuracy may be acquired because of the added information content in 3D images.
Exemplary Method
[0127]
[0128] The method 1000 may comprise additional embodiments. For instance, the method 1000 further comprises training a CNN to distinguish the components. The method 1000 further comprises further training the CNN to distinguish among a cortex of a kidney, a medulla of the kidney, and a calyx of the kidney. The method 1000 further comprises further training the CNN to distinguish blood vessels from other components. The method 1000 further comprises incorporating the CNN into the endoscope. The CNN comprises an input layer, a convolutional layer, a max-pooling layer, a flatten layer, dense layers, and an output layer. The animal body is a human body. The method 1000 further comprises further distinguishing a calyx of a kidney from a cortex of the kidney and a medulla of the kidney; inserting, based on the distinguishing, the system into the calyx; and removing the endoscope from the system to obtain the needle. The method 1000 further comprises further distinguishing the calyx from a blood vessel; and avoiding contact between the system and the blood vessel. The method 1000 further comprises removing kidney stones while the needle remains in the calyx. The method 1000 further comprises inserting, based on the distinguishing, the system into a kidney of the animal body; and obtaining a biopsy of the kidney. The system is a forward-view endoscopic OCT system.
Exemplary Computing Apparatus
[0129]
[0130] The processor 1130 is any combination of hardware, middleware, firmware, or software. The processor 1130 comprises any combination of one or more CPU chips, cores, FPGAs, ASICs, or DSPs. The processor 1130 communicates with the ingress ports 1110, the RX 1120, the TX 1140, the egress ports 1150, and the memory 1160. The processor 1130 comprises an endoscopic guidance component 1170, which implements the disclosed embodiments. The inclusion of the endoscopic guidance component 1170 therefore provides a substantial improvement to the functionality of the apparatus 1100 and effects a transformation of the apparatus 1100 to a different state. Alternatively, the memory 1160 stores the endoscopic guidance component 1170 as instructions, and the processor 1130 executes those instructions.
[0131] The memory 1160 comprises any combination of disks, tape drives, or solid-state drives. The apparatus 1100 may use the memory 1160 as an over-flow data storage device to store programs when the apparatus 1100 selects those programs for execution and to store instructions and data that the apparatus 1100 reads during execution of those programs. The memory 1160 may be volatile or non-volatile and may be any combination of ROM, RAM, TCAM, or SRAM.
[0132] A computer program product may comprise computer-executable instructions for storage on a non-transitory medium and that, when executed by a processor, cause an apparatus to perform any of the embodiments. The non-transitory medium may be the memory 1160, the processor may be the processor 1130, and the apparatus may be the apparatus 1100.
[0133] While several embodiments have been provided in the present disclosure, it may be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.
[0134] In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, components, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled may be directly coupled or may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and may be made without departing from the spirit and scope disclosed herein.