Method for analyzing biological specimens by spectral imaging
10067051 ยท 2018-09-04
Assignee
Inventors
- Max Diem (Boston, MA, US)
- Benjamin Bird (Jamaica Plain, MA, US)
- Milos Miljkovic (Jamaica Plain, MA, US)
- Stanley H. Remiszewski (Spencer, MA, US)
- Clay M. Thompson (Camano Island, WA, US)
Cpc classification
G01N21/31
PHYSICS
G16Z99/00
PHYSICS
G16H50/20
PHYSICS
G06T2207/20101
PHYSICS
A61B5/0075
HUMAN NECESSITIES
G06T7/30
PHYSICS
International classification
G01N21/31
PHYSICS
A61B5/00
HUMAN NECESSITIES
G01N33/50
PHYSICS
G06T7/30
PHYSICS
Abstract
A method for analyzing biological specimens by spectral imaging to provide a medical diagnosis includes obtaining spectral and visual images of biological specimens and registering the images to detect cell abnormalities, pre-cancerous cells, and cancerous cells. This method eliminates the bias and unreliability of diagnoses that is inherent in standard histopathological and other spectral methods. In addition, a method for correcting confounding spectral contributions that are frequently observed in microscopically acquired infrared spectra of cells and tissue includes performing a phase correction on the spectral data. This phase correction method may be used to correct various types of absorption spectra that are contaminated by reflective components.
Claims
1. A method executed by a system for analyzing biological specimens by spectral imaging, comprising: acquiring a spectral image of the biological specimen; acquiring a visual image of the biological specimen; and registering the visual image and spectral image to generate a registered image, wherein acquiring the spectral image of the biological specimen further comprises: acquiring spectral data from the biological specimen; performing pre-processing on the spectral data by selecting a spectral range, computing a second derivative, performing reverse Fourier transformation, performing zero-filling and reverse Fourier transformation, and performing a phase correction; performing multivariate analysis on the spectral data; and preparing, by the system, a grayscale or pseudo-color spectral image based on the multivariate analysis of the spectral data.
2. The method of claim 1, further comprising: storing the registered image.
3. The method of claim 2, wherein registering the visual image and spectral image comprises: aligning corresponding control points on the spectral image and visual image.
4. The method of claim 1, wherein the biological specimen comprises cells or tissue.
5. The method of claim 1, wherein acquiring spectral data from the biological specimen comprises: performing infrared spectroscopy, Raman spectroscopy, visible, terahertz, or fluorescence spectroscopy on the biological specimen.
6. The method of claim 1, wherein multivariate analysis of the spectral data comprises: performing unsupervised analysis.
7. The method of claim 6, wherein performing unsupervised analysis comprises: performing hierarchical cluster analysis (HCA) or principal component analysis (PCA).
8. The method of claim 1, wherein multivariate analysis on the spectral data comprises: performing analysis of the data via a supervised algorithm.
9. The method of claim 8, wherein performing analysis of the data via a supervised algorithm comprises: performing analysis of the data via a machine learning algorithm selected from the group consisting of artificial neural networks (ANNs), hierarchical artificial neural networks (hANN), support vector machines (SVM), and random forest algorithms.
10. The method of claim 1, wherein acquiring a visual image of the biological specimen comprises: obtaining a digital image of the biological specimen.
11. The method of claim 1, further comprising: providing one or more of a diagnostic decision and prognostic decision.
12. The method of claim 11, further comprising: obtaining a selected region of a spectral image; comparing data for the selected region to data in a repository that is associated with a disease or condition; determining any correlation between the repository data and the data for the selected region; and outputting a classification of the selected region based on the determination, wherein the classification is used for one or more of the diagnostic decisions and prognostic decisions.
13. The method of claim 12, wherein the repository data is obtained for a plurality of images, and wherein each of the plurality of images in the repository is associated with a disease or condition.
14. A system for analyzing biological specimens by spectral imaging, comprising: a memory in communication with a processor, wherein the memory and the processor are cooperatively configured to: acquire a spectral image of the biological specimen; acquire a visual image of the biological specimen; and register the visual image and spectral image to generate a registered image, wherein acquiring the spectral image of the biological specimen further comprises: acquiring spectral data from the biological specimen; performing pre-processing on the spectral data by selecting a spectral range, computing a second derivative, performing reverse Fourier transformation, performing zero-filling and reverse Fourier transformation, and performing a phase correction; performing multivariate analysis on the spectral data; and preparing a grayscale or pseudo-color spectral image based on the multivariate analysis of the spectral data.
15. The system of claim 14, further comprising: storing the registered image.
16. The system of claim 15, wherein registering the visual image and spectral image further comprises: aligning corresponding control points on the spectral image and visual image.
17. The system of claim 14, wherein the biological specimen comprises cells or tissue.
18. The system of claim 14, wherein acquiring spectral data from the biological specimen comprises: performing infrared spectroscopy, Raman spectroscopy, visible, terahertz, or fluorescence spectroscopy on the biological specimen.
19. The system of claim 14, wherein multivariate analysis of the spectral data comprises: performing unsupervised analysis.
20. The system of claim 19, wherein performing unsupervised analysis comprises: performing hierarchical cluster analysis (HCA) or principal component analysis (PCA).
21. The system of claim 14, wherein multivariate analysis on the spectral data comprises: performing analysis of the data via a supervised algorithm.
22. The system of claim 21, wherein performing analysis of the data via a supervised algorithm further comprises: performing analysis of the data via a machine learning algorithm selected from the group consisting of artificial neural networks (ANNs), hierarchical artificial neural networks (hANN), support vector machines (SVM), and random forest algorithms.
23. The system of claim 14, wherein acquiring a visual image of the biological specimen comprises: obtaining a digital image of the biological specimen.
24. The system of claim 14, further comprises: obtaining a selected region of a spectral image; comparing data for the selected region to data in a repository that is associated with a disease or condition; determining any correlation between the repository data and the data for the selected region; and outputting a classification of the selected region based on the determination.
25. The system of claim 24, wherein the classification is used for one or more of diagnostic decisions and prognostic decisions.
26. The system of claim 24, wherein the repository data is obtained for a plurality of images, and wherein each of the plurality of images in the repository is associated with a disease or condition.
27. A non-transitory computer-readable medium storing instructions that when executed by a computer device, cause the computer device to: acquire a spectral image of the biological specimen; acquire a visual image of the biological specimen; and register the visual image and spectral image to generate a registered image, wherein acquiring the spectral image of the biological specimen further comprises: acquiring spectral data from the biological specimen; performing pre-processing on the spectral data by selecting a spectral range, computing a second derivative, performing reverse Fourier transformation, performing zero-filling and reverse Fourier transformation, and performing a phase correction; performing multivariate analysis on the spectral data; and preparing a grayscale or pseudo-color spectral image based on the multivariate analysis of the spectral data.
Description
DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48) The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
DETAILED DESCRIPTION
(49) Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which aspects of this invention belong. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, this specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
(50) One aspect of the invention relates to a method for analyzing biological specimens by spectral imaging to provide a medical diagnosis. The biological specimens may be medical specimens obtained by surgical methods, biopsies, and cultured samples. The method includes obtaining spectral and visual images of biological specimens and registering the images to detect cell abnormalities, pre-cancerous cells, and cancerous cells. The biological specimens may include tissue or cellular samples, but tissue samples are preferred for some applications. This method identifies abnormal or cancerous and other disorders including, but not limited to, breast, uterine, renal, testicular, ovarian, or prostate cancer, small cell lung carcinoma, non-small cell lung carcinoma, and melanoma, as well as non-cancerous effects including, but not limited to, inflammation, necrosis, and apoptosis.
(51) One method in accordance with aspects of the invention overcomes the obstacles discussed above in that it eliminates or generally reduces the bias and unreliability of diagnoses that are inherent in standard histopathological and other spectral methods. In addition, it allows access to a spectral database of tissue types that is produced by quantitative and reproducible measurements and is analyzed by an algorithm that is calibrated against classical histopathology. Via this method, for example, abnormal and cancerous cells may be detected earlier than they can be identified by the related art, including standard histopathological or other spectral techniques.
(52) A method in accordance with aspects of the invention is illustrated in the flowchart of
(53) Biological Section
(54) According to the example method of the invention shown in
(55) A tissue section that is to be subjected to spectral and visual image acquisition may be prepared from frozen or from paraffin embedded tissue blocks according to methods used in standard histopathology. The section may be mounted on a slide that may be used both for spectral data acquisition and visual pathology. For example, the tissue may be mounted either on infrared transparent microscope slides comprising a material including, but not limited to, calcium fluoride (CaF.sub.2) or on infrared reflective slides, such as commercially available low-e slides. After mounting, paraffin-embedded samples may be subjected to deparaffinization.
(56) Spectral Image
(57) According to aspects of the invention, the step of acquiring a spectral image of the biological section 302 shown in
(58) Spectral Data
(59) As set forth in
(60) The spectral data may be collected by methods including, but not limited to infrared, Raman, visible, terahertz, and fluorescence spectroscopy. Infrared spectroscopy may include, but is not limited to, attenuated total reflectance (ATR) and attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR). In general, infrared spectroscopy may be used because of its fingerprint sensitivity, which is also exhibited by Raman spectroscopy. Infrared spectroscopy may be used with larger tissue sections and to provide a dataset with a more manageable size than Raman spectroscopy. Furthermore, infrared spectroscopy data may be more amenable to fully automatic data acquisition and interpretation. Additionally, infrared spectroscopy may have the necessary sensitivity and specificity for the detection of various tissue structures and diagnosis of disease.
(61) The intensity axis of the spectral data, in general, express absorbance, reflectance, emittance, scattering intensity or any other suitable measure of light power. The wavelength may relate to the actual wavelength, wavenumber, frequency or energy of electromagnetic radiation.
(62) Infrared data acquisition may be carried out using presently available Fourier transform (FT) infrared imaging microspectrometers, tunable laser-based imaging instruments, such as quantum cascade or non-linear optical devices, or other functionally equivalent instruments based on different technologies. The acquisition of spectral data using a tunable laser is described further in U.S. patent application Ser. No. 13/084,287 titled, Tunable Laser-Based Infrared Imaging System and Method of Use Thereof, which is incorporated herein in its entirety by reference.
(63) According to one method in accordance with aspects of the invention, a pathologist or technician may select any region of a stained tissue section and receive a spectroscopy-based assessment of the tissue region in real-time, based on the hyperspectral dataset collected for the tissue before staining. Spectral data may be collected for each of the pixels in a selected unstained tissue sample. Each of the collected spectra contains a fingerprint of the chemical composition of each of the tissue pixels. Acquisition of spectral data is described in WO 2009/146425, which is incorporated herein in its entirety by reference.
(64) In general, the spectral data includes hyperspectral datasets, which are constructs including N=n.Math.m individual spectra or spectral vectors (absorption, emission, reflectance etc.), where n and m are the number of pixels in the x and y dimensions of the image, respectively. Each spectrum is associated with a distinct pixel of the sample, and can be located by its coordinates x and y, where 1xn, and 1ym. Each vector has k intensity data points, which are usually equally spaced in the frequency or wavenumber domain.
(65) The pixel size of the spectral image may generally be selected to be smaller than the size of a typical cell so that subcellular resolution may be obtained. The size may also be determined by the diffraction limit of the light, which is typically about 5 m to about 7 m for infrared light. Thus, for a 1 mm.sup.2 section of tissue, about 140.sup.2 to about 200.sup.2 individual pixel infrared spectra may be collected. For each of the N pixels of a spectral hypercube, its x and y coordinates and its intensity vector (intensity vs. wavelength), are stored.
(66) Pre-Processing
(67) Subjecting the spectral data to a form of pre-processing may be helpful to isolate the data pertaining to the cellular material of interest and to remove confounding spectral features. Referring to
(68) Pre-processing may involve creating a binary mask to separate diagnostic from non-diagnostic regions of the sampled area to isolate the cellular data of interest. Methods for creating a binary mask are disclosed in WO 2009/146425, which is incorporated by reference herein in its entirety.
(69) A method of pre-processing, according to another aspect of the invention, permits the correction of dispersive line shapes in observed absorption spectra by a phase correction algorithm that optimizes the separation of real and imaginary parts of the spectrum by adjusting the phase angle between them. This method, which is computationally fast, is based on a revised phase correction approach, in which no input data are required. Although phase correction is used in the pre-processing of raw interferograms in FTIR and NMR spectroscopy (in the latter case, the interferogram is usually referred to as the free induction decay, FID) where the proper phase angle can be determined experimentally, the method of this aspect of the invention differs from earlier phase correction approaches in that it takes into account mitigating factors, such as Mie, RMie and other effects based on the anomalous dispersion of the refractive index, and it may be applied to spectral datasets retroactively.
(70) The pre-processing method of this aspect of the invention transforms corrupted spectra into Fourier space by reverse FT transform. The reverse FT results in a real and an imaginary interferogram. The second half of each interferogram is zero-filled and forward FT transformed individually. This process yields a real spectral part that exhibits the same dispersive band shapes obtained via numeric KK transform, and an imaginary part that includes the absorptive line shapes. By recombining the real and imaginary parts with a correct phase angle between them, phase-corrected, artifact-free spectra are obtained.
(71) Since the phase required to correct the contaminated spectra cannot be determined experimentally and varies from spectrum to spectrum, phase angles are determined using a stepwise approach between 90 and 90 in user selectable steps. The best spectrum is determined by analysis of peak position and intensity criteria, both of which vary during phase correction. The broad undulating Mie scattering contributions are not explicitly corrected for explicitly in this approach, but they disappear by performing the phase correction computation on second derivative spectra, which exhibit a scatter-free background.
(72) According to aspects of the invention, the pre-processing step 402 of
(73) Spectral Range
(74) In step 501, each spectrum in the hyperspectral dataset is pre-processed to select the most appropriate spectral range (fingerprint region). This range may be about 800 to about 1800 cm.sup.1, for example, which includes heavy atom stretching as well as XH (X: heavy atom with atomic number12) deformation modes. A typical example spectrum, superimposed on a linear background, is shown in
(75) Second Derivative of Spectra
(76) The second derivative of each spectrum is then computed in step of 502 of the flowchart of
(77) Second derivative spectra may have the advantage of being free of baseline slopes, including the slowly changing Mie scattering background. The second derivative spectra may be nearly completely devoid of baseline effects due to scattering and non-resonant Mie scattering, but still contain the effects of RMieS. The second derivative spectra may be vector normalized, if desired, to compensate for varying sample thickness. An example of a second derivative spectrum is shown in
(78) Reverse Fourier Transform
(79) In step 503 of the flowchart of
(80) Zero-Fill and Forward Fourier Transform
(81) The second half of both the real and imaginary interferogram for each spectrum is subsequently zero-filled in step 504. These zero-filled interferograms are subsequently forward Fourier transformed to yield a real and an imaginary spectral component with dispersive and absorptive band shapes, respectively.
(82) Phase Correction
(83) The real (RE) and imaginary (IM) parts resulting from the Fourier analysis are subsequently phase corrected, as shown in step 505 of the flowchart of
(84)
where is the phase angle.
(85) Since the phase angle for the phase correction is not known, the phase angle may be varied between /2/2 in user defined increments, and a spectrum with the least residual dispersive line shape may be selected. The phase angle that produces the largest intensity after phase correction may be assumed to be the uncorrupted spectrum, as shown in
(86) The phase correction method, in accordance with aspects of the invention described in steps 501-505, works well both with absorption and derivative spectra. This approach even solves a complication that may occur if absorption spectra are used, in that if absorption spectra are contaminated by scattering effects that mimic a baseline slope, as shown schematically in
Example 1Operation of Phase Correction Algorithm
(87) An example of the operation of the phase correction algorithm is provided in
(88) When data segmentation by hierarchical cluster analysis (HCA) was first carried out on this example lymph node section, the image shown in
(89) The difficulties in segmenting this dataset can be gauged by inspection of
(90) The results from a subsequent HCA are shown in
(91) The advantages of the pre-processing method in accordance with aspects of the invention over previous methods of spectral correction include that the method provides a fast execution time of about 5000 spectra/second, and no a priori information on the dataset is required. In addition, the phase correction algorithm can be incorporated into spectral imaging and digital staining diagnostic routines for automatic cancer detection and diagnosis in SCP and SHP. Further, phase correction greatly improves the quality of the image, which is helpful for image registration accuracy and in diagnostic alignment and boundary representations.
(92) Further, the pre-processing method in accordance with aspects of the invention may be used to correct a wide range of absorption spectra contaminated by reflective components. Such contamination occurs frequently in other types of spectroscopy, such as those in which band shapes are distorted by dispersive line shapes, such as Diffuse Reflectance Fourier Transform Spectroscopy (DRIFTS), Attenuated Total Reflection (ATR), and other forms of spectroscopy in which mixing of the real and imaginary part of the complex refractive index, or dielectric susceptibility, occurs to a significant extent, such as may be present with Coherent Anti-Stokes Raman Spectroscopy (CARS).
(93) Multivariate Analysis
(94) Multivariate analysis may be performed on the pre-processed spectral data to detect spectral differences, as outlined in step 403 of the flowchart of
(95) For example, unsupervised methods, such as HCA and principal component analysis (PCA), supervised methods, such as machine learning algorithms including, but not limited to, artificial neural networks (ANNs), hierarchical artificial neural networks (hANN), support vector machines (SVM), and/or random forest algorithms may be used. Unsupervised methods are based on the similarity or variance in the dataset, respectively, and segment or cluster a dataset by these criteria, requiring no information except the dataset for the segmentation or clustering. Thus, these unsupervised methods create images that are based on the natural similarity or dissimilarity (variance) in the dataset. Supervised algorithms, on the other hand, require reference spectra, such as representative spectra of cancer, muscle, or bone, for example, and classify a dataset based on certain similarity criteria to these reference spectra.
(96) HCA techniques are disclosed in Bird (Bird et al., Spectral detection of micro-metastates in lymph node histo-pathology, J. Biophoton. 2, No. 1-2, 37-46 (2009)), which is incorporated herein in its entirety. PCA is disclosed in WO 2009/146425, which is incorporated by reference herein in its entirety.
(97) Examples of supervised methods for use in accordance with aspects of the invention may be found in P. Lasch et al. Artificial neural networks as supervised techniques for FT-IR microspectroscopic imaging J. Chemometrics 2006 (hereinafter Lasch); 20: 209-220, M. Miljkovic et al., Label-free imaging of human cells: algorithms for image reconstruction of Raman hyperspectral datasets (hereinafter Miljkovic), Analyst, 2010, xx, 1-13, and A. Dupuy et al., Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting, JNCI, Vol. 99, Issue 2 I Jan. 17, 2007 (hereinafter Dupuy), each of which is incorporated by reference herein in its entirety.
(98) Grayscale or Pseudo-Color Spectral Image
(99) Similarly grouped data from the multivariate analysis may be assigned the same color code. The grouped data may be used to construct digitally stained grayscale or pseudo-color maps, as set forth in step 404 of the flowchart of
(100) An example of a spectral image prepared after multivariate analysis by HCA is provided in
(101) The construction of pseudo-color spectral images by HCA analysis is discussed in Bird.
(102) An example of a spectral image prepared after analysis by ANN is provided in
(103) Visual Image
(104) A visual image of the same biological section obtained in step 302 may be acquired, as indicated by step 303 in
(105) A visual image of a histopathological sample may be obtained using a standard visual microscope, such as one commonly used in pathology laboratories. The microscope may be coupled to a high resolution digital camera that captures the field of view of the microscope digitally. This digital real-time image is based on the standard microscopic view of a stained piece of tissue, and is indicative of tissue architecture, cell morphology and staining patterns. The digital image may include many pixel tiles that are combined via image stitching, for example, to create a photograph. According to aspects of the invention, the digital image that is used for analysis may include an individual tile or many tiles that are stitched combined into a photograph. This digital image may be saved and displayed on a computer screen.
(106) Registration of Spectral and Visual Images
(107) According to one method in accordance with aspects of the invention, once the spectral and visual images have been acquired, the visual image of the stained tissue may be registered with a digitally stained grayscale or pseudo-color spectral image, as indicated in step 304 in the flowchart of
(108) In accordance with aspects of the invention, image registration may be performed in a number of ways. For example, a common coordinate system may be established for the visual and spectral images. If establishing a common coordinate system is not possible or is not desired, the images may be registered by point mapping to bring an image into alignment with another image. In point mapping, control points on both of the images that identify the same feature or landmark in the images are selected. Based on the positions of the control points, spatial mapping of both images may be performed. For example, at least two control points may be used. To register the images, the control points in the visible image may be correlated to the corresponding control points in the spectral image and aligned together.
(109) In one variation according to aspects of the invention, control points may be selected by placing reference marks on the slide containing the biological specimen. Reference marks may include, but are not limited to, ink, paint, and a piece of a material, including, but not limited to polyethylene. The reference marks may have any suitable shape or size, and may be placed in the central portion, edges, or corners of the side, as long as they are within the field of view. The reference mark may be added to the slide while the biological specimen is being prepared. If a material having known spectral patterns, including, but not limited to a chemical substance, such as polyethylene, and a biological substance, is used in a reference mark, it may be also used as a calibration mark to verify the accuracy of the spectral data of the biological specimen.
(110) In another variation according to aspects of the invention, a user, such as a pathologist, may select the control points in the spectral and visual images. The user may select the control points based on their knowledge of distinguishing features of the visual or spectral images including, but not limited to, edges and boundaries. For biological images such as cells and tissue, control points may be selected from any of the biological features in the image. For example, such biological features may include, but are not limited to, clumps of cells, mitotic features, cords or nests of cells, sample voids, such as alveolar and bronchi, and irregular sample edges. The user's selection of control points in the spectral and visual images may be saved to a repository that is used to provide a training correlation for personal and/or customized use. This approach may allow subjective best practices to be incorporated into the control point selection process.
(111) In another variation according to aspects of the invention, software-based recognition of distinguishing features in the spectral and visual images may be used to select control points. The software may detect at least one control point that corresponds to a distinguishing feature in the visual or spectral images. For example, control points in a particular a cluster region may be selected in the spectral image. The cluster pattern may be used to identify similar features in the visual image. The features in both images may be aligned by translation, rotation, and scaling. Translation, rotation and scaling may also be automated or semi-automated, for example, by developing mapping relationships or models after selecting the features selection. Such an automated process may provide an approximation of mapping relationships that may then be resampled and transformed to optimize registration, for example. Resampling techniques include, but are not limited to nearest neighbor, linear, and cubic interpolation.
(112) Once the control points are aligned, the pixels in the spectral image having coordinates P.sub.1 (x.sub.1, y.sub.1) may be aligned with the corresponding pixels in the visual image having coordinates P.sub.2 (x.sub.2, y.sub.2). This alignment process may be applied to all or a selected portion of the pixels in the spectral and visual images. Once aligned, the pixels in each of the spectral and visual images may be registered together. By this registration process, the pixels in each of the spectral image and visual images may be digitally joined with the pixels in the corresponding image. Since the method in accordance with aspects of the invention allows the same biological sample to be tested spectroscopically and visually, the visual and spectral images may be registered accurately.
(113) An identification mark such as a numerical code, bar code, may be added to the slide to verify that the correct specimen is being accessed. The reference and identification marks may be recognized by a computer that displays or otherwise stores the visual image of the biological specimen. This computer may also contain software for use in image registration.
(114) An example of image registration according to an aspect of the invention is illustrated in
(115) Once the coordinates of the pixels in the spectral and visual images are registered, they may be digitally stored together. The entire images or a portion of the images may be stored. For example, the diagnostic regions may be digitally stored instead of the images of the entire sample. This may significantly reduce data storage requirements.
(116) A user who views a certain pixel region in either the spectral or visual image may immediately access the corresponding pixel region in the other image. For example, a pathologist may select any area of the spectral image, such as by clicking a mouse or with joystick control, and view the corresponding area of the visual image that is registered with the spectral image.
(117) In addition, as a pathologist moves or manipulates an image, he/she can also access the corresponding portion of the other image to which it is registered. For example, if a pathologist magnifies a specific portion of the spectral image, he/she may access the same portion in the visual image at the same level of magnification.
(118) Operational parameters of the visual microscope system, as well as microscope magnification, changes in magnification etc., may be also stored in an instrument specific log file. The log file may be accessed at a later time to select annotation records and corresponding spectral pixels for training the algorithm. Thus, a pathologist may manipulate the spectral image, and at a later time, the spectral image and the digital image that is registered to it are both displayed at the appropriate magnification. This feature may be useful, for example, since it allows a user to save a manipulated registered image digitally for later viewing or for electronic transmittal for remote viewing.
(119) Image registration may be used with a tissue section having a known diagnosis to extract training spectra during a training step of a method in accordance with aspects of the invention. During the training step, a visual image of stained tissue may be registered with an unsupervised spectral image, such as from HCA. Image registration may also be used when making a diagnosis on a tissue section. For example, a supervised spectral image of the tissue section may be registered with its corresponding visual image. Thus, a user may obtain a diagnosis based on any point in the registered images that has been selected.
(120) Image registration according to aspects of the invention provides numerous advantages over prior methods of analyzing biological samples. For example, it allows a pathologist to rely on a spectral image, which reflects the highly sensitive biochemical content of a biological sample, when making analyzing biological material. As such, it provides significantly greater accuracy in detecting small abnormalities, pre-cancerous, or cancerous cells, including micrometastates, than the related art. Thus, the pathologist does not have to base his/her analysis of a sample on his/her subjective observation of a visual image of the biological sample. Thus, for example, the pathologist may simply study the spectral image and may easily refer to the relevant portion in the registered visual image to verify his/her findings, as necessary.
(121) In addition, the image registration method in accordance with aspects of the invention provides greater accuracy than the prior method of Bird (Bird et al., Spectral detection of micro-metastates in lymph node histo-pathology, J. Biophoton. 2, No. 1-2, 37-46 (2009)) because it is based on correlation of digital data, i.e. the pixels in the spectral and visual images. Bird does not correlate any digital data from the images, and instead relies merely on the skill of the user to visually match spectral and visual images of adjacent tissue sections by physically overlaying the images. Thus, the image registration method in accordance with aspects of the invention provides more accurate and reproducible diagnoses with regard to abnormal or cancerous cells. This may be helpful, for example, in providing accurate diagnosis in the early stages of disease, when indicia of abnormalities and cancer are hard to detect.
(122) Training
(123) A training set may optionally be developed, as set forth in step 305 in the method provided in the flowchart of
(124) According to one aspect in accordance with the invention, in the training step, a training set may be developed by identifying a region of a visual image containing a disease or condition, correlating the region of the visual image to spectral data corresponding to the region, and storing the association between spectral data and the corresponding disease or condition. The training set may then be archived in a repository, such as a database, and made available for use in machine learning algorithms to provide a diagnostic algorithm with output derived from the training set. The diagnostic algorithm may also be archived in a repository, such as a database, for future use.
(125) For example, a visual image of a tissue section may be registered with a corresponding unsupervised spectral image, such as one prepared by HCA. Then, a user may select a characteristic region of the visual image. This region may be classified and/or annotated by a user to specify a disease or condition. The spectral data underlying the characteristic region in the corresponding registered unsupervised spectral image may be classified and/or annotated with the disease or condition.
(126) The spectral data that has been classified and/or annotated with a disease or condition provides a training set that may be used to train a supervised analysis method, such as an ANN. Such methods are also described, for example, in Lasch, Miljkovic Dupuy. The trained supervised analysis method may provide a diagnostic algorithm.
(127) A disease or condition information may be based on algorithms that are supplied with the instrument, algorithms trained by a user, or a combination of both. For example, an algorithm that is supplied with the instrument may be enhanced by the user.
(128) An advantage of the training step according to aspects of the invention is that the registered images may be trained against the best available, consensus-based gold standards, which evaluate spectral data by reproducible and repeatable criteria. Thus, after appropriate instrument validation and algorithm training, methods in accordance with aspects of the invention may produce similar results worldwide, rather than relying on visually-assigned criteria such as normal, atypical, low grade neoplasia, high grade neoplasia, and cancer. The results for each cell may be represented by an appropriately scaled numeric index or the results overall as a probability of a classification match. Thus, methods in accordance with aspects of the invention may have the necessary sensitivity and specificity for the detection of various biological structures, and diagnosis of disease.
(129) The diagnostic limitation of a training set may be limited by the extent to which the spectral data are classified and/or annotated with diseases or conditions. As indicated above, this training set may be augmented by the user's own interest and expertise. For example, a user may prefer one stain over another, such as one or many IHC stains over an H&E stain. In addition, an algorithm may be trained to recognize a specific condition, such as breast cancer metastases in axillary lymph nodes, for example. The algorithm may be trained to indicate normal vs. abnormal tissue types or binary outputs, such as adenocarcenoma vs. not-adenocarcenoma only, and not to classify the different normal tissue types encountered, such as capsule, B- and T-lymphocytes. The regions of a particular tissue type, or states of disease, obtained by SHP, may be rendered as digital stains superimposed on real-time microscopic displays of the tissue sections.
(130) Diagnosis
(131) Once the spectral and visual images have been registered, they may be used make a medical diagnosis, as outlined in step 306 in the flowchart of
(132) For example, spectral data and a visual image may be acquired from a biological specimen of unknown disease or condition. The spectral data may be analyzed by an unsupervised method, such as HCA, which may then be used along with spatial reference data to prepare an unsupervised spectral image. This unsupervised spectral image may be registered with the visual image, as discussed above. The spectral data that has been analyzed by an unsupervised method may then be input to a trained supervised algorithm. For example, the trained supervised algorithm may be an ANN, as described in the training step above. The output from the trained supervised algorithm may be spectral data that contains one or more labels that correspond to classifications and/or annotations of a disease or condition based on the training set.
(133) To extract a diagnosis based on the labels, the labeled spectral data may used to prepare a supervised spectral image that may be registered with the visual image and/or the unsupervised spectral image of the biological specimen. For example, when the supervised spectral image is registered with the visual image and/or the unsupervised spectral image, through a GUI, a user may select a point of interest in the visual image or the unsupervised spectral image and be provided with a disease or condition corresponding to the label at that point in the supervised spectral image. As an alternative, a user may request a software program to search the registered image for a particular disease or condition, and the software may highlight the sections in any of the visual, unsupervised spectral, and supervised spectral images that are labeled with the particular disease or condition. This advantageously allows a user to obtain a diagnosis in real-time, and also allows the user view a visual image, which he/she is familiar with, while accessing highly sensitive spectroscopically obtained data.
(134) The diagnosis may include a binary output, such as an is/is not type output, that indicates the presence or lack of a disease or condition. In addition, the diagnosis may include, but is not limited to an adjunctive report, such as a probability of a match to a disease or condition, an index, or a relative composition ratio.
(135) In accordance with aspects of the method of the invention, gross architectural features of a tissue section may be analyzed via spectral patterns to distinguish gross anatomical features that are not necessarily related to disease. Such procedures, known as global digital staining (GDS), may use a combination of supervised and unsupervised multivariate methods. GDS may be used to analyze anatomical features including, but not limited to, glandular and squamous epithelium, endothelium, connective tissue, bone, and fatty tissue.
(136) In GDS, a supervised diagnostic algorithm may be constructed from a training dataset that includes multiple samples of a given disease from different patients. Each individual tissue section from a patient may be analyzed as described above, using spectral image data acquisition, pre-processing of the resulting dataset, and analysis by an unsupervised algorithm, such as HCA. The HCA images may be registered with corresponding stained tissue, and may be annotated by a pathologist. This annotation step, indicated in
(137) According to the GDS method, the sample may be stained using classical stains or immuno-histochemical agents. When the pathologist receives the stained sample and inspects it using a computerized imaging microscope, the spectral results may be available to the computer controlling the visual microscope. The pathologist may select any tissue spot on the sample and receive a spectroscopy-based diagnosis. This diagnosis may overlay a grayscale or pseudo-color image onto the visual image that outlines all regions that have the same spectral diagnostic classification.
(138)
(139) Areas of these gross anatomical features, which are registered with the corresponding visual image, may be selected for analysis based on more sophisticated criteria in the spectral pattern dataset. This next level of diagnosis may be based on a diagnostic marker digital staining (DMDS) database, which may be solely based on SHP results, for example, or may contain spectral information collected using immuno-histochemical (IHC) results. For example, a section of epithelial tissue may be selected to analyze for the presence of spectral patterns indicative of abnormality and/or cancer, using a more diagnostic database to scan the selected area. An example of this approach is shown schematically in
(140) The relationship between GDS and DMDS is shown by the horizontal progression marked in dark blue and purple, respectively, in the schematic of
(141) According to an example method in accordance with aspects of the invention, a pathologist may provide certain inputs to ensure that an accurate diagnosis is achieved. For example, the pathologist may visually check the quality of the stained image. In addition, the pathologist may perform selective interrogation to change the magnification or field of view of the sample.
(142) The method according to aspects of the invention may be performed by a pathologist viewing the biological specimen and performing the image registration. Alternatively, since the registered image contains digital data that may be transmitted electronically, the method may be performed remotely.
(143) Methods may be demonstrated by the following non-limiting examples.
Example 2Lymph Node Section
(144)
(145) Since the SHP-based digital stain is based on a trained and validated repository or database containing spectra and diagnoses, the digital stain rendered is directly relatable to a diagnostic category, such as metastatic breast cancer, in the case of
Example 3Fine Needle Aspirate Sample of Lung Section
(146) Sample sections were cut from formalin fixed paraffin embedded cell blocks that were prepared from fine needles aspirates of suspicious legions located in the lung. Cell blocks were selected based on the criteria that previous histological analysis had identified an adenocarcinoma, small cell carcinoma (SCC) or squamous cell carcinoma of the lung. Specimens were cut by use of a microtome to provide a thickness of about 5 m and subsequently mounted onto low-e microscope slides (Kevley Technologies, Ohio, USA). Sections were then deparaffinized using standard protocols. Subsequent to spectroscopic data collection, the tissue sections were hematoxylin and eosin (H&E) stained to enable morphological interpretations by a histopathologist.
(147) A Perkin Elmer Spectrum 1/Spotlight 400 Imaging Spectrometer (Perkin Elmer Corp, Shelton, Conn., USA) was employed in this study. Infrared micro-spectral images were recorded from 1 mm1 mm tissue areas in transflection (transmission/reflection) mode, with a pixel resolution of 6.25 m6.25 m, a spectral resolution of 4 cm.sup.1, and the co-addition of 8 interferograms, before Norton-Beer apodization (see, e.g., Naylor, et al. J Opt. Soc. Am., A24:3644-3648 (2007)) and Fourier transformation. An appropriate background spectrum was collected outside the sample area to ratio against the single beam spectra. The resulting ratioed spectra were then converted to absorbance. Each 1 mm1 mm infrared image contains 160160, or 25,600 spectra.
(148) Initially, raw infrared micro-spectral data sets were imported into and processed using software written in Matlab (version R2009a, Mathworks, Natick, Mass., USA). A spectral quality test was performed to remove all spectra that were recorded from areas where no tissue existed, or displayed poor signal to noise. All spectra that pass the test were then baseline off-set normalized (subtraction of the minimal absorbance intensity across the entire spectral vector), converted to second derivative (Savitzy-Golay algorithm (see, e.g., Savitzky, et al. Anal. Chem., 36:1627 (1964)), 13 smoothing points), cut to only include intensity values recorded in the 1350 cm.sup.1-900 cm.sup.1 spectral region, and finally vector normalized.
(149) Processed data sets were imported into a software system and HCA performed using the Euclidean distance to define spectral similarity, and Ward's algorithm (see, e.g., Ward, J Am. Stat. Assoc., 58:236 (1963)) for clustering. Pseudo-color cluster images that describe pixel cluster membership, were then assembled and compared directly with H&E images captured from the same sample. HCA images of between 2 and 15 clusters, which describe different clustering structures, were assembled by cutting the calculated HCA dendrogram at different levels. These cluster images were then provided to collaborating pathologists who confirmed the clustering structure that best replicated the morphological interpretations they made upon the H&E-stained tissue.
(150) Infrared spectra contaminated by underlying base line shifts, unaccounted signal intensity variations, peak position shifts, or general features not arising from or obeying LambertBeer law were corrected by a sub-space model version of EMSC for Mie scattering and reflection contributions to the recorded spectra (see B. Bird, M. Miljkovi and M. Diem, Two step resonant Mie scattering correction of infrared micro-spectral data: human lymph node tissue, J. Biophotonics, 3 (8-9) 597-608 (2010)). Initially, 1000 recorded spectra for each cancer type were pooled into separate data sets from the infrared images presented in
(151) These data sets were then searched for spectra with minimal scattering contributions, a mean for each cancer type was calculated to increase signal to noise, and KK transforms were calculated for each cell type, as shown in
(152) A sub space model for Mie scattering contributions was constructed by calculating 340 Mie scattering curves that describe a nuclei sphere radius range of 6 m-40 m, and a refractive index range of 1.1-1.5, using the Van de Hulst approximation formulae (see, e.g., Brussard, et al., Rev. Mod. Phys., 34:507 (1962)). The first 10 principal components that describe over 95% of the variance composed in these scattering curves, were then used in a addition to the KK transforms for each cancer type, as interferences in a 1 step EMSC correction of data sets. The EMSC calculation took approximately 1 sec per 1000 spectra.
(153)
(154) Since the majority of sample area is composed of blood and non-diagnostic material, the data was pre-processed to only include diagnostic material and correct for scattering contributions. In addition, HCA was used to create a binary mask and finally classify the data. This result is shown in
(155) The results presented in the Examples above show that the analysis of raw measured spectral data enables the differentiation of SCC and non-small cell carcinoma (NSCC). After the raw measured spectra are corrected for scattering contributions, adenocarinoma and squamous cell carcinoma according to methods in accordance with aspects of the invention, however, the two subtypes of NSCC, are clearly differentiated. Thus, these Examples provide strong evidence that this spectral imaging method may be used to identify and correctly classify the three main types of lung cancer.
(156)
(157) Information relating to a diagnosis, for example, via a network, 110, such as the Internet, for example, may be transmitted between the analyst 101 and the server module 106. Communications may be made, for example, via couplings 111, 113, such as wired, wireless, or fiberoptic links.
(158) Aspects of the invention may be implemented using hardware, software or a combination thereof and may be implemented in one or more computer systems or other processing systems. In one variation, aspects of the invention are directed toward one or more computer systems capable of carrying out the functionality described herein. An example of such a computer system 200 is shown in
(159) Computer system 200 includes one or more processors, such as processor 204. The processor 204 is connected to a communication infrastructure 206 (e.g., a communications bus, cross-over bar, or network). Various software aspects are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person skilled in the relevant art(s) how to implement the aspects of invention using other computer systems and/or architectures.
(160) Computer system 200 can include a display interface 202 that forwards graphics, text, and other data from the communication infrastructure 206 (or from a frame buffer not shown) for display on the display unit 230. Computer system 200 also includes a main memory 208, preferably random access memory (RAM), and may also include a secondary memory 210. The secondary memory 210 may include, for example, a hard disk drive 212 and/or a removable storage drive 214, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 214 reads from and/or writes to a removable storage unit 218 in a well-known manner. Removable storage unit 218, represents a floppy disk, magnetic tape, optical disk, etc., which is read by and written to removable storage drive 214. As will be appreciated, the removable storage unit 218 includes a computer usable storage medium having stored therein computer software and/or data.
(161) In alternative variations, secondary memory 210 may include other similar devices for allowing computer programs or other instructions to be loaded into computer system 200. Such devices may include, for example, a removable storage unit 222 and an interface 220. Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an erasable programmable read only memory (EPROM), or programmable read only memory (PROM)) and associated socket, and other removable storage units 222 and interfaces 220, which allow software and data to be transferred from the removable storage unit 222 to computer system 200.
(162) Computer system 200 may also include a communications interface 224. Communications interface 224 allows software and data to be transferred between computer system 200 and external devices. Examples of communications interface 224 may include a modem, a network interface (such as an Ethernet card), a communications port, a Personal Computer Memory Card International Association (PCMCIA) slot and card, etc. Software and data transferred via communications interface 224 are in the form of signals 228, which may be electronic, electromagnetic, optical or other signals capable of being received by communications interface 224. These signals 228 are provided to communications interface 224 via a communications path (e.g., channel) 226. This path 226 carries signals 228 and may be implemented using wire or cable, fiber optics, a telephone line, a cellular link, a radio frequency (RF) link and/or other communications channels. In this document, the terms computer program medium and computer usable medium are used to refer generally to media such as a removable storage drive 214, a hard disk installed in hard disk drive 212, and signals 228. These computer program products provide software to the computer system 200. Aspects of the invention are directed to such computer program products.
(163) Computer programs (also referred to as computer control logic) are stored in main memory 208 and/or secondary memory 210. Computer programs may also be received via communications interface 224. Such computer programs, when executed, enable the computer system 200 to perform the features in accordance with aspects of the invention, as discussed herein. In particular, the computer programs, when executed, enable the processor 204 to perform such features. Accordingly, such computer programs represent controllers of the computer system 200.
(164) In a variation where aspects of the invention are implemented using software, the software may be stored in a computer program product and loaded into computer system 200 using removable storage drive 214, hard drive 212, or communications interface 224. The control logic (software), when executed by the processor 204, causes the processor 204 to perform the functions as described herein. In another variation, aspects of the invention are implemented primarily in hardware using, for example, hardware components, such as application specific integrated circuits (ASICs). Implementation of the hardware state machine so as to perform the functions described herein will be apparent to persons skilled in the relevant art(s).
(165) In yet another variation, aspects of the invention are implemented using a combination of both hardware and software.