Methods for Preparing Data from Tissue Sections for Machine Learning Using Both Brightfield and Fluorescent Imaging
20220138945 · 2022-05-05
Assignee
Inventors
Cpc classification
G06V20/70
PHYSICS
G16H50/20
PHYSICS
G01N1/30
PHYSICS
G06V10/774
PHYSICS
G16H50/70
PHYSICS
G06V20/69
PHYSICS
International classification
G01N1/30
PHYSICS
G06V10/774
PHYSICS
G06V20/69
PHYSICS
G06V20/70
PHYSICS
Abstract
In digital pathology, obtaining a labeled data set for training, testing and/or validation of a machine learning model is expensive, because it requires manual annotations from a pathologist. In some cases, it can be difficult for the pathologist to produce correct annotations. The present invention allows the creation of labeled data sets using fluorescent dyes, which do not affect the appearance of the slide in the brightfield imaging modality. It thus becomes possible to add correct annotations to a brightfield slide without human intervention.
Claims
1. A method comprising: staining a tissue section with at least one brightfield stain, wherein the at least one brightfield stain includes staining for at least one tissue object; staining the tissue section with at least one fluorescent stain, wherein the at least one fluorescent stain includes staining for the at least one tissue object and identifies target cells; scanning the tissue section in brightfield to create a first image; scanning the tissue section in fluorescence to create a second image; processing the first image to identify and quantify cells within the tissue section; creating a data set of a subset of the identified cell within the tissue section; aligning the second image to the first image using the at least one tissue object; labeling the cells within the data set based on staining of the target cells; and using the labeled cells within the data set for machine learning.
2. The method of claim 1, wherein the subset of the identified cells is all identified cells.
3. The method of claim 1, wherein the machine learning is training a machine learning model to identify the target cells.
4. The method of claim 3, wherein the machine learning is testing the machine learning model.
5. The method of claim 4, wherein the machine learning is validating the machine learning model.
6. The method of claim 5, further comprising using the machine learning model to identify a patient status for a patient selected form the group consisting of from whom the tissue section was taken and unrelated to the tissue section used for training the machine learning model.
7. The method of claim 6, wherein the patient status for a patient unrelated to the tissue section used for training the machine learning model is determined via the use a synthetic stain applied to a digital image of an unstained tissue section taken from that patient.
8. The method of claim 1, further comprising applying the machine learning to a digital image of an unstained tissue section to create a synthetic stain on the digital image to identify target cells within that digital image.
9. A method comprising: staining a tissue section with at least one brightfield stain; staining the tissue section with at least one fluorescent stain, wherein the at least one fluorescent stain identifies at least one target tissue region; scanning the tissue section in brightfield to create a first image; scanning the tissue section in fluorescence to create a second image; aligning the second image to the first image; identifying regions stained in the second image to create an annotation; and using the annotation and the first image for machine learning.
10. The method of claim 9, wherein the machine learning is training a machine learning model to identify the at least one target tissue region.
11. The method of claim 10, wherein the machine learning is testing the machine learning model.
12. The method of claim 11, wherein the machine learning is validating the machine learning model.
13. The method of claim 12, further comprising using the machine learning to identify a patient status for a patient selected form the group consisting of from whom the tissue section was taken and unrelated to the tissue section used for training the machine learning model.
14. The method of claim 13, wherein the patient status for a patient unrelated to the tissue section used for training the machine learning model is determined via the use a synthetic stain applied to a digital image of an unstained tissue section taken from that patient.
15. The method of claim 9, further comprising applying the machine learning to a digital image of an unstained tissue section to create a synthetic stain on the digital image to identify target cells within that digital image.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0010]
[0011]
DETAILED DESCRIPTION OF EMBODIMENTS
[0012] In the following description, for purposes of explanation and not limitation, details and descriptions are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced in other embodiments that depart from these details and descriptions without departing from the spirit and scope of the invention.
[0013] For purpose of definition, a tissue object is one or more of a cell (e.g., immune cell), cell sub-compartment (e.g., nucleus, cytoplasm, membrane, organelle), cell neighborhood, a tissue compartment (e.g., tumor, tumor microenvironment (TME), stroma, lymphoid follicle, healthy tissue), blood vessel, a lymphatic vessel, vacuole, collagen, regions of necrosis, extra-cellular matrix, a medical device (e.g., stent, implant), a gel, a parasitic body (e.g., virus, bacterium,), a nanoparticle, a polymer, and/or a non-dyed object (e.g., metal particle, carbon particle). Tissue objects are visualized by histologic stains which highlight the presence and localization of a tissue object. Tissue objects can be identified directly by stains specifically applied to highlight the presence of said tissue object (e.g., hematoxylin to visualize nuclei, immunohistochemistry stain for a protein specifically found in a muscle fiber membrane), indirectly by stains applied which non-specifically highlight the tissue compartment (e.g., DAB background staining), are biomarkers known to be localized to a specific tissue compartment (e.g., nuclear-expressed protein, carbohydrates only found in the cell membrane), or can be visualized without staining (e.g., carbon residue in lung tissue).
[0014] For the purpose of this disclosure, patient status includes diagnosis of inflammatory status, disease state, disease severity, disease progression, therapy efficacy, and changes in patient status over time. Other patient statuses are contemplated.
[0015] In an illustrative embodiment of the invention, the methods can be summarized in the following eight steps: (i) staining a tissue section with brightfield stain ensuring that a particular tissue object is stained; (ii) staining the same tissue section with fluorescent stain ensuring that the same tissue object is stained and that target cells are identified with the fluorescent stain; (iii) scanning the tissue section in brightfield and fluorescence to create two images; (iv) quantifying and identifying cells within the brightfield image; (v) creating a data set using a subset of the identified cells; (vi) aligning the fluorescent image with the brightfield image using the tissue object that is stained in both brightfield and fluorescent; (vii) labeling the cells in the data set based on the staining of the target cells in fluorescent; (viii) using the labeled cells within the data set for machine learning. This illustrative embodiment of the invention is summarized in
[0016] In some embodiments the subset of identified cells is all of the identified cells. In other embodiments, the machine learning is training a machine learning model to identify the target cells, testing the machine learning model, or validating the trained machine learning model.
[0017] In further embodiments, the machine learning model is used to identify a patient status for a patient from whom the tissue section was taken or for a separate patient not associated with the tissue section used to train the machine learning model. This embodiment can be used to create a “synthetic stain”, a markup of a digital image of a tissue section that has not been stained to cause the cells within that digital image to appear as if they had been stained.
[0018] Another embodiment of the invention is illustrated in
[0019] In further embodiments, the machine learning is training a machine learning model to identify the target tissue region, testing the machine learning model, or validating the trained machine learning model.
[0020] In further embodiments, the machine learning model is used to identify a patient status for a patient from whom the tissue section was taken or for a separate patient not associated with the tissue section used to train the machine learning model. This embodiment can be used to create a “synthetic stain”, a markup of a digital image of a tissue section that has not been stained to cause the cells within that digital image to appear as if they had been stained.
[0021] Any of these embodiments can be used with multiple tissue sections to feed into the data set. This improves the accuracy and precision of the machine learning.