METHOD AND SYSTEM FOR SCREENING NEOANTIGENS, AND USES THEREOF
20230047716 · 2023-02-16
Assignee
- Korea Advanced Institute Of Science And Technology (Daejeon, KR)
- PENTAMEDIX CO., LTD. (Gyeonggi-do, KR)
Inventors
- Jung Kyoon CHOI (Daejeon, KR)
- Hyo Eun BANG (Daejeon, KR)
- Jae Soon PARK (Daejeon, KR)
- Dae Yeon CHO (Gyeonggi-do, KR)
Cpc classification
G16B40/00
PHYSICS
C12Q2600/106
CHEMISTRY; METALLURGY
International classification
Abstract
Provided are a method and system for screening neoantigen and uses of neoantigens. Specifically, provided are a method and system for screening neoantigens derived from a gene of which expression is essential for survival of a cancer cell and/or a is homogeneously expressed in all cells in cancer tissue as a diagnostic and/or therapeutic target, and uses of neoantigens.
Claims
1. A method of screening neoantigens, the method comprising: obtaining sequencing data of an exome, a transcriptome, a single cell transcriptome, a peptidome, or an entire genome from a cancer patient; identifying genes essential for cancer cell survival; and obtaining neoantigens derived from the genes essential for cancer cell survival, wherein the identifying of the genes essential for cancer cell survival comprises identifying genes essential for cancer cell survival by using a model for predicting cell survival dependency, wherein the model for predicting cell survival dependency is generated by learning a relationship between gene expression in a cell and cell apoptosis, and identifies as the gene essential for cancer cell survival, a gene of which expression reduction or removal causes cell apoptosis, wherein the obtaining of the neoantigens derived from the gene essential for cancer cell survival comprises: obtaining neoantigens of a cancer patient by comparing a sequence of a cancer cell and a sequence of a normal cell, based on sequencing data obtained from the cancer patient; and collecting neoantigens derived from the gene essential for cancer cell survival from the obtained neoantigens.
2. The method of claim 1, further comprising determining binding affinity of the neoantigens to HLA of an antigen-presenting cell.
3. (canceled)
4. The method of claim 1, wherein the identified gene essential for cancer cell survival causes cancer cell apoptosis when its expression level is decreased or removed, but does not affect survival of a normal cell.
5. The method of claim 1, wherein the relationship between gene expression in a cell and cell apoptosis is based on in vitro data or in silico data on cancer cell line apoptosis according to a reduction in expression or removal of a targeted gene.
6. The method of claim 1, wherein the identifying of the gene essential for cancer cell survival further comprises determining whether the gene essential for cancer cell survival identified by using the model for predicting cell survival dependency is homogeneously expressed in all cancer cells obtained from a cancer patient.
7. (canceled)
8. The method of claim 1, wherein the collecting of the neoantigens further comprises selecting nonsynonymous mutations in the genes essential for cancer cell survival.
9. The method of claim 1, wherein the neoantigen is specific to the cancer patient.
10. The method of claim 2, wherein the determining of binding affinity of the neoantigens to HLA of an antigen-presenting cell comprises obtaining prediction of the binding affinity by inputting a sequence of the neoantigen to a model for predicting binding affinity of a peptide to HLA of an antigen-presenting cell, and the model for predicting binding affinity of a peptide is generated by learning data regarding interaction between amino acids of a peptide and amino acids of an HLA, and the HLA is MHC class I or MCH class II.
11. The method of claim 10, wherein the antigen-presenting cell comprises a dendritic cell, a macrophage, a B cell, or a combination thereof.
12. (canceled)
13. The method of claim 10, wherein the neoantigen is determined to have binding affinity to the HLA, when a CNN-MHC value between the neoantigen and the HLA of the antigen-presenting cell is >0.5.
14. A system for screening neoantigens, the system comprising: a memory for storing at least one instruction; and at least one processor for executing the at least one instruction stored in the memory, wherein the at least one processor executes the at least one instruction to generate a model for predicting cell survival dependency that predicts cell survival dependency on gene expression by learning a relationship between gene expression level in a cell and cell apoptosis, wherein the model for predicting cell survival dependency is generated from learning of relationship between gene expression in a cell and cell apoptosis, and identifies as a gene essential for cancer cell survival, a gene of which expression reduction or removal causes cancer cell apoptosis, obtains neoantigens by comparing gene expression profile of a cancer patient and gene expression profile of a normal cell or normal control and collects neoantigens derived from the gene essential for cancer cell survival; to identify a gene essential for cancer cell survival and obtain a neoantigen derived from the gene essential for cancer cell survival by inputting a gene expression profile of a cancer patient to the model for predicting cell survival dependency, to generate a model for predicting binding affinity of a neoantigen which predicts binding affinity based on amino acid interactions between a peptide and an antigen-presenting cell, and to select a neoantigen having binding affinity to HLA of the antigen-presenting cell by using the model for predicting binding affinity of a neoantigen.
15. (canceled)
16. The system of claim 14, wherein the at least one processor executes the at least one instruction to select a neoantigen having binding affinity when a CNN-MHC value between the neoantigen and the HLA of the antigen-presenting cell is >0.5.
17. The system of claim 14, wherein the at least one processor executes the at least one instruction to learn relationship between gene expression and cell apoptosis, and relationship between a neoantigen and HLA of an antigen-presenting cell for binding affinity, respectively.
18. The system of claim 14, wherein the relationship between gene expression in a cell and cell apoptosis is based on in vitro data or in silico data on cancer cell apoptosis according to a reduction in expression or removal of a targeted gene.
19. The system of claim 14, wherein the model for predicting binding affinity of the neoantigen is generated from learning of data regarding interaction between amino acids of the peptide and amino acids of the HLA.
20. The system of claim 14, wherein the gene expression profile is sequencing data of an exome, a transcriptome, a single cell transcriptome, a peptidome, or an entire genome.
21. A method of preparing an anti-cancer vaccine, the method comprising: obtaining a neoantigen by the method of claim 1; and preparing an anti-cancer vaccine comprising the neoantigen, wherein the preparing an anti-cancer vaccine comprises obtaining peptide sequences comprising the neoantigen, the peptide sequences consisting of 9 to 30 amino acids; and selecting a peptide sequence having hydrophilicity and stability from the obtained peptide sequences.
22. (canceled)
23. The method of claim 21, wherein the selected peptide sequence has Kyte-Doolittle GRAVY<0 and InstaIndex<40.
24. (canceled)
25. A method of providing information to predict treatment prognosis of a cancer patient, the method comprising: obtaining a neoantigen by the method of claim 1; and measuring the neoantigen load in a sample from the cancer patient.
26. The method of claim 25, further comprising comparing the obtained neoantigen load and a neoantigen load obtained from a control group consisting of cancer patients for whom treatment prognosis have been confirmed.
27. (canceled)
Description
BRIEF DESCRIPTION OF DRAWINGS
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
[0093]
[0094]
DESCRIPTION OF THE EMBODIMENTS
[0095] The present disclosure will become apparent with reference to embodiments described in detail below in conjunction with the accompanying drawings. However, the present disclosure is not limited to embodiments below, but will be implemented in various forms. The present embodiments only serve to complete the disclosure of the present disclosure, and are provided to completely inform the scope of the disclosure to ordinary skill in the art to which the present disclosure pertains, and the present disclosure is defined by the claims.
[0096] The term “exome” as used herein refers to the collection of all of the exons present in a cell, a group of cells, or an organism.
[0097] The term “transcriptome” as used herein refers to the collection of all RNA transcripts present in a cell, a group of cells, or an organism.
[0098] The term “gene expression profile” as used herein refers to an analysis of genes expressed or transcribed from the genome of a cell, and also refers to a set of values indicating gene expression levels including the mRNA levels of one or more genes.
[0099] The term “dependency” as used herein refers to essentiality for proliferation or survival of a cell, and is used interchangeably with “essentiality”.
[0100] The term “dependency gene” as used herein refers to a gene essential for proliferation or survival of a cell. In particular, the dependency gene is a gene that causes reduced cell proliferation and/or cell apoptosis when expression of the gene is reduced or the gene is removed, and that is, refers to a gene on which a cell depends for survival thereof. The dependency gene may include a universal dependency gene identified as universally essential for survival of cancers or cancer cells of various types and/or sources, and/or a cancer patient-specific dependency gene identified as specifically essential for survival of cancer cells derived from an individual cancer patient. The dependency gene may refer to a gene that is constitutively expressed in a cell and that is homogeneously expressed in all individual cells.
[0101] The term “universal dependency gene” as used herein refers to a gene identified as essential for survival of a range of various cancer cells through in vitro data of a known cancer cell line.
[0102] The term “cancer patient-specific dependency gene” as used herein refers to a gene identified as essential for survival of a cancer cell derived an individual cancer patient.
[0103] The term “neoantigen” as used herein refers to a peptide that causes an immune response. That is, the neoantigen may be an immunogenic peptide. The neoantigen may be generated by a cancer cell-specific mutation, and may appear as an epitope of a cancer cell. The neoantigen may be an antigen having at least one alteration that distinguishes it from a corresponding wild-type, parental antigen, either through mutation in a cancer cell or through post-translational modification specific to a cancer cell. The neoantigen may include an amino acid sequence or a nucleotide sequence. A mutation may include a frameshift or non-frameshift mutation, an indel, a missense or nonsense, a splice site mutation, a genomic rearrangement or gene fusion, or any genomic or expressional alteration resulting in a new ORF.
[0104] Since a neoantigen derived from a universal dependency gene or a cancer patient-specific dependency gene is not affected by immune evasion of cancer cells via immunoediting, such a neoantigen may be an effective therapeutic target for a cancer patient-personalized cancer vaccine that can have high immunotherapeutic effects in the cancer patient, and may be an effective diagnostic target as a marker for prognosis of immunotherapy.
[0105] The term “immunotherapy” as used herein refers to therapy using an immune response. Immunotherapy may be used to treat cancer. For example, the immunotherapy may be a treatment using an immune checkpoint blockade such as an anti-CTLA4 blocker or an anti-PD-1/PD-L1 blocker, but is not limited thereto, and may include various types of immunotherapy.
[0106] The term “binding affinity” as used herein refers to a binding force between a neoantigen peptide and MHC of an antigen-presenting cell, and may be expressed as a CNN-MHC value. The “CNN-MHC value” is a value obtained by establishing a deep learning model based on experimental values, the binding strength between respective amino acids of the neoantigen and the MHC converted into a matrix form, and means a probability value between 0 and 1 obtained by use of sigmoid activation function. Specifically, an immunogenic peptide capable of binding to an MHC Class I or II protein may have an MHC CNN-MHC level of greater than or equal to 0.5. In addition, as the CNN-MHC value gets close to 1, the binding between the immunogenic peptide and the MHC class I or II protein gets strong.
[0107] The term “an antigen-presenting cell” as used herein refers to a cell that internalizes and processes a protein antigen and presents on its surface an antigen-derived peptide fragment along with MHC class II to a T cell for activation, and examples thereof include a macrophage, a B cells, a dendritic cell, and the like.
[0108] The term “model for predicting cell survival dependency” as used herein refers to a model predicting the probability of cell survival or cell apoptosis in case of a reduction in expression or removal of an individual gene. Specifically, it is a model that predicts an effect of a specific gene on cell survival by learning the effect of knockout/knockdown of a gene on cell survival based on machine learning, and the machine learning may be performed based on in vitro data of the knockout/knockdown of a gene by RNAi or CRISPR/Cas9. When a gene expression profile or a sequence is input into the model for predicting cell survival dependency, cell survival dependency of a gene derived from the corresponding sequence may be predicted.
[0109] The term “model for predicting binding affinity of neoantigen” as used herein refers to a model predicting binding affinity of a neoantigen to an antigen-presenting cell, particularly, to MHC of an antigen-presenting cell. By learning the binding affinity resulting from the amino acid interactions between a peptide sequence and an HLA sequence based on the machine learning and inputting a peptide sequence of a neoantigen, the binding affinity of the neoantigen to the HLA of the antigen-presenting cell may be predicted, and the neoantigen may be classified according to the established scale of the binding affinity.
Example 1. Selection of Gene Essential for Cancer Cell
[0110] To determine whether functions of a gene are essential for survival of cancer or cancer cells, high-throughput screening (HTS) can be performed by using an RNAi or CRISPR library capable of performing knockout/knockdown of all genes. Specifically, genes essential for survival of cancer cells can be identified in vitro by transfecting cancer cells with an shRNA library or CRISPR sgRNA, conducting deep sequencing after a certain period of time, and comparing the result with the sequencing result from the cells in the initial state to obtain quantitative profiling showing which genes, when inactivated, led to cell apoptosis.
[0111] In this way, in vitro data on genes essential for survival of cancer cells are continually being produced with an increasing number of cell lines. Accordingly, for major solid cancers, such as lung cancer, ovarian cancer, colorectal cancer, stomach cancer, breast cancer, and the like, dependency data on a significant number of cancer cell lines have been established (https://depmap.org/portal/ and https://depmap.sanger.ac.uk/). Then, according to an embodiment of the present disclosure, these data were used as data on universal dependency genes, or used for the purpose of predicting cancer patient-specific dependency genes in silico based on deep learning.
[0112] The in vitro data on the cancer cell lines can be used to obtain universal dependency genes, whereas the data derived from cancer patients, such as transcriptome data of cancer patients, can be used to identify a cancer patient-specific dependency gene by applying the data to a model for predicting cell survival dependency, which was trained based on the in vitro data on the cancer cell lines to predict dependency of each patient sample.
[0113] In addition, applying single cell transcriptome data to the model for predicting cell survival dependency makes it possible to identify a dependency pattern of respective cells in tumor heterogeneity, and genes showing the same dependency among several cells may be selected as universal dependency genes. Genes that are essential for survival of cancer cells and homogeneously expressed at an individual cell level may be selected as dependency genes that may serve as effective diagnostic and/or therapeutic targets.
[0114] The model for predicting cell survival dependency is a model based on neural networks of deep learning consisting of an input layer, multiple hidden layers, and an output layer. The neural networks are configured in a way that, when the aforementioned in vitro data are input into the input layer, the relationship between gene expression pattern in a cell and the cell apoptosis is learned in the one or more hidden layers, and then a predetermined probability value is output in the output layer. Here, the predetermined probability value includes a value indicating a probability of cell apoptosis and a value indicating a probability of cell growth.
[0115] When the single cell transcriptome data on cancer patients are available, the cancer patient-specific dependency genes can be identified by using the model for predicting cell survival dependency. Otherwise, the universal dependency genes can be identified by using other published data on cancer patient samples.
Example 2. Selection of Neoantigens Derived from Universal Dependency Gene and Significance Thereof as Diagnostic/Therapeutic Target
[0116] A neoantigen refers to a peptide or a protein fragment that binds to an MHC protein and is presented on a surface of a cancer cell, and thus is recognized as an antigen by immune cells, which is not present in a normal cell, but is generated by cancer-specific mutations. The neoantigen is a key element of immunotherapy, and it is known that the greater the number of neoantigens, the better the responsiveness to immunotherapy. In this regard, a neoantigen load is utilized as a diagnostic marker. However, since neoantigens are derived from various genes and have different characteristics, usefulness of the neoantigens as diagnostic markers or therapeutic targets will vary. In this Example, as described in Example 1, neoantigens derived from genes on which cancer cell survival is dependent or essential for cancer cell survival were selected as useful neoantigens capable of maximizing effects of immunotherapy, and the selected neoantigens were found responsive to immunotherapy.
[0117] Specifically, the responsiveness to immunotherapy was compared by employing cohorts treated with immune checkpoint blockades listed in Table 1.
TABLE-US-00001 TABLE 1 Cohort Cohort Data name Tumor type size availability Reference SMC Lung cancer 122 Pre-therapy Rizvi Lung cancer 34 Pre-therapy Science 348, 124-128 Hellmann Lung cancer 75 Pre-therapy Cancer Cell 33, 843-852 Hugo Melanoma 38 Pre-therapy Cell 165, 35-44 Van Allen Melanoma 110 Pre-therapy Science 350, 207-211 Snyder Melanoma 64 Pre-therapy N. Engl. J. Med. 371, 2189-2199 Riaz Melanoma 68 Pre-/On- Cell 171, 934-949 therapy
[0118] 2-1. Pre-Therapy Cohort
[0119] By using the in vitro dependency data and expression homogeneity data of the published lung cancer and melanoma cohorts with only pre-therapy results, the usefulness of the screening method for the neoantigen based on the cancer cell survival dependency according to an embodiment of the present disclosure was investigated. The analysis results are shown in FIG. 5. Specifically, the genes from which the neoantigens were derived were aligned according to the essentiality for the cancer cell survival, top 500 to 2000 (high dependency) genes and bottom 500 to 2000 (low dependency) genes were selected, and the differential neoantigen load between patients with good prognosis and patient with poor prognosis for immunotherapy was calculated according to
It can be considered the larger the difference the better explanatory power for the treatment prognosis. Here, the gray dotted line indicates a difference in the number of neoantigens calculated for all genes according to an existing standard method. As a result, it was found that use of a small number of genes of high essentiality or expression homogeneity provides better explanatory power than use of all genes, while use of genes having low essentiality or expression heterogeneity fails to provide good explanatory power. Similarly, the differential neoantigen load was calculated according to expression homogeneity and gene expression level. Accordingly, it was found that the essentiality and expression homogeneity of a gene provide better explanatory power for the treatment prognosis than the expression level of the gene. Specifically, the genes from which the neoantigens were derived were aligned according to the essentiality for the cancer cell survival, top 500 to 2000 (high dependency) genes and bottom 500 to 2000 (low dependency) genes were selected, and the differential neoantigen load between patients with good prognosis and patient with poor prognosis for immunotherapy was calculated for each gene. The larger difference is considered as having better explanatory power for the treatment prognosis. Here, the gray dotted line indicates a difference in the number of neoantigens calculated for all genes according to an existing standard method. As a result, it was found that use of a few genes having high essentiality or expression homogeneity provides better explanatory power than use of all genes, while use of genes having low essentiality or expression heterogeneity provides poor explanatory power. Similarly, the differential neoantigen load was calculated according to expression homogeneity and gene expression level. Overall, it was found that the essentiality and expression homogeneity of a gene provide better explanatory power for the treatment prognosis than the expression level of the gene.
[0120] These results suggest that the immune evasion may occur more actively when the neoantigens subject to immune response are derived from the genes not essential for the proliferation or survival of cancer.
[0121]
[0122] 2-2. Pre-Therapy and On-Therapy Cohorts
[0123] The dependency of genes from which the neoantigens were derived and the immunotherapy responsiveness were analyzed by using the Riaz cohort data including both pre-therapy and on-therapy data of the immune checkpoint blockades.
[0124] The analysis results are shown in
[0125] Here, CR/PR and SD/PD refer to positive prognosis and negative prognosis, respectively. Accordingly, it was found that in the patient group with a good treatment response (CR(complete remission)/PR(partial remission)), neoantigens derived from genes having expression homogeneity and genes essential for survival of cancer cells were mainly targeted for an anticancer immune reaction, thereby leading to clonal contraction and reduction in the expression level at the RNA level. Meanwhile, it was found that, in the patient group with a poor treatment response (SD(stable disease) and PD(progressive disease)), neoantigens derived from genes having expression heterogeneity and genes not essential for survival of cancer cells were the main targets of immune attack, thereby leading to reduced gene expression and successful immune evasion through immunoediting, and thus the clonal expansion.
[0126] 2-3. MSK-Integrated Mutation Profiling of Actionable Cancer Targets (IMPACT)
[0127] MSK-IMPACT was used to verify the dependency of genes from which neoantigens were derived and immunotherapy responsiveness for various carcinomas.
[0128] In the previous study, for 468 genes included on the so-called Memorial Sloan Kettering Cancer Center (MSKCC) panel, a mutation load (proportional to the neoantigen load) in 1,662 people who received the checkpoint blockade treatment was measured over various cancer types, and the mutation load was found to be an important determinant of the immunotherapy responsiveness (Nat. Genet. 51:202-206, 2019).
[0129] In this Example, it was found that use of top 50% of 468 genes present on the MSKCC panel in terms of essentiality for cancer cell survival (High dependency) to measure a mutation load resulted in a higher association with the responsiveness of the immune checkpoint blockade treatment than use of all 468 genes.
[0130] The results are shown in the upper panel in
[0131] For each gene group, the patients were grouped into patients having high mutation burden or neoantigen load and patients having low mutation burden or neoantigen load and compared the post-treatment survival probability with HR values and p-values shown for each group. As a result, it was confirmed that the treatment prognosis was better explained by a small number of high dependency genes for cancer cell survival than all genes according to the standard method. Taken together, it was confirmed that the essentiality of genes from which antigens (including neoantigens and surface antigens) were derived, for cancer cells and the expression homogeneity in each cell in a tissue significantly affect the immune responsiveness of cancer.
[0132] It was also found that it is essential to make treatment targeted to neoantigens derived from genes essential for the cancer survival and genes homogeneously expressed, in order to minimize the immune evasion of cancer, and that the neoantigen load may be used as markers for prediction of immunotherapy prognosis.
Example 3. Selection of Neoantigen Derived from Cancer Patient-Specific Dependency Gene and Significance of the Neoantigen as Diagnostic/Therapeutic Target
[0133] In this Example, data obtained from a specific cancer patient sample was applied to a model for predicting cell survival dependency to obtain neoantigens derived from selected cancer patient-specific dependency genes, and then association between the neoantigens and the survival of cancer patients was investigated.
[0134] In Example 2, the association between the neoantigens derived from universal dependency gene and the survival of patients treated with immunotherapy was investigated.
[0135] The dependency data used in Example 2 were derived from the in vitro cancer cell line experiment described in Example 1, and accordingly, genes having universal dependency in several cancer cell lines, rather than dependency in a specific patient, were identified. As described above, a model for predicting cell survival dependency for predicting the cell survival dependency on gene expression may be generated by learning the relationship between the patterns of gene expression levels and the cell apoptosis based on the in vitro dependency data. When the gene expression profile of a cancer patient is input to the model, cancer patient-specific dependency genes may be identified. Then, the transcriptome data for lung cancer patients and breast cancer patients published by The Cancer Genome Atlas (TCGA) were used to find whether neoantigens obtained based on the cancer patient-specific in silico dependency data can serve as diagnostic/therapeutic targets of significance (https://portal.gdc.cancer.gov/). Since these patients did not actually receive the immunotherapy, the patients were divided into a sample having high penetration of immune cells (referred to as high leukocyte fraction) and a sample having low penetration of immune cells (referred to as low leukocyte fraction), assuming that an effect similar to immunotherapy would be seen in the sample having high penetration of immune cells. The results are shown in
[0136] The number of neoantigens derived from the cancer patient-specific genes essential for the proliferation or survival of cancer cells (High dependency) (shown in a black bar graph in
Example 4. Model for Predicting Binding Affinity Between Neoantigen and Antigen-Presenting Cell
[0137] For a neoantigen to be responsive to immunotherapy, the neoantigen must be processed by an antigen-presenting cell to be bound to HLA on a surface of the antigen-presenting cell.
[0138] Benchmark datasets from the Immune Epitope Database (IEDB) used in previous studies were employed to build and validate a model for predicting binding affinity between the neoantigen and the antigen-presenting cell and the results from the model were compared with the prediction power results from the existing machine learning algorithms published by the IEDB.
[0139] Here, the model for predicting binding affinity between the neoantigen and the antigen-presenting cell is a CNN model that includes multiple convolutional layers, a full connected layer, and an output layer, but nopooling layer so that all output values of the multiple convolution layers are used for prediction.
[0140] The multiple convolutional layers extract interaction features from input data which are map data including parameters indicating binding affinity between amino acids of a peptide and amino acids of HLA.
[0141] The multiple convolutional layers may perform convolution by using a specific number of kernels or weight matrices. The fully connected layer receives and integrates the output values of the multiple convolutional layers as an input, and the output layer generate information about the binding probability by using a sigmoid function.
[0142] The results are shown in
[0143] 4-1. Confirmation of Binding of Neoantigen to HLA
[0144] To confirm whether the neoantigen actually binds to the HLA molecule as predicted by the model (CNN-MHC) according to this Example, the binding ability of the neoantigen to the HLA was tested.
[0145] Specifically, the binding ability of peptides predicted to bind to HLA-A02 was analyzed by using a T2 cell line expressing the HLA-A02 (ATCC CRL-1992). The peptides predicted to bind to the HLA-A02 are shown in Table 2 (SEQ ID NOs: 1 to 50). In Table 2, CNN-MHC values are obtained from a deep learning model built based on the experimental values that converted the binding strength between respective amino acids of the neoantigen and the MHC into a matrix form, and mean a value converted to a probability between 0 and 1. NetMHC values refer to predicted values of the binding between the MHC protein and the peptide as calculated with NetMHCPan-4.1, which is a model learned based on mass spectrometry data with the latest version of the NetMHC tool commonly used for the binding prediction of the neoantigens. It is considered that the higher these two values, the higher the binding probability. Specifically, a culture of T2 cell line (1×10.sup.6/ml) was treated with a peptide at a concentration of 50 μg/ml for 24 hours, and then stained with APC-labeled HLA-A2 monoclonal antibody. The resulting cells were analyzed by flow cytometry to find the binding ability thereof. Here, DMSO was used as a negative control, and Mart-1 and NY-ESO were used as positive controls.
TABLE-US-00002 TABLE 2 P CNN-MHC NetMHC Mutation information
acid sequence value value
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
_
0.
0.
indicates data missing or illegible when filed
[0146]
[0147] 80% of the peptides predicted by the model (CNN-MHC) of this Example were actually found to bind to the HLA-A02. Compared with the widely used prediction values of NetMHCpan-4.1 (http://www.cbs.dtu.dk/services/NetMHCpan-4.1/), the model of this Example was found to have better prediction power. Meanwhile, the aforementioned Examples may be performed by a processor 120 executing at least one instruction stored in a memory 110 of a system shown in
[0148] Referring to
[0149] Prediction models generated and updated by the prediction model generation unit 122 may be stored in a memory.
Example 5. Anti-Cancer Vaccine Containing Neoantigen
[0150] As described in the aforementioned Examples, an anti-cancer vaccine containing a neoantigen derived from a gene essential for cancer cell survival may be prepared.
[0151] By performing exome sequencing on a mouse cell line with cancer cells or tissues transplanted, mutations not found in normal cells may be discovered, and these mutations may be applied to the model for predicting binding affinity to antigen-presenting cells to select candidate neoantigens.
[0152] Genes from which the neoantigens were derived are subjected to the model for predicting cell survival dependency to determine the essentiality for the cancer cell survival based on the dependency data on the corresponding cancer, and the neoantigens are aligned according to the essentiality to select top 5 and bottom 5 neoantigens. All possible peptide sequences including any of the selected neoantigens in 9 to 30 amino acids long are generated and tested to select sequences having chemical properties suitable for peptide synthesis, e.g., high hydrophilicity (Kyte-Doolittle GRAVY<0) and having low instability index (InstaIndex<40). Then, peptides having the selected sequences may be synthesized, and used to prepare an anti-cancer vaccine containing the neoantigens.
[0153] 5-1. Selection of Neoantigen
[0154] The exome and transcriptome information of cancer cell lines was analyzed by using the model for predicting cell survival dependency and the model for predicting binding affinity between the neoantigen and the antigen-presenting cell as described in Examples 1 to 4, and accordingly, mutations derived from the dependency gene having high cell survival essentiality and having HLA-binding ability were selected.
[0155] 5-2. Confirmation of Immune Response by Neoantigen
[0156] To confirm whether the neoantigen induces immune response thereto, a test to detect T cells that recognize the neoantigen peptides with predicted or experimentally verified HLA-binding ability including the neoantigen peptides selected in 5-1 was conducted.
[0157] Specifically, mice transplanted with a mouse cancer cell line were used to identify a T cell pool having specificity. 1×10.sup.6 LLC-1 (ATCC CRL-1642), a mouse cancer cell line were subcutaneously injected into the flank of a 6-week-old male C57BL/6 mouse weighing 20 g to generate a mouse model. Specifically, through the analysis of exomes and transcriptomes of the cancer cell line, 15 mutations predicted to have high essentiality for cell survival and high binding force to the antigen-presenting cell were selected as a neoantigen group expected to induce binding to HLA and reactive CD8+ T cells.
[0158] The information on the selected peptides is summarized in Table 3 (SEQ ID NOs: 51 to 65). The CNN-MHC values listed in Table 3 are values obtained by building a deep learning model based on experimental values that converted the binding strength between respective amino acids of the neoantigen and the MHC into a matrix form, and are values converted into a probability value between 0 and 1. Here, a higher value indicates a higher probability of binding. To determine the reaction of CD8+ T cells to the neoantigens, the ELISpot analysis based on IFNγ secretion was performed by using spleen cells extracted from the spleen of mice vaccinated with 9-mer or 15-mer peptides synthesized with the mutation sequences. 10 peptides (66.7%) out of 15 candidate neoantigen peptides with expected reactivity were found to induce IFNγ secretion from T cells. The results are shown in
TABLE-US-00003 TABLE 3 P
antigen CNN-MHC Mutation information HLA type
acid sequence value
p.
_
0.9
p.
_
0.9
p.
_variant
0.9
p.
_variant
0.97
p.
_variant
0.95
8
p.
_variant
0.9350
p.
_variant
0.9
p.
0.9
p.
_variant
0.9158
p.
_variant
p.
_variant
0.8
p.
_variant
0.89
p.
_variant
0.
74
p.
_variant
0.
22
p.
_variant
0.7
indicates data missing or illegible when filed
Example 6. Prediction of Treatment Prognosis for Cancer Patient by Using Neoantigen
[0159] As described in the aforementioned Examples, neoantigens derived from a gene essential for cancer cell survival may be used to predict treatment prognosis for a cancer patient.
[0160] By performing exome sequencing on a patient sample, mutations not found in normal cells but found only in cancer cells are identified, and these mutations may be applied to the model for predicting binding affinity with antigen-presenting cells to select candidate neoantigens.
[0161] The model for predicting cell survival dependency is used to determine the essentiality of the genes from which the neoantigens were derived, for the cancer cell survival based on the dependency data on the corresponding cancer, and the neoantigens are aligned according to the essentiality for survival to select the same number of genes (e.g., 500 genes) from the top and the bottom. The patients treated with immunotherapy are divided into a patient group who responded to the immunotherapy and a patient group who did not respond to the immunotherapy. Then, the patient groups are compared in terms of the number of neoantigens derived from the genes selected from the top (i.e., highest essentiality) and that from the genes selected from the bottom (i.e., lowest essentiality) to determine whether the load of neoantigens derived from the genes of high essentiality is highly associated with the treatment prognosis of a patient. The determination of the association between the number of neoantigens and the treatment prognosis of patients are reiterated with change in the number of genes selected, thereby choosing the number of neoantigens derived from the dependency gene necessary to predict the treatment prognosis of the patient.
[0162] The foregoing descriptions are only for illustrating the present disclosure, and it will be apparent to a person having ordinary skill in the art to which the present invention pertains that the embodiments disclosed herein can be easily modified into other specific forms without changing the technical principle or essential features.
[0163] Therefore, it should be understood that Examples described herein are illustrative in all respects and are not limiting. For example, each component described as in a single form may be implemented in a distributed manner, and likewise components described as being distributed may be implemented in a combined form.