METHODS AND COMPOSITIONS FOR IDENTIFYING NEOANTIGENS FOR USE IN TREATING AND PREVENTING CANCER
20220257701 · 2022-08-18
Inventors
Cpc classification
A61K9/0019
HUMAN NECESSITIES
G01N33/57484
PHYSICS
C40B40/10
CHEMISTRY; METALLURGY
C12N15/1062
CHEMISTRY; METALLURGY
A61K38/16
HUMAN NECESSITIES
C12N15/1093
CHEMISTRY; METALLURGY
International classification
A61K38/16
HUMAN NECESSITIES
A61K39/00
HUMAN NECESSITIES
A61P35/00
HUMAN NECESSITIES
C12N15/10
CHEMISTRY; METALLURGY
Abstract
Provided herein, are methods of identifying neoantigens for treating and preventing cancer. Also disclosed are methods and compositions for administering identified neoantigens for the treatment and prevention of cancer.
Claims
1. A composition comprising a plurality of frameshift variant peptides, (i) wherein the plurality of frameshift variant peptides comprise peptides encoded by genes having a variant in a microsatellite (MS) in a coding region of the gene; or (ii) wherein the plurality of frameshift variant peptides comprise peptides encoded by a mRNA having an RNA processing error.
2. The composition of claim 1, wherein the plurality of frameshift variant peptides comprise one or more peptides provided in any one of Tables 1 or 7.
3. The composition of claim 1, wherein the plurality of frameshift peptides comprise frameshift peptides with enriched activity in an individual tumor type.
4. The composition of claim 1, wherein the plurality of frameshift peptides comprise frameshift peptides found across tumor types.
5. The composition of claim 1, further comprising an adjuvant.
6. The composition of claim 5, wherein the adjuvant is selected from the group consisting of ABM2, AS01B, AS02, AS02A, Adjumer, Adjuvax, Algammulin, alum, aluminum phosphate, aluminum potassium sulfate, Bordetella pertussis, calcitriol, chitosan, cholera toxin, CpG, dibutyl phthalate, dimethyldioctadecylammonium bromide (DDA), Freund's adjuvant, Freund's complete, Freund's incomplete (IFA), GM-CSF, GMDP, gamma inulin, glycerol, HBSS (Hank's Balanced Salt Solution), polyinosinic-polycytidylic acid stabilized with polylysine and carboxymethylcellulose (poly-ICLC, also known as Hiltonol), IL-12, IL-2, imiquimod, interferon-gamma, ISCOM, lipid core peptide (LCP), lipofectin, lipopolysaccharide (LPS), liposomes, MF59, MLP+TDM, monophosphoryl lipid A, Montanide IMS-1313, Montanide ISA 206, Montanide ISA 720, Montanide ISA-51, Montanide ISA-50, nor-MDP, oil-in-water emulsion, P1005 (non-ionic copolymer), Pam3Cys (lipoprotein), pertussis toxin, poloxamer, QS21, RaLPS, Ribi, saponin, Seppic ISA 720, soybean oil, squalene, Syntex adjuvant formulation (SAF), synthetic polynucleotides (poly IC/poly AU), TiterMax tomatine, Vaxfectin, XtendIII, or Zymosan.
7. The composition of claim 1, wherein the plurality of frameshift variant peptides comprise two or more pooled frameshift peptides.
8. The composition of claim 1, wherein the RNA processing error is a splicing error.
9. The composition of claim 8, wherein the splicing error comprises intron retention.
10. A method of treating or preventing cancer in a subject comprising administering a composition according to claim 1.
11. The method of claim 10, wherein the subject is a mammal.
12. The method of claim 11, wherein the subject is a human, a dog, a cat, a mouse, a rat, a rabbit, a horse, a cow, or a pig.
13. The method of claim 10, wherein the cancer is selected from the group consisting of acute lymphoblastic leukemia, acute monocytic leukemia, acute myeloid leukemia, acute promyelocytic leukemia, adenocarcinoma, adult T-cell leukemia, astrocytoma, bladder cancer, bone cancer, brain tumor, breast cancer, Burkitt's lymphoma, carcinoma, cervical cancer, chronic lymphocytic leukemia, chronic myelogenous leukemia, colon cancer, colorectal cancer, endometrial cancer, glioblastoma multiforme, glioma, hepatocellular carcinoma, Hodgkin's lymphoma, inflammatory breast cancer, kidney cancer, leukemia, lung cancer, lymphoma, malignant mesothelioma, medulloblastoma, melanoma, multiple myeloma, neuroblastoma, non-Hodgkin lymphoma, non-small cell lung cancer, ovarian cancer, pancreatic cancer, pituitary tumor, prostate cancer, retinoblastoma, skin cancer, small cell lung cancer, squamous cell carcinoma, stomach cancer, T-cell leukemia, T-cell lymphoma, thyroid cancer, and Wilms' tumor.
14. A method of making a composition for use in treating or preventing cancer, comprising: (a) contacting a biological sample obtained from a subject to a peptide array comprising a plurality of frameshift variant peptides, (i) wherein the plurality of frameshift variant peptides comprise peptides encoded by genes having a variant in a micro satellite (MS) in a coding region of the gene; or (ii) wherein the plurality of frameshift variant peptides comprise peptides encoded by a mRNA having an RNA processing error; (b) detecting one or more immune reactive neoantigens in the biological sample, wherein the one or more immune reactive neoantigens bind to at least one peptide in the peptide array; (c) selecting one or more detected immune reactive neoantigens; and (d) producing a composition comprising the one or more selected immune reactive neoantigens.
15. The method of claim 14, wherein the plurality of frameshift variant peptides comprise one or more peptides provided in any one of Tables 1 or 7.
16. The method of claim 14, wherein the plurality of frameshift variant peptides are fixed on a substrate.
17. The method of claim 16, wherein the substrate comprises glass, composite, resin, or combination thereof.
18. The method of claim 14, wherein the peptide array is configured to detect binding by at least one of fluorescence, luminescence, calorimetry, chromatography, radioactivity, Bio-Layer Interferometry, and surface plasmon resonance.
19. The method of claim 14, wherein the peptide array comprises at least about 25000, about 50000, about 75000, about 100000, about 125000, about 150000, about 175000, about 200000, about 225000, about 250000, about 275000, about 300000, about 325000, about 350000, about 375000, or about 400000 frameshift variant peptides.
20. The method of claim 14, wherein the biological sample comprises blood, serum, plasma, cerebrospinal fluid, saliva, urine, or combinations thereof.
21. The method of claim 14, wherein the biological sample comprises an antibody.
22. The method of claim 14, wherein the subject is a human, a dog, a cat, a mouse, a rat, a rabbit, a horse, a cow, or a pig.
23. The method claim 14, wherein the subject is suspected of having a cancer.
24. The method of claim 14, wherein the cancer is selected from the group consisting of acute lymphoblastic leukemia, acute monocytic leukemia, acute myeloid leukemia, acute promyelocytic leukemia, adenocarcinoma, adult T-cell leukemia, astrocytoma, bladder cancer, bone cancer, brain tumor, breast cancer, Burkitt's lymphoma, carcinoma, cervical cancer, chronic lymphocytic leukemia, chronic myelogenous leukemia, colon cancer, colorectal cancer, endometrial cancer, glioblastoma multiforme, glioma, hepatocellular carcinoma, Hodgkin's lymphoma, inflammatory breast cancer, kidney cancer, leukemia, lung cancer, lymphoma, malignant mesothelioma, medulloblastoma, melanoma, multiple myeloma, neuroblastoma, non-Hodgkin lymphoma, non-small cell lung cancer, ovarian cancer, pancreatic cancer, pituitary tumor, prostate cancer, retinoblastoma, skin cancer, small cell lung cancer, squamous cell carcinoma, stomach cancer, T-cell leukemia, T-cell lymphoma, thyroid cancer, and Wilms' tumor.
25. The method of claim 14, wherein the RNA processing error comprises intron retention.
26. The method of claim 14, wherein the composition comprises an adjuvant.
27. The method of claim 26, wherein the adjuvant is selected from the group consisting of ABM2, AS01B, AS02, AS02A, Adjumer, Adjuvax, Algammulin, alum, aluminum phosphate, aluminum potassium sulfate, Bordetella pertussis, calcitriol, chitosan, cholera toxin, CpG, dibutyl phthalate, dimethyldioctadecylammonium bromide (DDA), Freund's adjuvant, Freund's complete, Freund's incomplete (IFA), GM-CSF, GMDP, gamma inulin, glycerol, HBSS (Hank's Balanced Salt Solution), polyinosinic-polycytidylic acid stabilized with polylysine and carboxymethylcellulose (poly-ICLC, also known as Hiltonol), IL-12, IL-2, imiquimod, interferon-gamma, ISCOM, lipid core peptide (LCP), lipofectin, lipopolysaccharide (LPS), liposomes, MF59, MLP+TDM, monophosphoryl lipid A, Montanide IMS-1313, Montanide ISA 206, Montanide ISA 720, Montanide ISA-51, Montanide ISA-50, nor-MDP, oil-in-water emulsion, P1005 (non-ionic copolymer), Pam3Cys (lipoprotein), pertussis toxin, poloxamer, QS21, RaLPS, Ribi, saponin, Seppic ISA 720, soybean oil, squalene, Syntex adjuvant formulation (SAF), synthetic polynucleotides (poly IC/poly AU), TiterMax tomatine, Vaxfectin, XtendIII, or Zymosan.
28. The method of claim 14, further comprising preparing a vector encoding the one or more detected immune reactive neoantigens.
29. The method of claim 14, wherein selecting one or more detected immune reactive neoantigens comprises selecting immune reactive neoantigens with a high positive rate in a specific cancer type.
30. The method of claim 14, wherein selecting one or more detected immune reactive neoantigens comprises selecting immune reactive neoantigens with a high positive rate across multiple cancer types.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] The novel features of the disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings of which:
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
DETAILED DESCRIPTION
[0068] Provided herein are methods and compositions for preventing, treating, and diagnosing cancer comprising the use of neoantigens. Neoantigens herein comprise peptides encoded by nucleic acids having frameshift mutations, such as insertions or deletions, causing a frameshift in the mRNA and a long stretch of mutant amino acids that are, in some cases, recognized as a non-self peptide by the immune system.
[0069] The success of checkpoint inhibitors in cancer therapy is largely attributed to activating the patient's immune response to their tumor's neoantigens arising from DNA mutations. This realization has motivated the interest in personal cancer vaccines based on sequencing the patient's tumor DNA to discover neoantigens. Embodiments provided herein relate to an additional, unrecognized source of tumor neoantigens. In some embodiments, errors in transcription of microsatellites (MS) and mis-splicing of exons create highly immunogenic frameshift (FS) neoantigens in tumors. The sequence of these FS neoantigens are predictable, allowing creation of a peptide array representing all possible neoantigen FS peptides. This array can be used to detect the antibody response in a patient to the FS peptides. A survey of 5 types of cancers reveals peptides that are personally reactive for each patient. This source of neoantigens and the method to discover them may be useful in developing cancer vaccines.
[0070] Personal cancer vaccines are promising as a new therapeutic treatment. These vaccines are currently based on mutations in tumor DNA. In some embodiments, variants in RNA production create frameshift neoantigens that may be another source of neoantigens for personal vaccines. Because there are only ˜220K of these antigens a simple peptide array can be used for their detection. Checkpoint inhibitor immunotherapeutics are revolutionizing cancer therapy. However, even in the most responsive cancers a substantial portion (50%-80%) of the patients have poor to no positive response (1-5). A surprising finding in the analysis of these patients was that one of the best correlates of response has been the total number of neoantigens in the tumor (6-8). This is also the case for patients with high microsatellite instability (MSI) where the production of FS neoantigens drives the effective anti-tumor immune responses (9-11). The realization of the immunological importance of these DNA mutations has spawned the effort to develop personal vaccines (12). As promising as early studies are of these vaccines, a major problem is that the majority of tumors will not have enough neoantigen-generating mutations to sustain development of a personal vaccine (13-15). For example, melanoma tumors have a high mutational level with an average of 200 neoepitope mutations. This provides a large number to algorithmically screen for optimal antigenic presentation. In recent reports of two Phase I clinical trials of personal melanoma vaccines, starting with 90-2,000 personal neoantigens, 10 or 20 were identified for the vaccine (16, 17). However, in glioblastoma multiforme (GBM) only 3.5% patients had a high tumor mutation load, and further analysis showed that only a very small subset of GBM patients would potentially benefit from checkpoint blockade treatment (18). This is also consistent with a lack of response in GBM patients to checkpoint inhibitors (19). Massive genomic sequencing results indicated that GBM, ovarian cancer, breast adenocarcinoma and many other cancer types had very low number non-synonymous mutations, which will make these cancers difficult targets for personalized cancer vaccines (14, 20).
[0071] To solve this problem, methods and compositions are provided herein related to an alternative source of neoantigens which expand the scope of the application and efficacy of the neoantigen based cancer vaccines. In the process of becoming a tumor, not only does the DNA mutation rate increase with faster cell divisions, but also there is a disruption of basic cellular functions, including RNA transcription, splicing and the quality control system on peptides (21). The disrupted RNA processes increase the FS transcripts generated by RNA splicing errors and the insertions and deletions (INDELs) of MSs (22). Both of these processes, combined with the disrupted quality control system in tumor cells, can lead to the production of FS peptides and exposure of the FS epitopes to the immune system. Embodiments provided herein relate to FS variants produced by errors in RNA processing as a source of cancer neoantigens and a simple system to detect them.
[0072] Disclosed herein are models for how errors in transcription microsatellites and mis-splicing of exons could create frameshift neoantigens. Embodiments provided herein include examples in the RNA of tumors for both mis-splicing and of mis-transcription of an INDEL where the errors are present at the RNA but not DNA level. Also provided are methods for analysis of the NCBI EST library to reveal other examples of FS variants. Using an array comprising all predicted FS peptides with specific qualifications, human sera from patients with 5 different cancers have higher antibody reactivity than people without cancer. Three different patterns of high antibody reactive can be determined—pan-cancer, cancer-type focused and personal. Several examples are presented demonstrating that the FS variants offer at least partial protection in mouse models and that the protection is additive for each FS antigen.
[0073] The methods and compositions provided herein indicates that variants produced at the RNA level in tumor cells may be a good source of neoantigens for vaccines for several reasons. First, these FS variants produce neoantigens which are more likely to be immunogenic than neo-epitopes encoded by single nucleotide mutations (7). Second, FS from MS INDELs would be particularly attractive sources. There are a limited number of possible variants (˜8600 of homopolymers >=7 bp), which encode about 7,000 FS peptides longer than 10 aa, thus reducing the search space for neoantigens. Third, because of the predictable number of candidates it should be possible to use a peptide array to screen for immune reactive neoantigens. This approach would be much simpler than sequencing tumor DNA obtained from a biopsy. Fourth, because any expressed gene has the potential to produce neoantigens, it may not be necessary to limit the vaccine to oncological driver genes. Finally, it should be difficult for the tumor cells to evolve away from the vaccine since these FSs are variants, not heritable mutations. Particularly if the FS antigen was produced in RNA from an essential gene, the tumor cells would need to restrict MHC presentation (17, 52) or create an immune suppressive environment (53) to escape an immune response.
[0074] Elements of the model are supported by other published work. The immunological reactivity of FS neoantigens is the presumed basis of the effectiveness of PD-1 in most MSI-H cancers (54, 55). It also explains the responsiveness of renal cancer to CPI therapy—these cancers have low point mutation levels but high FS mutations (3, 7, 20). It has also been shown that cancer cells have much higher mis-splicing rates than normal cells (39, 41, 56). Recently, Andre et al. (56) showed informatically that cancer cells could make neofusion sites by mis-splicing. However, their analysis did not include fusions that created FS peptides. Also, Alicia et.al. (57) analyzed intron retention in tumor databases. This process can also create FS neoepitopes, though apparently much less frequently than mis-splicing of exons. The only aspects of the model not independently confirmed are 1) that the FS peptides potentially generated at the RNA level are made in tumors, 2) that the RNA-generated FSPs can generate immune responses, and 3) that these peptides can be protective against tumors. However, the methods provided herein support these 3 remaining aspects of the model.
[0075] An important aspect of this source of neoantigens is that it may allow extending the personal vaccines to more patients and tumor types. Many tumors have relatively low numbers of DNA mutations and probably could not support constructing a vaccine (58). Estimates from published mutational surveys of various tumors(59) indicate that only 40% of patients could be treated with personal vaccines. However, the methods and compositions provided herein predicts that the RNA FS variants would be produced in any cancer type, even if the DNA mutation level is low. This is substantiated, for example, in GBM (
[0076] The model also predicts that there may be recurrent FSs produced in different tumors. This is substantiated, for example, at the RNA level for SMC1 FS in breast cancers (
[0077] Sets of FS peptides were found that had enriched activity in individual tumor types. A collection of a set of these peptides could potentially be used to constitute a general, therapeutic vaccine or one focused on a particular tumor or set of tumors. Such vaccines would have an advantage over a personal vaccine of being pre-made but would have fewer antigens in common with the tumor. Conceivably, pan-cancer peptides could be used to create a prophylactic cancer vaccine, as has been proposed for cancer associated antigens (60). However, as shown in comparing late and early stage pancreatic cancer profiles, a prophylactic vaccine from FS neoantigens would be best constituted from peptides reactive at early stages of cancer. Clinical trials in dogs were recently initiated of a prophylactic vaccine that is designed to be broadly protective (data not shown).
[0078]
[0079] In
[0080] The vaccines tested did not produce complete protection by themselves in the models tested. However, it should be noted that both these models are very stringent and probably do not completely replicate natural tumors. One reason for this may be due to low level production of each FS neoantigen, consistent with the additive effects of the FS peptides in vaccines (
[0081] The arrays detect antibody responses to FS peptides. B-cell responses are not commonly considered important for an anti-tumor effect. It was recently shown that antibodies generated by dogs with cancer could be detected on an 800 FS peptide array. Peptides reactive on the dog array, whose homolog was also present in a mouse tumor line, were protective in the mouse models, while non-reactive peptides on the array did not confer protection. This study establishes that antibody response is an indicator of vaccine effectiveness. The level of antibody response correlated with protection in the mouse models. One explanation for this observation is that the IgG antibody response depends on CD4+ T-cell help. FS with good CD4+ T cell epitopes may also elicit tumor cell killing. It has been noted that CD4+ T cell responses to vaccines correlate with protection (66, 67).
[0082] In summary, the methods and compositions provided herein relate to another class of neoantigens that are useful in developing different types of cancer vaccines. Also provided herein are array formats for directly detecting immune responses to these tumor antigens. Dog and human clinical trials for use of the tumor antigens identified by the methods disclosed herein are underway.
[0083] As used herein, the term “detect,” “detection,” “detectable,” or “detecting” is understood both on a quantitative and a qualitative level, as well as a combination thereof. It thus includes quantitative, semi-quantitative, and qualitative measurements of measuring a cancer in a subject, using the methods and compositions as disclosed herein.
[0084] As used herein, the expression “a subject in need thereof” means a human or non-human mammal that exhibits one or more symptoms or indications of cancer, and/or who has been diagnosed with cancer, including a solid tumor and who needs treatment for the same. In many embodiments, the term “subject” may be interchangeably used with the term “patient”. For example, a human subject may be diagnosed with a primary or a metastatic tumor and/or with one or more symptoms or indications including, but not limited to, unexplained weight loss, general weakness, persistent fatigue, loss of appetite, fever, night sweats, bone pain, shortness of breath, swollen abdomen, chest pain/pressure, enlargement of spleen, and elevation in the level of a cancer-related biomarker.
[0085] The term “malignancy” refers to a non-benign tumor or a cancer. As used herein, the term “cancer” includes a malignancy characterized by deregulated or uncontrolled cell growth. Exemplary cancers include: carcinomas, sarcomas, leukemias, and lymphomas. Cancer includes primary malignant tumors (e.g., those whose cells have not migrated to sites in the subject's body other than the site of the original tumor) and secondary malignant tumors (e.g., those arising from metastasis, the migration of tumor cells to secondary sites that are different from the site of the original tumor). A cancer may include, for example, gastric, myeloid, colon, nasopharyngeal, esophageal, and prostate tumors, glioma, neuroblastoma, breast cancer, lung cancer, ovarian cancer, colorectal cancer, thyroid cancer, leukemia (e.g., Adult T-cell leukemia, Acute monocytic leukemia, Acute myeloid leukemia, Acute promyelocytic leukemia, myelogenous leukemia, lymphocytic leukemia, acute myelogenous leukemia (AML), chronic myeloid leukemia (CML), acute lymphoblastic leukemia (ALL), T-lineage acute lymphoblastic leukemia or T-ALL chronic lymphocytic leukemia (CLL), myelodysplastic syndrome (MDS), hairy cell leukemia), lymphoma (Hodgkin's lymphoma (HL), non-Hodgkin's lymphoma (NHL)), multiple myeloma, bladder, renal, gastric (e.g., gastrointestinal stromal tumors (GIST)), liver, melanoma and pancreatic cancer, sarcoma, Adenocarcinoma, Astrocytoma, Bone Cancer, Brain Tumor, Burkitt's lymphoma, Carcinoma, Cervical Cancer, Chronic Lymphocytic Leukemia, Chronic myelogenous leukemia, Endometrial cancer, Glioblastoma multiforme, Glioma, Hepatocellular carcinoma, Hodgkin's lymphoma, Inflammatory breast cancer, Kidney Cancer, Leukemia, Lymphoma, Malignant Mesothelioma, Medulloblastoma, Melanoma, Multiple myeloma, Neuroblastoma, Non-Hodgkin Lymphoma, Non-Small Cell Lung Cancer, Pancreatic Cancer, Pituitary tumor, Retinoblastoma, Skin Cancer, Small Cell Lung Cancer, Squamous cell carcinoma, Stomach cancer, T-cell leukemia, T-cell lymphoma, and Wilms' tumor.
[0086] As used herein the term “frameshift mutation” is a mutation causing a change in the frame of the protein. Thus, a frameshift variant peptide is a peptide in which a frame has changed due to a frameshift mutation. In some embodiments provided herein, a frameshift includes two or more pooled frameshifts. As used herein, the term “pooled” refers to a plurality of frameshift samples that have been combined to create a new composition.
[0087] As used herein, the term “microsatellite instability,” also known as “MSI” refers to the changes in microsatellite repeats in tumor cells or genetic hypermutability caused due to deficient DNA mismatch repair. Microsatellites, also known as simple sequence repeats, are repeated sequences of DNA comprising repeating units 1-6 base pairs in length. Although the length of microsatellites is highly variable from person to person and contributes to the DNA fingerprint, each individual has microsatellites of a set length. MSI results from the inability of the mismatch repair (MMR) proteins to fix a DNA replication error. MSI comprises DNA polymorphisms, wherein the replication errors vary in length instead of sequence. MSI comprises frame-shift mutations, either through insertions or deletions, or hypermethylation, leading to gene silencing. It is known in the art that microsatellite instability may result in colon cancer, gastric cancer, endometrium cancer, ovarian cancer, hepatobiliary tract cancer, urinary tract cancer, brain cancer, and skin cancers.
EXAMPLES
Example 1: Materials and Methods for Isolating Neoantigens
[0088] Cell Lines and Tissues
[0089] HEK293, B16-F10 and 4T1 cell lines were purchased from ATCC in 2006. Upon receipt, cells were cultured for three passages in RPMI medium (ATCC) with 10% FBS, 100 U/mL penicillin, and 100 mg/mL streptomycin and stored in aliquots under liquid nitrogen. Cells were maintained at 37° C. under humidified 5% CO.sub.2, 95% air. Cells between 2 and 20 passages were used. Cell lines were not re-authenticated. Other cells lines are listed in Table 2 and were cultured in ATCC-recommended media.
[0090] Mice and Mouse Tumor Models
[0091] BALB/c and C57BL/6 mice were from Charles River Laboratories or Jackson Laboratories. For the tumor challenge, 5×10.sup.3 4T1 cells were injected in the mammary pad at the right flank of the mice, or 1×10.sup.5 B16F10 cells were injected subcutaneously in the right flank of the mice. Tumor volumes were measured and calculated by (Length.sup.2×Width/2) daily after the size was larger than 1 mm.sup.3. Breeding pairs of BALB-neuT and FVB-neuN (FVB/N-Tg (MMTVneu) 202Mul) mice were obtained from Joseph Lustgarten, Mayo Clinic Arizona. Mice were monitored weekly for the tumor incidence after tumor size reached 1 mm.sup.3. All experiments were performed in accordance with protocols approved by the Institutional Animal Care and Use Committee of Arizona State University.
[0092] Statistical significance of differences was analyzed by a Student t-test.
[0093] EST Analysis
[0094] To identify potential putative chimeric transcripts, that when translated would result in a frame-shifted neo-peptide, two publicly available datasets and applied an algorithm that was used to identify chimeric transcripts were used. Specifically, the sequences found within the Expressed Sequence Taq (EST) library and the Human RefSeq database (23) from the National Center for Biotechnology Information (NCBI) were used. Using the stand-alone BLAST program, all EST sequences were aligned to RefSeq. ESTs that aligned with 50-85 base pairs and had 95% homology to RefSeqs that have been previously annotated by National Center Institute (NCI) were selected. The alignment data was filtered by eliminating the EST sequences that did not align to multiple RefSeqs or were aligned in the 3′-5′ orientation. Lastly, the sequences that aligned with non-coding sequence regions were eliminated. The remaining EST sequences were then used to identify the chimeric transcripts. Only the ESTs that aligned to two or more distinct RefSeq in consecutive positions were considered to be potential candidates. To be defined as a coding chimeric transcript, the EST sequences had to be at least 100 bp long with sequence similarity greater than or equal to 95% to the RefSeq. Also, the junction points between the two genes had to occur within the coding sequence of the upstream gene and orientation of the upstream gene alignment had to be in the positive (5′-3′) orientation. To eliminate false calls, all potential chimeric EST sequences had to be either present in more than one cDNA library or supported by three or more independent EST sequences. In addition, chimeric transcripts were classified based on the relative position of two genes. Classification of types of chimeric transcript was based on relative position of two fusion genes on the chromosome. Specifically, genes found on different chromosomes resulted in inter-chromosomal fusion while genes found in same chromosome were intra-chromosomal or read-through chimeric transcripts. Read-through chimeric transcripts resulted from two neighboring genes on same strand, otherwise intra-chromosomal.
[0095] PCR Screen for EST FS Candidates
[0096] The 50 Human Breast cancer cell lines were obtained from the American Type Culture Collection (ATCC) and were grown according to recommendations. Human breast cancer tissue specimens were acquired from Mayo Clinic, and were informed consent and approval by the Mayo Clinic Institutional Review Board. All specimens were coded and anonymized. All experiments were performed in accordance with the approval protocol. Total RNA was extracted from breast cancer cell lines and primary breast tissues using the TRIzol LS reagent (Life Technologies, Carlsbad, Calif.) following the manufacturers protocol. RNA integrity was determined by gel electrophoresis and concentration was determined by measuring absorbance at 260/280 on the Nano-drop (NanoDrop Products, Wilmington, Del.). cDNA was prepared by using the SuperScript™ III First-Strand Synthesis SuperMix (Life Technologies, Carlsbad, Calif.) that includes random hexamers and oligo dT's following the manufacturer's recommended protocol. cDNA integrity and quality were assessed by performing a β-actin control PCR. End Point PCR primers for each chimeric transcript were designed using Primer3 (24) so that the forward and reverse primers both bind 80 bp to 280 bp upstream/downstream from the junction point. End-point PCR reactions using approximately 25 ng of cDNA, reagents from (Life Technologies, Carlsbad, Calif.) and 35 cycles were performed using Mastercycler ep gradient S (Eppendorf, Hamburg, Germany). PCR products were analyzed on 1.5% agarose gels. PCR products were purified, and sequence confirmed by Applied Biosystems 3730 (Life Technologies, Carlsbad, Calif.) sequencing.
[0097] End-Point RT-PCR
[0098] cDNAs from human primary breast tumors and normal mammary glands were from BioChain (Newark, Calif.). Total RNA from other sources was extracted with TRIzol (Life Technologies, Carlsbad, Calif.). cDNA was synthesized from total RNA using the SuperScript III First-Stand Synthesis SuperMix (Life Technologies). The primer sequences used for end-point RT-PCR were synthesized by Life Technologies or Sigma. End-point RT-PCR reactions (25 sL) used the GoTaq PCR kit (Promega, Madison, Wis.) and the following conditions: 95° C. for 2 min; 35 cycles of 95° C. for 30 secs, 60° C. for 30 sec (annealing), and 72° C. for 10 to 30 sec (extension); and 72° C. for 5 min. Exceptions were that mouse SMC1A primers used an annealing temperature of 55° C., and β-actin primers were done with 25 cycles and 30 sec of extension time. Sequence verification was performed on RT-PCR products in initial reactions and later during intermittent reactions. The following primers (from 5′ to 3′) for the PCR were used: SEC62 DNA human forward: TGCCATACCTGTTITITCCC (SEQ ID NO: 1); SEC62 human DNA reverse: AGTTATCTCAGGTAGGTGTTGC (SEQ ID NO: 2); SEC62 DNA dog forward: AAGGGAGTCTGTGGTTGA (SEQ ID NO: 3); SEC62 DNA dog reverse: CAAAGAGGGAAGAGAGTGG (SEQ ID NO: 4); SEC62 cDNA human forward: AAAGGAAAAGCTGAAAGTGGAA (SEQ ID NO: 5); SEC62 human cDNA reverse: GCAACAGCAAGGAGAAGAATAC (SEQ ID NO: 6); SEC62 cDNA dog forward: AAGGGAGTCTGTGGTTGA (SEQ ID NO: 7); SEC62 cDNA dog reverse: CAAAGAGGGAAGAGAGTGG (SEQ ID NO: 8); SMC1A mouse forward: CTGTCATGGGTTTCCTG (SEQ ID NO: 9); SMC1A mouse reverse: GAGCTGTCCTCTCCTTG (SEQ ID NO: 10); SMC1A human forward: CCTGAAACTGATTGAGATTGAG (SEQ ID NO: 11); SMC1A human reverse: TCTTCAGCCTTCACCATTTC (SEQ ID NO: 12); β-actin mouse forward: CCAACCGTGAAAAGATGACC (SEQ ID NO: 13); β-actin mouse reverse: TGCCAATAGTGATGACCTGG (SEQ ID NO: 14); β-actin human forward: CCAACCGCGAGAAGATGACC (SEQ ID NO: 15); β-actin human reverse: TGCCAATGGTGATGACCTGG (SEQ ID NO: 16); Rat Her-2 forward: ATCGGTGATGTCGGCGATAT (SEQ ID NO: 17); Rat Her-2 reverse: GTAACACAGGCAGATGTAGGA (SEQ ID NO: 18).
[0099] Sec62 Transfection and Flow Analysis
[0100] HEK293 cell line were purchased from ATCC and cultured with standard protocols. Lipofectamine 2000 Transfection Reagent (Thermo Fisher Scientific, MA) was used to transfect plasmids into cell lines for overnight. Cells were then prepared in FACS buffer and quantified with flow cytometry. The three open reading frames (ORFs) were assembled by PCR and inserted into pCMVi vector at EcoR I MCS site. Detailed sequences of three ORFs were included in Table 6.
[0101] Gene Expression
[0102] Gene expression was measured with the TaqMan Gene Expression Assay (Life Technologies) according to the manufacturer's directions. The hSMC1A-specific labeled probe was 5′-CAATGGCTCTGGGTGCTGTGGAATC-3′ (SEQ ID NO: 19). The unlabeled forward and reverse primers were 5′-GGGTCGACAGATTATCGGACC-3′ (SEQ ID NO: 20) and 5′-GTCATACTCCTGCGCCAGCT-3 (SEQ ID NO: 21), respectively. Results were normalized by human GAPDH.
Example 2: Human Frameshift Peptide Array Synthesis and Analysis
[0103] Microsatellite Frameshift antigens: human mRNA sequences were acquired from NCBI CCDS databases (25). Microsatellite regions (homopolymers of 7 runs or more) were mapped to human coding genes, 2.sup.nd and 3.sup.rd reading frame peptide sequences after MS regions were predicted and stored in Microsatellite FS database, MS FS peptides 10 aa or longer were included in the human FS peptide array.
[0104] Mis-splicing Frameshift antigens: human mRNA sequences and exon coordinates were acquired from NCBI Refseq database (23). 2.sup.nd and 3.sup.rd reading frame FS peptide sequences were predicted from the start of every exon. Then all the FS peptides were aligned against the human proteome, FS peptides with higher than 98% homology to wild type proteome were removed. FS peptides 10 aa or longer were then included in the human FS peptide array. Table 7 depicts exemplary variant FS peptides.
[0105] A total number of 64 non-cancer control samples and 13 pancreatic stage 1 cancer samples, 85 late stage cancer samples from 5 cancer types were tested on the FS array, detailed information was summarized in Table 5. All samples were acquired from collaborators and were informed consent upon collection through the institute's own IRB. All samples were anonymized before receipt at Arizona State University (ASU) via Institutional Review Board (IRB) protocol No. STUDY00003722, ‘Receipt of Deidentified Human Serum for Immunosignature Analysis’ and protocol No. 0912004625, ‘Profiling Biological Sera for Unique Antibody Signatures’. All experiments were performed in accordance with the approval protocol.
[0106] 400K Frameshift Peptide Array Assay
[0107] Serum was diluted 1:100 in binding buffer (0.01M Tris-HCl, pH 7.4, 1% alkali-soluble casein, 0.05% Tween-20) and 150 μl diluted samples were loaded into each compartment of the 12-plex array and incubated overnight at room temperature or 4° C. After sample binding, the arrays were washed 3× in wash buffer (1×TBS, 0.05% Tween-20), 10 min per wash. Primary sample binding was detected via Alexa Fluor® 647-conjugated goat anti-human IgG secondary antibody (Jackson ImmunoResearch #109-605-098). The secondary antibody was diluted 1:10,000 (final concentration 0.15 ng/μl) in secondary binding buffer (Ix TBS, 1% alkali-soluble casein, 0.05% Tween-20). Arrays were incubated with secondary antibody for 3 h at room temperature, washed 3× in wash buffer (10 min per wash), 30 secs in reagent-grade water, and then dried by centrifuging at 690 RPM for 5 mins. All washes and centrifugations were done on a Little Dipper 650C Microarray Processor (SciGene) with preset programs. Fluorescent signal of the secondary antibody was detected by scanning at 635 nm at 2 μm resolution and 15% gain, using an MS200 microarray scanner (Roche NimbleGen).
Example 3: Genetic Immunization
[0108] Plasmids for Genetic Immunization
[0109] The DNA fragments encoding FS peptides were cloned as a C-terminal fusion into the genetic immunization vectors pCMVi-UB (26) and pCMVi-LSrCOMPTT (27, 28) with the Bgl II and Hind III and mixed with 1:1 ratio as the vaccine antigen. Three adjuvants were encoded by genetic immunization vectors. The pCMVi-mGM-CSF vector expresses the adjuvant mouse granulocyte/macrophage colony-stimulating factor (mGM-CSF) under control of the human cytomegalovirus (CMV) promoter (27). LTAB indicates immunization with 1:5 ratio by weight of two plasmids, pCMVi-LTA and pCMVi-LTB, expressing the heat-labile enterotoxins LTA and LTB from Escherichia coli. These plasmids express LTA and LTB as C terminal fusions to the secretion leader sequence from the human α1 antitrypsin gene (29). Vectors pCMVi-UB, pCMVi-LSrCOMPTT, pCMVi-LTA (also called pCMVi-LS-LTA-R192G) and pCMVi-LTB are available from the PSI:Biology-Materials Repository DNASU (dnasu.org) at Arizona State University. Additional adjuvants were the class A CpG 2216 single-stranded oligodeoxynucleotide obtained from Sigma and alum from Pierce.
[0110] Bullet Preparation for Genetic Immunization with Gene Gun
[0111] Bullets for biolistic genetic immunization used the gold micronanoplex approach and were prepared as described (30) with the following changes. Two grams of 1-micron gold was used. Prior to addition of N-hydroxysuccinimide and N-(3-dimethylaminopropyl)-N′-ethylcarbodiimide hydrochloride, the gold was resuspended in 20 mL of a 0.1 M solution of 2-(N-morpholino) ethanesulfonic acid (MES), pH 6.0. DNA-gold micronanoplexes were prepared by combining, per bullet, 57 μL of cysteamine-gold solution with precipitated DNA (≤10 μg) that had been resuspended in ≤15 μL of water, and then vortexing for 10 min. To the DNA-cysteamine-gold was added 6 L/bullet of a freshly made solution of PEI-micron gold (167 mg/mL in 0.1 M MES, pH 6, without NaCl). The pelleted micronanoplexes were washed with ethanol prior to resuspension in n-butanol (55 L/bullet), followed by bullet formation under nitrogen gas.
[0112] Immunization Dosage and Regime and Tumor Challenge
[0113] C57BUB16-F10 Mouse Melanoma Model
[0114] Six week old mice (n=10 per group) received one genetic immunization with the Gene Gun in the pinna of the ear (4 shots/mouse) with 20 ng of antigen (SMC1A-1{circumflex over ( )}4 and non-protective Cowpox viral antigen CPV 172 (31)) in pCMVi vectors plus the adjuvants pCMVi-mGM-CSF (0.5 μg) and CpG 2216 (5 μg) for each shot. All of the mice were challenged with 1×10.sup.5 B16-F10 cells 4 weeks after the immunization.
[0115] BALB/C-4T1 Mouse Breast Tumor Model
[0116] For the three MS FS experiments, all mice (n=10 per group) were genetically immunized in the ear by Gene Gun at 8 weeks of age (2 shots/mouse, 60 ng pooled antigens plus 0.25 μg LTAB and 2.5 μg CpG2216 as the adjuvant for each shot) and boosted twice (two days apart) in three weeks with 1 μg pooled antigens plus the same adjuvants dosage. All mice were boosted again in two weeks with 50 μg KLH conjugated MS FS peptides with 50 μg CpG 2216 and 50 ul alum in total 100 ul PBS. The negative groups were immunized with the empty vectors and KLH protein with the same dosage. All mice were challenged with 5×10.sup.3 4T1 cells two weeks after the last immunization.
[0117] For the mSMC1A-1{circumflex over ( )}4 experiment, all mice were (n=10 per group) genetically immunized in the ear by Gene Gun at 8 weeks of age (2 shots/mouse, 1 μg antigen plus 0.25 μg LTAB and 2.5 μg CpG2216 as the adjuvant for each shot), and boosted in two weeks with KLH conjugated SMC1A-1{circumflex over ( )}4 peptide plus 50 μg Poly:IC (Sigma) in 100 ul PBS. The same regime was repeated in two weeks. The negative groups were immunized with the empty vectors and KLH protein with the same dosage. All mice were challenged with 5×1034T1 cells 4 weeks after the last immunization. The CD8 and CD4 T cell depletion started 2 weeks after the last immunization by i.p injection of 100 μg antibody (anti CD8, clone 2.43; anti CD4, clone GK 1.5; BioXCell, West Lebanon, N.H.) every 3 days until the end of the experiment.
[0118] BALB-neuT Mice
[0119] Mice were genetically immunized by Gene Gun at 4-6 weeks with 100 ng of antigen(s) in pCMVi vectors, boosted twice (3-4 days apart) at 9-10 weeks with 1 μg of the same antigen(s), and boosted once at 13-14 weeks with protein. Genetic immunizations included adjuvants LTAB (0.5 μg) and CpG 2216 (5 μg). Protein boosts were 50 μg of KLH conjugated FS peptides (SMC1A-1{circumflex over ( )}4, n=32; RBM FS, n=22; SLAIN2 FS, n=14 and pool of three FS neoantigens, n=37). The protein boost included 50 μg CpG 2216 and 50 μl alum in 100 μl PBS as the adjuvant. The negative groups (n=30) were immunized with the empty vectors and GST or KLH protein with the same adjuvants and dosage.
[0120] ELISPOT
[0121] Peptides used in the ELISPOT assays were synthesized in-house. The Mouse IFN-7 ELISPOT Set (BD Biosciences) was used according to the manufacturer's directions except that blocking was at 37° C. 10.sup.6 fresh mouse splenocytes were added to each well, followed by co-culturing for 48 hr with 20 μg of peptide in a volume of 200 μl RPMI medium. The plate was scanned and spots were analyzed by the AID EliSpot Reader System (Autoimmun Diagnostika GmbH, Germany).
[0122] Statistical Analysis
[0123] The statistical calculation software used was GraphPad Prism 7 (GraphPad Software, San Diego, Calif.) and JMP Pro (SAS Institute, NC). The data presentation and the statistical tests for each experiment are indicated in the legend of the corresponding figures, as well as the samples size and p-values.
Example 4: Model for the Production of RNA-Based Frameshift Variants
[0124] Mistakes in RNA mis-splicing and transcription, particularly of INDELs of MSs in coding regions, in cancer cells may also be a source of neoantigens.
[0125] As seen in
Example 5: Detection of Frameshift Transcripts
[0126] This model makes several specific predictions. First, frequent FS variants in different cancers will be produced by errors in RNA splicing and transcription, not as DNA mutations. As an example of errors in mis-splicing, substantial levels of a FS transcript, SMC1 A1{circumflex over ( )}4 (exon 1 to exon 4), from the gene SMC1A in different mouse and human tumors were found (
[0127] The analysis of RNA-generated FS variants was expanded by comparing NCBI tumor EST libraries to normal EST libraries. To simplify the analysis, FS variants caused by exon skipping or trans-splicing were focused on, i.e. splicing exons from different genes. A total of 12,456 exon skipping variants and 5,234 trans-splicing variants were found (
[0128] Another source of FS transcripts in tumors predicted by embodiments of the model provided herein is INDELs in MSs generated in transcription. As an example, the microsatellite region in the Sec62 gene contains 9 and 11 repeats of Adenine in human and dog, respectively. The sequence of Sec62 and the corresponding INDEL frameshift peptides are shown in
[0129] To further validate the INDELs in the transcription and the translation of the FS peptide, three plasmids based on the dog Sec62 gene were constructed. One has the eGFP fused in the 3.sup.rd reading frame to the MS region of 11 A in the dog Sec62 CDS. The eGFP protein will be correctly translated if there is one A insertion during the transcription. The 11 A with 11 nucleotides of non-MS sequence in another plasmid as the negative control was replaced, so there is no MS related INDEL in the transcription and no expression of eGFP. The 11 A with 12A as the positive control was also replaced, so the eGFP is in the 1st reading frame and would be translated with the upstream dog Sec62 gene. (
Example 6: Detection of Antibodies to Frameshift Peptides
[0130] The model also predicts that the increased expression of FS variants, combined with other aberrant proteins, would overwhelm the quality control system and could potentially elicit immune responses to these FS peptides. To test this, an array of all possible predicted RNA-defined frameshift peptides was designed, meeting specific qualifications that the tumor cell could produce from INDELs in coding MS and mis-splicing of exons.
[0131] There are over 8000 MS in the coding region of the human genome that are runs of 7 or more repeats of homopolymers. The majority of MS regions meeting selection criteria are A runs and the number of MS candidates decreases exponentially as the repeat length or frameshift peptide length criteria increases. Each MS could generate 2 predictable FS peptides depending on whether there was an insertion or deletion. In addition, there are ˜200,000 possible FS peptides that could be generated by mis-splicing of exons in the human genome, such as the examples of mis-splicing FSs. Similar to MS FSs, the number of mis-splicing FSs decreases exponentially as the FS peptide length requirement increases. Most of mis-splicing FSs are generated from the first 10 exons of human genes. The restriction of the peptide being longer than 10 amino acids for both sources of FS was applied. By these criteria there are over 220,000 possible FS antigens. Each FS antigen that was longer than 15 aa was divided into 15 aa, non-overlapping peptides. This produced a total of ˜400,000 peptides. Peptides that share more than 10 aa identical sequences with any human reference proteins were excluded. Finally, each FS array was designed to contain a total of 392,318 FS peptides (
[0132] NimbleGen (Roche, Madison, Wis.) synthesized the FS peptide array, processed the array assay and summarized the IgG signals of each array with their standard protocol (51). The specific IgG reactivities was analyzed to these FSPs in 64 non-cancer control samples and a total of 85 cancers from 5 different late stage cancer types with 17 samples each (LC: lung cancer, BC: breast cancer, GBM: glioblastoma, GC: gastric cancer, PC: pancreatic cancer) and 12 stage I pancreatic cancer samples.
[0133] Each array was normalized to its median florescence for analysis. Three patterns of FS feature reactivity that were higher in cancer than non-cancer were found: common reactivity against FS peptides across all 5 cancer types; cancer type specific reactivity and personal reactivity. Reactivity against ˜7000 selected peptides are shown in
[0134] Total reactivity on the 400K arrays was evaluated in the 5 cancer types and non-cancer samples with two methods. The first method compares the number of significant peptides in the cancer and control samples using fold change and p-values. By this method, BC, GC, PC and LC cancer samples had significantly more FS peptides compared to control samples which met the fold change and p-value criteria described in
[0135] The analysis of individual cancer samples within the same cancer type using the scoring method showed that there were three patterns of reactivity. Most of the positive FS peptides (69%˜80%) were personal for that individual. However, 16%˜19% of the positive peptides were shared between two samples in that cancer type, with 1.5%˜6.9% shared between 3 or more. The distribution of these classes is shown in
[0136] Embodiments of the model provided herein predicts that a FS peptide with high antibody reactivity is highly immunogenic and/or highly expressed in the tumor cells. These FS peptides could be cancer vaccine candidates. Analysis of the distribution of positive peptides allows the formulation 3 types of potential vaccines. One type is a personal vaccine. As an example, the personal vaccines for the 17 GBM patients are shown. Each patient had ˜5800 positive FS peptides using the 6SD cut-off criterion and ˜4500 positive FS peptides being unique for that patient (
[0137] As noted in
[0138] Finally, it was determined if there were FS peptides that were common across all 5 cancer types that met the p-value and frequency requirements. In
[0139] All of the samples used for this analysis were from patients with late stage cancer. Cancer vaccines could also potentially be used for treatment of early stage cancers, and it is unclear whether early and late stage cancer vaccines would require different components. 20,000 most reactive and recurrent peptides were compared to non-cancer for both the late stage and stage 1 pancreatic cancer. As evident in
Example 7: Frameshift Peptides Offer Partial Protection as Vaccines
[0140] The data presented herein shows that FS variants are present at the RNA level in tumors and that antibody responses to these FS peptides are present in cancer patients. However, the clinically relevant question is whether these FS variants can afford therapeutic value as vaccines, which is explored using mouse tumor models.
[0141] It was determined if the SMC1A 1{circumflex over ( )}4 FS peptide confers protection in the B16F10 mouse melanoma cancer model and/or the 4T1 mouse breast cancer model. This FS variant was shown to be common in both human and these mouse tumors (
[0142] It was tested whether the detection of FS variants in the RNA correlated with protection. The SLAIN2 and ZDHHC17 FSs had been identified in sequencing B16F10 cDNA. The SLAIN2 FS was present in the 4T1 mammary cancer cell line, but ZDHHC17 FS was not (
[0143] The model (
[0144] Embodiments of the model provided herein also predicts that each tumor cell will present multiple FS neoantigens. These peptides could be presented at low levels as only a fraction of each RNA would be defective. Therefore, multiplexing neoantigens in a vaccine would be predicted to be more protective. To test this prediction, three FS neoantigens were tested individually and pooled together as vaccines in the BALB-NeuT transgenic mouse mammary cancer model. Each FS neoantigen-based vaccine individually showed similar protection by significantly delaying the tumor growth. As predicted, the pooled neoantigen vaccine produced a significant additive increase in delaying tumor initiation and growth (
[0145] Furthermore, as shown in
REFERENCES
[0146] 1. J. W. Riess, P. N. Lara, Jr., D. R. Gandara, Theory Meets Practice for Immune Checkpoint Blockade in Small-Cell Lung Cancer. J Clin Oncol, (2016). [0147] 2. D. Schadendorf et al., Pooled Analysis of Long-Term Survival Data From Phase II and Phase III Trials of Ipilimumab in Unresectable or Metastatic Melanoma. J Clin Oncol 33, 1889-1894 (2015). [0148] 3. R. J. Motzer et al., Nivolumab versus Everolimus in Advanced Renal-Cell Carcinoma. N Engl J Med 373, 1803-1813 (2015). [0149] 4. E. B. Garon et al., Pembrolizumab for the treatment of non-small-cell lung cancer. N Engl J Med 372, 2018-2028 (2015). [0150] 5. J. Larkin et al., Combined Nivolumab and Ipilimumab or Monotherapy in Untreated Melanoma. N Engl J Med 373, 23-34 (2015). [0151] 6. A. M. Goodman et al., Tumor Mutational Burden as an Independent Predictor of Response to Immunotherapy in Diverse Cancers. Mol Cancer Ther 16, 2598-2608 (2017). [0152] 7. S. Turajlic et al., Insertion-and-deletion-derived tumour-specific neoantigens and the immunogenic phenotype: a pan-cancer analysis. Lancet Oncol 18, 1009-1021 (2017). [0153] 8. N. A. Rizvi et al., Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124-128 (2015). [0154] 9. S. Bae, J. Tie, J. Desai, P. Gibbs, Microsatellite instability status is critical to analysis of survival in stage II colon cancer. J Clin Oncol 30, 675-676; author reply 676-677 (2012). [0155] 10. K. Bauer et al., T cell responses against microsatellite instability-induced frameshift peptides and influence of regulatory T cells in colorectal cancer. Cancer Immunol Immunother 62, 27-37 (2013). [0156] 11. J. C. Dudley, M. T. Lin, D. T. Le, J. R. Eshleman, Microsatellite Instability as a Biomarker for PD-1 Blockade. Clin Cancer Res 22, 813-820 (2016). [0157] 12. R. H. Vonderheide, K. L. Nathanson, Immunotherapy at large: the road to personalized cancer vaccines. Nat Med 19, 1098-1100 (2013). [0158] 13. A. Vitiello, M. Zanetti, Neoantigen prediction and the need for validation. Nat Biotechnol 35, 815-817 (2017). [0159] 14. Z. R. Chalmers et al., Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med 9, 34 (2017). [0160] 15. B. Vogelstein et al., Cancer genome landscapes. Science 339, 1546-1558 (2013). [0161] 16. P. A. Ott et al., An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217-221 (2017). [0162] 17. U. Sahin et al., Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222-226 (2017). [0163] 18. T. R. Hodges et al., Mutational burden, immune checkpoint expression, and mismatch repair in glioma: implications for immune checkpoint immunotherapy. Neuro Oncol 19, 1047-1057 (2017). [0164] 19. A. C. Filley, M. Henriquez, M. Dey, Recurrent glioma clinical trial, CheckMate-143: the game is not over yet. Oncotarget 8, 91779-91794 (2017). [0165] 20. C. Kandoth et al., Mutational landscape and significance across 12 major cancer types. Nature 502, 333-339 (2013). [0166] 21. D. Hanahan, R. A. Weinberg, Hallmarks of cancer: the next generation. Cell 144, 646-674 (2011). [0167] 22. J. F. Gout et al., The landscape of transcription errors in eukaryotic cells. Sci Adv 3, e1701484 (2017). [0168] 23. N. A. O'Leary et al., Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44, D733-745 (2016). [0169] 24. A. Untergasser et al., Primer3Plus, an enhanced web interface to Primer3. Nucleic Acids Res 35, W71-74 (2007). [0170] 25. K. D. Pruitt et al., The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res 19, 1316-1323 (2009). [0171] 26. K. F. Sykes, S. A. Johnston, Genetic live vaccines mimic the antigenicity but not pathogenicity of live viruses. DNA Cell Biol 18, 521-531 (1999). [0172] 27. R. S. Chambers, S. A. Johnston, High-level generation of polyclonal antibodies by genetic immunization. Nat Biotechnol 21, 1088-1092 (2003). [0173] 28. D. T. Hansen et al., Polyclonal Antibody Production for Membrane Proteins via Genetic Immunization. Sci Rep 6, 21925 (2016). [0174] 29. G. C. Whitlock et al., Protective antigens against glanders identified by expression library immunization. Front Microbiol 2, 227 (2011). [0175] 30. S. A. Svarovsky, M. J. Gonzalez-Moa, M. D. Robida, A. Y. Borovkov, K. Sykes, Self-assembled micronanoplexes for improved biolistic delivery of nucleic acids. Mol Pharm 6, 1927-1933 (2009). [0176] 31. A. Borovkov et al., New classes of orthopoxvirus vaccine candidates by functionally screening a synthetic library for protective antigens. Virology 395, 97-113 (2009). [0177] 32. J. F. Gout, W. K. Thomas, Z. Smith, K. Okamoto, M. Lynch, Large-scale detection of in vivo transcription errors. Proc Natl Acad Sci USA 110, 18584-18589 (2013). [0178] 33. B. Schwanhausser et al., Global quantification of mammalian gene expression control. Nature 473, 337-342 (2011). [0179] 34. M. Imashimizu, T. Oshima, L. Lubkowska, M. Kashlev, Direct assessment of transcription fidelity by high-resolution RNA sequencing. Nucleic Acids Res 41, 9090-9104 (2013). [0180] 35. H. S. Zaher, R. Green, Fidelity at the molecular level: lessons from protein synthesis. Cell 136, 746-762 (2009). [0181] 36. S. Lykke-Andersen, T. H. Jensen, Nonsense-mediated mRNA decay: an intricate machinery that shapes transcriptomes. Nat Rev Mol Cell Biol 16, 665-677 (2015). [0182] 37. A. Ruggiano, O. Foresti, P. Carvalho, Quality control: ER-associated degradation: protein quality control and beyond. J Cell Biol 204, 869-879 (2014). [0183] 38. J. E. Bradner, D. Hnisz, R. A. Young, Transcriptional Addiction in Cancer. Cell 168, 629-643 (2017). [0184] 39. S. C. Lee, O. Abdel-Wahab, Therapeutic targeting of splicing in cancer. Nat Med 22, 976-986 (2016). [0185] 40. T. I. Lee, R. A. Young, Transcriptional regulation and its misregulation in disease. Cell 152, 1237-1251 (2013). [0186] 41. S. Oltean, D. O. Bates, Hallmarks of alternative splicing in cancer. Oncogene 33, 5311-5318 (2014). [0187] 42. S. Negrini, V. G. Gorgoulis, T. D. Halazonetis, Genomic instability—an evolving hallmark of cancer. Nat Rev Mol Cell Biol 11, 220-228 (2010). [0188] 43. C. Y. Lin et al., Transcriptional amplification in tumor cells with elevated c-Myc. Cell 151, 56-67 (2012). [0189] 44. D. Silvera, S. C. Formenti, R. J. Schneider, Translational control in cancer. Nat Rev Cancer 10, 254-266 (2010). [0190] 45. P. L. Lollini et al., Vaccines and other immunological approaches for cancer immunoprevention. Curr Drug Targets 12, 1957-1973 (2011). [0191] 46. M. Goldman et al., The UCSC Cancer Genomics Browser: update 2015. Nucleic Acids Res 43, D812-817 (2015). [0192] 47. C. A. Maher et al., Transcriptome sequencing to detect gene fusions in cancer. Nature 458, 97-101 (2009). [0193] 48. C. A. Maher et al., Chimeric transcript discovery by paired-end transcriptome sequencing. Proc Natl Acad Sci USA 106, 12353-12358 (2009). [0194] 49. M. T. Chang et al., Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat Biotechnol 34, 155-163 (2016). [0195] 50. R. J. Hause, C. C. Pritchard, J. Shendure, S. J. Salipante, Classification and characterization of microsatellite instability across 18 cancer types. Nat Med 22, 1342-1350 (2016). [0196] 51. B. Forsstrom et al., Proteome-wide epitope mapping of antibodies using ultra-dense peptide arrays. Mol Cell Proteomics 13, 1585-1597 (2014). [0197] 52. M. Sade-Feldman et al., Resistance to checkpoint blockade therapy through inactivation of antigen presentation. Nat Commun 8, 1136 (2017). [0198] 53. M. D. Vesely, R. D. Schreiber, Cancer immunoediting: antigens, mechanisms, and implications to cancer immunotherapy. Ann N Y Acad Sci 1284, 1-5 (2013). [0199] 54. D. T. Le et al., Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science 357, 409-413 (2017). [0200] 55. D. T. Le et al., PD-1 Blockade in Tumors with Mismatch-Repair Deficiency. N Engl J Med 372, 2509-2520 (2015). [0201] 56. A. Kahles et al., Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients. Cancer Cell 34, 211-224 e216 (2018). [0202] 57. A. C. Smart et al., Intron retention is a source of neoepitopes in cancer. Nat Biotechnol 36, 1056-1058 (2018). [0203] 58. S. D. Martin et al., Low Mutation Burden in Ovarian Cancer May Limit the Utility of Neoantigen-Targeted Vaccines. PLoS One 11, e0155189 (2016). [0204] 59. T. N. Schumacher, R. D. Schreiber, Neoantigens in cancer immunotherapy. Science 348, 69-74 (2015). [0205] 60. T. Kimura et al., MUC1 vaccine for individuals with advanced adenoma of the colon: a cancer immunoprevention feasibility study. Cancer Prev Res (Phila) 6, 18-26 (2013). [0206] 61. L. A. Vella et al., Healthy individuals have T-cell and antibody responses to the tumor antigen cyclin B1 that when elicited in mice protect from cancer. Proc Natl Acad Sci USA 106, 14010-14015 (2009). [0207] 62. D. W. Cramer et al., Conditions associated with antibodies against the tumor-associated antigen MUC1 and their relationship to risk for ovarian cancer. Cancer Epidemiol Biomarkers Prev 14, 1125-1131 (2005). [0208] 63. P. Stafford et al., Physical characterization of the “immunosignaturing effect”. Mol Cell Proteomics 11, M111 011593 (2012). [0209] 64. G. P. Dunn, A. T. Bruce, H. Ikeda, L. J. Old, R. D. Schreiber, Cancer immunoediting: from immunosurveillance to tumor escape. Nat Immunol 3, 991-998 (2002). [0210] 65. D. B. Keskin et al., Neoantigen vaccine generates intratumoral T cell responses in phase Ib glioblastoma trial. Nature 565, 234-239 (2019). [0211] 66. S. Kreiter et al., Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature 520, 692-696 (2015). [0212] 67. C. Linnemann et al., High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma. Nat Med 21, 81-85 (2015).
TABLE-US-00001 TABLE 1 # # RefSeq_ Encode FS Joint_ #Total_ #Total_ Tumor_ Normal_ ID peptides pos EST EST_Ids Lib lib lib NM_ SPSQAMWATRM 1940- 7 14679393, 3 3 0 001640.3 (SEQ ID NO: 2047 16524005, 22) 18802412, 18807797, 19365353, 19366001, 33261912, NM_ GVGGGILPPETP 2623- 3 10264060, 3 3 0 199002.1 PVSAWGELCPP 2788 19733507, AWLHL (SEQ 23301501, ID NO: 23) NM_ RHEKCCNWKQ 370- 5 20492217, 3 3 0 014154.2 QAESQSHCFRS 448 22518928, CSKIVVLASARN 45367569, LKHRAEN (SEQ 146009855, ID NO: 24) 146104793, NM__ TTNPSRISLPSW 1183- 4 11106585, 4 3 1 001686.3 VWMNFLRKTS 1398 12431398, (SEQ ID NO: 25) 19143008, 20486863, NM_ DHGGVGRCSNV 317- 3 10342556, 3 3 0 004217.2 LPWEEGDSQRH 599 14654109, KARKSALRAQG 22671315, RAEDC (SEQ ID NO: 26) NM_ WSCSSITGAAG 545- 6 19376801, 4 3 1 016561.2 NLNTTSWSTRL 751 28113628, WPNGRRKKLSS 45652559, GWSSWALGHLF 47036548, TGKGFYLNE 52114251, (SEQ ID NO: 27) 52114353, NM_ FSLKMSSYPLLG 379- 4 9808442, 4 3 0 024808.2 LIMKGNSFHNVI 426 17166915, PVNALT (SEQ ID 146059308, NO: 28) 146063843, NM_ PCTGLSLHPMA 2168- 4 4311385, 4 4 0 013265.2 PRIWSRWSFPA 2397 46230323, GRCQDRPNKHV 46834109, WPPQKKKKKK 47020765, KKKKK (SEQ ID NO: 29) NM_ GSADRDDGKV 1339- 4 8407623, 4 3 1 020314.5 (SEQ ID NO: 30) 1540 9889142, 10213802, 80934926, NM_ CYQHPFPKKSQ 1845- 3 8618242, 3 3 0 018553.3 FPGAYWTSFEG 2308 14448310, EEEGSGQLTLPGP 14469670, (SEQ ID NO: 31) NM_ GFAASWLFKKP 1419- 20 2111082, 18 12 4 134447.1 RPSECHTVIFKE 2068 3151384, ESYMN (SEQ ID 3405187, NO: 32) 3801503, 5395116, 5446288, 5636075, 6451167, 7152982, 7319964, 8634237, 8634238, 19587294, 19753219, 21251126, 23295375, 24791739, 24792974, 154727570, 154730372, NM_ DAAFFMSPKLI 224- 4 10744663, 4 3 1 152266.3 WWQEMATERG 283 11064241, LFGLEIPIILKEL 22668651, (SEQ ID NO: 33) 32210516, NM_ CFTSSPLRW 241- 7 12272400, 7 4 0 080571.1 (SEQ ID NO: 34) 360 20501581, 22824741, 45697997, 46272730, 146043981, 146121376, NM_ RVQGTLVHCPT 2485- 4 10217199, 4 3 1 178448.3 RHLSQRRGPGR 2522 13329041, QRGNSLPEPSS 14652514, MLTCPQQPHRA 71054789, TFPAAPGLQGCP RTGPSQPSMQL PSYPEDGSGLSR GHKDVRPGPPG QERVQVLRACA PQPQHQVDCSA VGGPVAAREKP PVSRLGSAHQG LPTSAFEGACH ALGDPGIFTGLE AGDRTVSVPG (SEQ ID NO: 35) NM_ CLQKHLPVALS 2741- 3 2222976, 3 3 0 000070.2 TSLC (SEQ ID 3083 4124403, NO: 36) 7038190, NM_ MTSLLSSHHPLK 1977- 13 1720716, 9 6 2 032830.2 RRNLEP (SEQ ID 2102 2269339, NO: 37) 4332045, 5397085, 5638770, 5769282, 7317235, 11450365, 13719026, 13734654, 24787788, 24808260, 45860690, NM_ LLSSHHPLKRRN 1986- 4 13908790, 4 3 0 032830.2 LEP (SEQ ID NO: 2111 18392074, 38) 46257227, 92180377, NM_ TSASQIQAILVP 1865- 3 4630123, 3 3 0 001040648.1 (SEQ ID NO: 39) 2258 4899627, 5676137, NM_ LLLQLRPGSRPF 889- 4 1940552, 4 4 0 001161452.1 PVTYVSVTGRQ 1669 3933437, PYKSW (SEQ ID 13402321, NO: 40) 14509526, NM_ AAAAAHHHSPR 225- 3 13914233, 3 3 0 001039712.1 PAALRHPQEET 429 21175318, GCVP (SEQ ID 45699401, NO: 41) NM_ LLQPPFVFIPPG 263- 5 10202290, 5 3 1 015954.2 CVML (SEQ ID 412 11101998, NO: 42) 13284397, 15434305, 52108714, NM_ SPKLPLVRRWM 540- 3 9183529, 3 3 0 213566.1 Q (SEQ ID NO: 731 10729953, 43) 13583484, NM_ LPCSSLTSYWE 292- 9 9141503, 7 6 1 001384.4 MLWLWLHDWR 345 9341726, RRQGQRCSFWV 9720673, TQPTAAAAWM 11614383, CWVLSKLELRL 12102395, SYILALPA (SEQ 13326770, ID NO: 44) 22703054, 22813642, 56794883, NM_ HFPACQLLPLCD 2342- 4 6594041, 4 3 0 130443.2 LISSALPYVE 2439 6974193, (SEQ ID NO: 45) 24809933, 31153484, NM_ CLQNWWYWYC 119- 4 10201484, 3 3 0 001402.5 SCWPSGDWCSQ 818 16001157, TRYGGHLCSSQ 19093438, RYNGSKICRNAP 19204512, (SEQ ID NO: 46) NM_ GFWSRFPPPW 448- 5 9137001, 5 3 1 014285.5 (SEQ ID NO: 47) 528 46278258, 145993595, 146042851, 146123968, NM_ VSPGVSELRRNS 3439- 4 6444477, 3 3 0 001113378.1 KKYGKAGEAV 3628 6870295, WFSSDPPVLFFH 6870449, FLRTE (SEQ ID 83195477, NO: 48) NM_ VLGSQRHPGQG 860- 3 10218110, 3 3 0 001018078.1 SCGSCPWHLCS 1009 19144710, SPHPTCGSGFGT 46186123, RSGRAGRRCCG AGPSPGTWTVR TPPAARRPACA GSARRCRAARG RAVAPRFESCSS MLPGTGTRRPC (SEQ ID NO: 49) NM_ GWPGHVMGSQ 637- 8 2574599, 7 5 1 006098.4 RRQTPLHARW 748 9807168, WGHHQRPVLQP 13524413, (SEQ ID NO: 50) 33203609, 52715305, 58413416, 58566171, 90906220, NM_ GPRGHAGEGGR 390- 4 13133604, 4 3 1 015666.3 QSCGRPVLRGR 507 145997763, (SEQ ID NO: 51) 146023828, 146095508, NM_ VQMKMMKSSS 291- 24 14072238, 15 9 3 016426.6 DPLDIKKDVLLP 350 14079103, AWN (SEQ ID 14080406, NO: 52) 14176079, 52197802, 52282171, 52282469, 52282506, 52282657, 84914016, 145998391, 146023882, 146039486, 146039586, 146040214, 146050613, 146052038, 146057369, 146057491, 146062991, 146072037, 146080660, 146102605, 146107434, NM_ EGVLLQVTNEE 1300- 4 12422802, 4 3 1 031243.2 VVNHRVFKK 3179 13033025, (SEQ ID NO: 53) 13047121, 24132471, NM_ KEGVLLQVTNE 2581- 3 2466855, 3 3 0 031243.2 EVVNHRVFKK 3176 4569115, (SEQ ID NO: 54) 5659331, NM_ DSCGIVNSY 2925- 6 2077398, 5 4 1 006644.2 (SEQ ID NO: 55) 3176 10153160, 10993881, 12672555, 13911640, 51668448, NM_ DSCGIVNSY 2924- 6 4074102, 4 3 1 006644.2 (SEQ ID NO: 56) 3175 10032700, 10153621, 19588875, 19608035, 45695863, NM_ NCPVWRHNPCL 441- 6 2033361, 6 4 1 024660.2 ASWMSWRCWKS 821 9124825, (SEQ ID NO: 9141800, 57) 9332671, 10216854, 23253517, NM_ IVGPGPKPEASA 916- 5 19137983, 5 3 1 014761.2 KLPSRPADNYD 955 19193502, NFVLPELPSVPD 28140121, TLPTASAGASTS 46922603, ASEDIDFDDLSR 283449919, RFEEL (SEQ ID NO: 58) NM_ VGSMPKELLGE 450- 7 9329185, 5 4 1 001130089.1 SSSSMIFEERG 617 9335633, (SEQ ID NO: 59) 9336682, 14810814, 22365213, 45711902, 45715554, NM_ HRDSRGSGRNG 197- 18 10141571, 10 8 2 199187.1 RHPEREGDHAK 260 10142544, PERPPGLLPGQS 10402934, EEPGDREPEAGE 10586887, QNPGALGEEGT 12758550, PGQRLEPLLQD 14177331, HRGPEGSDLRK 19893549, YCGQCPHRSAD 21768308, (SEQ ID NO: 60) 21774629, 21774763, 21777572, 21811780, 21815923, 22682079, 22908399, 24042754, 24045349, 56795793, NM_ LLRSRHSTRILP 831- 9 9340416, 4 4 0 002273.3 TAAGLRLRACT 881 9759824, RSSMRSCRAWL 9759932, GSTGMTCGAQR 9897110, LRSLR (SEQ ID 9897831, NO: 61) 10156714, 21813841, 21814354, 21816557, NM_ RCQPDRHSHIW 1731- 6 2054843, 4 3 1 177433.1 ALRWPWWSWC 1809 5511019, QHQWQLWCLW 5673765, FLLQV (SEQ ID 5853954, NO: 62) 20203884, 23531396, NM_ ETPSDSDHKKK 590- 8 3151481, 5 3 1 153450.1 KKKKEEDPERK 830 3750732, RKKKEKKKKK 4223069, VE (SEQ ID NO: 6139460, 63) 11444683, 11451179, 11452422, 18988750, NM_ AGNVRSNSRPSI 749- 4 2252141, 4 3 1 015950.3 QR (SEQ ID NO: 824 3277351, 64) 19588584, 23291327, NM_ PASGGSDLVNH 541- 3 12308492, 3 3 0 032112.2 SFLCKWHP 717 12339978, (SEQ ID NO: 65) 22813610, NM_ CLLLGAVTL 599- 5 10154760, 3 3 0 032112.2 (SEQ ID NO: 66) 713 13408766, 20201885, 20493143, 21494992, NM_ EIPERNQGPVAA 237- 5 1295506, 5 4 1 014018.2 IRS (SEQ ID NO: 420 6898484, 67) 10246880, 33209502, 34555226, NM_ LHWGSTKVHLL 415- 4 10738994, 4 3 1 001145839.1 LI (SEQ ID NO: 801 80835964, 68) 146091479, 146109603, NM_ GGPRRIWS 410- 8 10147163, 4 4 0 001114185.1 (SEQ ID NO: 69) 712 10989026, 16773154, 16776609, 16779119, 22853771, 22902798, 145986212, NM_ GGPRRIWS 419- 5 16773347, 4 3 1 000431.2 (SEQ ID NO: 70) 721 16777501, 28132078, 47402601, 146062357, NM_ RSVKWSPNTMQ 452- 5 9151226, 5 4 0 003491.2 MGRTPMP (SEQ 499 9345658, ID NO: 71) 19210146, 27947049, 126672362, NM_ VPTACCRCCFC 788- 16 9141107, 12 9 1 024313.2 WDV (SEQ ID 2696 9803380, NO: 72) 12427492, 13328739, 13908466, 14678515, 21780385, 21785028, 22345418, 22361309, 22361754, 46290768, 68292178, 82116561, 90837311, 92186397, NM_ SGKTSSILCRRG 391- 13 2054860, 10 8 2 016391.4 RWRWS (SEQ ID 483 2932939, NO: 73) 2942143, 3601044, 4535246, 5425877, 5438639, 5596079, 5659519, 7151152, 19723340, 19738445, 24795292, NM_ AGDAVLGAHTQ 151- 3 12766042, 3 3 0 007243.1 RPCVVGGSG 349 46616730, (SEQ ID NO: 74) 145998555, NM_00 GAKPGGLALGA 533- 3 3887573, 3 3 0 1042549.1 V (SEQ ID NO: 12194 4991027, 75) 6451223, NM_ DEVFALPLAHL 426- 5 10326854, 5 4 1 181843.1 LQTQNQGYTHF 498 11970552, CRGGHFRYTLP 18510936, VFLHGPHRVWG 19030548, LTAVITEFALQL 46555631, LAPGTYQPRLA GLTCSGAEGLA RPKQPLASPCQ ASSTPGLNKGL (SEQ ID NO: 76) NM_ QENCSNPGGRG 2448- 4 1192583, 4 3 0 198887.1 CSDPRSCHFTPA 3467 3280105, WAKEQNAISKN 5636736, IHI (SEQ ID NO: 24792671, 77) NM_ AKFCPTFNKSM 704- 5 11617690, 5 3 1 007342.2 EEQGK (SEQ ID 782 52065044, NO: 78) 52097801, 52298172, 80768446, NM_ GLWLFRPQNVL 257- 9 3988478, 5 4 0 001199462.1 QMPQSILLQQG 383 4076572, ASDPRLEIGT 4268320, (SEQ ID NO: 79) 4268335, 6700534, 10373888, 10984780, 11512824, 11512860, NM_ DYRRLPPGPAN 2760- 4 9808150, 4 3 1 002618.3 FFCIFSRDGVSP 3281 11159219, CYPGWSPSPDL 13459444, VMSPLRSPKVL 22920343, GLQA (SEQ ID NO: 80) NM_ PLRRPCTRSCW 465- 3 14807581, 3 3 0 031948.3 GQGS (SEQ ID 629 19210482, NO: 81) 146069312, NM_ CDLNSLCIFVAI 1630- 5 2159346, 4 3 1 004577.3 FHTKCFKCGESI 1940 13709277, KHLYS (SEQ ID 13742243, NO: 82) 14506129, 27939669, NM_ GTIVVQWGPSW 269- 18 2277936, 14 10 3 020387.2 CLT (SEQ ID NO: 466 9146588, 83) 10156678, 10742718, 14380528, 14511202, 19128358, 19180556, 19196633, 19199578, 19199919, 23272326, 24184393, 38619719, 52187412, 52187724, 52259400, 52288970, NM_ GLWMVVRSVW 338- 6 10885369, 4 3 1 006743.4 IMQASLLGEPEE 445 12600212, VALGPMGVVA 12600293, ATLEVVGTRAM 13460579, GVAGIMTVDLE 19132700, GMDMDMDVPE 21168881, TIMAETRVVMT ATQEEITETIMTT (SEQ ID NO: 84) NM_ SLPPNPSAARET 165- 3 1679208, 3 3 0 016026.3 KGISPIKDSKCV 546 22269010, FPRTSPGKDPLP 80545142, (SEQ ID NO: 85) NM_ GLFVFPIYCLC 1017- 5 10400124, 5 3 1 152553.2 (SEQ ID NO: 86) 1133 13908341, 14428408, 52261877, 83255255, NM_ EVWRHLLGRPH 427- 5 21985536, 4 3 1 198486.2 S (SEQ ID NO: 538 21986341, 87) 145986153, 145999838, 146106725, NM_ IRELCHRYLPQP 225- 8 1154529, 6 5 1 000973.3 (SEQ ID NO: 88) 453 6937038, 9128356, 19091430, 19200294, 20486488, 22907262, 24044064, NM_ GVRQWQHLQP 646- 13 9124850, 8 7 1 001002.3 (SEQ ID NO: 89) 754 10205674, 13031883, 13403621, 13466151, 13666955, 14173427, 14175419, 19817898, 19895213, 21816494, 22689525, 47384119, NM_ GLLWCAAVHH 285- 3 9125003, 3 3 0 001005.3 GEWGQRLRGC 381 9139471, GVWETPRTEG 22695855, (SEQ ID NO: 90) NM_ FGKAHGASW 614- 6 10160942, 4 4 0 001006.3 (SEQ ID NO: 91) 725 12602739, 19378611, 21773234, 22849872, 22908519, NM_ GDGGSGSKGRP 1088- 3 1801795, 3 3 0 138421.2 VEQTEVFLCISK 1286 7155873, PSSFL (SEQ ID 16771906, NO: 92) NM_ LHARAPGPRGP 1708- 4 4890586, 4 4 0 017827.3 PLLCPCCLRVSH 1833 5746185, (SEQ ID NO: 93) 13915028, 23284022, NM_ LPQQDLWHLQF 1164- 3 9896956, 3 3 0 001005914.1 HQGLPRRCHPV 1377 52185731, CAEPPPHVQLCP 80585087, AHWGAPSFPTS WSQLHLHSNCR GPGCSR (SEQ ID NO: 94) NM_ GIFELFIL (SEQ 328- 4 19184218, 4 3 1 021627.2 ID NO: 95) 463 52117054, 80576973, 82328796, NM_ GIGAVCMDWW 1086- 4 19211503, 4 3 0 001193342.1 AAAPPGECAPR 1200 146039032, PGCAAHHCGHR 146045087, LLH (SEQ ID NO: 146056161, 96) NM_ SPCPSSPPSQPW 1096- 4 21176693, 4 3 1 001532.2 (SEQ ID NO: 97) 1137 24044445, 28133989, 80539035, NM_ VLSDLGCAAGK 343- 4 13997158, 4 3 1 178148.2 SDDPQLWGHSH 499 46283786, ITG (SEQ ID NO: 78233770, 98) 80883909, NM_ CCGIYCHEEPQR 179- 6 10204155, 4 3 1 006306.2 EDSSI (SEQ ID 482 10350966, NO: 99) 20396212, 20413818, 52288176, 84940096, NM_ HFPDGEVTAER 1593- 10 1162267, 10 7 2 030918.5 CGHLAFPYPLPF 2370 2324233, PSPPSSYSFHVP 2356934, FQTE (SEQ ID 2552335, NO: 100) 2557157, 3765160, 4328216, 12300356, 24781036, 24803854, NM_ ISVSIMWTQRRK 269- 5 24952240, 5 3 0 006461.3 L (SEQ ID NO: 862 45703140, 101) 46182693, 46185076, 52109618, NM_ VKGVLHSLTAA 1055- 6 2952696, 6 5 1 006925.3 GQTH (SEQ ID 1428 4286279, NO: 102) 18979142, 21477426, 21982089, 24787231, NM_ KHQAMDHHGV 426- 7 9183882, 5 4 1 006374.3 PGRRLSTGLA 477 11256565, (SEQ ID NO: 103) 17161793, 17163262, 17174422, 22286625, 24120773, NM_ GDQQPDRTQAG 2877- 8 6883317, 8 6 2 014760.3 LKSVSQVEDVF 2907 10991109, RELIGTQKTRTG 12385448, CFPPSGS (SEQ 21770848, ID NO: 104) 46184886, 58050995, 82074179, 91879091, NM_ CSAQARNRSED 2451- 8 9149080, 3 3 0 006521.4 ETQPLPLGTLLA 2492 9330710, F (SEQ ID NO: 9331155, 105) 9336773, 9344551, 9344576, 10734097, 10734771, NM_ HQALGAVPSCE 112- 6 16526130, 4 3 1 199293.2 GV (SEQ ID NO: 370 45700010, 106) 45704764, 45705693, 45717940, 46847261, NM_ QFRTPGWPLKA 543- 10 3933593, 4 3 0 207379.1 LAGRGWPEDAS 1313 3933605, PGQEPSKGAGR 4111770, GWA (SEQ ID 4312229, NO: 107) 4684269, 6504772, 6838403, 10031991, 10940483, 11083896, NM_ PRAAVSGIQQW 2087- 11 9176343, 5 4 1 006291.2 WNGRQNWKRK 2545 10210944, KEKMSSRLAGA 11290536, FRVLWRAVSTA 19369027, SIRRHIQVAPRP 22342759, LQAGPAMGP 22374168, (SEQ ID NO: 108) 22662093, 22852902, 22853464, 22853646, 2902765, NM_ LIVGGGAPDRK 2096- 10 21980643, 5 4 1 015140.3 GFQ (SEQ ID 2811 46551962, NO: 109) 46552370, 46845450, 46876330, 46920760, 46925643, 46929343, 46951310, 47021176, NM_ CQRCPLCWP 343- 3 12687717, 3 3 0 012473.3 (SEQ ID NO: 110) 468 21780390, 28088991, NM_ GVRCLIHSIHGFL 308- 6 11265100, 6 4 1 001184977.1 (SEQ ID NO: 382 18775927, 111) 19897757, 51485275, 81213059, 82161427, NM_ WPQLLLEPNSG 567- 4 8608901, 3 3 0 003370.3 KSASRRRPQGG 1019 14173570, PQPPKLRVVEA 46181698, EVGDSWKR 46269629, (SEQ ID NO: 112) NM_ VAARAWAQPPL 828- 7 10145344, 5 4 1 052844.3 PGAECGHRREG 939 10147104, ATLAGHRGRPA 16526305, AAHRGLRPGHA 21773170, AAATEHQAQEA 21777139, SPRGDRGGRHG 31447502, SGLLQL (SEQ ID 46265826, NO: 113) NM_ RYGRCVHCREI 290- 5 10391746, 4 3 1 001033519.1 VLQQPSGHRQP 374 10393365, (SEQ ID NO: 114) 12339226, 14653998, 78233952, NM_ GLMASDYSEEV 674- 3 11158199, 3 3 0 152858.1 ATSEKFPF (SEQ 895 12338537, ID NO: 115) 21118493, NM_ DRKRGCCPTSSS 1312- 4 22340486, 4 3 1 182969.1 LPISLRVRLS 1480 27841540, (SEQ ID NO: 116) 27878857, 83526847, NM_ SHSQSGGPRHP 722- 4 9155377, 3 3 0 005741.4 GGTRRKAMGSQ 1106 16534738, CPELQGGPEPQR 16535238, PSSRRREI(SEQ 22701945, ID NO: 117)
TABLE-US-00002 TABLE 2 The 50 human breast cancer cell lines. No. Cell Line ATCC_Name Tissue 1 MCF-10A CRL-10317 Breast 2 BT-474 HTB-20 Breast 3 Hs 319.T CRL-7236 Breast 4 HCC1428 CRL-2327 Breast 5 HCC1599 CRL-2331 Breast 6 Hs 605.T CRL-7365 Breast 7 Hs 362.T CRL-7253 Breast 8 ZR-75-1 CRL-1500 Breast 9 MCF-7 HTB-22 Breast 10 Hs 281.T CRL-7227 Breast 11 HCC1500 CRL-2329 breast 12 BT-20 HTB-19 breast 13 HCC1143 CRL-2321 breast 14 UACC-812 CRL-1897 breast 15 SW527 CRL-7940 breast 16 MDA-MB-453 HTB-131 breast 17 ZR-75-30 CRL-1504 breast 18 MDA-MB-468 HTB-132 breast 19 HCC1187 CRL-2322 breast 20 SK-BR-3 HTB-30 breast 21 MDA-MB-175-VII HTB-25 breast 22 Hs 574.T CRL-7345 breast 23 HCC 1008 CRL-2320 breast 24 Hs 742.T CRL-7482 breast 25 Hs 748.T CRL-7486 breast 26 BT-483 HTB-121 breast 27 HCC202 CRL-2316 breast 28 HCC 2157 CRL-2340 breast 29 BT-549 HTB-122 breast 30 MDA-MB-415 HTB-128 breast 31 HCC1395 CRL-2324 breast 32 HTB-127 breast 33 MDA-MB-231 HTB-26 breast 34 CAMA-1 HTB-21 breast 35 MDA-MB-134-VI HTB-23 breast 36 Hs 606.T CRL-7368 breast 37 HCC1806 CRL-2335 breast 38 HCC1419 CRL-2326 breast 39 AU565 CRL-2351 breast 40 HCC1937 CRL-2336 breast 41 Hs 578T HTB-126 breast 42 Hs 739.T CRL-7477 breast 43 DU4475 HTB-123 breast 44 HCC70 CRL-2315 breast 45 HCC38 CRL-2314 breast 46 HCC1954 CRL-2338 breast 47 MB 157 CRL-7721 breast 48 HCC2218 CRL-2343 breast 49 Hs 343.T CRL-7245 breast 50 UACC-893 CRL-1902 breast
TABLE-US-00003 TABLE 3 Mouse mis-splicing FS antigens in the vaccine Peptide Antigen Name size peptide sequence ZDHHC17 FS 21 AVLLMCQLYQPWMCKEYYRLL (SEQ ID NO: 118) SLAIN2 FS 21 IPRMQPQASANHCQLLKVMVA (SEQ ID NO: 119) mSMC1A1{circumflex over ( )}4 27 TAIIGPNGSGCSGVYCHEEPQGEDSSV (SEQ ID NO: 120) RBM FS 45 GRVIECDVVKGSCQDGEAVHWKSAPGG HRAGDPLTLRAVREGAGM (SEQ ID NO: 121)
TABLE-US-00004 TABLE 4 Three mouse MS FS antigens with predicted H2-D epitope Anti- Pep- peptide sequence gen MS tide (Kd/Ld epitope ID Access # type INDEL size score > 20) MS927 NM_ 9_A Del 33 ICMSPPLLWATLQAPE 053009.3 TTSAACKASYRPEGLYL (SEQ ID NO: 122) MS255 NM_ 9_A In 24 YFSCDKRCIKHYAGNK 010086.4 SLLTFSGY (SEQ ID NO: 123) MS518 NM_ 10_A Del 59 TLCMEVMLRWNTRELG 153511.3 YLYLQLCFLNTHFLHT SQEEKLLTLGRFLTWT SRCGSFVIRPL (SEQ ID NO: 124)
TABLE-US-00005 TABLE 5 Samples tested on Human 400K FS array Number of Sample Type Samples Source Breast Cancer 17 UT Southwestern Lung Cancer 17 UT Southwestern GBM 17 Barrows Neurological Institute Pancreatic Cancer 17 TGEN Pancreatic Cancer Stage 1 13 TGEN Gastric Cancer 17 Japan Control 64 Varied Sources
TABLE-US-00006 TABLE 6 Three ORFs of Sec62 gene Sec62-12A: (SEQ ID NO: 125) atggcggagcgcaggagacacaagaagcggatccaggaagttggtgaacc atctaaagaagagaaggctgtagccaagtatcttcgatttaactgtccaa caaagtctaccaatatgatggggcaccgagagattatacattgcttcaaa agcagtggattgccattggattcaaagtgggcaaaggccaagaaaggaga ggaagcatatttacaacaagggagtctgtggagactactgcaacaggcat taaagaagcagattacaccgggcactaaaagtaatgaaaatgaagtatga taaagacataaaaaaagaaaaagagaaaggaaaggccgaaagtggaaaag aagaagataaaaagagcaggaaagaaaatctaaaggatgaaaagacgaaa aaggagaaagaaaaaaaaaaaagatggggaaaaggaagaggattacaagg acgacgacgacaagtgaaattcatggtgagcaagggcgaggagctgacac cggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccaca agacagcgtgtccggcgagggcgagggcgatgccacctacggcaagctga ccctgaagacatctgcaccaccggcaagctgcccgtgccctggcccaccc tcgtgaccaccctgacctacggcgtgcagtgcttcagccactaccccgac cacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgt ccaggagcgcaccatcacttcaaggacgacggcaactacaagacccgcgc cgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaagg gcatcgacttcaaggaggacggcaacatcctggggcacaagctggagtac aactacaacagccacaacgtctatatcatggccgacaagcagaagaacgg catcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgc agctcgccgaccactaccagcagaacacccccatcggcgacggccccgtg ctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaaga ccccaacgagaagcgcgatcacatggtcctgctggagacgtgaccgccgc cgggatcactctcggcatggacgagctgtacaagagatctggtaccacgc gtatcgataagcttgcatgcctgcaggtcgactctagaggatcgtga; Sec62-11A: (SEQ ID NO: 126) atggcggagcgcaggagacacaagaagcggatccaggaagaggtgaacca tctaaagaagagaaggctgtagccaagtatatcgatttaactgtccaaca aagtctaccaatatgatggggcaccgagagattatacattgcttcaaaag cagtggattgccattggattcaaagtgggcaaaggccaagaaaggagagg aagattatttacaacaagggagtctgtggagactactgcaacaggcatta aagaagcagattacaccgggcactaaaagtaatgaaaatgaagtatgata aagacataaaaaaagaaaaagagaaaggaaaggccgaaagtggaaaagaa gaagataaaaagagcaggaaagaaaatctaaaggatgaaaagacgaaaaa ggagaaagaaaaaaaaaaaagatggggaaaaggaagaggattacaaggac gacgacgacaagtgaaattcatggtgagcaagggcgaggagctgacaccg gggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaag acagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgacc ctgaagacatctgcaccaccggcaagctgcccgtgccctggcccaccctc gtgaccaccctgacctacggcgtgcagtgcttcagccactaccccgacca catgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgtcc aggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgcc gaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaaggg catcgacttcaaggaggacggcaacatcctggggcacaagctggagtaca actacaacagccacaacgtctatatcatggccgacaagcagaagaacggc atcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgca gctcgccgaccactaccagcagaacacccccatcggcgacggccccgtgc tgctgcccgacaaccactacctgagcacccagtccgccctgagcaaagac cccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgccgc cgggatcactctcggcatggacgagctgtacaagagatctggtaccacgc gtatcgataagcttgcatgcctgcaggtcgactctagaggatcgtga; Sec62-Non MS: (SEQ ID NO: 127) atggcggagcgcaggagacacaagaagcggatccaggaagttggtgaacc atctaaagaagagaaggctgtagccaagtatcttcgatttaactgtccaa caaagtctaccaatatgatggggcaccgagttgattatttcattgcttca aaagcagtggattgccttttggattcaaagtgggcaaaggccaagaaagg agaggaagctttatttacaacaagggagtctgtggttgactactgcaaca ggcttttaaagaagcagttttttcaccgggcactaaaagtaatgaaaatg aagtatgataaagacataaaaaaagaaaaagagaaaggaaaggccgaaag tggaaaagaagaagataaaaagagcaggaaagaaaatctaaaggatgaaa agacgaaaaaggagaaagagaggaagagagatggggaaaaggaagaggat tacaaggacgacgacgacaagtgaaattcatggtgagcaagggcgaggag ctgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaa cggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacg gcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccc tggcccaccctcgtgaccaccctgacctacggcgtgcagtgcttcagcca ctaccccgaccacatgaagcagcacgacttcttcaagtccgccatgcccg aaggctacgtccaggagcgcaccatcttcttcaaggacgacggcaactac aagacccgcgccgaggtgaagttcgagggcgacaccctggtgaaccgcat cgagctgaagggcatcgacttcaaggaggacggcaacatcctggggcaca agctggagtacaactacaacagccacaacgtctatatcatggccgacaag cagaagaacggcatcaaggtgaacttcaagatccgccacaacatcgagga cggcagcgtgcagctcgccgaccactaccagcagaacacccccatcggcg acggccccgtgctgctgcccgacaaccactacctgagcacccagtccgcc ctgagcaaagaccccaacgagaagcgcgatcacatggtcctgctggagtt cgtgaccgccgccgggatcactctcggcatggacgagctgtacaagagat ctggtaccacgcgtatcgataagcttgcatgcctgcaggtcgactctaga ggatcgtga.
TABLE-US-00007 TABLE 7 up- down- Trans- stream stream splicing gene gene ID ACC# ACC# up WT sequence down stream FS sequence BOLA2_ NM_ NM_ MASAKSLDRWKARLLEGGST LLNR (SEQ ID NO: 129) Exon_ 001031827.1 015092.3 ALTYALVRAEVSFPAEVAPV SMG1_Exon12 RQQGSVAGARAGVVSLLGCR SSWTAAMELSAEYLREKLQR DLEAEHVEVEDTTLNRCSCSF RVLVVSAKFEGKPLLQRHR (SEQ ID NO: 128) GFOD1_ NM_ NM_ MLPGVGVFGTSLTARVIIPLL EPGHQRKKISRQKNTGEKKMP Exon1_ 018988.2 033069.2 KDEGFAVKALWGRTQEEAEE RGSVQLSFCSLQHPHMGHLFTP C6orf114_ LAKEMSVPFYTSRIDEVLLHQ HDAALGESQGTGFKPLGMQPV Exon2 DVDLVCINLPPPLTRQIAVKT (SEQ ID NO: 131) L (SEQ ID NO: 130) MDS1_ NM_ NM_ MRSKGRARKLATNNECVYG ILDEFYNVKFCIDASQPDVGSW Exon2_EVI1_ 004991.2 001105078.2 NYPEIPLEEMPDADGVASTPS LKYIRFAGCYDQHNLVACQIND Exon4 LNIQEPCSPATSSEAFTPKEGS QIFYRVVADIAPGEELLLFMKS PYKAPIYIPDDIPIPAEFELRES EDYPHETMAPDIHEERQYRCED NMPGAGLGIWTKRKIEVGEK CDQLFESKAELADHQKFPCSTP FGPYVGEQRSNLKDPSYGWE HSAFSMVEEDFQQKLESENDLQ (SEQ ID NO: 132) EIHTIQECKECDQVFPDLQSLEK HMLSHTEEREYKCDQCPKAFN WKSNLIRHQMSHDSGKHYECE NCAKVFTDPSNLQRHIRSQHVG ARAHACPECGKTFATSSGLKQ HKHIHSSVKPFICEV (SEQ ID NO: 133) C11orf79_ NM_ NM_ MAVSTVFSTSSLMLALSRHSL GPEGPFRHPGARASGHHGAGA Exon3_ 017841.1 145017.1 LSPLLSVTSFRRFYRGDSPTDS QGSASAPPAAGPGPAGAGELPT C11orf66_ QKDMIEIPLPPWQERTDESIET WPTLHDVGVQFQVSQGPSRPA Exon5 KRARLLYESRKRGMLENCILL RFLAEEIDRRKGGEWLHQTVPP SLFAKEHLQHMTEKQLNLYD EPHCLPTALTGPPWGPCPPPRPE RLINEPSNDWDIYYWAT CHQVRLPPQDSPTWR (SEQ ID (SEQ ID NO: 134) NO: 135) ABHD14A_ NM_ NM_ MVGALCGCWFRLGGARPLIP AHHAQRHDQQGSRGGAPIGDA Exon3_ 015407.3 000666.1 LGPTVVQTSMSQSQVALLGL LPPVPAYPHCPAQA (SEQ ID ACY1_ SLLLMLLLYVGLPGPPEQTSC NO: 137) Exon2 LWGDPNVTVLAGLTPGNSPIF YREVLPLNQAHRVEVVLLHG KAFNSHTWEQLGTLQLLSQR GYRAVALDLP (SEQ ID NO: 136) RBM14_ NM_ NM_ MKIFVGNVDGADTTPEELAA GSCQDGEAVHRKPAPGGYRAG NA_RBM4_ 006328.3 002896.2 LFAPYGTVMSCAVMKQFAFV DSLTLRAVWEGAGM (SEQ ID Exon2 HMRENAGALRAIEALHGHEL NO: 139) RPGRALVVEMSRPRPLNTWK IFVGNVSAACTSQELRSLFER RGRVIECDVVK (SEQ ID NO: 138) C20orf29_ NM_ NM_ MVHAFLIHTLRAPNTEDTGLC SLVSSQSIHPSWGQSPLSRI Exon2_ 018347.1 020746.3 RVLYSCVFGAEKSPDDPRPH (SEQ ID NO: 141) VISA_ GAERDRLLRKEQILAVA Exon2 (SEQ ID NO: 140) RRM2_ NM_ NM_ MLSLRVPLAPITDPQQLQLSP LGDREVQSRWSPGPRGDSTPVR Exon9_ 001034.1 182626.1 LKGLSLVDKENTPPALSGTRV EMETNHPPSVRG (SEQ ID NO: C2orf48_ LASKTARRIFQEPTEPKTKAA 143) Exon2 APGVEDEPLLRENPRRFVIFPI EYHDIWQMYKKAEASFWTA EEVDLSKDIQHWESLKPEERY FISHVLAFFAASDGIVNENLV ERFSQEVQITEARCFYGFQIA MENIHSEMYSLLIDTYIKDPK EREFLFNAIETMPCVKKKAD WALRWIGDKEATYGERVVA FAAVEGIFFSGSFASIFWLKKR GLMPGLTFSNELISRDEGLHC DFACLMFKHLVHKPSEERVR EIIINAVRIEQEFLTEALPVKLI GMNCTLMKQYIEFVADRLML ELGFSKV (SEQ ID NO: 142) ELACl_ NM_ NM_ MSMDVTFLGTGAAYPSPTRG YPEYMSNNFPCNVSCCFSLFPK Exon2_ 018696.2 005359.5 ASAVVLRCEGECWLFDCGEG DQNCFRNWRHI (SEQ ID NO: SMAD4_ TQTQLMKSQLKAG (SEQ ID 145) Exon2 NO: 144) BCAS4_ NM_ NM_ MQRTGGGAPRPGRNHGLPGS VPLTGA (SEQ ID NO: 147) Exon1_ 001010974.1 001099432.1 LRQPDPVALLMLLVDADQPE BCAS3_ PMRSGARELALFLTPEPGAE Exon24 (SEQ ID NO: 146) C22orf39_ NM_ NM_ MADGSGWQPPRPCEAYRAE ASRFFQLIFTLTGPSSQLEDKGR Exon2_ 173793.3 003325.3 WKLCRSARHFLHHYYVHGE ILGRL (SEQ ID NO: 149) HIRA_ RPACEQWQRDLASCRDWEE Exon2 RRNAEAQ (SEQ ID NO: 148) PMF1_ NM_ NM_ MAEASSANLGSGCEEKRHEG VRSPAVQSPAKVQPLCPSRRAA Exon4_ 007221.2 199173.3 SSSESVPPGTTISRVKLLDTM R (SEQ ID NO: 151) BGLAP_ VDTFLQKLVAAGSYQRFTDC {circumflex over ( )}Exon4 YKCFYQLQPAMTQQIYDKFI AQLQTSIREEISDIKEEGNLEA VLNALDKIVEEGKVRKEPAW RPSGIPEKDLHSVMAPYFLQQ RDTLRRHVQKQEAENQQLAD AVLAGRRQVEELQLQVQAQ QQAWQ (SEQ ID NO: 150) SDHD_ NM_ NM_ MAVLWRLSAVCGALGGRAL CLQCQIVHSCPLLENQIHLSLKF Exon3_ 003002.1 031275.4 LLRTPVVRPAHISAFLQDRPIP PDYFIKMKPWRKI (SEQ ID TEX12_ EWCGVQHIHLSPSHHSGSKA NO: 153) Exon3 ASLHWTSERVVSVLLLGLLP AAYLNPCSAMDYSLAAALTL HGH (SEQ ID NO: 152) PRR13_ NM_ NM_ MWNPNAGGPPHPVPQPGYPG FLAFTPNQ (SEQ ID NO: 155) Exon3b_ 001005354.2 001128914.1 CQPLGPYPPPYPPPAPGIPPVN PCBP2_ PLAPGMVGPAVIVDKKMQK Exon2 KMKKAHKKMHKHQKHHKY HKHGK (SEQ ID NO: 154) RMND5A_ NM_ NM_ MDQCVTVERELEKVLHKFSG DSL (SEQ ID NO: 157) Exon2_ 022780.2 022662.2 YGQLCERGLEELIDYTGGLK ANAPC1_ HEILQSHGQDAELSGTLSLVL Exon25 TQCCKRIKDTVQKLASDHKDI HSSVSRVGKAIDK (SEQ ID NO: 156) TYMP_ NM_ NM_ MAALMTPGTGAPPAPGDFSG ASDPCCC (SEQ ID NO: 159) Exon9_ 001113756.1 005138.2 EGSQGLPDPSPEPKQLPELIR SCO2_ MKRDGGRLSEADIRGFVAAV Exon2 VNGSAQGAQIGAMLMAIRLR GMDLEETSVLTQALAQSGQQ LEWPEAWRQQLVDKHSTGG VGDKVSLVLAPALAACGCKV PMISGRGLGHTGGTLDKLESI PGFNVIQSPEQMQVLLDQAG CCIVGQSEQLVPADGILYAAR DVTATVDSLPLITASILSKKLV EGLSALVVDVKFGGAAVFPN QEQARELAKTLVGVGASLGL RVAAALTAMDKPLGRCVGH ALEVEEALLCMDGAGPPDLR DLVTTLGGALLWLSGHAGTQ AQGAARVAAALDDGSALGR FERMLAAQGVDPGLARALCS GSPAERRQLLPRAREQEELLA PADGTVELVRALPLALVLHE LGAGRSRAGEPLRLGVGAEL LVDVGQRLRRG (SEQ ID NO: 158) NAIP_ NM_ NM_ MATQQKASDERISQFDHNLL G (SEQ ID NO: 161) Exon13_ 004536.2 002538.2 PELSALLGLDAVQLAKELEEE OCLN_ EQKERAKMQKGYNSQMRSE Exon5 AKRLKTFVTYEPYSSWIPQEM AAAGFYFTGVKSGIQCFCCSL ILFGAGLTRLPIEDHKRFHPDC GFLLNKDVGNIAKYDIRVKN LKSRLRGGKMRYQEEEARLA SFRNWPFYVQGISPCVLSEAG FVFTGKQDTVQCFSCGGCLG NWEEGDDPWKEHAKWFPKC EFLRSKKSSEEITQYIQSYKGF VDITGEHFVNSWVQRELPMA SAYCNDSIFAYEELRLDSFKD WPRESAVGVAALAKAGLFYT GIKDIVQCFSCGGCLEKWQE GDDPLDDHTRCFPNCPFLQN MKSSAEVTPDLQSRGELCELL ETTSESNLEDSIAVGPIVPEMA QGEAQWFQEAKNLNEQLRA AYTSASFRHMSLLDISSDLAT DHLLGCDLSIASKHISKPVQE PLVLPEVFGNLNSVMCVEGE AGSGKTVLLKKIAFLWASGC CPLLNRFQLVFYLSLSSTRPD EGLASIICDQLLEKEGSVTEM CVRNIIQQLKNQVLFLLDDYK EICSIPQVIGKLIQKNHLSRTC LLIAVRTNRARDIRRYLETILE IKAFPFYNTVCILRKLFSHNM TRLRKFMVYFGKNQSLQKIQ KTPLFVAAICAHWFQYPFDPS FDDVAVFKSYMERLSLRNKA TAEILKATVSSCGELALKGFF SCCFEFNDDDLAEAGVDEDE DLTMCLMSKFTAQRLRPFYR FLSPAFQEFLAGMRLIELLDS DRQEHQDLGLYHLKQINSPM MTVSAYNNFLNYVSSLPSTK AGPKIVSHLLHLVDNKESLEN ISENDDYLKHQPEISLQMQLL RGLWQICPQAYFSMVSEHLL VLALKTAYQSNTVAACSPFV LQFLQGRTLTLGALNLQYFFD HPESLSLLRSIHFPIRGNKTSP RAHFSVLETCFDKSQVPTIDQ DYASAFEPMNEWERNLAEKE DNVKSYMDMQRRASPDLST GYWKLSPKQYKIPCLEVDVN DIDVVGQDMLEILMTVFSAS QRIELHLNHSRGFIESIRPALE LSKASVTKCSISKLELSAAEQ ELLLTLPSLESLEVSGTIQSQD QIFPNLDKFLCLKELSVDLEG NINVFSVIPEEFPNFHHMEKLL IQISAEYDPSKL (SEQ ID NO: 160) C1orf151_ NM_ NM_ MSESELGRKWDRCLADAVV LWRPRA (SEQ ID NO: 163) Exon1_ 001032363.1 182744.2 KIG (SEQ ID NO: 162) NBL1_ Exon3 DDIT3_ NM _ NM_ MAAESLPFSFGTLSSWELEA LPLGASGGFPSATANCFFRSKSF {circumflex over ( )}Exon3_ 004083.4 004990.2 WYEDLQEVLSSDENGGTYVS ATSAATSFLSAFCAFSSRTMFPC MARS_ PP (SEQ ID NO: 164) FVTSSISACICCGLAVVTVSTTA {circumflex over ( )}Exon21 GFGDVFAWPPPKRCLKLSIWSF SNFWNKGLTVPIWCPAGKVHR KFVSRILQAGGGSCSWAWIVAL TVGM (SEQ ID NO: 165) RIPK3_ NM_ NM_ MSCVKLWPSGAPAPLVSIEEL ADLRPELPDHCAVRAGRLLAA Exon9_ 006871.3 139247.2 ENQELVGKGGFGTVFRAQHR AGPRFPGAATAALDASPVRLG ADCY4_ KWGYDVAVKIVNSKAISREV MGRAASARPRLPVHRGRGERL Exon2 KAMASLDNEFVLRLEGVIEK GPGVLFSLRHLHGVCHAALGH VNWDQDPKPALVTKFMENG AGRRRRGPRLLTLASAGPRAVS SLSGLLQSQCPRPWPLLCRLL WATAGLTACTAAAVGSKRSAV KEVVLGMFYLHDQNPVLLHR PVRERGRSVPQGADGARPAGH DLKPSNVLLDPELHVKLADF VPGGTQLPALTPAAGHREEAPG GLSTFQGGSQSGTGSGEPGGT TPSLVHPSCLPGPRDEGRDHGT LGYLAPELFVNVNRKASTAS AAGRTGVTAREH (SEQ ID NO: DVYSFGILMWAVLAGREVEL 167) PTEPSLVYEAVCNRQNRPSLA ELPQAGPETPGLEGLKELMQL CWSSEPKDRPSFQECLPKTDE VFQMVENNMNAAVSTVKDF LSQLRSSNRRFSIPESGQGGTE MDGFRRTIENQHSRNDVMVS EWLNKLNLEEPPSSVPKKCPS LTKRSRAQEEQVPQAWTAGT SSDSMAQPPQTPETSTFRNQM PSPTSTGTPSPGPRGNQGAER QGMNWSCRTPEPNPVTG (SEQ ID NO: 166) COMMD3_ NM_ NM_ MELSESVQKGFQMLADPRSF GFFIKQKCIEQRESRSLS (SEQ Exon1_ 012071.2 005180.5 DSNAFTLLLRAAFQSLLDAQ ID NO: 169) BMI1_Exon2 ADEAVL (SEQ ID NO: 168) MED8_ NM_ NM_ MQREEKQLEASLDALLSQVA VLSQDGGCCELVPRGDEARRSP Exon7c_ 052877.3 022821.2 DLKNSLGSFICKLENEYGRLT DPGLPSDGVPLANDLHSPDLRV ELOVL1_ WPSVLDSFALLSGQLNTLNK LRSLTWASHHG (SEQ ID NO: Exon2 VLKHEKTPLFRNQVIIPLVLSP 171) DRDEDLMRQTEGRVPVFSHE VVPDHLRTKPDPEVEEQEKQ LTTDAARIGADAAQKQIQSLN KMCSNLLEKISKEERESESGG LRPNKQTFNPTDTNALVAAV AFGKGLSNWRPSGSSGPGQA GQPGAGTILAGTSGLQQVQM AGAPSQQQPMLSGVQMAQA GQPGKMPSGIKTNIKSASMHP YQR (SEQ ID NO: 170) POLR2J3- NM_ XM_ MNAPPAFESFLLFEGEKITINK RACFPFAFCRDCQFPEASPATLS {circumflex over ( )}Exon2_ 001097615.1 001717094.1 DTKVPNACLFTMNKEDHTLG VQPAEL (SEQ ID NO: 173) UPK38_ NIIKS (SEQ ID NO: 172) {circumflex over ( )}Exon7 BGLAP_ NM_ NM_ MAEASSANLGSGCEEKRHEG VRSPAVQSPAKVQPLCPSRRAA {circumflex over ( )}Exon2_ 199173.3 007221.2 SSSESVPPGTTISRVKLLDTM R (SEQ ID NO: 175) PMF1_ VDTFLQKLVAAGSYQRFTDC {circumflex over ( )}Exon5 YKCFYQLQPAMTQQIYDKFI AQLQTSIREEISDIKEEGNLEA VLNALDKIVEEGKVRKEPAW RPSGIPEKDLHSVMAPYFLQQ RDTLRRHVQKQEAENQQLAD AVLAGRRQVEELQLQVQAQ QQAWQ (SEQ ID NO: 174) TMEM199_ NM_ NM_ MASSLLAGERLVRALGPGGE PRGAHWAGRDPEPGEGTRTRR Exon5_ 152464.1 015077.2 LEPERLPRKLRAELEAALGKK AGAERGRHLGAHVQAFGGDM SARM1_ HKGGDSSSGPQRLVSFRLIRD PEAGGGRRPGRGAVLVPPHGP Exon2 LHQHLRERDSKLYLHELLEGS RAAAPLRAGAGQLRAARGPGG EIYLPEVVKPPRNPELVARLE AATHGREARSRVALPARLLQG KIKIQLANEEYKRITRNVTCQ GRAASAARLPRSSGVGD (SEQ DTRHGGTLSDLGKQVRSLKA ID NO: 177) LVITIFNFIVTVVAAFVCTYLG SQYIFTEMASR (SEQ ID NO: 176) C1QTNF6 NM_ NM_ MQWLRVRESPGEATGHRVT LPSSAPPCGCNGGPCSVLASAPP Exon2_ 182486.1 000878.2 MGTAALGPVWAALLLFLLM HPPPAPGYLLGICSGEWHFPVH IL2RB_ CEIPMVELTFDRAVASGCQRC MLLQLESQHLLCLEPRWGSAG Exon2 CDSEDPLDPAHVSSASSSGRP HFLPSPCLAGQTAVEPNL (SEQ HALPEIRPYINITILKG (SEQ ID NO: 179) ID NO: 178) LOC100131434_ XM_ XM_ MDPASRGCLGPTPAFRHRKE RPSTPCLHGAALHLHSGHGSGS NA_ 001713865.1 001714058.1 QSSASPRPSEATGARTMGSQA RLTNSSCFPGTRRLLALQFTQQ FLJ44451_ RRPPVIPFTKNETLFSLPGPDA TGTVGHPTWQPVIR (SEQ ID NA RQPTRPRPGDLETGSLDEEPE NO: 181) GGKGTGGRKISRIDFITKFWV PASGVPDETKRLLVLHPRCYF QNSGLVVWSLHCSMSLLSNL ESSVFLPSVRCAYFSLEKLEE AGMLEM (SEQ ID NO: 180) COX19_ NM_ NM_ MSTAMNFGTKSFQPRPPDKG SRLGLLHSGRLHLPELLGNPPE Exon2_ 001031617.2 006869.2 SFPLDHLGECKSFKEKFMKCL YPPGQQGEVRPPGRLGGGPSGV CENTAl_ HNNNFENALCRKESKEYLEC HGLPRERRRESQV (SEQ ID Exon2 RMER (SEQ ID NO: 182) NO: 183) ACSF2_ NM_ NM_ MAVYVGMLRLGRLCAGSSG RNLRKKLQHGKMDSKAPMSC Exon10_ 025149.4 001267.2 VLGARAALSRSWQEARLQGV (SEQ ID NO: 185) CHAD_ RFLSSREVDRMVSTPIGGLSY {circumflex over ( )}Exon4 VQGCTKKHLNSKTVGQCLET TAQRVPEREALVVLHEDVRL TFAQLKEEVDKAASGLLSIGL CKGDRLGMWGPNSYAWVL MQLATAQAGIILVSVNPAYQ AMELEYVLKKVGCKALVFPK QFKTQQYYNVLKQICPEVEN AQPGALKSQRLPDLTTVISVD APLPGTLLLDEVVAAGSTRQ HLDQLQYNQQFLSCHDPINIQ FTSGTTGSPKGATLSHYNIVN NSNILGERLKLHEKTPEQLRM ILPNPLYHCLGSVAGTMMCL MYGATLILASPIFNGKKALEA IS RERGTFLYGTPTMFVDILN QPDFSSYDISTMCGGVIAGSP APPELIRAIINKINMKDLV (SEQ ID NO: 184) TIMM23B_ XM_ XM_ MEGGGGSGNKTTGGLAGFFG VSEMALDSPFCVLLSGS (SEQ NA_ 928114.3 001719607.1 AGGAGYSHADLAGVPLTGM ID NO: 187) LOC100132418_ NPLSPYLNVDPRYLVQDTDEF NA ILPTGANKTRGRFELAFFTIGG CCMTGAAFGAMNGLRLGLK ETQNMAWSKPRNVQILNMV TRQGALWANTLGSLALLYSA FGVIIEKTRGAEDDLNTVAAG TMTGMLYKCT (SEQ ID NO: 186) NDUFA13_ NM_ NM_ MQEPRRVTPCLGKRGVKTPQ GLGAAAPTCRHGKSGA (SEQ Exon4_ 015965.5 198537.2 LQPGSAFLPRVRRQSFPARSD ID NO: 189) YJEFN3_ SYTTVRDFLAVPRTISSASATL Exon2 IMAVAVSHFRPGPEVWDTAS MAASKVKQDMPPPGGYGPID YKRNLPRRGLSGYSMLAIGIG TLIYGHWSIMKWNRERRRLQ IEDFEARIALLPLLQAETDRRT LQMLRENLEEEAIIMKDVPD WK (SEQ ID NO: 188) ADHFE1_ NM_ NM_ MAAAARARVAYLLRQLQRA YPVQPEEEPKALSTS (SEQ ID Exon13_ 144650.2 152765.3 ACQCPTHSHTYSQAPGLSPSG NO: 191) C8orf46_ KTTDYAFEMAVSNIRYGAAV NA TKEVGMDLKNMGAKNVCLM TDKNLSKLPPVQVAMDSLVK NGIPFTVYDNVRVEPTDSSFM EAIEFAQKGAFDAYVAVGGG STMDTCKAANLYASSPHSDF LDYVSAPIGKGKPVSVPLKPL IAVPTTSGTGSETTGVAIFDYE HLKVKIGITSRAIKPTLGLIDP LHTLHMPARVVANSGFDVLC HALESYTTLPYHLRSPCPSNPI TRPAYQGSNPISDIWAIHALRI VAKYLKRAVRNPDDLEARSH MHLASAFAGIGFGNAGVHLC HGMSYPISGLVKMYKAKDY NVDHPLVPHGLSVVLTSPAVF TFTAQMFPERHLEMAEILGA DTRTARIQDAGLVLADTLRK FLFDLDVDDGLAAVGYSKAD IPALVKGTLPQ (SEQ ID NO: 190) HPS4_ NM_ NM_ MATSTSTEAKSASWWNYFFL SNSCTS (SEQ ID NO: 193) Exon13_ 022081.4 020437.4 YDGSKVKEEGDPTRAGICYF ASPHD2_ YPSQTLLDQQELLCGQIAGVV {circumflex over ( )}Exon4 RCVSDISDSPPTLVRLRKLKF AIKVDGDYLWVLGCAVELPD VSCKRFLDQLVGFFNFYNGP VSLAYENCS QEELSTEWDTFI EQILKNTSDLHKIFNSLWNLD QTKVEPLLLLKAARILQTCQR SPHILAGCILYKGLIVSTQLPP SLTAKVLLHRTAPQEQRLPTG EDAPQEHGAALPPNVQIIPVF VTKEEAISLHEFPVEQMTRSL ASPAGLQDGSAQHHPKGGST SALKENATGHVESMAWTTPD PTSPDEACPDGRKENGCLSGH DLESIRPAGLHNSARGEVLGL SSSLGKELVFLQEELDLSEIHI PEAQEVEMASGHFAFLHVPV PDGRAPYCKASLSASS SLEPT PPEDTAISSLRPPSAPEMLTQH GAQEQLEDHPGHSSQAPIPRA DPLPRRTRRPLLLPRLDPGQR GNKLPTGEQGLDEDVDGVCE SHAAPGLECSSGSANCQGAG PSADGISSRLTPAESCMGLVR MNLYTHCVKGLVLSLLAEEP LLGDSAAIEEVYHSSLASLNG LEVHLKETLPRDEAAS TS STY NFTHYDRIQSLLMANLPQVA TPQDRRFLQAVSLMHSEFAQ LPALYEMTV (SEQ ID NO: 192) KIAA1267_ NM_ NM_ MAAMAPALTDAAAEAHHIRF VSVWRQ (SEQ ID NO: 195) Exon2_ 015443.2 001113738.1 KLAPPSSTLSPGSAENNGNAN ARL17P1_ ILIAANGTKRKAIAAEDPSLDF Exon3 RNNPTKEDLGKLQPLVASYL CSDVTSVPSKESLKLQGVFSK QTVLKSHPLLSQSYELRAELL GRQPVLEFSLENLRTMNTSG QTALPQAPVNGLAKKLTKSS THSDHDNSTSLNGGKRALTSS ALHGGEMGGSESGDLKGGM TNCTLPHRSLDVEHTTLYSNN STANKSSVNSMEQPALQGSS RLSPGTDSSSNLGGVKLEGKK SPLSSILFSALDSDTRITALLRR QADIESRARRLQKRLQVVQA KQVERHIQHQLGGFLEKTLSK LPNLESLRPRSQLMLTRKAEA ALRKAASETTTSEGLSNFLKS NSISEELERFTASGIANLRCSE QAFDSDVTDSSSGGESDIEEE ELTRADPEQRHVPLRRRSEW KWAADRAAIVSRWNWLQAH VSDLEYRIRQQTDIYKQIRAN K (SEQ ID NO: 194) L0C100129406_ XM_ NM_ MAGRPGSQEQSKDRGTGSLP SIGHISTMLMAF (SEQ ID NO: NA_ 001722372.1 018704.2 PPSQRPLGPSPEGAGPSPPPPG 197) CTTNBP2NL_ IPRGGGSSSSEGPHSYFLSLVD NA SQLLRRGFPLTPLIQRHLPPRT SALAERTH (SEQ ID NO: 196) RNF216_ NM_ NM_ MEEGNNNEEVIHLNNFHCHR VYQPQSLHVSKSSRK (SEQ ID Exon7_ 207116.1 021163.3 GQEWINLRDGPITISDSSDEER NO: 199) RBAK_ IPMLVTPAPQQHEEEDLDDD Exon2 VILTEDDSEDDYGEFLDLGPP GISEFTKPSGQTEREPKPGPSH NQAANDIVNPRSEQKVIILEE GSLLYTESDPLETQNQSSEDS ETELLSNLGESAALADDQAIE EDCWLDHPYFQSLNQQPREIT NQVVPQERQPEAELGRLLFQ HEFPGPAFPRPEPQQGGISGPS SPQPAHPLGEFEDQQLASDDE EPGPAFPMQESQEPNLENIWG QEAAEVDQELVELLVKETEA RFPDVANGFIEEIIHFKNYYDL NVLCNFLLENPDYPKREDRIII NPSSSLLASQDETKLPKIDFFD YSKLTPLDQRCFIQAADLLM ADFKVLSSQDIKWALHELKG HYAITRK (SEQ ID NO: 198) DEDD_ NM_ NM_ MAGLKRRASQVWPEEHGEQ APSGLGL (SEQ ID NO: 201) Exon4_ 032998.2 005600.1 EHGLYSLHRMFDIVGTHLTH NIT1_Exon6 RDVRVLSFLFVDVIDDHERGL IRNGRDFLLALERQGRCDESN FRQVLQLLRIITRHDLLPYVTL KRRRA (SEQ ID NO: 200) RAD54B_ NM_ XM_ MRRSAAPSQLQGNSFKKPKFI QTWMRRHRLVPVHYR (SEQ Exon3_ 012415.2 001722896.1 PPGRSNPGLNEEITKLNPDIKL ID NO: 203) LOC100128414_ FEGVAINNTFLPSQNDLRICSL NA NLPSEESTREINNRDNCSGKY CFEAPTLATLDPPHTV (SEQ ID NO: 202) TOPORS_ NM_ NM_ MGSQPPLGSPLSREEGEAPPP KRCSIFRLRKTTRAQWRLPHFF Exon2_ 005802.2 014314.3 APASEGRRRSRRVRLRGSCR SSSCWSSRRKAGSVAFWMP DDX58_ HRPSFLGCRELAASAPARPAP (SEQ ID NO: 205) Exon2 ASSE (SEQ ID NO: 204) NDUFC2_ NM_ NM_ MIARRNPEPLRFLPDEARSLPP VYCCGAERRG (SEQ ID NO: Exon2_ 004549.4 023930.3 PKLTDPRLLYIGFLGYCSGLID 207) KCTD14_ NLIRRRPIATAGLHRQLLYITA Exon2 FFFAGYYLVKREDYLYAVRD REMFGYMKLHPEDFPEED (SEQ ID NO: 206) LRRC57_ NM_ NM_ MGNSALRAHVETAQKTGVF SALSVIRFICGF (SEQ ID NO: {circumflex over ( )}Exon5_ 153260.2 003825.2 QLKDRGLTEFPADLQKLTSNL 209) SNAP23_ RTIDLSNNKIESLPPLLIGKFTL Exon8 LKSLSLNNNKLTVLPDEICNL KKLETLSLNNNHLRELPSTFG QLSALKTLSLSGNQLGALPPQ LCSLRHLDVMDLSKNQIRSIP DSVGELQVIELNLNQNQISQIS VKISCCPRLKILRL (SEQ ID NO: 208) IPO11_ NM_ NM_ MVQPIIHLGYVVYSLLYLGY LASKGP (SEQ ID NO: 211) NA_SLRN_ 001134779.1 181506.4 KPVQHVTALNTVSSCHKMVS NA MDLNSASTVVLQVLTQATSQ DTAVLKPAEEQLKQWETQPG FYSVLLNIFTNHTLDINVRWL AVLYFKHGIDRYWRRVAPHA LSEEEKTTLRAGLITNFNEPIN QIATQIAVLIAKVARLDCPRQ WPELIPTLIESVKVQDDLRQH RALLTFYHVTKTLASKRLAA DRKLFYDLASGIYNFACSLW NHHTDTFLQEVSSGNEAAILS SLERTLLSLKVLRKLTVNGFV EPHKNMEVMGFLHGIFERLK QFLECSRSIGTDNVCRDRLEK TIILFTKVLLDFLDQHPFSFTP LIQRSLEFSVSYVFTEVGEGV TFERFIVQCMNLIKMIVKNYA YKPSKNFEDSSPETLEAHKIK MAFFTYPTLTEICRRLVSHYF LLTEEELTMWEEDPEGFTVEE TGGDSWKYSLRPCTEVLFIDI FHEYNQTLTPVLLEMMQTLQ GPTNVEDMNALLIKDAVYNA VGLAAYELFDSVDFDQWFKN QLLPELQVIHNRYKPLRRRVI WLIGQWISVKFKSDLRPMLY EAICNLLQDQDLVVRIETATT LKLTVDDFEFRTDQFLPYLET MFTLLFQLLQQVTECDTKMH VLHVLSCVIERVNMQIRPYVG CLVQYLPLLWKQSEEHNMLR CAILTTLIHLVQGLGADSKNL YPFLLPVIQLSTDVSQPPHVY LLEDGLELWLVTLENSPCITP ELLRIFQNMSPLLELSSENLRT CFKIINGYIFLSSTEFLQTYAV GLCQSFCELLKEITTEGQVQV LKVVENALKVNPILGPQMFQ PILPYVFKGIIEGERYPVVMST YLGVMGRVLLQNTSFFSSLL NEMAHKFNQEMDQLLGNMI EMWVDRMDNITQPERRKLSA LALLSLLPSDNS (SEQ ID NO: 210) SNRPF_ NM_ NM_ MSLPLNPKPFLNGLTGKPVM QDFHLHLGNIETK (SEQ ID Exon2_ 003095.2 182496.1 VKLKWGMEYKGYLVSVDGY NO: 213) CCDC38_ MNMQ (SEQ ID NO: 212) {circumflex over ( )}Exon12 RNF139_ NM_ NM_ MAAVGPPQQQVRMAHQQV ETNTDTLLV (SEQ ID NO: 215) Exon1_ 007218.3 005005.2 WAALEVALRVPCLYIIDAIFN NDUFB9_ SYPDSSQSRFCIVLQIFLRLF Exon2 (SEQ ID NO: 214) NDUFB8_ NM_ NM_ MAVARAGVLGVQWLQRASR DRP (SEQ ID NO: 217) Exon4_ 005004.2 015490.3 NVMPLGARTASHMTKDMFP SEC31B_ GPYPRTPEERAAAAKKYNMR {circumflex over ( )}Exon2 VEDYEPYPDDGMGYGDYPK LPDRSQHERDPWYSWDQPGL RLNWGEPMHWHLDMYNRN RVDTSPTPVSWHVMCMQLFG FLAFMIFMCWVGDVYPVYQP V (SEQ ID NO: 216) MIA_ NM_ NM_ MARSLVCLGVIILLSAFSGPG TSSSNSW (SEQ ID NO: 219) Exon3_ 006533.2 016154.3 VRGGPMPKLADRKLCADQEC RAB4B_Exon2 SHPISMAVALQDYMAPDCRF LTIHRGQVVYVFSKLKGRGR LFWGGSVQGDYYGDLAARL GYFPSSIVREDQTLKPGKVDV KTD (SEQ ID NO: 218) THAP2_ NM_ NM_ MPTNCAAAGCATTYNKHINI VTYDLFLRGVGCFLLLFLF Exon2_ 031435.2 018279.3 SFHRFPLDPKRRKEWVRLVR (SEQ ID NO: 221) TMEM19_ RKNFVPGKHTFLCSKHFEASC Exon2 FDLTGQTRRLKMDAVPTIFDF CTHIKSM (SEQ ID NO: 220) NITl_ NM_ NM_ MLGFITRPPHRFLSLLCPGLRI QPVSS (SEQ ID NO: 223) Exon6_DEDD_ 005600.1 032998.2 PQLSVLCAQPRPRAMAISSSS Exon4 CELPLVAVCQVTSTPDKQQN FKTCAELVREAARLGACLAF LPEAFDFIARDPAETLHLSEPL GGKLLEEYTQLARECGLWLS LGGFHERGQDWEQTQKIYNC HVLLNSKGAVVATYRKTHLC DVEIPGQGPMCESNSTMPGPS LESPVSTPAGKIGLAVCYDMR FPELSLALAQAGAEILTYPSAF GSITGPAHWE (SEQ ID NO: 222)
[0213] While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.