COMPOSITION FOR PREDICTING RESPONSE TO STANDARD PREOPERATIVE CHEMORADIATION THERAPY AND PROGNOSIS FOLLOWING TREATMENT, AND METHOD AND COMPOSITION FOR PREDICTING PATIENTS WITH VERY UNSATISFACTORY PROGNOSES FOLLOWING STANDARD THERAPY
20230021094 · 2023-01-19
Assignee
Inventors
Cpc classification
G01N33/57484
PHYSICS
G01N2800/60
PHYSICS
C12Q2600/106
CHEMISTRY; METALLURGY
G01N2800/52
PHYSICS
International classification
Abstract
The present invention relates to a biomarker composition for predicting the prognosis of a cancer patient, the biomarker composition including a first molecular subtype or a protein transcribed and translated from the first molecular subtype. The present invention also relates to a biomarker composition for predicting the prognosis of a cancer patient, the biomarker composition further including a second molecular subtype or a protein transcribed and translated from the second molecular subtype.
Claims
1. A composition for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, or identifying a target patient for neoadjuvant therapy prior to anticancer therapy in cancer patients, comprising: an agent that measures the expression level of at least one gene of a first molecular subtype and a second molecular subtype or a protein encoded thereby, wherein the first molecular subtype comprises one or more types of genes selected from PMP2, AGTR1, PLCXD3, TCEAL6, ANKRD1, and ARHGAP26-AS1, and the second molecular subtype comprises one or more types of genes selected from PGP, SLC26A3, HIST1H4C, RUVBL2, RAB19, HIST2H2AC, and SNORD69.
2. The composition of claim 1, wherein the first molecular subtype further comprises one or more types of genes selected from the group consisting of AADACL2, ABCA6, ABCA8, ABCA9, ABCB5, ABI3BP, ACADL, ACSM5, ACTG2, ADAMTS9-AS1, ADAMTS9-AS2, ADAMTSL3, ADCYAP1R1, ADGRB3, ADH1B, ADIPOQ, ADRA1A, AFF3, AGTR1, AICDA, ALB, ANGPTL1, ANGPTL5, ANGPTL7, ANK2, ANKS1B, ANXA8L1, APOA2, APOB, APOC3, AQP4, AQP8, ARPP21, ART4, ASB5, ASPA, ASTN1, ATCAY, ATP1A2, ATP2B2, ATP2B3, AVPR1B, B3GALT5-AS1, BCHE, BEST4, BHMT2, BLOC1S5-TXNDC5, BMP3, BRINP3, BVES, BVES-AS1, C14orf180, C1QTNF7, C7, C8orf88, CA1, CA2, CA7, CACNA2D1, CADM2, CADM3, CALN1, CARTPT, CASQ2, CAVIN2, CCBE1, CCDC144B, CCDC158, CCDC160, CCDC169, CCN5, CD300LG, CDH10, CDH19, CDKN2B-AS1, CDO1, CHRDL1, CHRM2, CHST9, CIDEA, CILP, CLCA4, CLCNKB, CLDN8, CLEC3B, CLEC4M, CLVS2, CMA1, CNGA3, CNN1, CNR1, CNTN1, CNTN2, CNTNAP4, COL19A1, CP, CPEB1, CPXM2, CR2, CRP, CTNNA3, CTSG, CYP1B1, DAO, DCLK1, DDR2, DES, DHRS7C, DIRAS2, DPP6, DPT, EBF2, ECRG4, ELAVL4, EPHA5, EPHA6, EPHA7, ERICH3, EVX2, FABP4, FAM106A, FAM133A, FAM135B, FAM180B, FDCSP, FGF10, FGF13-AS1, FGF14, FGFBP2, FGG, FGL1, FHL1, FILIP1, FLNC, FMO2, FRMD6-AS2, FRMPD4, FUT9, GABRA5, GABRG2, GALR1, GAP43, GAS1RR, GC, GCG, GDF6, GFRA1, GNAO1, GPM6A, GPR119, GPR12, GPRACR, GRIA2, GRIN2A, GTF2IP1, GUCA2B, HAND1, HAND2, HAND2-AS1, HEPACAM, HP, HPCAL4, HRG, HRK, HSPB8, HTR2B, IGSF10, IGSF11, IRX6, ISM1, KCNA1, KCNB1, KCNC2, KCNK2, KCNMA1, KCNMB1, KCNQ5, KCNT2, KCTD8, KERA, KHDRBS2, KIAA0408, KIF1A, KRT222, KRT24, KRTAP13-2, LCN10, LDB3, LEP, LGI1, LIFR, LINC00504, LINC00507, LINC00682, LINC00924, LINC01266, LINC01352, LINC01474, LINC01505, LINC01697, LINC01798, LINC01829, LINC02015, LINC02023, LINC02185, LINC02268, LINC02408, LINC02544, LIX1, LMO3, LMOD1, LOC100506289, LOC101928731, LOC102724050, LOC107986321, LOC283856, LOC440434, LOC729558, LONRF2, LRAT, LRCH2, LRRC3B, LRRC4C, LRRTM4, LVRN, LYVE1, MAB21L1, MAB21L2, MAGEE2, MAMDC2, MAPK4, MASP1, MEF2C-AS1, MEOX2, METTL24, MFAP5, MGAT4C, MGP, MICU3, MIR133A1HG, MIR8071-1, MMRN1, MORNS, MPPED2, MRGPRE, MS4A1, MS4A12, MSRB3, MUSK, MYH11, MYH2, MYLK, MYO3A, MYOC, MYOCD, MYOM1, MYOT, MYT1L, NALCN, NAP1L2, NBEA, NECAB1, NEFL, NEFM, NEGR1, NETO1, NEUROD1, NEXMIF, NEXN, NGB, NIBAN1, NLGN1, NOS1, NOVA1, NPR3, NPTX1, NPY2R, NRG3, NRK, NRSN1, NRXN1, NSG2, NTNG1, NTRK3, NUDT10, OGN, OLFM3, OMD, OTOP2, OTOP3, P2RX2, P2RY12, PAK3, PAPPA2, PCDH10, PCDH11X, PCDH9, PCOLCE2, PCP4L1, PCSK2, PDZRN4, PEG3, PENK, PGM5, PGM5-AS1, PGM5P4-AS1, PGR, PHOX2B, PI16, PIK3C2G, PIRT, PKHD1L1, PLAAT5, PLCXD3, PLD5, PLIN1, PLIN4, PLN, PLP1, PMP2, POPDC2, POU3F4, PPP1R1A, PRDM6, PRELP, PRG4, PRIMA1, PROKR1, PTCHD1, PTGIS, PTPRQ, PTPRZ1, PYGM, PYY, RANBP3L, RBFOX3, RBM20, RELN, RERGL, RGS13, RGS22, RIC3, RIMS4, RNF150, RNF180, RORB, RSPO2, SCARA5, SCGN, SCN2B, SCN7A, SCN9A, SCNN1G, SCRG1, SEMA3E, SERTM1, SERTM2, SFRP1, SFRP2, SFTPA1, SGCG, SHISAL1, SLC13A5, SLC17A8, SLC30A10, SLC4A4, SLC5A7, SLC6A2, SLC7A14, SLIT2, SLITRK2, SLITRK3, SLITRK4, SMIM28, SMYD1, SNAP25, SNAP91, SORCS1, SORCS3, SPHKAP, SPIB, SPOCK3, SST, ST8SIA3, STMN2, STMN4, STON1-GTF2A1L, STUM, SV2B, SYNM, SYNPO2, SYT10, SYT4, SYT6, TACR1, TAFA4, TCEAL2, TCEAL5, TCF23, TENM1, THBS4, TLL1, TMEFF2, TMEM100, TMEM35A, TMIGD1, TMOD1, TNNT3, TNS1, TNXB, TRARG1, TRDN, UGT2B10, UGT2B4, UNC80, VEGFD, VGLL3, VIT, VSTM2A, VXN, WSCD2, XKR4, ZBTB16, ZDHHC22, ZFHX4, ZMAT4, ZNF385B, ZNF676, and ZNF728.
3. The composition of claim 1, wherein the second molecular subtype further comprises one or more types of genes selected from the group consisting of ADAT3, ANP32D, BHLHA9, BOD1L2, C4orf48, CCDC85B, CDH16, CLMAT3, CSNK1A1L, CTU1, DBET, DDC-AS1, DEFA5, EIF3IP1, FAM173A, FEZF2, FOXI3, FRMD8P1, GALR3, GJD3, GPR25, HBA1, HES4, HIST1H4A, HIST1H4L, HLA-L, IGFBP7-AS1, ITLN2, KCNE1B, LCN15, LKAAEAR1, LOC101927795, LOC101927972, LOC101928372, LOC344967, LRRC26, MAGEA10, MESP1, MIR203A, MIR324, MIR3661, MIR4449, MIR4479, MIR4665, MIR4737, MIR4767, MIR6807, MIR6858, MIR6891, MIR8075, NACA2, NOXO1, ONECUT3, PCSK1N, PDF, PITPNM2-AS1, PNMA5, PRR7, PRSS2, PRSS56, PTGER1, PTTG3P, REG3A, RNA5S9, RNU4-1, RNU5A-1, RNU5B-1, RNU5E-1, RNU6ATAC, RNY1, RPL29P2, RPRML, SBF1P1, SHISAL2B, SKOR2, SLC32A1, SMARCA5-AS1, SMCR5, SNHG25, SNORA36A, SNORD30, SNORD38A, SNORD3B-2, SNORD41, SNORD48, TMEM160, TMEM238, TPGS1, TRAPPC5, UBE2NL, WBP11P1, and ZAR1.
4. The composition of claim 1, wherein the anticancer therapy is chemotherapy, radiation therapy, surgical treatment or a combination thereof.
5. The composition of claim 1, wherein the anticancer therapy is standard neoadjuvant chemoradiotherapy or surgical treatment after standard neoadjuvant chemoradiotherapy.
6. The composition of claim 1, wherein the cancer is rectal cancer.
7. A kit for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, or identifying a target patient for neoadjuvant therapy prior to anticancer treatment in cancer patients, comprising: the composition of claim 1.
8. The kit of claim 7, wherein the kit is an RT-PCR kit, a DNA chip kit, an ELISA kit, a protein chip kit, a rapid kit or a multiple reaction monitoring (MRM) kit.
9. A biomarker composition for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, or identifying a target patient for neoadjuvant therapy prior to anticancer therapy in cancer patients, comprising: at least one gene of a first molecular subtype and a second molecular subtype or a protein encoded thereby, wherein the first molecular subtype comprises one or more types of genes selected from PMP2, AGTR1, PLCXD3, TCEAL6, ANKRD1, and ARHGAP26-AS1, and the second molecular subtype comprises one or more types of genes selected from PGP, SLC26A3, HIST1H4C, RUVBL2, RAB19, HIST2H2AC, and SNORD69.
10. A method of providing information for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, or identifying a target patient for neoadjuvant therapy prior to anticancer therapy, comprising: measuring the expression level of at least one gene of a first molecular subtype and a second molecular subtype or a protein encoded thereby in a biological sample isolated from a target subject, wherein the first molecular subtype comprises one or more types of genes selected from PMP2, AGTR1, PLCXD3, TCEAL6, ANKRD1, and ARHGAP26-AS1, and the second molecular subtype comprises one or more types of genes selected from PGP, SLC26A3, HIST1H4C, RUVBL2, RAB19, HIST2H2AC, and SNORD69.
11. The method of claim 10, wherein the first molecular subtype further comprises one or more types of genes selected from the group consisting of AADACL2, ABCA6, ABCA8, ABCA9, ABCB5, ABI3BP, ACADL, ACSM5, ACTG2, ADAMTS9-AS1, ADAMTS9-AS2, ADAMTSL3, ADCYAP1R1, ADGRB3, ADH1B, ADIPOQ, ADRA1A, AFF3, AGTR1, AICDA, ALB, ANGPTL1, ANGPTL5, ANGPTL7, ANK2, ANKS1B, ANXA8L1, APOA2, APOB, APOC3, AQP4, AQP8, ARPP21, ART4, ASB5, ASPA, ASTN1, ATCAY, ATP1A2, ATP2B2, ATP2B3, AVPR1B, B3GALT5-AS1, BCHE, BEST4, BHMT2, BLOC1S5-TXNDC5, BMP3, BRINP3, BVES, BVES-AS1, C14orf180, C1QTNF7, C7, C8orf88, CA1, CA2, CA7, CACNA2D1, CADM2, CADM3, CALN1, CARTPT, CASQ2, CAVIN2, CCBE1, CCDC144B, CCDC158, CCDC160, CCDC169, CCN5, CD300LG, CDH10, CDH19, CDKN2B-AS1, CDO1, CHRDL1, CHRM2, CHST9, CIDEA, CILP, CLCA4, CLCNKB, CLDN8, CLEC3B, CLEC4M, CLVS2, CMA1, CNGA3, CNN1, CNR1, CNTN1, CNTN2, CNTNAP4, COL19A1, CP, CPEB1, CPXM2, CR2, CRP, CTNNA3, CTSG, CYP1B1, DAO, DCLK1, DDR2, DES, DHRS7C, DIRAS2, DPP6, DPT, EBF2, ECRG4, ELAVL4, EPHA5, EPHA6, EPHA7, ERICH3, EVX2, FABP4, FAM106A, FAM133A, FAM135B, FAM180B, FDCSP, FGF10, FGF13-AS1, FGF14, FGFBP2, FGG, FGL1, FHL1, FILIP1, FLNC, FMO2, FRMD6-AS2, FRMPD4, FUT9, GABRA5, GABRG2, GALR1, GAP43, GAS1RR, GC, GCG, GDF6, GFRA1, GNAO1, GPM6A, GPR119, GPR12, GPRACR, GRIA2, GRIN2A, GTF2IP1, GUCA2B, HAND1, HAND2, HAND2-AS1, HEPACAM, HP, HPCAL4, HRG, HRK, HSPB8, HTR2B, IGSF10, IGSF11, IRX6, ISM1, KCNA1, KCNB1, KCNC2, KCNK2, KCNMA1, KCNMB1, KCNQ5, KCNT2, KCTD8, KERA, KHDRBS2, KIAA0408, KIF1A, KRT222, KRT24, KRTAP13-2, LCN10, LDB3, LEP, LGI1, LIFR, LINC00504, LINC00507, LINC00682, LINC00924, LINC01266, LINC01352, LINC01474, LINC01505, LINC01697, LINC01798, LINC01829, LINC02015, LINC02023, LINC02185, LINC02268, LINC02408, LINC02544, LIX1, LMO3, LMOD1, LOC100506289, LOC101928731, LOC102724050, LOC107986321, LOC283856, LOC440434, LOC729558, LONRF2, LRAT, LRCH2, LRRC3B, LRRC4C, LRRTM4, LVRN, LYVE1, MAB21L1, MAB21L2, MAGEE2, MAMDC2, MAPK4, MASP1, MEF2C-AS1, MEOX2, METTL24, MFAP5, MGAT4C, MGP, MICU3, MIR133A1HG, MIR8071-1, MMRN1, MORNS, MPPED2, MRGPRE, MS4A1, MS4A12, MSRB3, MUSK, MYH11, MYH2, MYLK, MYO3A, MYOC, MYOCD, MYOM1, MYOT, MYT1L, NALCN, NAP1L2, NBEA, NECAB1, NEFL, NEFM, NEGR1, NETO1, NEUROD1, NEXMIF, NEXN, NGB, NIBAN1, NLGN1, NOS1, NOVA1, NPR3, NPTX1, NPY2R, NRG3, NRK, NRSN1, NRXN1, NSG2, NTNG1, NTRK3, NUDT10, OGN, OLFM3, OMD, OTOP2, OTOP3, P2RX2, P2RY12, PAK3, PAPPA2, PCDH10, PCDH11X, PCDH9, PCOLCE2, PCP4L1, PCSK2, PDZRN4, PEG3, PENK, PGM5, PGM5-AS1, PGM5P4-AS1, PGR, PHOX2B, PI16, PIK3C2G, PIRT, PKHD1L1, PLAAT5, PLCXD3, PLD5, PLIN1, PLIN4, PLN, PLP1, PMP2, POPDC2, POU3F4, PPP1R1A, PRDM6, PRELP, PRG4, PRIMA1, PROKR1, PTCHD1, PTGIS, PTPRQ, PTPRZ1, PYGM, PYY, RANBP3L, RBFOX3, RBM20, RELN, RERGL, RGS13, RGS22, RIC3, RIMS4, RNF150, RNF180, RORB, RSPO2, SCARA5, SCGN, SCN2B, SCN7A, SCN9A, SCNN1G, SCRG1, SEMA3E, SERTM1, SERTM2, SFRP1, SFRP2, SFTPA1, SGCG, SHISAL1, SLC13A5, SLC17A8, SLC30A10, SLC4A4, SLC5A7, SLC6A2, SLC7A14, SLIT2, SLITRK2, SLITRK3, SLITRK4, SMIM28, SMYD1, SNAP25, SNAP91, SORCS1, SORCS3, SPHKAP, SPIB, SPOCK3, SST, ST8SIA3, STMN2, STMN4, STON1-GTF2A1L, STUM, SV2B, SYNM, SYNPO2, SYT10, SYT4, SYT6, TACR1, TAFA4, TCEAL2, TCEAL5, TCF23, TENM1, THBS4, TLL1, TMEFF2, TMEM100, TMEM35A, TMIGD1, TMOD1, TNNT3, TNS1, TNXB, TRARG1, TRDN, UGT2B10, UGT2B4, UNC80, VEGFD, VGLL3, VIT, VSTM2A, VXN, WSCD2, XKR4, ZBTB16, ZDHHC22, ZFHX4, ZMAT4, ZNF385B, ZNF676, and ZNF728.
12. The method of claim 10, wherein the second molecular subtype further comprises one or more types of genes selected from the group consisting of ADAT3, ANP32D, BHLHA9, BOD1L2, C4orf48, CCDC85B, CDH16, CLMAT3, CSNK1A1L, CTU1, DBET, DDC-AS1, DEFA5, EIF3IP1, FAM173A, FEZF2, FOXI3, FRMD8P1, GALR3, GJD3, GPR25, HBA1, HES4, HIST1H4A, HIST1H4L, HLA-L, IGFBP7-AS1, ITLN2, KCNE1B, LCN15, LKAAEAR1, LOC101927795, LOC101927972, LOC101928372, LOC344967, LRRC26, MAGEA10, MESP1, MIR203A, MIR324, MIR3661, MIR4449, MIR4479, MIR4665, MIR4737, MIR4767, MIR6807, MIR6858, MIR6891, MIR8075, NACA2, NOXO1, ONECUT3, PCSK1N, PDF, PITPNM2-AS1, PNMA5, PRR7, PRSS2, PRSS56, PTGER1, PTTG3P, REG3A, RNA5S9, RNU4-1, RNU5A-1, RNU5B-1, RNU5E-1, RNU6ATAC, RNY1, RPL29P2, RPRML, SBF1P1, SHISAL2B, SKOR2, SLC32A1, SMARCA5-AS1, SMCR5, SNHG25, SNORA36A, SNORD30, SNORD38A, SNORD3B-2, SNORD41, SNORD48, TMEM160, TMEM238, TPGS1, TRAPPC5, UBE2NL, WBP11P1, and ZAR1.
13. The method of claim 10, wherein the anticancer therapy is chemotherapy, radiation therapy, surgical treatment or a combination thereof.
14. The method of claim 10, wherein the anticancer therapy is standard neoadjuvant chemoradiotherapy or surgical treatment after standard neoadjuvant chemoradiotherapy.
15. The method of claim 10, wherein when the first molecular subtype is expressed in the biological sample isolated from a target subject, or the expression level thereof is higher than a control, it is predicted that a therapeutic response to the anticancer therapy or a prognosis after the anticancer therapy is poor.
16. The method of claim 10, wherein when the second molecular subtype is expressed in the biological sample isolated from a target subject, or the expression level thereof is higher than a control, it is predicted that a therapeutic response to the anticancer therapy or a prognosis after the anticancer therapy is good.
17. The method of claim 10, further comprising: confirming the subject's TNM stage, age, sex, a pathologic complete response (pCR) or combined information thereof.
18. The method of claim 17, wherein when the expression level of the first molecular subtype of the subject is higher than a control, and the TNM stage of the subject is T3 or T4, it is predicted that the prognosis after anticancer therapy is poor.
19. The method of claim 17, wherein when the expression level of the first molecular subtype of the subject is higher than a control, and the TNM stage of the subject is N1 or N2, it is predicted that the prognosis after anticancer therapy is poor.
20. The method of claim 17, wherein when the expression level of the second molecular subtype of the subject is higher than a control, and pCR is achieved after the anticancer therapy, it is predicted that a prognosis after anticancer therapy is good.
21. The method of claim 17, wherein when the expression level of the second molecular subtype of the subject is higher than a control, and the TNM stage of the subject is T0, T1 or T2, it is predicted that a prognosis after the anticancer therapy is good.
22. The method of claim 17, wherein when the expression level of the second molecular subtype of the subject is higher than a control, and the TNM stage of the subject is N0, it is predicted that a prognosis after anticancer therapy is good.
23. The method of claim 10, wherein the cancer is one or more types of cancer selected from the group consisting of breast cancer, uterine cancer, esophageal cancer, stomach cancer, brain cancer, rectal cancer, colon cancer, lung cancer, skin cancer, ovarian cancer, cervical cancer, kidney cancer, blood cancer, pancreatic cancer, prostate cancer, testicular cancer, laryngeal cancer, oral cancer, head and neck cancer, thyroid cancer, liver cancer, bladder cancer, osteosarcoma, lymphoma, and leukemia.
24. A device for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, or identifying a target patient for total neoadjuvant therapy prior to anticancer therapy, comprising: a measurement unit for measuring the expression level of one or more genes of a first molecular subtype and a second molecular subtype or a protein encoded thereby in a biological sample isolated from a target subject; and a calculation unit that provides information for predicting a therapeutic response to anticancer therapy or a prognosis after anticancer therapy, and identifying a target patient for total neoadjuvant therapy from the expression level in the subject, wherein the first molecular subtype comprises one or more types of genes selected from PMP2, AGTR1, PLCXD3, TCEAL6, ANKRD1, ARHGAP26-AS1, and TCEAL6, and the second molecular subtype comprises one or more types of genes selected from PGP, SLC26A3, HIST1H4C, RUVBL2, RAB19, HIST2H2AC, and SNORD69.
25. The device of claim 24, wherein the anticancer therapy is chemotherapy, radiation therapy, surgical treatment or a combination thereof.
26. The device of claim 24, wherein the anticancer therapy is standard neoadjuvant chemoradiotherapy or surgical treatment after standard neoadjuvant chemoradiotherapy.
27. The device of claim 24, further comprising: an input unit for receiving the TNM stage, age or sex of the subject, a pathologic complete response (pCR) or combined information thereof.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0139]
[0140]
[0141]
[0142]
[0143]
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151]
[0152]
[0153]
[0154]
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0155] One purpose of the present invention is directed to providing a biomarker composition which can accurately and simply predict a therapeutic response to anticancer therapy or a prognosis after anticancer therapy.
[0156] Hereinafter, the present invention will be described in further detail with reference to examples. These examples are only for illustrating the present invention in further detail, and it will be apparent to those of ordinary skill in the art that the scope of the present invention is not limited by these examples according to the gist of the present invention.
EXAMPLES
[Preparation Example 1] Study Cohort
[0157] To develop a rectal cancer-specific molecular subtype classifier, a total of two clinical cohorts including publicly available RNAseq data from 177 rectal cancer patients downloaded from the Cancer Genome Atlas Project and RNAseq data of a validation cohort consisting of 230 rectal cancer cases from the Yonsei Cancer Center were used.
[Preparation Example 2] Study Design
[0158]
[0159] To find intrinsic rectal cancer molecular subtypes, genes using non-negative matrix factorization (NMF) and differently expressed between identified subtypes were identified using the DEseq2 package. Based on the gene set enrichment analysis of DEseq2 data, the inventors established a clinical hypothesis on the role of molecular subtypes to predict responses to chemotherapy and a prognosis, and prospectively tested 230 cases of rectal cancer diagnosed and treated at the Yonsei Cancer Center. To classify a rectal cancer patient group from the Yonsei Cancer Center into molecular subtypes, a molecular subtyping gene list was constructed using a list of 522 genes that have a two-fold or more difference in expression level and a p-value of less than 10.sup.−5 in the DESeq2 analysis. In addition, to develop the optimal classification gene list for the molecular subtypes found, the prediction analysis of a microarray was used.
[Preparation Example 3] Validation Cohort
[0160] As a result of investigating the case history of rectal adenocarcinoma patients from the Yonsei Cancer Center (Seoul, Korea) from 1995 to 2012 in order to predict a prognosis of a rectal cancer patient, 264 cases were confirmed, and according to inclusion criteria, cases of 1) performing total mesenteric excision (TME) following pre-radiation therapy (preop-CRT) and 2) using a formalin-fixed paraffin-embedded (FFPE) pretreated biopsy sample were included. A case in which 3) a patient received incomplete CRT, metastatic disease or palliative treatment was excluded.
[0161] CRT consisted of a total of 45 Gy delivered to the pelvis in 25 fractions of 180 cGy 5 times a week using a 3D conformal technique, and boosted radiation therapy of 540 cGy was given in 3 fractions. The simultaneous chemotherapy used was 5-fluorouracil bonded with leucovorin or capcitabine. TME was performed within 6 to 8 weeks after the completion of CRT. Fluorouracil-based postoperative radiation (postop-CRT) was given unless a postoperative pathological examination reported pCR.
[0162] After surgery, patients were followed at 3-month intervals for the first 3 years, at 6-month intervals for the next 2 years, and annually thereafter. Routine surveillance included physical examination, endoscopy, serum cancer embryo antigens, chest and abdominal pelvic CT scans, and toxicity assessment. Histological confirmation, MRI or FDG-PET was performed for further evaluation when recurrence was suspected. Intrapelvic recurrence was defined as local recurrence, and other recurrences were defined as distant recurrence.
[Preparation Example 4] RNAseq and Quality Control
[0163] RNA was extracted from paraffin-embedded biopsy tissues after formalin fixation for 264 cases using the Qiagen AllPrep DNA/RNA FFPE kit (Qiagen, Valencia, Calif., USA). Tumor-rich areas were extracted from 1 to 17 5-micron thick tissue sections depending on the amount of available tissues. An RNA concentration was quantified by fluorescent analysis (Qubit RNA Assay kit, ThermoFisher Scientific, USA). RNAseq was performed using the Ion Proton platform according to the manufacturer's instructions. 34 cases with poor data quality were excluded.
[Preparation Example 5] RNAseq Data and Gene Set Enrichment Analysis
[0164] Gene expression values were quantified using HT-seq with the Ensembl GRCh37 gene model. Count data were normalized by the DESeq variance stabilizing transformation (VST). Cases were assigned to the NMF-derived native subtypes using Nearest Template Prediction with 522 classifier gene templates derived from DESeq2. In addition, the enrichment analysis of colon cancer and cancer-related gene sets was performed by utilizing the gene set enrichment analysis functions, “CMSgsa” and “fgsea” packages of the consensus molecular subtype (CMS) package for colon cancer and colorectal cancer (CRC).
[Preparation Example 6] Statistical Analysis of Clinical Results by Intrinsic Subtypes
[0165] All statistical analyses were performed using R statistical programming environment version 3.6.3 and R Studio version 1.2.5033. The primary evaluation criterion was a disease-free survival (DFS) rate defined as the period from the date of surgery to the last follow-up in the case of the first local or distant recurrence event, death or censorship. The secondary evaluation variables included a distant recurrence-free survival (DRFS) rate, a local recurrence-free survival (LRFS) rate, and an overall survival (OS) rate. For LRFS and DRFS analyses, data was censored at the time of a competition event. OS was defined as the period from the data of surgery to the date of death, or the last follow-up when censored. For survival analysis, an R survival package was used. Survminer and ggplot2 packages were used to generate a Kaplan-Meier plot. The Kaplan-Meier method was used to compare a survival difference between intrinsic subtypes and tested with a log-rank test. Factors related to DFS and OS were analyzed by Cox proportional hazard regression analysis. A two-sided P-value of less than 0.05 was considered statistically significant. The relative accuracy of each prognostic model was evaluated using the log-likelihood ratio and the correlation index (C-index).
[Preparation Example 7] Discovery of Rectal Cancer Molecular Subtype Using RNAseq Data from TCGA-READ Cohort
[0166]
[0167]
[0168] RNAseq data (TCGA-READ.htseq_fpkm-uq.tsv) of 177 rectal cancer samples were downloaded from the Cancer Genome Atlas Project. To identify the optical number of subtypes using R package “NMF,” a non-negative matrix factorization analysis was performed with a rank from 2 to 5. The cophenetic index and silhouette discovered through consensus clustering suggest that dividing rectal cancer into two molecular subtypes would be the best option.
[Preparation Example 8] Properties of Two Types of Intrinsic Molecular Subtypes of Rectal Cancer
[0169]
[0170] To find the biological properties of the two molecular subtypes found, a gene set enrichment analysis was performed to identify significantly different biological pathways between two subtypes using the “fgsea” package of R and the “CMSgsa” package. As shown in
[Preparation Example 9] Difference Between Rectal Cancer-Intrinsic Molecular Subtype and Colon Cancer Molecular Subtype
[0171]
[0172] In the meantime, rectal cancer has been included in the molecular subtype classification of colorectal cancer as it has been judged to be a part of colorectal cancer. The molecular subtype of colon cancer is classified into four molecular subtypes, that is, Consensus Molecular Subtype (CMS) 1 to 4, by global consensus. The CMS4 subtype has a poor prognosis and the epithelial-mesenchymal transition pathway gene is activated, so it is likely to be the same as the first molecular subtype found by the inventors. Accordingly, TCGA-READ RNAseq data was classified into CMS molecular subtypes using the CMScaller package, and the correlation between the two molecular subtypes found by the inventors was examined. Only 58.8% of the first molecular subtype was classified as CMS4, and 17.5% of the second molecular subtype was classified as CMS4. It showed that they have a statistically significant correlation, but do not match.
TABLE-US-00001 TABLE 1 CMS1 CMS2 CMS3 CMS4 Total First molecular 8 13 12 47 (58.8%) 80 subtype Second molecular 12 46 22 17 (17.5%) 97 subtype Total 20 59 34 64 177
[0173] As a result of a gene set enrichment analysis for 8 types of rectal cancer classified by two classification methods using the “CMSgsa” function of the “CMScaller” package in order to understand the difference between CMS4 and the first molecular subtype, as shown in
[0174] In addition, based on the above data, it was hypothesized that the first molecular subtype was associated with a worse prognosis and a lower response to preoperative chemoradiotherapy, compared to the second molecular subtype.
[Preparation Example 10] Development of Classifier for Newly Found Molecular Subtype (1)
[0175]
[0176] The Prediction Analysis of the Microarray R package (PAMr) was utilized to develop a classifier for the newly found molecular subtypes. For analysis, a threshold of 6 and a prop-selected-in-cv threshold of 0.6 were used. For reference, even when the threshold and the prop-selected-in-cv threshold are changed, it affects the number of selected genes and the final performance of the classifier, but the top classifier gene remains the same, and the clinical performance is similar to a partial change in p-value. As a result of this analysis, 94 genes were primarily selected as templates for subtype classification as shown in Table 2. In the molecular subtype items in Table 2, 1 denotes the first molecular subtype, and 2 denotes the second molecular subtype.
TABLE-US-00002 TABLE 2 No. Gene Molecular subtype 1 ZNF728 1 2 ZNF676 1 3 TVP23C-CDRT4 1 4 TCEAL2 1 5 TBC1D3L 1 6 SYT4 1 7 SLITRK4 1 8 SEMA3E 1 9 SCN9A 1 10 SCN7A 1 11 RANBP3L 1 12 PLN 1 13 PLGLB2 1 14 PLCXD3 1 15 PGM5P3-AS1 1 16 PGM5-AS1 1 17 PCDH10 1 18 OR7E12P 1 19 NLGN1 1 20 NEXN 1 21 MYH8 1 22 MIR4477B 1 23 MIR3911 1 24 MIR186 1 25 MIR133A1HG 1 26 MEIS1-AS2 1 27 LONRF2 1 28 LOC644838 1 29 LOC642131 1 30 LOC440434 1 31 LOC101929607 1 32 LOC101928509 1 33 LOC100507387 1 34 LOC100507073 1 35 LINGO2 1 36 LINC01537 1 37 LINC01489 1 38 LINC01352 1 39 LINC01266 1 40 LINC00504 1 41 LGI1 1 42 KRT222 1 43 KIAA2022 1 44 KIAA0408 1 45 KCTD8 1 46 HNRNPA1P33 1 47 HLX-AS1 1 48 HIST2H3C 1 49 HCG23 1 50 GTF2IP1 1 51 GRIN2A 1 52 GRIA2 1 53 GOLGA8K 1 54 GAS1RR 1 55 FILIP1 1 56 FAM47E-STBD1 1 57 FAM35BP 1 58 FAM133A 1 59 EPHA6 1 60 CTAGE8 1 61 CDH19 1 62 CCDC144B 1 63 C10orf131 1 64 BVES-AS1 1 65 BLOC1S5-TXNDC5 1 66 BCHE 1 67 ARHGEF18 1 68 ADAMTS9-AS1 1 69 ACADL 1 70 TRAPPC5 2 71 TPGS1 2 72 TMEM160 2 73 SNORD38A 2 74 SNORD30 2 75 SNHG25 2 76 PRR7 2 77 PDF 2 78 NOXO1 2 79 MIR3661 2 80 LOC440311 2 81 FEZF2 2 82 FAM173A 2 83 EIF3IP1 2 84 CTU1 2 85 C4orf48 2
[0177] As shown in Table 2, classification genes relatively over-expressed in the primarily selected first molecular subtype are ACADL, ADAMTS9-AS1, ARHGEF18, BCHE, BLOC1S5-TXNDC5, BVES-AS1, C10orf131, CCDC144B, CDH19, CTAGE8, EPHA6, FAM133A, FAM35BP, FAM47E-STBD1, FILIP1, GAS1RR, GOLGA8K, GRIA2, GRIN2A, GTF2IP1, HCG23, HIST2H3C, HLX-AS1, HNRNPA1P33, KCTD8, KIAA0408, KIAA2022, KRT222, LGI1, LINC00504, LINC01266, LINC01352, LINC01489, LINC01537, LINGO2, LOC100507073, LOC100507387, LOC101928509, LOC101929607, LOC440434, LOC642131, LOC644838, LONRF2, MEIS1-AS2, MIR133A1HG, MIR186, MIR3911, MIR4477B, MYH8, NEXN, NLGN1, OR7E12P, PCDH10, PGM5-AS1, PGM5P3-AS1, PLCXD3, PLGLB2, PLN, RANBP3L, SCN7A, SCN9A, SEMA3E, SLITRK4, SYT4, TBC1D3L, TCEAL2, TVP23C-CDRT4, ZNF676, and ZNF728.
[0178] Meanwhile, classification genes relatively over-expressed in the primarily-selected second molecular subtype are C4orf48, CTU1, EIF3IP1, FAM173A, FEZF2, LOC440311, MIR3661, NOXO1, PDF, PRR7, SNHG25, SNORD30, SNORD38A, TMEM160, TPGS1, and TRAPPC5.
[0179] When gene expression was analyzed by the RNAseq method, since the expression level of many genes can be measured at the same time, it is possible to apply a panel consisting of more genes than a 94-gene panel. Differentially expressed genes between two subtypes were identified using the “DEseq2” package in R. At the statistical significance level of p<10.sup.−7, there were 4877 differentially expressed genes between the two molecular subtypes. Among 4877 genes, it is possible to develop a molecular subtype classifier by selecting and combining some genes in various ways, and in one example, as shown in Table 3, classification is possible by selecting 522 genes in which the difference in expression level between two molecular subtypes is 2 times or more as templates. In the molecular subtype items of Table 3 below, 1 denotes the first molecular subtype, and 2 denotes the second molecular subtype.
TABLE-US-00003 TABLE 3 Gene Molecular subtype 1 ADAT3 2 2 ANP32D 2 3 BHLHA9 2 4 BOD1L2 2 5 C4orf48 2 6 CCDC85B 2 7 CDH16 2 8 CLMAT3 2 9 CSNK1A1L 2 10 CTU1 2 11 DBET 2 12 DDC-AS1 2 13 DEFA5 2 14 EIF3IP1 2 15 FAM173A 2 16 FEZF2 2 17 FOXI3 2 18 FRMD8P1 2 19 GALR3 2 20 GJD3 2 21 GPR25 2 22 HBA1 2 23 HES4 2 24 HIST1H4A 2 25 HIST1H4L 2 26 HLA-L 2 27 IGFBP7-AS1 2 28 ITLN2 2 29 KCNE1B 2 30 LCN15 2 31 LKAAEAR1 2 32 LOC101927795 2 33 LOC101927972 2 34 LOC101928372 2 35 LOC344967 2 36 LRRC26 2 37 MAGEA10 2 38 MESP1 2 39 MIR203A 2 40 MIR324 2 41 MIR3661 2 42 MIR4449 2 43 MIR4479 2 44 MIR4665 2 45 MIR4737 2 46 MIR4767 2 47 MIR6807 2 48 MIR6858 2 49 MIR6891 2 50 MIR8075 2 51 NACA2 2 52 NOXO1 2 53 ONECUT3 2 54 PCSK1N 2 55 PDF 2 56 PITPNM2-AS1 2 57 PNMA5 2 58 PRR7 2 59 PRSS2 2 60 PRSS56 2 61 PTGER1 2 62 PTTG3P 2 63 REG3A 2 64 RNA5S9 2 65 RNU4-1 2 66 RNU5A-1 2 67 RNU5B-1 2 68 RNU5E-1 2 69 RNU6ATAC 2 70 RNY1 2 71 RPL29P2 2 72 RPRML 2 73 SBF1P1 2 74 SHISAL2B 2 75 SKOR2 2 76 SLC32A1 2 77 SMARCA5-AS1 2 78 SMCR5 2 79 SNHG25 2 80 SNORA36A 2 81 SNORD30 2 82 SNORD38A 2 83 SNORD3B-2 2 84 SNORD41 2 85 SNORD48 2 86 TMEM160 2 87 TMEM238 2 88 TPGS1 2 89 TRAPPC5 2 90 UBE2NL 2 91 WBP11P1 2 92 ZAR1 2 93 AADACL2 1 94 ABCA6 1 95 ABCA8 1 96 ABCA9 1 97 ABCB5 1 98 ABI3BP 1 99 ACADL 1 100 ACSM5 1 101 ACTG2 1 102 ADAMTS9-AS1 1 103 ADAMTS9-AS2 1 104 ADAMTSL3 1 105 ADCYAP1R1 1 106 ADGRB3 1 107 ADH1B 1 108 ADIPOQ 1 109 ADRA1A 1 110 AFF3 1 111 AGTR1 1 112 AICDA 1 113 ALB 1 114 ANGPTL1 1 115 ANGPTL5 1 116 ANGPTL7 1 117 ANK2 1 118 ANKS1B 1 119 ANXA8L1 1 120 APOA2 1 121 APOB 1 122 APOC3 1 123 AQP4 1 124 AQP8 1 125 ARPP21 1 126 ART4 1 127 ASB5 1 128 ASPA 1 129 ASTN1 1 130 ATCAY 1 131 ATP1A2 1 132 ATP2B2 1 133 ATP2B3 1 134 AVPR1B 1 135 B3GALT5-AS1 1 136 BCHE 1 137 BEST4 1 138 BHMT2 1 139 BLOC1S5-TXNDC5 1 140 BMP3 1 141 BRINP3 1 142 BVES 1 143 BVES-AS1 1 144 C14orf180 1 145 C1QTNF7 1 146 C7 1 147 C8orf88 1 148 CA1 1 149 CA2 1 150 CA7 1 151 CACNA2D1 1 152 CADM2 1 153 CADM3 1 154 CALN1 1 155 CARTPT 1 156 CASQ2 1 157 CAVIN2 1 158 CCBE1 1 159 CCDC144B 1 160 CCDC158 1 161 CCDC160 1 162 CCDC169 1 163 CCN5 1 164 CD300LG 1 165 CDH10 1 166 CDH19 1 167 CDKN2B-AS1 1 168 CDO1 1 169 CHRDL1 1 170 CHRM2 1 171 CHST9 1 172 CIDEA 1 173 CILP 1 174 CLCA4 1 175 CLCNKB 1 176 CLDN8 1 177 CLEC3B 1 178 CLEC4M 1 179 CLVS2 1 180 CMA1 1 181 CNGA3 1 182 CNN1 1 183 CNR1 1 184 CNTN1 1 185 CNTN2 1 186 CNTNAP4 1 187 COL19A1 1 188 CP 1 189 CPEB1 1 190 CPXM2 1 191 CR2 1 192 CRP 1 193 CTNNA3 1 194 CTSG 1 195 CYP1B1 1 196 DAO 1 197 DCLK1 1 198 DDR2 1 199 DES 1 200 DHRS7C 1 201 DIRAS2 1 202 DPP6 1 203 DPT 1 204 EBF2 1 205 ECRG4 1 206 ELAVL4 1 207 EPHA5 1 208 EPHA6 1 209 EPHA7 1 210 ERICH3 1 211 EVX2 1 212 FABP4 1 213 FAM106A 1 214 FAM133A 1 215 FAM135B 1 216 FAM180B 1 217 FDCSP 1 218 FGF10 1 219 FGF13-AS1 1 220 FGF14 1 221 FGFBP2 1 222 FGG 1 223 FGL1 1 224 FHL1 1 225 FILIP1 1 226 FLNC 1 227 FMO2 1 228 FRMD6-AS2 1 229 FRMPD4 1 230 FUT9 1 231 GABRA5 1 232 GABRG2 1 233 GALR1 1 234 GAP43 1 235 GAS1RR 1 236 GC 1 237 GCG 1 238 GDF6 1 239 GFRA1 1 240 GNAO1 1 241 GPM6A 1 242 GPR119 1 243 GPR12 1 244 GPRACR 1 245 GRIA2 1 246 GRIN2A 1 247 GTF2IP1 1 248 GUCA2B 1 249 HAND1 1 250 HAND2 1 251 HAND2-AS1 1 252 HEPACAM 1 253 HP 1 254 HPCAL4 1 255 HRG 1 256 HRK 1 257 HSPB8 1 258 HTR2B 1 259 IGSF10 1 260 IGSF11 1 261 IRX6 1 262 ISM1 1 263 KCNA1 1 264 KCNB1 1 265 KCNC2 1 266 KCNK2 1 267 KCNMA1 1 268 KCNMB1 1 269 KCNQ5 1 270 KCNT2 1 271 KCTD8 1 272 KERA 1 273 KHDRBS2 1 274 KIAA0408 1 275 KIF1A 1 276 KRT222 1 277 KRT24 1 278 KRTAP13-2 1 279 LCN10 1 280 LDB3 1 281 LEP 1 282 LGI1 1 283 LIFR 1 284 LINC00504 1 285 LINC00507 1 286 LINC00682 1 287 LINC00924 1 288 LINC01266 1 289 LINC01352 1 290 LINC01474 1 291 LINC01505 1 292 LINC01697 1 293 LINC01798 1 294 LINC01829 1 295 LINC02015 1 296 LINC02023 1 297 LINC02185 1 298 LINC02268 1 299 LINC02408 1 300 LINC02544 1 301 LIX1 1 302 LMO3 1 303 LMOD1 1 304 LOC100506289 1 305 LOC101928731 1 306 LOC102724050 1 307 LOC107986321 1 308 LOC283856 1 309 LOC440434 1 310 LOC729558 1 311 LONRF2 1 312 LRAT 1 313 LRCH2 1 314 LRRC3B 1 315 LRRC4C 1 316 LRRTM4 1 317 LVRN 1 318 LYVE1 1 319 MAB21L1 1 320 MAB21L2 1 321 MAGEE2 1 322 MAMDC2 1 323 MAPK4 1 324 MASP1 1 325 MEF2C-AS1 1 326 MEOX2 1 327 METTL24 1 328 MFAP5 1 329 MGAT4C 1 330 MGP 1 331 MICU3 1 332 MIR133A1HG 1 333 MIR8071-1 1 334 MMRN1 1 335 MORN5 1 336 MPPED2 1 337 MRGPRE 1 338 MS4A1 1 339 MS4A12 1 340 MSRB3 1 341 MUSK 1 342 MYH11 1 343 MYH2 1 344 MYLK 1 345 MYO3A 1 346 MYOC 1 347 MYOCD 1 348 MYOM1 1 349 MYOT 1 350 MYT1L 1 351 NALCN 1 352 NAP1L2 1 353 NBEA 1 354 NECAB1 1 355 NEFL 1 356 NEFM 1 357 NEGR1 1 358 NETO1 1 359 NEUROD1 1 360 NEXMIF 1 361 NEXN 1 362 NGB 1 363 NIBAN1 1 364 NLGN1 1 365 NOS1 1 366 NOVA1 1 367 NPR3 1 368 NPTX1 1 369 NPY2R 1 370 NRG3 1 371 NRK 1 372 NRSN1 1 373 NRXN1 1 374 NSG2 1 375 NTNG1 1 376 NTRK3 1 377 NUDT10 1 378 OGN 1 379 OLFM3 1 380 OMD 1 381 OTOP2 1 382 OTOP3 1 383 P2RX2 1 384 P2RY12 1 385 PAK3 1 386 PAPPA2 1 387 PCDH10 1 388 PCDH11X 1 389 PCDH9 1 390 PCOLCE2 1 391 PCP4L1 1 392 PCSK2 1 393 PDZRN4 1 394 PEG3 1 395 PENK 1 396 PGM5 1 397 PGM5-AS1 1 398 PGM5P4-AS1 1 399 PGR 1 400 PHOX2B 1 401 PI16 1 402 PIK3C2G 1 403 PIRT 1 404 PKHD1L1 1 405 PLAAT5 1 406 PLCXD3 1 407 PLD5 1 408 PLIN1 1 409 PLIN4 1 410 PLN 1 411 PLP1 1 412 PMP2 1 413 POPDC2 1 414 POU3F4 1 415 PPP1R1A 1 416 PRDM6 1 417 PRELP 1 418 PRG4 1 419 PRIMA1 1 420 PROKR1 1 421 PTCHD1 1 422 PTGIS 1 423 PTPRQ 1 424 PTPRZ1 1 425 PYGM 1 426 PYY 1 427 RANBP3L 1 428 RBFOX3 1 429 RBM20 1 430 RELN 1 431 RERGL 1 432 RGS13 1 433 RGS22 1 434 RIC3 1 435 RIMS4 1 436 RNF150 1 437 RNF180 1 438 RORB 1 439 RSPO2 1 440 SCARA5 1 441 SCGN 1 442 SCN2B 1 443 SCN7A 1 444 SCN9A 1 445 SCNN1G 1 446 SCRG1 1 447 SEMA3E 1 448 SERTM1 1 449 SERTM2 1 450 SFRP1 1 451 SFRP2 1 452 SFTPA1 1 453 SGCG 1 454 SHISAL1 1 455 SLC13A5 1 456 SLC17A8 1 457 SLC30A10 1 458 SLC4A4 1 459 SLC5A7 1 460 SLC6A2 1 461 SLC7A14 1 462 SLIT2 1 463 SLITRK2 1 464 SLITRK3 1 465 SLITRK4 1 466 SMIM28 1 467 SMYD1 1 468 SNAP25 1 469 SNAP91 1 470 SORCS1 1 471 SORCS3 1 472 SPHKAP 1 473 SPIB 1 474 SPOCK3 1 475 SST 1 476 ST8SIA3 1 477 STMN2 1 478 STMN4 1 479 STON1-GTF2A1L 1 480 STUM 1 481 SV2B 1 482 SYNM 1 483 SYNPO2 1 484 SYT10 1 485 SYT4 1 486 SYT6 1 487 TACR1 1 488 TAFA4 1 489 TCEAL2 1 490 TCEAL5 1 491 TCEAL6 1 492 TCF23 1 493 TENM1 1 494 THBS4 1 495 TLL1 1 496 TMEFF2 1 497 TMEM100 1 498 TMEM35A 1 499 TMIGD1 1 500 TMOD1 1 501 TNNT3 1 502 TNS1 1 503 TNXB 1 504 TRARG1 1 505 TRDN 1 506 UGT2B10 1 507 UGT2B4 1 508 UNC80 1 509 VEGFD 1 510 VGLL3 1 511 VIT 1 512 VSTM2A 1 513 VXN 1 514 WSCD2 1 515 XKR4 1 516 ZBTB16 1 517 ZDHHC22 1 518 ZFHX4 1 519 ZMAT4 1 520 ZNF385B 1 521 ZNF676 1 522 ZNF728 1
[0180] As shown in Table 3, secondly selected first molecular subtypes are AADACL2, ABCA6, ABCA8, ABCA9, ABCB5, ABI3BP, ACADL, ACSM5, ACTG2, ADAMTS9-AS1, ADAMTS9-AS2, ADAMTSL3, ADCYAP1R1, ADGRB3, ADH1B, ADIPOQ, ADRA1A, AFF3, AGTR1, AICDA, ALB, ANGPTL1, ANGPTL5, ANGPTL7, ANK2, ANKS1B, ANXA8L1, APOA2, APOB, APOC3, AQP4, AQP8, ARPP21, ART4, ASB5, ASPA, ASTN1, ATCAY, ATP1A2, ATP2B2, ATP2B3, AVPR1B, B3GALT5-AS1, BCHE, BEST4, BHMT2, BLOC1S5-TXNDC5, BMP3, BRINP3, BVES, BVES-AS1, C14orf180, C1QTNF7, C7, C8orf88, CA1, CA2, CA7, CACNA2D1, CADM2, CADM3, CALN1, CARTPT, CASQ2, CAVIN2, CCBE1, CCDC144B, CCDC158, CCDC160, CCDC169, CCN5, CD300LG, CDH10, CDH19, CDKN2B-AS1, CDO1, CHRDL1, CHRM2, CHST9, CIDEA, CILP, CLCA4, CLCNKB, CLDN8, CLEC3B, CLEC4M, CLVS2, CMA1, CNGA3, CNN1, CNR1, CNTN1, CNTN2, CNTNAP4, COL19A1, CP, CPEB1, CPXM2, CR2, CRP, CTNNA3, CTSG, CYP1B1, DAO, DCLK1, DDR2, DES, DHRS7C, DIRAS2, DPP6, DPT, EBF2, ECRG4, ELAVL4, EPHA5, EPHA6, EPHA7, ERICH3, EVX2, FABP4, FAM106A, FAM133A, FAM135B, FAM180B, FDCSP, FGF10, FGF13-AS1, FGF14, FGFBP2, FGG, FGL1, FHL1, FILIP1, FLNC, FMO2, FRMD6-AS2, FRMPD4, FUT9, GABRA5, GABRG2, GALR1, GAP43, GAS1RR, GC, GCG, GDF6, GFRA1, GNAO1, GPM6A, GPR119, GPR12, GPRACR, GRIA2, GRIN2A, GTF2IP1, GUCA2B, HAND1, HAND2, HAND2-AS1, HEPACAM, HP, HPCAL4, HRG, HRK, HSPB8, HTR2B, IGSF10, IGSF11, IRX6, ISM1, KCNA1, KCNB1, KCNC2, KCNK2, KCNMA1, KCNMB1, KCNQ5, KCNT2, KCTD8, KERA, KHDRBS2, KIAA0408, KIF1A, KRT222, KRT24, KRTAP13-2, LCN10, LDB3, LEP, LGI1, LIFR, LINC00504, LINC00507, LINC00682, LINC00924, LINC01266, LINC01352, LINC01474, LINC01505, LINC01697, LINC01798, LINC01829, LINC02015, LINC02023, LINC02185, LINC02268, LINC02408, LINC02544, LIX1, LMO3, LMOD1, LOC100506289, LOC101928731, LOC102724050, LOC107986321, LOC283856, LOC440434, LOC729558, LONRF2, LRAT, LRCH2, LRRC3B, LRRC4C, LRRTM4, LVRN, LYVE1, MAB21L1, MAB21L2, MAGEE2, MAMDC2, MAPK4, MASP1, MEF2C-AS1, MEOX2, METTL24, MFAP5, MGAT4C, MGP, MICU3, MIR133A1HG, MIR8071-1, MMRN1, MORNS, MPPED2, MRGPRE, MS4A1, MS4A12, MSRB3, MUSK, MYH11, MYH2, MYLK, MYO3A, MYOC, MYOCD, MYOM1, MYOT, MYT1L, NALCN, NAP1L2, NBEA, NECAB1, NEFL, NEFM, NEGR1, NETO1, NEUROD1, NEXMIF, NEXN, NGB, NIBAN1, NLGN1, NOS1, NOVA1, NPR3, NPTX1, NPY2R, NRG3, NRK, NRSN1, NRXN1, NSG2, NTNG1, NTRK3, NUDT10, OGN, OLFM3, OMD, OTOP2, OTOP3, P2RX2, P2RY12, PAK3, PAPPA2, PCDH10, PCDH11X, PCDH9, PCOLCE2, PCP4L1, PCSK2, PDZRN4, PEG3, PENK, PGM5, PGM5-AS1, PGM5P4-AS1, PGR, PHOX2B, PI16, PIK3C2G, PIRT, PKHD1L1, PLAAT5, PLCXD3, PLD5, PLIN1, PLIN4, PLN, PLP1, PMP2, POPDC2, POU3F4, PPP1R1A, PRDM6, PRELP, PRG4, PRIMA1, PROKR1, PTCHD1, PTGIS, PTPRQ, PTPRZ1, PYGM, PYY, RANBP3L, RBFOX3, RBM20, RELN, RERGL, RGS13, RGS22, RIC3, RIMS4, RNF150, RNF180, RORB, RSPO2, SCARA5, SCGN, SCN2B, SCN7A, SCN9A, SCNN1G, SCRG1, SEMA3E, SERTM1, SERTM2, SFRP1, SFRP2, SFTPA1, SGCG, SHISAL1, SLC13A5, SLC17A8, SLC30A10, SLC4A4, SLC5A7, SLC6A2, SLC7A14, SLIT2, SLITRK2, SLITRK3, SLITRK4, SMIM28, SMYD1, SNAP25, SNAP91, SORCS1, SORCS3, SPHKAP, SPIB, SPOCK3, SST, ST8SIA3, STMN2, STMN4, STON1-GTF2A1L, STUM, SV2B, SYNM, SYNPO2, SYT10, SYT4, SYT6, TACR1, TAFA4, TCEAL2, TCEAL5, TCF23, TENM1, THBS4, TLL1, TMEFF2, TMEM100, TMEM35A, TMIGD1, TMOD1, TNNT3, TNS1, TNXB, TRARG1, TRDN, UGT2B10, UGT2B4, UNC80, VEGFD, VGLL3, VIT, VSTM2A, VXN, WSCD2, XKR4, ZBTB16, ZDHHC22, ZFHX4, ZMAT4, ZNF385B, ZNF676, and ZNF728.
[0181] Meanwhile, secondly selected second molecular subtypes are ADAT3, ANP32D, BHLHA9, BOD1L2, C4orf48, CCDC85B, CDH16, CLMAT3, CSNK1A1L, CTU1, DBET, DDC-AS1, DEFA5, EIF3IP1, FAM173A, FEZF2, FOXI3, FRMD8P1, GALR3, GJD3, GPR25, HBA1, HES4, HIST1H4A, HIST1H4L, HLA-L, IGFBP7-AS1, ITLN2, KCNE1B, LCN15, LKAAEAR1, LOC101927795, LOC101927972, LOC101928372, LOC344967, LRRC26, MAGEA10, MESP1, MIR203A, MIR324, MIR3661, MIR4449, MIR4479, MIR4665, MIR4737, MIR4767, MIR6807, MIR6858, MIR6891, MIR8075, NACA2, NOXO1, ONECUT3, PCSK1N, PDF, PITPNM2-AS1, PNMA5, PRR7, PRSS2, PRSS56, PTGER1, PTTG3P, REG3A, RNA5S9, RNU4-1, RNU5A-1, RNU5B-1, RNU5E-1, RNU6ATAC, RNY1, RPL29P2, RPRML, SBF1P1, SHISAL2B, SKOR2, SLC32A1, SMARCA5-AS1, SMCR5, SNHG25, SNORA36A, SNORD30, SNORD38A, SNORD3B-2, SNORD41, SNORD48, TMEM160, TMEM238, TPGS1, TRAPPC5, UBE2NL, WBP11P1, and ZAR1.
[0182] Noticeably, the classifier gene templates include pseudogenes, miRNA and non-coding genes, which are generally excluded from this type of analysis, which may explain why robust subtype classifiers have not been reported so far.
[Preparation Example 11] Development of Classifier for Newly Found Molecular Subtype (2)
[0183] To confirm other possible versions of classifier gene templates, depending on the thresholds used in PAM analysis, a slightly different list of template genes having similar major contributing genes was found. Such template genes can be used with similar clinical utility. Tables 4 to 7 are templates that can replace the first or secondly-selected gene templates. In the molecular subtype items of Tables 4 to 7 below, 1 denotes the first molecular subtype, and 2 denotes the second molecular subtype.
TABLE-US-00004 TABLE 4 Gene Molecular subtype 1 GTF2IP1 2 2 TBC1D3L 2 3 MIR4477B 2 5 BLOC1S5-TXNDC5 2 6 HIST2H3C 2 7 CTAGE8 2 8 HNRNPA1P33 2 9 LOC440434 2 10 GOLGA8K 2 11 TMEM160 1 12 FEZF2 1 13 C10orf131 2 14 TRAPPC5 1 15 KRT222 2 16 ACADL 2 17 LOC101929607 2 18 SNHG25 1 19 SNORD38A 1 20 LOC644838 2 21 KIAA0408 2 22 TCEAL2 2 23 C4orf48 1 24 LOC642131 2 25 PLGLB2 2 26 FAM47E-STBD1 2 27 MIR186 2 28 ADAMTS9-AS1 2 29 TVP23C-CDRT4 2 30 PGM5-AS1 2 31 SLITRK4 2 32 MIR3661 1 33 SEMA3E 2 34 ZNF676 2 35 PRR7 1 36 PGM5P3-AS1 2 37 KIAA2022 2 38 LONRF2 2 39 PLCXD3 2 40 NLGN1 2 41 LOC440311 1 42 EPHA6 2 43 LOC100507387 2 44 PDF 1 45 GRIN2A 2 46 LOC105369187 2 47 LINC01537 2 48 EIF3IP1 1 49 FAM35BP 2 50 BCHE 2 51 OPA1-AS1 2 52 TPGS1 1 53 GAS1RR 2 54 NOL12 2 55 LINC01266 2 56 LINC00504 2 57 COL25A1 2 58 LOC101928509 2 59 SNORD30 1 60 ATP2B2 2 61 NOXO1 1 62 MIR4449 1 63 LINC01489 2 64 FRMPD4 2 65 LINC00670 2 66 CCDC158 2 67 HCG23 2 68 CTU1 1 69 AGTR1 2 70 LOC102467147 2 71 FAM173A 1 72 GOLGA8N 2 73 PCDH10 2 74 MIR3911 2 75 TICAM2 2 76 LGI1 2 77 MYOC 2 78 SCN7A 2 79 MEF2C-AS1 2 80 SNORD3A 2 82 KCNQ5 2 83 CCL16 2 84 NEXN 2 85 MYH8 2 86 LOC100507073 2 87 SIAH3 2 90 GRAPL 2 92 FILIP1 2
TABLE-US-00005 TABLE 5 Gene Molecular subtype 1 GTF2IP1 2 2 TBC1D3L 2 3 MIR4477B 2 5 BLOC1S5-TXNDC5 2 6 HIST2H3C 2 7 CTAGE8 2 8 HNRNPA1P33 2 9 LOC440434 2 10 GOLGA8K 2 11 TMEM160 1 12 FEZF2 1 13 C10orf131 2 14 TRAPPC5 1 15 KRT222 2 16 ACADL 2 17 LOC101929607 2 18 SNHG25 1 19 SNORD38A 1 20 LOC644838 2 21 KIAA0408 2 22 TCEAL2 2 23 C4orf48 1 24 LOC642131 2 25 PLGLB2 2 26 FAM47E-STBD1 2 27 MIR186 2 28 ADAMTS9-AS1 2 29 TVP23C-CDRT4 2 30 PGM5-AS1 2 31 SLITRK4 2 32 MIR3661 1 33 SEMA3E 2 34 ZNF676 2 35 PRR7 1 36 PGM5P3-AS1 2 37 KIAA2022 2 38 LONRF2 2 39 PLCXD3 2 40 NLGN1 2 41 LOC440311 1 42 EPHA6 2 43 LOC100507387 2 44 PDF 1 45 GRIN2A 2 46 LOC105369187 2 47 LINC01537 2 48 EIF3IP1 1 49 FAM35BP 2 50 BCHE 2 51 OPA1-AS1 2 52 TPGS1 1 53 GAS1RR 2 54 NOL12 2 55 LINC01266 2 56 LINC00504 2 57 COL25A1 2 58 LOC101928509 2 59 SNORD30 1 60 ATP2B2 2 61 NOXO1 1 62 MIR4449 1 63 LINC01489 2 64 FRMPD4 2 65 LINC00670 2 66 CCDC158 2 67 HCG23 2 68 CTU1 1 69 AGTR1 2 70 LOC102467147 2 71 FAM173A 1 72 GOLGA8N 2 73 PCDH10 2 74 MIR3911 2 75 TICAM2 2 76 LGI1 2 77 MYOC 2 78 SCN7A 2 79 MEF2C-AS1 2 80 SNORD3A 2 81 LCN10 2 82 KCNQ5 2 83 CCL16 2 84 NEXN 2 85 MYH8 2 86 LOC100507073 2 87 SIAH3 2 88 CCDC85B 1 89 MIR133A1HG 2 90 GRAPL 2 91 SFTPA1 2 92 FILIP1 2 93 ADGRB3 2 94 CCDC144B 2 95 SYT4 2 96 BVES-AS1 2 97 CFHR1 2 98 RAB6C 2 99 ADAT3 1 100 SPOCK3 2 101 CTAGE9 2 102 SLC35F4 2 103 SEMA3D 2 104 GLUD1P7 2 105 GRIA2 2 106 KCTD8 2 107 LINC01352 2 108 MEIS1-AS2 2 109 MROH7-TTC4 2 110 MIR4668 2 111 LOC729558 2 112 OR7E12P 2 113 RANBP3L 2 114 SCN9A 2 115 EIF1AX-AS1 2 116 FGF13-AS1 2 117 ZNF727 2 118 LOC102724663 2 119 LOC283856 2 120 BRDT 2 121 SGCG 2 122 SLC26A5 2 123 TCEAL6 2 124 LINGO2 2 125 LRRC3B 2 126 PLN 2 127 CCDC54 2 128 FOXI3 1 129 CFHR3 2 130 ANKRD20A1 2 131 ARHGEF18 2 132 EPHA5 2 133 MIR6858 1 134 ZCCHC5 2 135 ZNF728 2 136 KCNB1 2 137 ZNF157 2 138 LOC283683 2 139 LOC100129216 2 140 SLITRK2 2 141 TCEAL5 2 142 CLVS2 2 143 C11orf88 2 144 FAM133A 2 145 CDH19 2 146 MORN5 2 147 RBAK-RBAKDN 2 148 ZEB2-AS1 2 149 ST3GAL6-AS1 2 150 NRG3 2 151 LEP 2 152 ANO3 2 153 PGM5P3-AS1.1 2 154 HLX-AS1 2 155 LINC01505 2 156 MACC1-AS1 2 157 RALGAPA1P1 2 158 MIR103A2 2 159 DDC-AS1 1 160 LOC101927588 2 161 TMEM238 1 162 HSPE1-MOB4 2 163 GDF5 2 164 BOLL 2 165 LINC01449 2 166 GAP43 2 167 LOC102724050 2 168 FGF10-AS1 2 169 TGFB2-AS1 2 170 LINC01474 2 171 GJD4 2 172 LOC100506289 2 173 C6orf58 2 174 CIDEB 2 175 FRMD6-AS2 2 176 USP32P2 2 177 VGLL3 2 178 LINC00862 2 179 MUM1L1 2 180 NKAPL 2 181 DPYS 2 182 SNURF 2 183 HFM1 2 184 PDZRN4 2 185 MIR8075 1 186 SCRG1 2 187 LOC101929595 2 189 SLITRK3 2 190 NUDT10 2 191 LOC105373878 2 192 PGP 1 193 SORCS3 2 194 DBIL5P2 2 195 SPECC1L-ADORA2A 2 196 MIR8071-1 2 197 NDUFB8 2 199 CNTN6 2 200 CCBE1 2 201 ACSM5 2 203 HES4 1 204 ASTN1 2 205 PMP2 2 206 EEF1G 2 207 ANGPTL1 2 209 GALR1 2 210 CNTN1 2 211 SYT16 2 212 MYH2 2 213 MUSTN1 2 214 MIR519A2 2 215 ENDOG 1 216 LOC440895 2 217 LOC102724488 2 218 MIR3149 2 219 RBM27 2 220 LOC441666 2 221 COMTD1 1 222 ABCB5 2 223 SOGA3.1 2 224 ZNF747 2 225 RAET1E-AS1.1 1 227 IL12A-AS1 2 228 MIR325HG 2 229 ADRA1A 2 232 NRXN1 2 233 LRRC26 1 236 CELF4 2 237 CCDC144A 2 238 SYNPO2 2 239 ZNF771 1 240 KLF17 2 242 SFTA1P 2 243 ZSCAN23 2 244 CYP8B1 2 245 CASQ2 2 247 MYH11 2 248 PRH1-PRR4 2 249 GPR21 2 253 MIR573 2 255 SPAG6 2 257 MIR4665 1 261 LOC101926940 2 262 ST8SIA3 2 265 PALM2.1 2 269 LOC101929095 2 270 GOLGA8R 2 272 MIR659 2 276 MIR4645 2 282 RIC3 2 285 TMEFF2 2 289 AKAP12 2 303 ABCA9 2
TABLE-US-00006 TABLE 6 Gene Molecular subtype 1 GTF2IP1 2 2 TBC1D3L 2 3 MIR4477B 2 5 BLOC1S5-TXNDC5 2 6 HIST2H3C 2 7 CTAGE8 2 8 HNRNPA1P33 2 9 LOC440434 2 10 GOLGA8K 2 11 TMEM160 1 12 FEZF2 1 13 C10orf131 2 14 TRAPPC5 1 15 KRT222 2 16 ACADL 2 17 LOC101929607 2 18 SNHG25 1 19 SNORD38A 1 20 LOC644838 2 21 KIAA0408 2 22 TCEAL2 2 23 C4orf48 1 24 LOC642131 2 25 PLGLB2 2 26 FAM47E-STBD1 2 27 MIR186 2 28 ADAMTS9-AS1 2 29 TVP23C-CDRT4 2 30 PGM5-AS1 2 31 SLITRK4 2 32 MIR3661 1 33 SEMA3E 2 34 ZNF676 2 35 PRR7 1 36 PGM5P3-AS1 2 37 KIAA2022 2 38 LONRF2 2 39 PLCXD3 2 40 NLGN1 2 41 LOC440311 1 42 EPHA6 2 43 LOC100507387 2 44 PDF 1 45 GRIN2A 2 46 LOC105369187 2 47 LINC01537 2 50 BCHE 2 51 OPA1-AS1 2 52 TPGS1 1 57 COL25A1 2 66 CCDC158 2 68 CTU1 1 71 FAM173A 1
TABLE-US-00007 TABLE 7 Gene Molecular subtype 1 GTF2IP1 2 2 TBC1D3L 2 4 MIR4477B 2 5 BLOC1S5-TXNDC5 2 6 HIST2H3C 2 7 CTAGE8 2 8 HNRNPA1P33 2 9 GOLGA8K 2 10 LOC440434 2 11 TMEM160 1 12 KRT222 2 13 TRAPPC5 1 14 C10orf131 2 15 FEZF2 1 16 LOC101929607 2 17 SNHG25 1 18 SNORD38A 1 19 ACADL 2 20 LOC642131 2 21 C4orf48 1 22 PLGLB2 2 23 SEMA3E 2 24 PGM5-AS1 2 25 PLCXD3 2 26 ZNF676 2 27 LOC644838 2 28 KIAA0408 2 29 TCEAL2 2 30 PGM5P3-AS1 2 31 FAM47E-STBD1 2 32 SLITRK4 2 33 ADAMTS9-AS1 2 34 MIR186 2 35 TVP23C-CDRT4 2 36 LOC100507387 2 37 KIAA2022 2 38 LONRF2 2 39 MIR3661 1 40 PRR7 1 41 NLGN1 2 42 GAS1RR 2 43 FAM35BP 2 44 LOC440311 1 45 PDF 1 46 LINC01266 2 47 EIF3IP1 1 48 LINC01537 2 49 GRIN2A 2 50 SNORD30 1 51 LOC105369187 2 52 EPHA6 2 53 LINC01489 2 54 TPGS1 1 55 BCHE 2 56 LGI1 2 57 OPA1-AS1 2 58 MYOC 2 59 CCDC144B 2 60 NEXN 2 61 FAM173A 1 62 CTU1 1 63 SCN7A 2 64 LINC00504 2 65 SYT4 2 66 LOC100507073 2 67 ATP2B2 2 68 NOL12 2 69 MIR133A1HG 2 70 COL25A1 2 71 BVES-AS1 2 72 MYH8 2 73 FRMPD4 2 74 SPOCK3 2 76 FILIP1 2 77 MIR4449 1 78 LOC102467147 2 79 KCNQ5 2 80 MEF2C-AS1 2 81 LINC01352 2 82 HCG23 2 83 CCDC158 2 84 LINC00670 2 85 CCDC85B 1 86 PCDH10 2 87 CFHR1 2 88 TICAM2 2 89 KCTD8 2 90 NOXO1 1 91 GRIA2 2 92 ADGRB3 2 93 OR7E12P 2 94 ZNF727 2 96 GOLGA8N 2 97 MIR4668 2 99 AGTR1 2 101 SCN9A 2
[Preparation Example 12] Development of Classifier for Newly Found Molecular Subtype (3)
[0184] In addition, Table 8 below is a template that can replace a gene subtype corresponding to the first molecular subtype, and Table 9 is a template that can replace a gene subtype corresponding to the second molecular subtype.
TABLE-US-00008 TABLE 8 No. Gene Ensembl Protein Synonym 1 PMP2 ENSG00000087245 Myelin P2 protein FABP8, M-FABP, MP2, P2, peripheral myelin protein 2, Myelin P2 protein, CMT1G 2 AGTR1 ENSG00000144891 Angiotensin II AG2S, AGTR1B, AT1, receptor type 1 AT1AR, AT1B, AT1BR, AT1R, AT2R1, HAT1R 3 PLCXD3 ENSG00000182836 PI-PLC X domain- Phosphatidylinositol Specific containing protein 3 Phospholipase C X Domain Containing 3, Phosphatidylinositol-Specific Phospholipase C, X Domain Containing, PLCXD3 4 ARHGAP26- ENSG00000226272 — ARHGAP26 Antisense RNA AS1 1, NONHSAG041808.2 91, HSALNG0045520, ENSG00000226272 5 TCEAL6 ENSG00000204071 Transcription Transcription Elongation elongation factor A Factor S-II Protein-Like 6, (SII)-like 6 Transcription Elongation Factor A Protein-Like 6, Transcription Elongation Factor A (SII)-Like 6, TCEA- Like Protein 6, WEX2, Transcription Elongation Factor A (SII)-Like 3, Tceal3 6 ANKRD1 ENSG00000148677 Ankyrin repeat Ankyrin Repeat Domain 1, domain-containing CARP, Ankyrin Repeat protein 1 Domain 1 (Cardiac Muscle), Cytokine-Inducible Gene C- 193 Protein, Cytokine- Inducible Nuclear Protein, Cardiac Ankyrin Repeat Protein, C-193, CVARP, MCARP, ALRP, Epididymis Secretory Sperm Binding Protein, Liver Ankyrin Repeat Domain 1, BA320F15.2, ANKRD1, HA1A2, C193
TABLE-US-00009 TABLE 9 No. Gene Ensembl Protein Synonym 1 PGP ENSG00000184207 P-glycoprotein 1 ABCB1, ABC20, CD243, CLCS, GP170, MDR1, PGY1, ATP binding cassette subfamily B member 1, P-glycoprotein, P-gp 2 SLC26A3 ENSG00000091138 Chloride anion CLD, DRA, solute carrier family exchange) 26 member 3 3 HIST1H4C ENSG00000197061 Histone H4 H4C3, H4/g, H4FG, dJ221C16.1, histone cluster 1, H4c, histone cluster 1 H4 family member c, H4 clustered histone 3, H4C5, H4C4, H4C9, H4C12, H4-16, H4C13, H4C11, H4C1, H4C14, H4C15, H4C8, H4C6, H4C2 4 SNORD69 ENSG00000212452 — snoRNA HBII-210, RF00574 5 RUVBL2 ENSG00000183207 RuvB-like 2 ECP51, INO80J, REPTIN, RVB2, TIH2, TIP48, TIP49B, CGI-46, ECP-51, TAP54-beta, RuvB like AAA ATPase 2 6 RAB19 ENSG00000146955 Ras-related protein Member RAS Oncogene Family, Rab-19 RAB19B, GTP-Binding Protein RAB19B 7 HIST2H2AC ENSG00000184260 Histone H2A type 2- H2AC20, H2A, H2A-GL101, C H2A/q, H2AFQ, histone cluster 2, H2ac, histone cluster 2 H2A family member c, H2A clustered histone 20
[Experimental Example 1] Verification of Clinical Usefulness of Newly-Developed Molecular Subtype Classifier (1)
[0185]
[0186] To classify pretreated biopsy samples from 230 rectal cancer patients treated at the Yonsei Cancer Center, a Nearest Template Prediction (NTP) method was used. Table 9 below shows the correlation between molecular subtypes classified by a primarily-selected 94-gene set and a response to preoperative chemoradiotherapy.
TABLE-US-00010 TABLE 10 First molecular Second molecular subtype subtype Total Pathologic incomplete 33 48 81 response Pathologic complete 6 (15.4%) 26 (35.1%) 32 (28.3%) response Total 39 74 113
[0187] 230 rectal cancer patients were classified by applying the primarily-selected 94-gene set using an NTP method. 113 patients were reliably classified (false discovery rate <0.2), but it was impossible to accurately classify 97 patients. Among 115 that are able to be classified, the pCR rate of the first molecular subtype was 15.4% (6 of 39 patients), whereas the pCR rate of the second molecular subtype was two-fold higher than that of the first molecular subtype, which is 35.1% (26 of 74 patients) (chi-squared=3.98, p=0.046).
[0188]
[0189] Table 11 below shows the correlation between subtypes classified by a secondly-selected 522-gene set and a response to preoperative chemoradiotherapy.
TABLE-US-00011 TABLE 11 First molecular Second molecular subtype subtype Total Pathologic incomplete 57 78 135 response Pathologic complete 6 (9.5%) 45 (36.6%) 51 response Total 63 (33.9%) 123 (66.1%) 186
[0190] 230 rectal cancer patients were classified by applying the secondly-selected 522-gene set using an NTP method. 186 patients were reliably classified (false discovery rate <0.2), but it was impossible to accurately classify 44 patients. Among 186 that are able to be classified, the pCR rate of the first molecular subtype was 9.5% (6 of 63 patients), whereas the pCR rate of the second molecular subtype was two-fold higher than that of the first molecular subtype, which is 36.6% (45 of 123 patients) (chi-squared=14.0, p=0.0002).
[0191]
[Experimental Example 2] Verification of Clinical Usefulness of Newly Developed Molecular Subtype Classifier (2)
[0192] The ability to predict a rectal cancer prognosis according to the first molecular subtype in Table 8 and the second molecular subtype in Table 9 in surgery after preoperative chemoradiotherapy was confirmed, and indicated as DFS (
[0193] As shown in
[Experimental Example 3] Confirmation of Ability to Predict Prognosis in Rectal Cancer Patients Before Treatment According to Molecular Subtype and Pathological Characteristics
[0194] Since the diagnosis of rectal cancer is made by a pathological examination using small tissue biopsy and radiodiagnosis such as CT-MM, it is not easy to predict a prognosis of a patient before the initiation of treatment. Table 12 shows the results of univariate and multivariate analyses performed on candidate prognostic factors that can be implemented or measured before the initiation of treatment. cN_stage indicates the clinically determined degree of lymph node metastasis, and cT_stage indicates a clinically determined tumor size. In Table 12, OR indicates an odds ratio, and CI indicates a confidence interval.
[0195] The analysis results clearly show that only molecular subtypes can predict the prognosis of patients before the initiation of treatment (p<0.001). This imparts a very clinically significant meaning to molecular subtypes. Recently, the treatment of rectal cancer is shifting to total neoadjuvant therapy (TNT), which performs all possible treatments before surgery. In this case, other therapeutic agents may be considered depending on the predicted patient's prognosis. That is, in the case of the first molecular subtype, a more powerful treatment can be considered, so the molecular subtype can play an important role in discriminating the target group for a clinical trial of a novel drug under development.
TABLE-US-00012 TABLE 12 Univariate statistical Multivariate statistical analysis analysis Prognosis Variate Category OR 95% CI P OR 95% CI P DFS cT_stage continuous 0.86 0.49 tp 0.629 0.77 0.39 to 0.442 1.58 1.50 cN_stage continuous 1.28 0.81 to 0.288 1.41 0.87 to 0.159 2.0 2.26 Age >60 vs <=60 1.01 0.57 to 0.980 1.04 0.58 to 0.894 (ref) 1.77 1.85 Sex male vs female 0.84 0.47 to 0.547 0.77 0.43 to 0.402 (ref) 1.50 1.41 Molecular 1 vs 2 (ref) 2.4 1.38 to 0.002 2.51 1.43 to 0.001 subtype 4.19 4.41 OS cT_stage continuous 1.06 0.54 to 0.860 1.06 0.51 to 0.863 2.08 2.21 cN_stage continuous 1.25 0.77 to 0.373 1.27 0.76 to 0.355 2.04 2.13 Age >60 vs <=60 1.21 0.66 to 0.534 1.22 0.66 to 0.521 (ref) 2.21 2.27 Sex male vs female 0.77 0.41 tp 0.415 0.74 0.39 to 0.368 (ref) 1.44 1.42 Molecular 1 vs 2 (ref) 1.88 1.03 to 0.039 1.95 1.07 to 0.030 subtype 3.43 3.57
[Experimental Example 4] Confirmation of Ability to Predict Prognosis of Rectal Cancer Patient after Neoadjuvant Chemoradiotherapy According to Molecular Subtype and Pathological Characteristics
[0196] Table 13 shows the result of investigating the correlation between candidate factors and molecular subtypes that can be used to determine the prognosis of rectal cancer patients after neoadjuvant chemoradiotherapy and surgery. It shows that molecular subtypes are statistically significantly correlated with the size of cancer after treatment (ypT stage) and pCR (in the case of the first molecular subtype, the size is large and pCR is low), but are not associated with the degree of lymph node metastasis (ypN stage), or a patient's age and sex.
TABLE-US-00013 TABLE 13 First Second molecular molecular subtype subtype Classi- (EMT (MYC fication Category subtype) subtype) Chi square P value ypT T0 6 45 20.288 0.0000438 T1 0 3 T2 15 15 T3 41 55 T4 1 5 ypN N0 41 90 3.7365 0.1544 N1 17 19 N2 5 14 pCR No-pCR 57 78 14.001 0.0001827 pCR 6 45 Age <=60 38 66 0.50362 0.4479 >60 25 57 Sex Female 18 45 0.86355 0.3527 Male 45 78
[0197] Table 14 shows the results of univariate and multivariate statistical analyses on candidate factors that can be used to determine the prognosis of rectal cancer patients after neoadjuvant chemoradiotherapy and surgery.
TABLE-US-00014 TABLE 14 Univariate statistical Multivariate statistical analysis analysis Prognosis Variate Category OR 95% CI P OR 95% CI P DFS ypT3 ypT3/4 ν 2.15 1.15 to 0.010 1.16 0.53 to 0.704 ypT0/1/2 (ref) 3.98 2.55 ypN ypN0 ν ypN1/2 3.92 2.23 to <0.001 3.82 2.00 to <0.001 (ref) 6.89 7.28 pCR pCR ν no pCR 2.59 1.10 to 0.010 0.88 0.29 to 0.834 (ref) 6.10 2.71 Molecular EMT ν MYC 2.40 1.37 to 0.002 2.37 1.33 to 0.003 subtype (ref) 4.19 4.21 OS ypT3 ypT3/4 ν 2.31 1.16 to 0.010 1.20 0.53 to 0.655 ypT0/1/2 (ref) 4.60 2.67 ypN ypN0 ν ypN1/2 3.16 1.72 to <0.001 2.67 1.38 to 0.003 (ref) 5.78 5.18 pCR pCR ν no pCR 4.01 1.24 to 0.005 1.78 0.45 to 0.408 (ref) 12.98 7.07 Molecular EMT ν MYC 1.89 1.03 to 0.040 1.77 0.95 to 0.068 subtype (ref) 3.42 3.29
[0198] As shown in Table 14, in the univariate analysis, it can be predicted from DFS and OS that all of the cancer size after treatment (ypT stage), the degree of lymph node metastasis (ypN stage), pCR, and a molecular subtype are statistically significant. However, from the result of multivariate analysis, it is seen that only ypN stage and a molecular subtype are significant. That is, this shows that, since the ypN stage and the molecular subtype each independently affect DFS, when two factors are used together, a prognosis can be more accurately predicted. To prove this, DFS and OS according to ypN stage and a molecular subtype were investigated using Kaplan-Meier plots.
[0199]
[0200]
[0201]
[0202]
[0203] When ypN stage and a molecular subtype are used together as shown in
[Experimental Example 5] Investigation of Rectal Cancer Predicting Ability of Conventional Developed CMS Molecular Subtype Classifier
[0204] On the other hand, to investigate the rectal cancer predicting ability of a conventional classifier for CMS and CRIS subtypes, which are conventional molecular subtypes for predicting the prognosis of colorectal cancer (CRC), using NTP together with a classifier gene template provided by the CMScaller package, DFS and OS in a rectal cancer cohort when the CMS molecular subtypes were used are shown in
[0205] As shown in
[Example 1] Rectal Cancer Treatment Protocol According to Molecular Subtype and Pathological Characteristics
[0206] Based on Experimental Examples 1 to 4, a method of predicting the prognosis of rectal cancer according to a first molecular subtype and a second molecular subtype is shown in
[0207] As shown in
[0208] In the above, as specific parts of the specification have been described in detail, although it is clear to those skilled in the art that this specific technique is merely a preferred embodiment, the scope of the specification is not limited thereto. Thus, the substantial scope of the specification will be defined by the accompanying claims and their equivalents.
[0209] The present invention relates to a composition for predicting a response to neoadjuvant chemoradiotherapy for rectal cancer or a prognosis after treatment and a prediction method using the same.