METHOD FOR DIAGNOSING COLORECTAL CANCER BY DETECTING INTRAGENIC METHYLATION
20230015571 · 2023-01-19
Inventors
Cpc classification
International classification
Abstract
The present invention relates to a method of diagnosing or predicting the prognosis of colorectal cancer by measuring the methylation level in the intragenic region of PDXJ, EN2 and/or MSXJ. The present invention provides highly reliable biomarkers for colorectal cancer by identifying CpG regions in genes that are hypermethylated specifically in colorectal cancer patients, and also provides optimized methylation-specific PCR (MSP) primers capable of efficiently detecting the identified CpG regions. Accordingly, the present invention may provide important clinical information that makes it possible to accurately predict not only the onset of colorectal cancer, but also overall prognosis including the degree of invasion of cancer tissue, the likelihood of metastasis, and the survival rate of the patient, thereby establishing a treatment strategy early and significantly improving the survival rate of colorectal cancer patients. The present invention also provides, as guidelines for the design of primers capable of accurately detecting DNA methylation, optimal parameters for the amplicon length, the total number of CpGs in target gene-binding regions of the primers, and the range of Tm values.
Claims
1. A method for diagnosing or predicting prognosis of colorectal cancer, comprising measuring a methylation level in an intragenic region of at least one gene selected from the group consisting of PDX1, EN2 and MSX1 genes.
2. The method of claim 1, wherein measuring the methylation level in the intragenic region of PDX1 is carried out by using a methylation-specific PCR (MSP) primer set that specifically recognizes the intragenic CpG island of PDX1.
3. The method of claim 2, wherein the intragenic CpG island of PDX1 comprises the nucleotide sequence of SEQ ID NO: 1.
4. The method of claim 3, wherein the MSP primer set is a pair of primers comprising the nucleotide sequence of SEQ ID NO: 4 and the nucleotide sequence of SEQ ID NO: 5.
5. The method of claim 1, wherein measuring the methylation level in the intragenic region of EN2 is carried out by using a methylation-specific PCR (MSP) primer set that specifically recognizes the intragenic CpG island of EN2.
6. The method of claim 5, wherein the intragenic CpG island of EN2 comprises the nucleotide sequence of SEQ ID NO: 2.
7. The method of claim 6, wherein the MSP primer is a pair of primers comprising the nucleotide sequence of SEQ ID NO: 6 and the nucleotide sequence of SEQ ID NO: 7.
8. The method of claim 1, wherein measuring the methylation level in the intragenic region of MSX1 is carried out by using a methylation-specific PCR (MSP) primer set that specifically recognizes the intragenic CpG island of MSX1 .
9. The method of claim 8, wherein the intragenic CpG island of MSX1 comprises the nucleotide sequence of SEQ ID NO: 3.
10. The method of claim 9, wherein the MSP primer is a pair of primers comprising the nucleotide sequence of SEQ ID NO: 8 and the nucleotide sequence of SEQ ID NO: 9.
11. A method for treating colorectal cancer in a subject, the method comprising: diagnosing or predicting prognosis of colorectal cancer in the subject by the method of claim 1; and then based on the diagnosis or prediction of prognosis, treating the subject for colorectal cancer.
12. A method for diagnosing colorectal cancer, comprising measuring an expression level of at least one gene selected from the group consisting ofPDX1, GRIN2D, PITX1, TFAP2A, EN2 and MSX1 genes.
13. A method for treating colorectal cancer in a subject, the method comprising: diagnosing colorectal cancer in the subject by the method of claim 12; and then based on the diagnosis, treating the subject for colorectal cancer.
14. A method for predicting prognosis of colorectal cancer, comprising measuring an expression level of at least one gene selected from the group consisting of PDXJ, EN2 and MSX1 genes.
15. The method of claim 14, wherein the prognosis comprises metastasis of colorectal cancer.
16. A method for treating colorectal cancer in a subject, the method comprising: predicting prognosis of colorectal cancer colorectal cancer in the subject by the method of claim 14; and then based on the predicted prognosis, treating the subject for colorectal cancer.
17. A nucleic acid molecule for detecting methylation in a target gene, comprising forward and reverse primers which form an amplicon having a length of 90 bp to 170 bp, comprise a total of 6 to 9 CpG sites in target gene-binding regions thereof, and have a melting temperature (Tm) of 53 to 62° C.
18. The nucleic acid molecule of claim 17, wherein the forward and reverse primers have a length of 20 bp to 35 bp.
19. The nucleic acid molecule of claim 17, wherein a melting temperature (Tm) difference between the forward and reverse primers is less than 2° C.
20. The nucleic acid molecule of claim 17, wherein the primers are used to measure a methylation level in an intragenic region of the target gene.
21. The nucleic acid molecule of claim 17, wherein the primers are methylation-specific PCR (MSP) primers that specifically recognize an intragenic CpG island.
22. The nucleic acid molecule of claim 21, wherein the methylation level in the intragenic CpG island is different between a patient with a cancer and a normal person.
23. A methylation-specific PCR (MSP) primer set selected from the group consisting of: (a) a primer set comprising the nucleotide sequence of SEQ ID NO: 4 and the nucleotide sequence of SEQ ID NO: 5; (b) a primer set comprising the nucleotide sequence of SEQ ID NO: 6 and the nucleotide sequence of SEQ ID NO: 7; and (c) a primer set comprising the nucleotide sequence of SEQ ID NO: 8 and the nucleotide sequence of SEQ ID NO: 9.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0074]
[0075]
[0076]
[0077] EN2 and MSX1. Overexpression of PDX1, EN2 and MSX1 was found to accelerate migration and confer invasive properties.
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
[0093]
DETAILED DESCRIPTION
[0094] Hereinafter, the present invention will be described in more detail with reference to examples. These examples serve merely to illustrate the present invention in more detail, and it will be apparent to those skilled in the art that the scope of the present invention according to the subject matter of the present invention is not limited by these examples.
Examples
Experimental Methods
[0095] Analysis of Infinium HumanMethylation450 BeadChip Data from TCGA
[0096] To select candidate genomic DNA regions for targeted bisulfite sequencing, Infinium HumanMethylation450 BeadChip data from TCGA were downloaded from the repository of five major gastrointestinal cancers, namely, colon adenocarcinoma (COAD), rectal adenocarcinoma (READ), liver hepatocellular carcinoma (LIHC), stomach adenocarcinoma (STAD), and pancreatic adenocarcinoma (PAAD), via the Genomic Data Commons (GDC) Data Portal (https://portal.gdc.cancer.gov/). The beta value of each CpG site was averaged to represent the methylation value of their matched CpG island in accordance with the human genome reference 19 (hg19). The CpG island methylation values of healthy tissue samples were then averaged, and methylation differences between the tumor samples and the average of the healthy tissue samples were tabulated. Finally, the present inventors shortlisted CpG islands that displayed methylation differences between normal and tumor tissues greater than or equal to 20% in more than 20% of the total patients.
[0097] Design of Hybridizing Probe Pool
[0098] The probe pool was designed according to the manufacturer's instructions. Basic information regarding the target genome is as follows: Application - SeqCap Epi, Organism -Homo Sapiens, Genomic builds - hg19/GRCh37. This was followed by data input in an appropriate BED (browser extensible data) format into NimbleDesign Software (version 4.3; Roche Diagnostics, Rotkreuz, Switzerland). The total number of target regions was 18,834, and the total length of the regions was 23,533,457 bp (probe design No.: IRN4000028910).
[0099] Colorectal Tumor and Adjacent Healthy Specimens
[0100] A total of 104 colorectal tumors and their adjacent healthy tissues were obtained from Seoul National University Hospital (SNUH; Seoul, Korea). The use of samples was approved by the Institutional Review Board of Seoul National University Hospital and carried out in accordance with the ethical standards and guidelines of the institution (IRB number: 1608-040-784).
[0101] Sample Preparation for Targeted Bisulfite Sequencing
[0102] Genomic DNA (1 μg) was used to prepare a single targeted bisulfite sequencing library. All genomic DNA of healthy and tumor samples were sheared using a focused ultrasonicator (M220; Covaris, Massachusetts, USA). The quality, quantity, and fragment size (major peak in 250-300 bp) of sheared genomic DNA were verified using a 2100 Bioanalyzer system (G2939BA; Agilent Technologies, California, USA) prior to library preparation. Sheared genomic DNA was then processed through end repair, A-tailing (Kapa Library Prep Kit for Illumina NGS Platform, 7137974001; Roche Diagnostics), and sequencing adaptor ligation steps (SeqCap Adapter Kit A, 7141530001; Roche Diagnostics).
[0103] After clean-up with Agencourt AMPure XP beads (A63880, Beckman Coulter, California, USA), the DNA library was bisulfite-converted using the EZ DNA MethylationLightning Kit (D5031; Zymo Research, California, USA) and amplified via precapture polymerase chain reaction (PCR) using KAPA HiFi HotStart Uracil+ReadyMix (NG SeqCap Epi Accessory Kit, 7145519001; Roche Diagnostics) with Pre-LM-PCR Oligo. The quality of the amplified, bisulfite-converted library samples and their sizes (main peak in 250-300 bp) were verified using a Bio-Analyzer. 1 μg of each amplified, bisulfite-converted library was then combined in sets of SeqCap Epi universal and indexing oligos and bisulfite capture enhancer (SeqCap EZ HE-Oligo Kit A, 6777287001; Roche Diagnostics). Each pool was subsequently lyophilized using a DNA vacuum concentrator (Modulspin 31; Hanil Science Co, Ltd.,
[0104] Daejeon, South Korea). The dried components were resuspended in hybridization buffer (SeqCap Epi Hybridization and Wash Kit, 5634253001; Roche Diagnostics) and then hybridized with the probe pool (SeqCap Epi Choice S, 7138938001; Roche Diagnostics) for 72 hours at 47° C. Following incubation, libraries were captured (SeqCap Pure Capture Bead Kit, 6977952001; Roche Diagnostics) in a 47° C. water bath and purified at room temperature. Captured bisulfite-converted libraries were amplified via post-capture PCR and then washed with AMPure XP beads. The quality and size (single peak in 250-300 bp) of the libraries were checked using a bioanalyzer, and samples that passed quality control were sequenced on a HiSeq 2500 instrument (Illumina, California, USA) in paired-end mode.
[0105] Preprocessing and Preliminary Screening of Targeted Bisulfite Sequencing Data
[0106] Trim Galore (version 0.5.0) was used to remove the adaptor sequences from the targeted bisulfite sequencing data. Based on the human CpG island reference hg19 file, bismark was used to align sequencing reads with Bowtie2. The sort and index commands from SAMtools were used. The number of methylated and unmethylated cytosines at each CpG site was listed using a Bismark methylation extractor, and only those 10x or higher were selected and used for downstream analysis.
[0107] Finally, the methylation values of CpG sites included in the same CpG island were calculated by averaging the methylation value based on the hg19 reference file. The following analyses were performed based on the assumption that the averaged value represents each respective CpG island.
[0108] Targeted bisulfite sequencing data were screened for targets in which DNA methylation increased or decreased by more than 30% in tumor samples compared with healthy tissue samples in 50% or more of the 90 patients. In addition, hypermethylated CpG islands in tumor samples were further filtered to retrieve regions that showed less than 30% DNA methylation in the healthy tissue samples and 50% or greater DNA methylation in the tumor samples. Conversely, hypomethylated CpG islands, in which the average DNA methylation was less than 30% in the tumor samples and more than 50% in the healthy tissue samples, were selected. Finally, the present inventors selected CpG islands where the mean DNA methylation in healthy tissue samples and tumor samples differed by more than 30%.
[0109] Analysis of Targeted Bisulfate Sequencing Data
[0110] To analyze the CpG site methylation levels in candidate CpG islands from healthy tissue and tumor samples, beta values of CpG sites in candidate CpG islands were extracted using the tabix program of SAMtools (version 1.9), and only the beta values of cytosines in the same strand of adjacent genes were used in the subsequent analysis to identify the optimal MSP target sites. To filter out the low-quality sequencing data, only sequencing data in which the methylation levels of CpG sites were present in 1/3 or more of the total CpG sites in each CpG island were used. Hierarchical clustering with Canberra distance was applied to the methylation level of each sample. Line graphs were also drawn with the same methylation data using ggplot2 (version 3.3.3) and ggsci (version 2.9) in R software. To display the methylation differences of candidate CpG islands between healthy tissue and tumor samples, hierarchical clustering with Manhattan distance was conducted using p-heatmap. Clustering of CRC patients was performed with the methylation data of the three candidate CpG islands in PDXJ, EN2, and MSXJ. Using IGV, the data regarding the average methylation levels of genes in healthy and tumor tissues were visualized.
[0111] Analysis of TCGA Colon Adenocarcinoma RNA Sequencing Data
[0112] 320 read count files (healthy tissue sample=41, tumor sample=279) which had been quantified with HTSeq, were used to analyze the CRC gene expression pattern of healthy tissue and tumor samples. Each read count was integrated into a matrix format, and the list of differentially expressed genes between healthy tissue and tumors was generated using the DeSeq2 package (version 3.12) in R software. Meanwhile, the TPM value of each gene was derived by using the scaled-estimate value from TCGA RNA-seq V2 data. Meanwhile, as genes that showed a greater than 2-fold change, genes with statistical significance (adjusted p-value <0.05) between normal and tumor samples were selected as final candidates.
[0113] Kaplan—Meier Survival Estimation
[0114] To investigate patient survival according to the expression level of the candidate genes, the present inventors utilized the UALCAN database (http://ualcan.path.uab.edu/index.html). Genes of interest were tabulated in a specified format, and the appropriate cancer type for analysis was preselected. UALCAN results culminated in the categorization of two groups: (1) high expression of queried genes (upper 25%) and (2) low/medium expression of queried genes (lower 75%). To evaluate whether methylation in the intragenic regions of PDXJ, EN2, and MSX1 has the potential as a prognostic marker in CRC, the survival (version 3.2-7), survminer (version 0.4.8), and ggplot2 packages (version 3.3.3) in R software was used. Progression-free survival in cancer recurrence analysis and overall survival (OS) in the survival analysis of CRC patients were evaluated. The statistical significance of the survival ratio was calculated using the log-rank test.
[0115] Overexpression Construct
[0116] Each overexpression construct of the candidate genes was subcloned from the pcDNA3-NFlag-NLRP3 vector. To obtain insert fragments, the present inventors designed PCR primers that specifically amplified target sequences on HCT116 and SW480 cDNA with reference to the National Center for Biotechnology Information. As the target genes have numerous CpG sites, the melting temperature (Tm) of the target amplicon naturally increases, hindering the PCR reaction. Thus, the present inventors pre-boiled HCT116 and SW480 cDNA for 10 min before commencing PCR to completely separate the double-stranded structure of the template DNA.
[0117] Cell Cultures
[0118] The colon cancer cell lines HCT116, LoVo, and SW480 were kindly gifted by Prof Sungsoon Fang of Yonsei University South Korea, a healthy colon fibroblast cell line, CCD-18Co was purchased from the Korean Cell Line Bank (KCLB), HCT116, LoVo, and SW480 cells were maintained in RPMI 1640 medium (11875119; Gibco) supplemented with 10% fetal bovine serum (SH30084.03; Hyclone), and CCD-18Co was grown in DMEM (DMEM/High glucose with L-glutamine, sodium pyruvate with phenol Red, SH30243.01; Hyclone) with 10% FBS. All cell lines were incubated at 37° C. and 5% CO2 in a humidified incubator. For overexpression of candidate genes in vitro, HCT116 cells were seeded in 60-mm culture plates and transfected with either an empty vector or a construct with the candidate genes using Lipofectamine 2000 (11668019; Thermo Fisher Scientific, Massachusetts, USA). The transfection efficiency of each overexpression construct was confirmed by the western blotting of the tags. SW480 cells were transfected with the dCas9-TET1 construct using Lipofectamine3000 (L3000015; Thermo Fisher Scientific) to enhance transfection efficiency.
[0119] Transfection efficiencies were verified using GFP as detected by fluorescence microscopy (Cell Imaging System, fl_AMF-4306; EVOS). To detect DNA methylation status and mRNA expression level simultaneously, both genomic DNA and total RNA were extracted from a single sample using AllPrep DNA/RNA mini kit (80204; Qiagen) and subjected to qMSP and qPCR, respectively.
[0120] Western Blotting
[0121] To confirm the overexpression of candidate genes compared to the empty vector, Western blotting was conducted by immunoblotting FLAG-tags at the N-terminus of each construct using antibodies against a-flag (F7425-.2MG; Sigma-Aldrich) and a-GAPDH (SC-25778; Santa Cruz, Texas, USA).
[0122] Cell Proliferation Assay
[0123] A total of 1×10.sup.5 HCT116 cells were transfected with the gene construct for 24 h, followed by seeding in 24-well plates. Cell viability was determined by measuring the absorbance at 450 nm using Cell Counting kit-8 (CK04-11; Dojindo, Kumamoto, Japan) and a microplate reader (Molecular Devices, LLC) at 450 nm at the indicated time points.
[0124] Invasion Assay
[0125] Invasion assays were performed in 24-well transwell plates (8-nin pore size, 3422; Costar). For invasion assays, 2×10.sup.5HCT116 cells were transfected with the gene construct for 24 hours, followed by seeding in Matrigel-coated upper chambers. The upper chamber was filled with serum-free RPMI medium, while the lower chamber was filled with RPMI medium supplemented with serum as a chemoattractant. After incubation for 48 hours, the cells that had not invaded through the membrane were removed, and the invaded cells were stained with crystal violet and counted.
[0126] MSP Primer Design
[0127] To validate DNA hypermethylation in candidate CpG islands in vitro, the present inventors used the following criteria to design MSP primers. First, the Tm difference between the forward and reverse primers was less than 2° C. Tm, which was calculated using Oligo Calc (version 3.27), was set between 55° C. and 60° C. Primer length was designated as 22 bp to 33 bp, with the expected PCR amplicon size set between 100 bp and 160 bp.sup.[25]. Additionally, with reference to the DNA methylation status in the targeted bisulfite sequencing data, the present inventors designed MSP primers to include at least 6 CpG sites in the primer binding regions. Finally, regions where 2/3 or more of the CpG sites are methylated by less than 20% in healthy tissues, and are methylated by more than 50% in tumors were selected as primer binding targets. MSP primer sets that bind to methylated (Met) or unmethylated (Unmet) CpG sites were designed manually using the above-described criteria. Additionally, the present inventors also included primers that bind to partially methylated CpG sites (Half-Met).
[0128] Quantitative Methylation-Specific PCR (qMSP)
[0129] Prior to measuring DNA methylation levels of target genes, 500 ng of genomic DNA extracted from colorectal cell lines or CRC patients was treated with sodium bisulfite (EZ DNA Methylation-Lightning Kits, D5031; Zymo Research). Concentration of bisulfite-converted genomic DNA was quantified using a UV spectrophotometer (Nanodrop 2000; Thermo Fisher Scientific). In the qMSP reaction, KAPA SYBR FAST qPCR Master Mix (2X) (KK4608; Kapa Biosystems) was used to enhance the GC-rich PCR with a PCR cycler (LightCycler 480 II; Roche Diagnostics). The crossing point (Cp) value was calculated by directly adjusting the signal threshold. The DNA methylation level of each CpG island was calculated using the following equation:
[0130] (Methylation Level)=2.sup.(Cp of Unmet) - (Cp of Met).
[0131] CRISPR/dCas9-TET1 construct
[0132] gRNA targeting sites within 100 bp of the MSP primer binding site were selected through Chopchopv2 and further filtered for the least number of off-target sites and best targeting efficiency (Labun et al., 2016). The cloning process was conducted according to the gRNA cloning protocol of Mali P (Mali et al., 2013; Morita et al., 2016). Gibson ligation was performed using a cloning kit (639649; Takara Bio Inc., Shiga, Japan), and the cloned gRNA sequence was confirmed by pyrosequencing.
[0133] Quantitative PCR (qPCR)
[0134] To check the expression of each candidate gene upon demethylation via the dCas9 system, complementary DNA was synthesized from the total RNA using reverse transcriptase (18090050; Invitrogen).
Experimental Results
[0135] Identification of Differentially Methylated Regions in CRC Tissues by Targeted Bisulfite Sequencing
[0136] To observe methylation levels in CRC and other types of cancers, 450 K microarray data of five cancer types (COAD, READ, LIHC, AD, and PAAD) were collected from TCGA (
[0137] Next, the present inventors performed bisulfite sequencing using the probe pool in CRC tissues. To this end, genomic DNA was obtained from the tissues of 104 Korean CRC patients (90 paired tumors and adjacent healthy tissues, an additional two healthy tissues, and 12 tumor tissues). Targeted bisulfite sequencing libraries were prepared according to the manufacturer's instructions (Roche) (
[0138] Thus, the present inventors ultimately identified 40 differentially methylated CpG islands (35 hypermethylated regions+5 hypomethylated regions) in tumor tissues. For instance, the genomic location of chromosome 7:27,147,589-27,148,389 is the intragenic region of HOXA3, where 67 CpG sites are located. On average, the methylation level in this region was 29% in healthy tissues but was 78.7% in tumor tissues. This difference was observed in 83.3% of CRC patients (75 out of 90) (Table 1).
[0139] [Table 1] Candidate CpG islands and their matched genes selected from targeted bisulfite sequencing data of 90 CRC patients
TABLE-US-00001 (McaM − CGI_location CGI_information Gene 30%_Diff McoM McaM McoM) chr7: 27147589-27148389 Intragenic HOXA3 83.3% (75/90) 29.0 78.7 49.7 chr7: 27146069-27146600 Intragenic HOXA3 82.2% (74/90) 26.0 74.0 48.0 chr19: 49669275-49669552 Intragenic TRPM4 81.1% (73/90) 24.2 73.7 49.5 chr2: 54086776-54087266 promoter GPR75-ASB3 .sup. 80% (72/90) 23.9 74.3 50.3 chr1: 200010625-200010832 Intragenic NR5A2 78.9% (71/90) 9.1 57.7 48.7 chr13: 28498226-28499046 Intragenic PDX1 72.2% (65/90) 9.1 55.0 45.9 chr5: 140857864-140858065 Intragenic PCDHGA2 72.2% (65/90) 17.3 62.8 45.5 chr7: 27182613-27185562 promoter HOXA-AS3 71.1% (64/90) 21.4 62.6 41.2 chr19: 48918115-48918340 Intragenic GRIN2D 69.9% (58/83) 10.7 53.1 46.2 chr5: 140864527-140864748 promoter PCDHGA2 68.9% (62/90) 9.1 52.3 43.1 chr5: 134363092-134365146 Intragenic PITX1 67.8% (61/90) 21.5 59.8 38.3 chr7: 158936507-158938492 promoter VIPR2 65.6% (59/90) 12.4 50.1 37.7 chr6: 62995855-62996228 promoter KHDRBS2 63.3% (57/90) 11.7 51.3 39.6 chr6: 10398573-10398812 Intragenic TFAP2A 63.3% (57/90) 16.1 53.0 36.9 chr7: 27143181-27143479 Intergenic — 63.3% (57/90) 26.0 62.6 36.7 chr7: 24323558-24325080 promoter NPY 63.3% (57/90) 16.5 52.7 36.2 chr8: 97171805-97172022 promoter GDF6 63.3% (57/90) 19.8 53.5 33.7 chr13: 53313127-53314045 promoter CNMD 62.2% (56/90) 15.6 50.9 35.3 chrX: 142721410-142722958 promoter SLITRK4 60.7% (54/89) 19.2 54.8 35.5 chr7: 155255098-155255311 Intragenic EN2 .sup. 60% (54/90) 17.0 52.2 35.2 chr13: 102568425-102569495 promoter FGF14 .sup. 60% (54/90) 15.6 50.6 35.0 chrX: 66766037-66766279 Intragenic AR 58.9% (53/90) 20.3 55.8 35.5 chr9: 37002489-37002957 promoter PAX5 58.9% (53/90) 22.1 56.3 34.1 chrX: 101906001-101907017 promoter ARMCX5-GPRASP2 57.8% (52/90) 21.6 58.2 36.6 chr4: 111549879-111550203 Intragenic PITX2 57.8% (52/90) 22.9 53.7 30.8 chr4: 4864456-4864834 Intragenic MSX1 57.3% (51/89) 29.7 64.3 35.3 chr8: 72753874-72754755 promoter MSC 56.7% (51/90) 26.7 58.7 32.0 chr19: 46915311-46915802 Intragenic CCDC8 55.6% (50/90) 17.7 52.1 34.5 chr8: 130995921-130996149 Intragenic FAM49B 54.4% (49/90) 20.9 53.1 32.1 chr2: 98962873-98964187 promoter CNGA3 54.4% (49/90) 19.6 51.7 32.1 chr2: 5836068-5837643 Intragenic SOX11 54.4% (49/90) 20.8 51.7 30.9 chr11: 65359292-65360328 Intragenic EHBP1L1 53.3% (48/90) 26.6 58.0 31.4 chr6: 108495654-108495986 Intragenic NR2E1 53.3% (48/90) 21.5 52.0 30.5 chr1: 120905971-120906396 promoter HIST2H2BA (H2BP1) 53.3% (48/90) 28.8 59.1 30.3 chr13: 70681732-70682219 promoter KLHL1 .sup. 50% (45/90) 25.1 55.5 30.4 chr16: 87441387-87441671 Intragenic ZCCHC14 78.9% (71/90) 77.98 28.81 −49.17 chr7: 5342299-5342599 Intragenic SLC29A4 77.8% (70/90) 73.15 26.40 −46.75 chr20: 33762403-33762774 Intragenic PROCR 66.7% (60/90) 68.94 29.90 −39.04 chr1: 235805318-235805771 Intragenic GNG4 56.7% (51/90) 62.69 29.03 −33.66 chr2: 233925091-233925318 promoter INPP5D 57.8% (52/90) 52.94 20.31 −32.63
[0140] *McoM: the mean of control (healthy) sample methylation, **McaM: the mean of patient (cancer) methylation.
[0141] Selection of Candidate Genes for Developing CRC Biomarkers
[0142] The methylation location plays an important role in the correlation between methylation states and gene expression.sup.[19, 26-28]. However, while it is well known that hypermethylation in the promoter region inhibits gene expression.sup.[29], the effect of methylation of the intragenic regions on gene expression is still controversial.sup.[30-36]
[0143] As a result of analyzing the locations of 40 differentially methylated CpG islands, it was observed that, among the 35 hypermethylated regions in the tumor, 16 CpG islands were in the promoter region, 18 were in the intragenic region, and 1 was in the intergenic region. Among the five hypomethylated regions, one was in the promoter region, and four were in the intragenic region (
[0144] The present inventors next wanted to develop a system to detect methylation states in the 40 differentially methylated CpG islands. To this end, the present inventors examined the regions whose methylation changes have a direct correlation with the expression changes of the related genes. The present inventors speculated that it would be much easier to detect the changes if both methylation and gene expression are increased in tumor tissues compared with healthy tissues. Therefore, the present inventors were interested in the hypermethylated regions, particularly in intragenic regions, because it is difficult to connect the intergenic region to gene expression, and hypermethylation in the promoter is well known to be related to decreased gene expression. To examine gene expression, the present inventors used the TCGA RNA-seq dataset of colon adenocarcinoma (
[0145] Next, the present inventors examined the relationship between the expression of the six genes and the survival rate of CRC patients. According to UALCAN analysis.sup.[37], high expression of PDX1, EN2, and MSX1 was negatively correlated with patient survival (
[0146] Overexpression of PDX1, EN2, or MSX1 Promotes Cell Proliferation and Invasion in Human Colon Cancer Cells
[0147] Pancreatic and duodenal homeobox 1 (PDX1) is a critical transcription factor for pancreatic development and beta-cell maturation.sup.[38]. PDX1 is overexpressed in pancreatic cancer cells, but its role is different at each cancer stage.sup.[39-41]. Although PDX1 has already been reported as a potential cancer marker in CRC, it is based on the observation of PDX1 expression in cancer cells, and its role has not been studied in detail. Homeobox protein engrailed-2 (EN2) is a homeobox-containing transcription factor regulating many developmental stages.sup.[42]. Recently, EN2 was reported to play an oncogenic role in tumor progression via CCL20 in CRC.sup.[43]. Msh homeobox 1 (MSX1) is also a homeobox-containing transcription factor. MSX1 has been suggested as an mRNA biomarker for CRC, but this suggestion was based on expression pattern observations, and its role has never been demonstrated at the cellular level in CRC.sup.[44].
[0148] The present inventors transiently transfected each gene into the HCT116 colon cancer cell line and then determined cell proliferation using CCK-8. As a result, overexpression of PDXJ, EN2, and MSX1 increased cell proliferation (
[0149] Overall, it was concluded that, since the overexpression of PDX1, EN2, and MSX1 is directly related to the proliferation and migration of CRC cells, if the methylation changes in the intragenic regions of these genes are correlated with changes in gene expression, the detection of methylation changes in the marker regions of the present invention would be able to predict cellular conditions.
Design of MSP Primers for Optimal Detection of Methylation Changes
[0150] To detect the methylation changes in the marker regions of the present invention, the present inventors decided to set up a qMSP for each region. Since MSP is a PCR-based experiment, the choice of primer region is very important. If each of the forward and reverse primers has as many CpG sites as possible, the methylation difference between healthy and tumor tissue is large. However, because it would be preferred to perform PCR with methylated primers with unmethylated primers in the same machine, excessive many CpG sites may cause a Tm difference between methylated and unmethylated primers. Finally, the present inventors attempted to make the amplicon length 100 to 160 bp for efficient amplification. Overall, after many trials and errors, the present inventors decided that the forward and reverse primers had at least six CpG sites in total, the Tm of each primer was 55 to 60° C., and the amplicon length was 100 to 160 bp.
[0151] To design MSP primers specific for the intragenic CpG island of PDX1 (chr13:28,498,226-28,499,046), the present inventors examined the methylation changes of 80 individual CpG sites in that region. Although most CpG sites had large differences in methylation changes between tumor and healthy tissues, in an effort to identify the region that satisfies the criteria of the present invention, the present inventors designed MSP primers based on the heatmap and the line graph of the methylation level for each CpG site in the candidate CpG islands (
[0152] The forward primer for PDX1 has four CpG sites, and the reverse primer has three CpG sites. The beta value of these seven CpG sites was approximately 10% in normal tissues but 70% in tumor tissues on average. The amplicon size was 126 bp and 123 bp, and the Tm was 55 to 57° C. (
[0153] For EN2 and MSX1 , MSP primers were also designed in a similar manner. In brief, the forward primer and the reverse primer for EN2 had three CpG sites. The beta value of the six CpG sites was approximately 10% in healthy tissues but 70% in tumor tissues on average.
[0154] The amplicon sizes were 127 bp and 112 bp, and the Tm was 57 to 58° C. (
[0155] MSP Primers Efficiently Detect the Methylation States of the Region of Interest
[0156] Since the MSP primers of various embodiments of the present invention had a total of six or seven CpG sites, the present inventors not only made a primer set that retained cytosine (methylation primers) or changed all cytosine to thymine (unmethylated primers) but also created a primer set that changed only half of the cytosine to thymine (half-methylation primers). Using these primers, qPCR was performed with bisulfite-treated genomic DNA from the CCD-18Co normal colon cell line and the SW480, LoVo, and HCT116 colon cancer cell lines.
[0157] In each CpG island, the methylation primer gave a PCR product in SW480, LoVo, and HCT116 cells but not in CCD-18Co cells. On the contrary, unmethylated primers were detected in CCD18Co cells but not in SW480, LoVo, and HCT116 cells. The half-methylation primer failed to show clear differences among CCD-18Co, SW480, LoVo, and HCT116 cells (
[0158] From these results, it was confirmed that the MSP primers of the present inventors could distinguish cancer cells from normal cells very efficiently. Although half-methylation primers also have four CpG sites where methylation levels between healthy and cancer cells are different, they could not produce clear differences when MSP was performed, suggesting that only MSP primers have more than enough CpG sites to provide substantially different results.
[0159] MSP Primers of the Present Invention could Detect Dynamic Changes in Methylation States
[0160] Next, the present inventors examined whether the MSP primers of the present invention could distinguish the dynamic changes in methylation levels out of concern that the data from cell lines might not sufficiently reflect physiological methylation changes due to fixed methylation values. To induce methylation changes, the present inventors used the CRISPR/dCas9-TET1 system (hereafter the dCas9-TET system), which enables to decrease methylation levels in a location-specific manner (
[0161] After introducing the dCas9-TET system into the PDX1 genomic region (
[0162] Methylation Levels of PDX1, EN2 and MSX1 Predict CRC Metastasis
[0163] Next, the present inventors examined whether the methylation levels of the intragenic CpG regions of PDX1, EN2 and MSX1 have clinical implications. The present inventors classified patients based on the methylation levels of these regions by conducting hierarchical clustering with the Manhattan distance. Consequently, the present inventors created two groups: the hypermethylated group (Group 1, N=26) and the intermediate methylation and hypomethylated group (Group 2, n=61) (
[0164] Finally, the present inventors examined whether the MSP system of the present invention could distinguish between these two patient groups. As a result of performing qMSP using bisulfate-treated genomic DNA from the tumor tissues of seven patients, it was confirmed that two patients in Group 1 showed higher methylation levels in the intragenic regions of PDXJ, EN2 and MSX1 .
[0165] [Table 2] Clinical data of the subgroups classified by the methylation level of the intragenic CpG island of PDX1, EN2, and MSX1
TABLE-US-00002 Parameter Subgroup 1 Subgroup 2 P N 25 61 Age, mean 58.2 (40-74) 63.2 (36-83) 0.0343, * (range), year Gender 13:12 39:22 .sup. 0.304, ns (male:female) Stage n = 25 n = 61 I 0% (0) 1.64% (1) II 8% (2) 0% (0) 2.113E−06, *** .sup. III 20% (5) 78.7% (48) IV 72% (18) 26.2% (12) Invasion n = 25 n = 61 Lymphatic 56% (14) 45.9% (19) 0.0314, * Vascular 44% (11) 19.6% (8) 0.00172, ** Perineural 80% (20) 50.8% (31) 0.0124, * Differentiation n = 24 n = 58 Well 0% (0) 1.7% (1) Moderately 91.7% (22) 93.1% (54) .sup. 0.706, ns Poorly 8.3% (2) 5.2% (3) Microsatellite n = 23 n = 58 Stable 91.3% (21) 93.1% (54) Instable - Low 4.3% (1) 5.2% (3) .sup. 0.969, ns Instable - High 4.3% (1) 5.2% (3) Site of Tumor n = 25 n = 58 Ascending 20% (5) 25.9% (15) Descending 4% (1) 0% (0) Transverse 4% (1) 1.7% (1) .sup. 0.667, ns Sigmoid 40% (10) 36.2% (21) Rectal 16% (4) 20.7% (12) Rectosigmoid 16% (4) 15.5% (9) Junction
[0166] The age of the two subgroups was compared via a two-tailed t test, and the chi-square test was used to analyze the other parameters.
[0167] Although the present invention has been described in detail with reference to the specific features, it will be apparent to those skilled in the art that this description is only of a preferred embodiment thereof, and does not limit the scope of the present invention. Thus, the substantial scope of the present invention will be defined by the appended claims and equivalents thereto.
REFERENCES
[0168] 1. Global Cancer Observatory: Cancer Today. [https://gco.iarc.fr/today] [0169] 2. Day DW: Scand J Gastroenterol Suppl 1984, 104:99-107. [0170] 3. Dekker E, Tanis PJ, Vleugels JLA, Kasi PM, Wallace MB: The Lancet 2019, 394:1467-1480. [0171] 4. Vogelstein B, Kinzler KW: Nat Med 2004, 10:789-799. [0172] 5. Zecchin D, Boscaro V, Medico E, Barault L, Martini M, Arena S, Cancelliere C, Bartolini A, Crowley EH, Bardelli A, et al: Mol Cancer Ther 2013, 12:2950-2961. [0173] 6. Schell MJ, Yang M, Teer JK, Lo FY, Madan A, Coppola D, Monteiro AN, Nebozhyn MV, Yue B, Loboda A, et al: Nat Commun 2016, 7:11743. [0174] 7. Xia LC, Van Hummelen P, Kubit M, Lee H, Bell JM, Grimes SM, Wood-Bouwens C, Greer SU, Barker T, Haslem DS, et al: Sci Rep 2020, 10:5009. [0175] 8. National Cancer Institute Surveillance E, and End Results Program.: Cancer stat facts: colorectal cancer. [0176] 9. Dashwood RH: Oncol Rep 1999, 6:277-281. [0177] 10. Force USPST, Bibbins-Domingo K, Grossman DC, Curry SJ, Davidson KW, Epling JW, Jr., Garcia FAR, Gillman MW, Harper DM, Kemper AR, et al: ,L4111,4 2016, 315:2564-2575. [0178] 11. Feinberg AP, Vogelstein B: Nature 1983, 301:89-92. [0179] 12. Ehrlich M: Oncogene 2002, 21:5400-5413. [0180] 13. Rodriguez J, Frigola J, Vendrell E, Risques RA, Fraga MF, Morales C, Moreno V, Esteller M, Capella G, Ribas M, Peinado MA: Canc er Res 2006, 66: 8462-9468. [0181] 14. Toyota M, AhujaN, Ohe-Toyota M, Herman JG, Baylin SB, Issa J-PJ: Proceedings of the National Academy of Sciences 1999, 96:8681-8686. [0182] 15. Toth, K., Sipos F, K alma A, Patai AV, Wichmann B, Stoehr R, Golcher H, Schellerer V, Tulassay Z, Moln8 B: PLoS One 2012, 7:e46000. [0183] 16. A stool DNA test (Cologuard) for colorectal cancer screening. Med Lett Drugs
[0184] Ther 2014, 56:100-101. [0185] 17. Peterse EFP, Meester RGS, de Jonge L, Omidvari AH, Alarid-Escudero F, Knudsen AB, Zauber AG, Lansdorp-Vogelaar I: J Natl Cancer Inst 2021, 113: 154-161. [0186] 18. Koch A, Joosten SC, Feng Z, de Ruijter TC, Draht MX, Melotte V, Smits KM, Veeck J, Herman JG, Van Neste L, et al: Nat Rev Clin Oncol 2018, 15: 459-466. [0187] 19. Tse JWT, Jenkins LJ, Chionh F, Mariadason JM: Trends Cancer 2017, 3:698-712. [0188] 20. Jain S, Chen S, Chang KC, Lin YJ, Hu CT, Boldbaatar B, Hamilton JP, Lin SY, Chang TT, Chen SH, et al: PLoS One 2012, 7:e35789. [0189] 21. Dedeurwaerder S, Defrance M, Calonne E, Denis H, Sotiriou C, Fuks F:
[0190] Epigenomics 2011, 3:771-784. [0191] 22. Wendt J, Rosenbaum H, Richmond TA, Jeddeloh JA, Burgess DL: Methods Mol Biol 2018, 1708:383-405. [0192] 23. Herman JG, Graff JR, MO as, Nelkin BD, Baylin SB: Proc Natl Acad Sci US A 1996, 93:9821-9826. [0193] 24. Hernandez HG, Tse MY, Pang SC, Arboleda H, Forero DA: Biotechniques 2013, 55:181-197. [0194] 25. Kibbe WA: OligoCalc: Nucleic Acids Res 2007, 35:W43-46. [0195] 26. Klutstein M, Nejman D, Greenfield R, Cedar H: Cancer Res 2016, 76: 3446-3450. [0196] 27. Lu J, Wilfred P, Korbie D, Trau M: Cancers (Basel) 2020, 12. [0197] 28. Ng JM, Yu J: Int J Mol Sci 2015, 16:2472-2496. [0198] 29. Suzuki MM, Bird A: Nat Rev Genet 2008, 9:465-476. [0199] 30. Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D'Souza C, Fouse SD, Johnson BE, Hong C, Nielsen C, Zhao Y, et al: Nature 2010, 466:253-257. [0200] 31. Lee SM, Lee J, Noh KM, Choi WY, Jeon S, Oh GT, Kim-Ha J, Jin Y, Cho SW,
[0201] Kim YJ: Proc Nail Acad Sci USA 2017, 114:E1885-e1894. [0202] 32. Krinner S, Heitzer AP, Diermeier SD, Obermeier I, La G, Wagner R: Nucleic Acids Res 2014, 42:3551-3564. [0203] 33. Shenker N, Flanagan JM: Br J Cancer 2012, 106:248-253. [0204] 34. Kinde B, Wu DY, Greenberg ME, Gabel HW: Proc Natl Acad Sci USA 2016, 113:15114-15119. [0205] 35. Arechederra M, Daian F, Yim A, Bazai SK, Richelme S, Dono R, Saurin AJ, Habermann BH, Maina. F: Nat Commun 2018, 9:3164. [0206] 36. Greenberg MVC, Bourc′his D: Nat Rev Mol Cell Biol 2019, 20: 590-607. [0207] 37. Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi B, Varambally S: UALCAN: Neoplasia 2017, 19: 649-658. [0208] 38. Teo AK, Tsuneyoshi N, Hoon S, Tan EK, Stanton LW, Wright CV, Dunn NR: Stem Cell Reports 2015, 4:578-590. [0209] 39. Lin C-P, He L: Annual Review of Cancer Biology 2017, 1:163-184. [0210] 40. Boons G, Vandamme T, Ibrahim J, Roeyen G, Driessen A, Peeters D, Lawrence
[0211] B, Print C, Peeters M, Van Camp G, Op de Beeck K: Cancers (Basel) 2020, 12. [0212] 41. Vinogradova TV, Sverdlov ED: PDX1: Biochemistry (Mosc) 2017, 82:887-893. [0213] 42. Brunet I, Weinl C, Piper M, Nature 2005, 438:94-98. [0214] 43. Li Y, Liu J, Xiao Q, Tian R, Zhou Z, Gan Y, Li Y, Shu G, Yin G: Cell Death Dis 2020, 11:604. [0215] 44. Sun AJ, Gao HB, Liu G, Ge HF, Ke ZP, Li S: J Cell Physiol 2017, 232: 1879-1884.