Method and kit for non-invasively detecting EGFR gene mutations
10648037 ยท 2020-05-12
Assignee
Inventors
- Lina Wang (Beijing, CN)
- Lanlan Ru (Beijing, CN)
- Tiancheng Li (Beijing, CN)
- Yaxi ZHANG (Beijing, CN)
- Jianguang Zhang (Beijing, CN)
Cpc classification
C12Q2539/105
CHEMISTRY; METALLURGY
C12Q2537/159
CHEMISTRY; METALLURGY
C12Q2537/159
CHEMISTRY; METALLURGY
C12Q2539/105
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention discloses a method for non-invasively detecting EGFR gene mutations in subjects, comprising the following steps: designing primers according to EGFR gene exons; extracting plasma DNAs in subjects; connecting the extracted plasma DNAs with tagging linkers; PCR pre-amplifying the tagging linkers connected plasma DNAs; cyclising the pre-amplified DNAs to obtain cyclised DNAs; PCR amplifying the cyclised DNAs using the designed primers; and high throughput sequencing the PCR amplified product and analyzing the EGFR gene mutations. The present invention also discloses a corresponding kit.
Claims
1. A method for non-invasively detecting an EGFR gene mutation in a subject, comprising the following steps: designing a pair of forward and reverse primers that are adjacent and backward extended, targeting an exon of the EGFR gene wherein the exon is selected from the group consisting of exon 18, exon 19, exon 20, or exon 21, wherein the forward and reverse primer pairs that target exon 18 are selected from a group consisting of E18-1-3Forward: CCCAGCTTGTGGAGCCTC [SEQ ID NO.:1]; E18-1-3Reverse: GACAAGAACACAGAGACAAGGGT [SEQ ID NO.:2] and E18-2-Forward: GCAGGGCCTCTCATGGTC [SEQ ID NO.:3]; E18-2-Reverse: CCTGTGCCAGGGACCTTAC [SEQ ID NO.:4], wherein the forward and reverse primer pairs that target exon 19 are selected from a group consisting of E19-1-Forward: ACGTCTTCCTTCTCTCTCTGTCAT [SEQ ID NO.:5]; E19-1-Reverse: GTGAGATGGTGCCACATGCT [SEQ ID NO.:6] and E19-2-Forward: GTCCATGGCTCTGAACCTCA [SEQ ID NO.:7]; E19-2-Reverse: CCACACAGCAAAGCAGAAAC [SEQ ID NO.:8], wherein the forward and reverse primers that target exon 20 are E20-2-1 Forward: CCTCCCCGTATCTCCCT [SEQ ID NO.:11]; E20-2-1 Reverse: GGAGATAAGGAGCCAGGAT [SEQ ID NO.:12], or wherein the forward and reverse primer pairs that target exon 21 are selected from a group consisting of E21-1-Forward: AGCAGGGTCTTCTCTGTTTCA [SEQ ID NO.:13]; E21-1-Reverse: GAGGGACAGATCATCATGGG [SEQ ID NO.:14] and E21-2-Forward: TTTCCTGACACCAGGGACC [SEQ ID NO.:15]; E21-2-Reverse: TGACCTAAAGCCACCTCCTT [SEQ ID NO.:16]; extracting plasma DNAs from the subject; connecting the extracted plasma DNAs with tagging linkers; PCR pre-amplifying the tagging linkers connected plasma DNAs; cyclizing the pre-amplified DNAs to obtain cyclized DNAs, wherein the cyclization is a splint mediated single strand DNA cyclization; PCR amplifying the cyclized DNAs using the designed primers; and high throughput sequencing the PCR amplified product and analyzing the EGFR gene mutations.
2. The method according to claim 1, characterized in that primers of the backward extended primer pair are located on 5 or 3 end of the EGFR gene exons.
3. The method according to claim 2, characterized in that 5 end of the backward extended primers contains linker sequences for high throughput sequencing library.
4. The method according to claim 1, characterized in that the EGFR genes in plasma DNAs have insertion, deletion, substitution or gene fusion mutations.
Description
BRIEF DESCRIPTION OF FIGURES
(1)
(2)
(3)
(4)
EMBODIMENTS
(5) With improvements in the sequencing technology, the traditional Sanger sequencing has been unable to fully meet the requirement of research. Thus, the second generation sequencing technology with lower cost, higher throughput, faster speed, and having capability for sequencing the whole genome emerges. The main principle of the second generation sequencing technology is high throughput sequencing by synthesis, namely, determining DNA sequences by capturing newly synthesized end labelling. The available technologic platforms mainly include Roche/454 FLX, Illumina/Genome Analyzer/Hiseq/Miseq, Applied Biosystems SOLID, life Technologies/Ion Torrent, and the like. Taking Illumina product as an example, HiSeq 2000 can reach a sequencing throughput of 30 coverage in 6 human genomes per run, i.e., about 600 G/run, and the operation time is reduced to 30 minutes. Furthermore, with the maturity of the second generation sequencing technology, the investigation on its clinical application has developed quickly. Research shows that fetus genetic health can be judged by sequencing maternal plasma DNAs, and that sequencing plasma DNAs in subjects can be used for early cancer screening, which will have a wide application in the future.
(6) Plasma DNAs, also known as circulating DNAs, are extracellular DNAs in the blood and is tens to hundreds of nucleotides in length (a main peak of about 167 bp). It presents in a form of DNA-protein complex, or as free DNA fragments. Normally, plasma DNAs are derived from DNA release of a small amount of senescent and dead cells. Under healthy condition, the generation and removal of the circulating DNAs are in a dynamic equilibrium and are maintained at a relatively steady low level. 1 mL plasma from a normal person contains about 2000 genomic DNAs. The circulating DNAs can reflect the metabolic condition of cells in human body, thus is an important index for judging health. The change of quantity and quality of circulating DNAs in peripheral blood is closely related to several diseases (including tumour, complex severe traumata, organ transplantation, pregnancy-related diseases, infectious diseases, organ failure, and the like). As a non-invasive detection index, it is expected to be an important molecular marker for early diagnosing some diseases, monitoring the conditions, and evaluating the therapeutic effects and prognosis of the diseases. For example, researches show that EGFR regulates the cell cycle progress, repair and survival of tumor cells, and at the same time relates to tumor metastasis. Recently, molecular targeted therapy using EGFR as the therapy target has received widespread attention from cancer communities both at home and abroad, and an EGFR tyrosine kinase inhibitor, Iressa (Genfitinib), has been approved by America Food and Drug Administration (FDA) for treating advanced NSCLC. The prominent feature of molecular targeted drugs is that its therapeutic effect strongly depends on the target: the therapeutic effect is significant strong in patients with the target, while the therapeutic effect is weak or none in patients without the target, which thus delays other treatments and makes the conditions worse. Therefore, blind administrations without target detection not only may result in high economic loss, but also may delay the valuable timing for treatment, or even aggravate the conditions. It is curtail to judge quickly and accurately whether the patient has the specific target for the targeted drug treatment. The traditional EGFR detection mainly detects the lesion tissue section by FISHor qPCR. However, it is found that there are more free DNAs in the plasma of NSCLC patients, about 10 times of those of normal people. A large quantity of free DNAs in plasma derives from DNAs release of senescent and dead tumor cells. They are similar to tumor genomic DNAs in genetic characteristics, and mutations thereof include deletion, point mutation, and increased copy number. EGFR gene mutations can be detected by examining plasma DNAs in NSCLC patients, and thus make it possible to detect EGFR expression non-invasively. The present invention detects EGFR expressions and mutations in plasma DNAs by the second generation sequencing technology quickly, accurately, non-invasively and with high sensitivity, and thus provides various diagnosis basis for patients.
(7) In view of the clinical significance of non-invasive detection by plasma DNAs sequencing and the rapid development of the second generation high throughput sequencing, the inventor found that sequencing plasma DNAs in large-scale can detect EGFR gene expressions and mutations more quickly, accurately, and non-invasively. It is applicable to a variety of second generation high throughput sequencing platforms, including but not limited to, Roche/454 FLX, Illumina/Genome Analyzer/Hiseq/Miseq, Applied Biosystems SOLID, life Technologies/Ion Torrent, and the like.
(8) The present invention is based on the following two facts: 1) plasma free DNAs in patients are similar to genomic DNAs in genetic characteristics. The plasma free DNAs of patients are higher in content than those of normal people, and often contain lots of mutations, while each mutation may be of low frequency; 2) The second generation high throughput sequencing can obtain the information of plasma free DNAs quickly, accurately, and with high throughput. Combining these two facts enables the non-invasive large scale application of the detection in genome specific regions. Researches show that plasma DNAs exist as fragments with low amount (1 mL plasma contains about 2000 genomes) and short length (mainly about 167 bp), which make it difficult for the traditional PCR to enrich mutations effectively using plasma DNAs as templates, resulting in rapid decrease in detection sensitivity. The present invention differs from the traditional methods in that the DNA fragments connected with sequence tagging linkers are amplified and single strand cyclized, then by means of back-to-back primer amplification, the templates are used maximally and the library is sequenced by high throughput paired-end sequencing. The original amplified templates are assembled based on the original sequencing analysis, and the tagging sequences are recorded. Sequences with the same position on the genome and the same tagging sequences are calculated as one template. The number of templates amplified by every primer pair is calculated and the number of mutated template is counted and recorded. The present invention improves the cyclization method and optimizes primers for the EGFR genes. The unique design of tagging sequences reduces background and prevents contamination effectively. The unique template set is counted by restoring templates in systems accurately, and thus a single molecule detection with high accuracy is finally achieved.
(9) According to one specific embodiment of the present invention, it provides a method for non-invasively detecting EGFR gene mutations in subjects, comprising the following steps: (1) designing primers according to EGFR gene exons; (2) extracting plasma DNAs from the subjects; (3) connecting the extracted plasma DNAs with tagging linkers; (4) PCR pre-amplifying the tagging linkers connected plasma DNA; (5) cyclizing the amplified DNAs to obtain cyclised DNAs; (6) PCR amplifying the cyclised DNAs using the designed primers; and (7) high throughput sequencing the PCR amplified product and analyzing the EGFR gene mutations. Non-invasive detection in the present invention means that in comparison with that routine histology detection methods such as surgery, tissue biopsy and the like, which are directly against cancer tissues, will result in body damages in subjects, the present invention only detects blood sample from the subjects. Traditional methods for detecting DNAs or gene fragments require PCR amplification of the regions to be tested before detection, and thus the DNAs or gene fragments to be tested should be complete. However, most of the plasma DNA fragments are incomplete, and thus the DNA fragments that can be used as templates in PCR amplification is few in number and is difficult to be detected by a routine PCR. Therefore, PCR amplification in the present invention adopts DNA cycllization technology to transform fragment DNAs into cyclic DNAs using linker sequences and enzymes. Primers based on the regions to be tested are designed; sequencing library is amplified and constructed, and then is sequenced by a high throughput sequencing technology; and the EGFR gene mutations are analyzed.
(10) According to another further specific embodiment of the present invention, primers of the backward extended primer pair are located on 5 or 3 end of the EGFR gene exons. Further, space of the backward extended primer pair is 0- of total base pairs of the fragment DNAs. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of the EGFR genes. Further, 5 end of the backward extended primers contains linker sequences for high throughput sequencing library. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of EGFR genes, and sequences of the primers are as follows:
(11) TABLE-US-00004 E18-1-3F: CCCAGCTTGTGGAGCCTC [SEQIDNO.:1] E18-1-3R: GACAAGAACACAGAGACAAGGGT [SEQIDNO.:2] E18-2-F: GCAGGGCCTCTCATGGTC [SEQIDNO.:3] E18-2-R: CCTGTGCCAGGGACCTTAC [SEQIDNO.:4] E19-1-F: ACGTCTTCCTTCTCTCTCTGTCAT [SEQIDNO.:5] E19-1-R: GTGAGATGGTGCCACATGCT [SEQIDNO.:6] E19-2-F: GTCCATGGCTCTGAACCTCA [SEQIDNO.:7] E19-2-R: CCACACAGCAAAGCAGAAAC [SEQIDNO.:8] E20-1-F: CACACTGACGTGCCTCTCC [SEQIDNO.:9] E20-1-R: CTTCGCATGGTGGCCAGA [SEQIDNO.:10] E20-2-1F: CCTCCCCGTATCTCCCT [SEQIDNO.:11] E20-2-1R: GGAGATAAGGAGCCAGGAT [SEQIDNO.:12] E21-1-F: AGCAGGGTCTTCTCTGTTTCA [SEQIDNO.:13] E21-1-R: GAGGGACAGATCATCATGGG [SEQIDNO.:14] E21-2-F: TTTCCTGACACCAGGGACC [SEQIDNO.:15] E21-2-R: TGACCTAAAGCCACCTCCTT. [SEQIDNO.:16]
(12) Further, the cyclization is a splint mediated single strand DNA cyclization.
(13) Further, the high throughput sequencing technologies are selected from Roche/454 FLX, Illumina/Hiseq/Miseq, Applied Biosystems SOLID and life Technologies/Ion Torrent/Proton. Illumina technology is used in the present invention.
(14) According to another specific embodiment of the present invention, it provides a kit for non-invasively detecting EGFR gene mutations, comprising: reagents for extracting plasma DNAs, a DNA cyclase, primers and reagents for amplifying target DNAs. Conventional reagents or commercially available kits can be used in the extraction of plasma DNAs. Further, the amplification primers of the regions to be tested in the EGFR genes are a pair of primers that are adjacent and backward extended. Further, primers of the backward extended primer pair are located on 5 or 3 end of the sites or regions to be tested in the EGFR genes. Further, space of the backward extended primer pair is 0- of total base pairs of the plasma DNAs. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of the EGFR genes. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of the EGFR genes, and sequences of the primers are as follows:
(15) TABLE-US-00005 E18-1-3F: CCCAGCTTGTGGAGCCTC [SEQIDNO.:1] E18-1-3R: GACAAGAACACAGAGACAAGGGT [SEQIDNO.:2] E18-2-F: GCAGGGCCTCTCATGGTC [SEQIDNO.:3] E18-2-R: CCTGTGCCAGGGACCTTAC [SEQIDNO.:4] E19-1-F: ACGTCTTCCTTCTCTCTCTGTCAT [SEQIDNO.:5] E19-1-R: GTGAGATGGTGCCACATGCT [SEQIDNO.:6] E19-2-F: GTCCATGGCTCTGAACCTCA [SEQIDNO.:7] E19-2-R: CCACACAGCAAAGCAGAAAC [SEQIDNO.:8] E20-1-F: CACACTGACGTGCCTCTCC [SEQIDNO.:9] E20-1-R: CTTCGCATGGTGGCCAGA [SEQIDNO.:10] E20-2-1F: CCTCCCCGTATCTCCCT [SEQIDNO.:11] E20-2-1R: GGAGATAAGGAGCCAGGAT [SEQIDNO.:12] E21-1-F: AGCAGGGTCTTCTCTGTTTCA [SEQIDNO.:13] E21-1-R: GAGGGACAGATCATCATGGG [SEQIDNO.:14] E21-2-F: TTTCCTGACACCAGGGACC [SEQIDNO.:15] E21-2-R: TGACCTAAAGCCACCTCCTT. [SEQIDNO.:16]
(16) Further, the kit comprises primers and reagents for pre-amplifying the regions to be tested in the EGFR genes. Specifically, the reagents and primers for pre-amplification include Taq DNA polymerase and its buffer, and primers for pre-amplification that are complementary to the Y-shape linkers.
(17) Further, the kit comprises reagents for high throughput sequencing. Further, the reagents for high throughput sequencing are applicable to the following high throughput sequencing technologies: Roche/454 FLX, Illumina/Hiseq/Miseq, Applied Biosystems SOLID and life Technologies/Ion Torrent/Proton. Further, the plasma DNA connection linkers contain tagging sequences. Further, the plasma DNAs are pre-amplified before they are cyclised. Further, the cyclization is a splint mediated single strand DNA cyclization.
(18) According to yet another specific embodiment of the present invention, it provides a use of primers according to EGFR gene exons in the preparation of diagnosing reagents or kits for non-invasively detecting EGFR gene mutations in subjects, characterized in that the diagnosing reagents or kits are applicable to a method for non-invasively detecting EGFR gene mutations in subjects comprising the following steps: extracting plasma DNAs in subjects; connecting the extracted plasma DNAs with tagging linkers; PCR pre-amplifying the tagging linkers connected plasma DNAs; cyclizing the amplified DNAs to obtain cyclised DNAs; PCR amplifying the cyclised DNAs using the designed primers; and high throughput sequencing the PCR amplified product and analyzing the EGFR gene mutations.
(19) Further, the primers are a pair of primers that are adjacent and backward extended. Further, primers of the backward extended primer pair are located on 5 or 3 end of the EGFR gene exons. Further, space of the backward extended primer pair is 0- of total base pairs of the fragment DNAs. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of the EGFR genes. Further, 5 end of the backward extended primers contains linker sequences for high throughput sequencing library. Further, the backward extended primers aim at exon 18, exon 19, exon 20, or exon 21 of the EGFR genes, and sequences of the primers are as follows:
(20) TABLE-US-00006 E18-1-3F: CCCAGCTTGTGGAGCCTC [SEQIDNO.:1] E18-1-3R: GACAAGAACACAGAGACAAGGGT [SEQIDNO.:2] E18-2-F: GCAGGGCCTCTCATGGTC [SEQIDNO.:3] E18-2-R: CCTGTGCCAGGGACCTTAC [SEQIDNO.:4] E19-1-F: ACGTCTTCCTTCTCTCTCTGTCAT [SEQIDNO.:5] E19-1-R: GTGAGATGGTGCCACATGCT [SEQIDNO.:6] E19-2-F: GTCCATGGCTCTGAACCTCA [SEQIDNO.:7] E19-2-R: CCACACAGCAAAGCAGAAAC [SEQIDNO.:8] E20-1-F: CACACTGACGTGCCTCTCC [SEQIDNO.:9] E20-1-R: CTTCGCATGGTGGCCAGA [SEQIDNO.:10] E20-2-1F: CCTCCCCGTATCTCCCT [SEQIDNO.:11] E20-2-1R: GGAGATAAGGAGCCAGGAT [SEQIDNO.:12] E21-1-F: AGCAGGGTCTTCTCTGTTTCA [SEQIDNO.:13] E21-1-R: GAGGGACAGATCATCATGGG [SEQIDNO.:14] E21-2-F: TTTCCTGACACCAGGGACC [SEQIDNO.:15] E21-2-R: TGACCTAAAGCCACCTCCTT. [SEQIDNO.:16]
(21) Further, the cyclization is a splint mediated single strand DNA cyclization. Further, the high throughput sequencing technologies are selected from Roche/454 FLX, Illumina/Hiseq/Miseq, Applied Biosystems SOLID and life Technologies/Ion Torrent/Proton.
EXAMPLES
Example 1
(22) The plasma DNA template was amplified using self-designed linkers and according to a method for constructing a plasma DNA high throughput sequencing library (that is, PCR pre-amplification using phosphorylated primers after linker connection). The PCR product was purified by a gel cutting and cyclised by a splint connection. The cyclised product was digested by Exo III, purified, and screened by multiplex PCR with 8 pairs of back-to-back primers (the primers contain universal sequences for constructing a sequencing library). The mutation sites should be close to the forward primer or the reverse primer. The library was finally obtained by purifying the amplification product amplified by the universal primers.
(23) 1. Linker design. It is annealed to a double strand, wherein X is a tagging sequence:
(24) TABLE-US-00007 ssCycADT-1: GTCTCATCCCTGCGTGXXXXT ssCycADT-2: pXXXXCACGCAGGGTACGTGT
(25) The structure of connection product:
(26) TABLE-US-00008 Top: GTCTCATCCCTGCGTGXXXXTNNNAXXXXCACGCAGGGTACGTGT Bottom: TGTGCATGGGACGCACXXXXANNNTXXXXGTGCGTCCCTACTCTG
(27) Primers:
(28) TABLE-US-00009 ssCycUniprimer-F: pGTCTCATCCCTGCGTG ssCycUniprimer-R: pACACGTACCCTGCGTG
(29) The library structure after pre-amplification:
(30) TABLE-US-00010 pGTCTCATCCCTGCGTGXXXXTNNNAXXXXCACGCAGGGTACGTGT CAGAGTAGGGACGCACXXXXANNNTXXXXGTGCGTCCCATGCACAp
(31) Back-to-Back Primers for Amplification in Target Zones:
(32) TABLE-US-00011 EXON18(123bp) [SEQIDNO.:17] CAAGTGCCGTGTCCTGGCACCCAAGCCCATGCCGTGGCTGCTGGTCCCCCTGCTGGGCCATGTCTGGCACTGCTTTCCAGCATGGTG AGGGCTGAGGTGACCCTTGTCTCTGTGTTCTTGTCCCCCCCAGCTTGTGGAGCCTCTTACACCCAGTGGAGAAGCTCCCAACCAAGC TCTCTTGAGGATCTTGAAGGAAACTGAATTCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGTTCGGCACGGTGTATAAGGTAAGGTC CCTGGCACAGGCCTCTGGGCTGGGCCGCAGGGCCTCTCATGGTCTGGTGGGGAGCCCAGAGTCCTTGCAAGCTGTATATTTCCATCA TCTACTTTACTCTTTGTTTCACTGAGTGTTTGG. E18-1-3F:CCCAGCTTGTGGAGCCTC[SEQIDNO.:1] E18-1-3R:GACAAGAACACAGAGACAAGGGT[SEQIDNO.:2] E18-2-F:GCAGGGCCTCTCATGGTC[SEQIDNO.:3] E18-2-R:CCTGTGCCAGGGACCTTAC[SEQIDNO.:4] EXON19(99bp) [SEQIDNO.:18] GCAATATCAGCCTTAGGTGCGGCTCCACAGCCCCAGTGTCCCTCACCTTCGGGGTGCATCGCTGGTAACATCCACCCAGATCACTGG GCAGCATGTGGCACCATCTCACAATTGCCAGTTAACGTCTTCCTTCTCTCTCTGTCATAGGGACTCTGGATCCCAGAAGGTGAGAAA GTTAAAATTCCCGTCGCTATCAAGGAATTAAGAGAAGCAACATCTCCGAAAGCCAACAAGGAAATCCTCGATGTGAGTTTCTGCTTT GCTGTGTGGGGGTCCATGGCTCTGAACCTCAGGCCCACCTTTTCTCATGTCTGGCAGCTGCTCTGCTCTAGACCCTGCTCATCTCCA CATCCTAAATGTTCACTTTCTATG. E19-1-F:ACGTCTTCCTTCTCTCTCTGTCAT[SEQIDNO.:5] E19-1-R:GTGAGATGGTGCCACATGCT[SEQIDNO.:6] E19-2-F:GTCCATGGCTCTGAACCTCA[SEQIDNO.:7] E19-2-R:CCACACAGCAAAGCAGAAAC[SEQIDNO.:8] EXON20(186bp) [SEQIDNO.:19] CCATGAGTACGTATTTTGAAACTCAAGATCGCATTCATGCGTCTTCACCTGGAAGGGGTCCATGTGCCCCTCCTTCTGGCCACCATG CGAAGCCACACTGACGTGCCTCTCCCTCCCTCCAGGAAGCCTACGTGATGGCCAGCGTGGACAACCCCCACGTGTGCCGCCTGCTGG GCATCTGCCTCACCTCCACCGTGCAGCTCATCACGCAGCTCATGCCCTTCGGCTGCCTCCTGGACTATGTCCGGGAACACAAAGACA ATATTGGCTCCCAGTACCTGCTCAACTGGTGTGTGCAGATCGCAAAGGTAATCAGGGAAGGGAGATACGGGGAGGGGAGATAAGGAG CCAGGATCCTCACATGCGGTCTGCGCTCCTGGGATAGCAAGAGTTTGCCATGGGGATATG. E20-1-F:CACACTGACGTGCCTCTCC[SEQIDNO.:9] E20-1-R:CTTCGCATGGTGGCAGA[SEQIDNO.:10] E20-2-1F:CCTCCCCGTATCTCCCT[SEQIDNO.:11] E20-2-1R:GGAGATAAGGAGCCAGGAT[SEQIDNO.:12] EXON21(156bp) [SEQIDNO.:20] CTAACGTTCGCCAGCCATAAGTCCTCGACGTGGAGAGGCTCAGAGCCTGGCATGAACATGACCCTGAATTCGGATGCAGAGCTTCTT CCCATGATGATCTGTCCCTCACAGCAGGGTCTTCTCTGTTTCAGGGCATGAACTACTTGGAGGACCGTCGCTTGGTGCACCGCGACC TGGCAGCCAGGAACGTACTGGTGAAAACACCGCAGCATGTCAAGATCACAGATTTTGGGCTGGCCAAACTGCTGGGTGCGGAAGAGA AAGAATACCATGCAGAAGGAGGCAAAGTAAGGAGGTGGCTTTAGGTCAGCCAGCATTTTCCTGACACCAGGGACCAGGCTGCCTTCC CACTAGCTGTATTGTTTAACACATGCAGGGGAGGATGCTCTCCAGACATTCTGGGTGAGCTCGCAGC. E21-1-F:AGCAGGGTCTTCTCTGTTTCA[SEQIDNO.:13] E21-1-R:GAGGGACAGATCATCATGGG[SEQIDNO.:14] E21-2-F:TTTCCTGACACCAGGGACC[SEQIDNO.:15] E21-2-R:TGACCTAAAGCCACCTCCTT[SEQIDNO.:16]
(33) 2. Plasma free DNAs were extracted from 2 mL plasma.
(34) 3. End-filling:
(35) The reaction mixture was prepared as follows:
(36) TABLE-US-00012 TABLE 1 PlasmaDNA solution 38.5 l T4 DNA phosphorylation buffer (10X) 5 l 10 mM dNTP mixture 2 l T4 DNApolymerase 2 l T4 DNA phosphorylase 2 l Klenow enzyme 0.5 l SterileH.sub.2O 0 l Total volume 50 l
(37) The mixture was placed in a 20 C. warm bath for 30 min. The DNA sample was purified on a purification column and eluted by 42 l sterile dH.sub.2O or an elution buffer.
(38) 4. Adding poly-adenine tail on the 3 end of the DNA fragments:
(39) The reaction mixture was prepared as follows:
(40) TABLE-US-00013 TABLE 2 End-filled DNA 32 l Klenow reaction buffer (10X) 5 l dATP solution 10 l klenow ex-enzyme (lacking 3-5 exonuclease activity) 3 l SterileH.sub.2O 0 l Total volume 50 l
(41) The mixture was placed in a 37 C. warm bath for 30 min. The DNA sample was purified on a column and eluted by 25 l sterile dH.sub.2O or an elution buffer.
(42) 5. Connecting linkers to the DNA fragments
(43) The reaction mixture was prepared as follows:
(44) TABLE-US-00014 TABLE 3 End-filled dA-tailed DNA 33 l Reaction buffer for quick connection (5X) 10 l 5 m DNA linker 2 l Quick T4 DNA ligase (NEB) 5 l Total volume 50 l
(45) The mixture was placed in a 20 C. warm bath for 15 min. The DNA sample was purified on a Qiagen column and eluted by 25 l sterile dH.sub.2O or an elution buffer.
(46) 6. Enriching the linker-modified DNA fragments by PCR pre-amplification
(47) The PCR reaction mixture was prepared as follows:
(48) TABLE-US-00015 Buffers EB 14 uL 10X Taq ligase buffer 5 uL Split Oligo (10 M) 4 uL Pre-lib 25 uL Taq ligase 2 uL Total volume 50 uL
(49) PCR programs:
(50) TABLE-US-00016 95 C. 30 s 30 cycles 50 C. 2 min 4 C. stop
(51) TABLE-US-00017 TABLE 4 DNA 12.5 l Phusion DNA polymerase (Phusion DNA polymerase 25 l mixture) PCR primer mixture 2 l Ultrapure water 10.5 l Total volume 50 l
(52) Amplification using the following PCR programs:
(53) a. 98 C. 30 s;
(54) b. 18 cycles as follows:
(55) 98 C. 10 s, 65 C. 30 s, 72 C. 30 s;
(56) c. 72 C. 5 min;
(57) d. maintained at 4 C.
(58) 6. PCR product was analyzed by electrophoresis on 2% agarose gel, and the results were shown in
(59) 7. Cyclization
(60) The cyclization system was prepared as follows (Table 5)
(61) TABLE-US-00018 DNA template 12 ul Circ Ligase II 10X reaction buffer 2 ul 50 mM MnCl.sub.2 1 ul 5M Betaine (optional): 4 ul Circ Ligase II ssDNA ligase (100 U) 1 ul Total volume 20 ul
(62) Reaction conditions
(63) TABLE-US-00019 60 C. 1 h 80 C. 10 min 4 C. stop
(64) 8. Enzyme digestion:
(65) All the cyclization products were digested by Exo III, and the digestion system was
(66) TABLE-US-00020 10 NE buffer 1.2 ul ExoIII 1 ul Cyclization product 10 ul
(67) The digestion system was placed in a PCR machine, and reacted for 30 min at 37 C.
(68) The digested product was purified on a purification column and dissolved in 30 ul EB buffer. The concentration was measured by Qubit, and the results were as follows:
(69) TABLE-US-00021 Sample Number Concentration 2 4 ng/ul 3 0.3 ng/ul 4 2.33 ng/ul 5 5.74 ng/ul
(70) Examples of the present invention used a splint-mediated cyclization, which has a high rate of cyclization. Detailed tracking and detecting results of every step during the cyclization process were shown in
(71) 9. Reverse PCR screening of the target zone using back-to-back primers
(72) PCR reaction system was prepared as follows
(73) TABLE-US-00022 TABLE 6 dd H.sub.2O 13 or 18 ul AmpliTaq Gold 360 Master Mix (2) 25 ul CycEGFR18-F1 1 ul CycEGFR18-R1 1 ul DNA 10 or 5 ul Total volume 50 ul Note: the controls were P (with primers and without template) and N (template is un-cyclized ssCyc Lib)
(74) PCR Reaction Conditions
(75) TABLE-US-00023 TABLE 7 95 C. 10 min 1 cycle 95 C. 30 s 30 cycles 55 C. 30 s 72 C. 30 s 72 C. 5 min 1 cycle
(76) 10. The second round PCR
(77) The second round PCR was performed using products of the reverse PCR as templates. System
(78) TABLE-US-00024 TABLE 8 Phusion PCR Master Mix (2) 25 uL P5-B1-F (10 mM) 1 uL Primer 2 -index 1-2 (10 uM) 1 uL Products of reverse PCR 5 uL ddH.sub.2O add to 50 uL
(79) Programs:
(80) TABLE-US-00025 98 C. 30 s 1 cycle 98 C. 30 s 12 cycles 65 C. 30 s 72 C. 30 s 72 C. 5 min 1ycle
(81) 10 uL PCR product from the second round PCR was analyzed by electrophoresis on 2% agarose gel, and the results were shown in
(82) 11. The remaining 40 uL PCR product from the second round PCR was purified on a QIAGEN column, and dissolved in 20 uL EB buffer to generate the final library.
(83) 12. After quality control, the generated library was 250 bp double-end sequenced by IlluminaMiseq.
(84) 13. Every high throughput sequenced double-end sequences were assembled to one sequence based on repeated regions. Linkers were removed, and the sequence was restored to the original template sequence, which was then compared to the human genome (hg19). The unique template sequence set was counted by comparing the start and terminal coordinates and tagging sequences of the template sequence on the genome. Using the unique template sequence, genome coverage was then calculated, which can be used for evaluating the specificity of the library and calculating the somatic cell mutations in the EGFR region.
(85) The results were compared with an EGFR gene mutation detection kit from AmoyDx (directed to the same cancer tissue) and a digital PCR. The comparison results were:
(86) TABLE-US-00026 ARMS ddPCR Results Results (positive oil Sequencing Results Sample (AmoyDx droplets/total (positive templates/ Number kit) oil droplets) total templates) LC113 19-del 19-del (2/723) 19-del c.2239_2251>C (7/1192) LC314 WT WT (del: 0/34; WT (0/876) 1858r: 0/178; t790: 0/208) LC320 19-del 19-del (1/161) 19-del c.2236_2250del15 (9/909) LC2 WT n/a WT (0/668) LC3 L858R L858R (589/858) c.2573T>G; p.L858R (1599/2510)
(87) 5 samples tested by the method of the present invention showed highly consistent results with those tested by other methods. AMRS-PCR (amplification refractory mutation system) Taq DNA polymerase lacks 3-5 exonuclease activity. Under certain conditions, effective amplification occurs only when the last base on 3 end of PCR primers is complementary to the template DNA. Mutated genes and wild type genes can be directly distinguished by PCR using suitable primers directed to different known mutations. This method is mainly for biopsy and FFPE samples. 5 mL peripheral blood before surgery and FFPE samples after surgery from the same patient were provided. The mutation type of FFPE samples on the known sites was detected by ARMS-PCR, and then plasma DNA was detected by digital PCR and the method of the present invention, respectively.
Example 2
(88) Reliability of the present method was verified by detecting known cancer mutation sites. The method was the same as that disclosed in Example 1.
(89) Cancer cell line DNA: cell DNAs containing hybrid c.2235_2249de115 (exon 19) mutation were broken by ultrasonication, 16610 bp fragments were recycled and mixed with plasma DNAs from a normal person at certain ratio. The sensitivity and stability of the method were examined.
(90) The detection results of c.2235_2249del15 were as follows:
(91) TABLE-US-00027 Nos. of Nos. of Nos. of Sample Total Non-Del Del Nos. Prediction templates templates templates ratio Sample 1 0% mutation 862 862 0 0.00% Sample 2 0.1 mutation 438 433 5 1.09% Sample 3 1% mutation 905 843 62 6.78% Sample 4 5% mutation 880 786 94 10.67% Sample 5 25% mutation 1631 1017 614 37.65%