A METHOD FOR DETECTING THE MUTATION AND METHYLATION OF TUMOR-SPECIFIC GENES IN CTDNA
20230272475 · 2023-08-31
Inventors
- Yuchen JIAO (Beijing, CN)
- Chunfeng QU (Beijing, CN)
- Yuting WANG (Beijing, CN)
- Pei WANG (Beijing, CN)
- Kun Chen (Beijing, CN)
- Qianqian SONG (Beijing, CN)
- Hui Liu (Beijing, CN)
- Jingjing Wang (Beijing, CN)
- Sizhen WANG (Beijing, CN)
Cpc classification
C12Q2525/30
CHEMISTRY; METALLURGY
C12Q2537/143
CHEMISTRY; METALLURGY
C40B50/06
CHEMISTRY; METALLURGY
C12Q2525/155
CHEMISTRY; METALLURGY
C12Q2525/30
CHEMISTRY; METALLURGY
C12N15/1065
CHEMISTRY; METALLURGY
C12N15/1093
CHEMISTRY; METALLURGY
C12Q2537/143
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C12N15/1093
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C40B40/06
CHEMISTRY; METALLURGY
C12Q2525/155
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention discloses a method for detecting the mutation and methylation of tumor-specific genes in ctDNA, and this method can simultaneously detect the mutation (including point mutation, insertion-deletion mutation, HBV integration and other mutation forms) and/or methylation of tumor-specific genes in ctDNA in one sample. Not only the sample size requirement is low, but the MC library prepared by this method can support 10-20 subsequent detections. The results of each test can represent the mutation status of all the original ctDNA specimens and the methylation modification status of the region covered by the restriction sites, without reducing the sensitivity and specificity. The present invention has important clinical significance for early tumor screening, disease tracking, efficacy evaluation, prognosis prediction and the like, and has great application value.
Claims
1. A method for constructing a sequencing library, comprising the following steps sequentially: (1) taking a DNA sample and digesting it with a methylation-sensitive restriction endonuclease; (2) the DNA sample digested in step (1) is subjected to end repair and adding A treatment at the 3′ end sequentially; (3) ligating the DNA sample processed in step (2) with the adapter in the adapter mixture, and obtaining a library after PCR amplification; the adapter mixture consists of n adapters; each adapter is formed by an upstream primer A and a downstream primer A to form a partial double-stranded structure; the upstream primer A has a sequencing adapter A, a random tag, an anchor sequence A and a base T at the end; the downstream primer A has an anchor sequence B and a sequencing adapter B; the partial double-stranded structure is formed by the reverse complementation of the anchor sequence A and the anchor sequence B; the sequencing adapter A and sequencing adapter B are corresponding sequencing adapters selected according to different sequencing platforms; the random tag is a random base of 8-14 bp; the anchor sequence A has a length of 12-20 bp, and has ≤3 consecutive repeating bases; the n adapters use n different anchor sequences A(s), and the four bases in each anchor sequence A are balanced, and the number of mismatched bases ≥ 3; n is any natural number ≥8.
2. The construction method according to claim 1, wherein: the upstream primer A includes the sequencing adapter A, the random tag, the anchor sequence A and the base T sequentially from the 5′ end; the downstream primer A includes the anchor sequence B and the sequencing adapter B sequentially from the 5′ end.
3. The construction method according to claim 1, wherein: the number of mismatched bases ≥ 3 means that the adapter mixture contains n anchor sequences A(s), and there are at least 3 differences in the bases between each anchor sequence A; the difference is different positions or different sequences.
4. The construction method according to claim 1, wherein: the DNA sample is a genomic DNA, cDNA, ct DNA or cf DNA sample.
5. The DNA library constructed by the method according to claim 1.
6. A kit for constructing a sequencing library, comprising the adaptor mixture and methylation-sensitive restriction endonucleases described in claim 1.
7. A kit for detecting tumor mutation and/or methylation in DNA samples, comprising the adaptor mixture and primer combinations described in claim 1; the primer combinations include primer set I, primer set II, primer set III, primer set IV, primer set V, primer set VI, primer set VII and primer set VIII; each primer in the primer set I and the primer set II is a specific primer designed according to the region related to tumor mutation, and its function is to locate at a specific position in the genome to achieve PCR enrichment of the target region; the primer set I and the primer set II are respectively used to detect the mutation sites of the DNA positive strand and the negative strand; each primer in the primer set III and the primer set IV is a specific primer designed according to the tumor-specific hypermethylated region, and its function is to locate at a specific position in the genome to achieve PCR enrichment of the target region; the primer set III and the primer set IV are respectively used to detect the methylation sites of the DNA positive strand and the negative strand; each primer in the primer set V, the primer set VI, the primer set VII and the primer set VIII includes a adapter sequence and a specific sequence, and the specific sequence is used for further enrichment of the target region; in the primer set V and the primer set I, the two primers designed for the same mutation site are in a “nested” relationship; in the primer set VI and the primer set II, the two primers designed for the same mutation site are in a “nested” relationship; in the primer set VII and the primer set III, the two primers designed for the same methylation site are in a “nested” relationship; in the primer set VIII and the primer set IV, the two primers designed for the same methylation site are in a “nested” relationship.
8. The kit according to claim 7, wherein the tumor is a liver malignant tumor.
9. The kit according to claim 8, wherein: the primer set I includes 78 single-stranded DNA molecules, and the nucleotide sequences of the 78 single-stranded DNA molecules are shown in SEQ ID NO.28 to 105 in the sequence listing sequentially; the primer set II includes 82 single-stranded DNA molecules, and the nucleotide sequences of the 82 single-stranded DNA molecules are shown in SEQ ID NO. 106 to 187 in the sequence listing sequentially; the primer set III includes 14 single-stranded DNA molecules, and the nucleotide sequences of the 14 single-stranded DNA molecules are shown in SEQ ID NO.188 to 201 in the sequence listing sequentially; the primer set IV includes 15 single-stranded DNA molecules, and the nucleotide sequences of the 15 single-stranded DNA molecules are shown in SEQ ID NO.202 to 216 in the sequence listing sequentially; the primer set V includes 75 single-stranded DNA molecules, and the 75 single-stranded DNA molecules sequentially include the nucleotide sequences shown in SEQ ID NO.220 to SEQ ID NO.294 of the sequence listing from the 16th position from the 5′ end to the 3′ end; the primer set VI includes 79 single-stranded DNA molecules, and the 79 single-stranded DNA molecules sequentially include the nucleotide sequences shown in SEQ ID NO.295 to SEQ ID NO.373 of the sequence listing from the 16th position from the 5′ end to the 3′ end; the primer set VII includes 14 single-stranded DNA molecules, and the 14 single-stranded DNA molecules sequentially include the nucleotide sequences shown in SEQ ID NO.374 to SEQ ID NO.387 of the sequence listing from the 16th position from the 5′ end to the 3′ end; the primer set VIII includes 15 single-stranded DNA molecules, and the 15 single-stranded DNA molecules sequentially include the nucleotide sequences shown in SEQ ID NO.388 to SEQ ID NO.402 of the sequence listing from the 16th position from the 5′ end to the 3′ end.
10. (canceled)
11. (canceled)
12. A method for detecting target mutation and/or methylation in a DNA sample, comprising the following steps: (1) constructing a library according to the method according to claim 1; (2) performing two rounds of nested PCR amplification to the library obtained in step (1), sequencing the product, and analyzing the occurrence of target mutation and/or methylation in the DNA sample according to the sequencing result; in the step (2), primer combination A is used to carry out the first round of PCR amplification; primer combination A consists of upstream primer A and downstream primer combination A; the upstream primer A is a library amplification primer used for library amplification in step (1); the downstream primer combination A is a combination of Y primers designed according to X target sites; X and Y are both natural numbers greater than 1, and X≤Y; using the product of the first round of PCR as a template, carrying out the second round of PCR amplification with primer combination B; primer combination B consists of upstream primer B, downstream primer combination B and index primer; the upstream primer B is a library amplification primer and the 3′ end is the same as that of the upstream primer A, and is used for the amplification of the product of the first round of PCR; the index primer includes a segment A for sequencing, an index sequence for distinguishing samples, and a segment B for sequencing from the 5′ end; the primer in the downstream primer combination B has the segment B and form a nested relationship with the primer detecting the same target site in the downstream primer combination A.
13. The method according to claim 12, wherein: the method for analyzing the target mutation in the DNA sample is: DNA molecules whose sequencing data meet the criterion A are traced back to a molecular cluster; the molecular clusters which meet the criterion B are labeled as a pair of duplex molecular clusters; for a mutation, if the following (a1) or (a2) is satisfied, the mutation is a true mutation from the original DNA sample: (a1) supported by at least one pair of duplex molecular clusters; (a2) supported by at least 4 molecular clusters; criterion A means satisfying ①, ② and ③ at the same time; ①the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ②the random tag sequences are the same; ③the anchor sequences are the same; criterion B means satisfying both ④ and ⑤; ④the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ⑤the anchor sequences at both ends of the molecular cluster are the same but in opposite positions; the method for analyzing methylation in the DNA sample is: the DNA molecules whose sequencing data meet the criterion C are labeled as a cluster, and the number of clusters whose ends are the restriction sites of interest is calculated respectively, and recorded as unmethylated fragments; the number of all the clusters whose amplified fragments reach or exceed the first restriction site is calculated, and recorded as the total number of fragments; the average methylation level of the corresponding region is calculated according to the number of two fragments; the methylation level of the region = (1 - the number of unmethylated fragments / the total number of fragments) × 100%; criterion C means satisfying ⑥, ⑦ and ⑧ at the same time; ⑥the random tag sequences are the same; ⑦the anchor sequences are the same; ⑧the length of the DNA inserts is the same and the sequences are the same except for the mutation sites.
14. A method for detecting multiple target mutations and/or methylation in a DNA sample, comprising the following steps: (1) constructing a library according to the method described in claim 1; (2) enriching and sequencing the target region of the library of step (1), and analyzing the occurrence of target mutation and/or methylation in the DNA sample according to the sequencing result.
15. The method according to claim 14, wherein: the method for analyzing the target mutation in the DNA sample is: DNA molecules whose sequencing data meet the criterion A are traced back to a molecular cluster; the molecular clusters which meet the criterion B are labeled as a pair of duplex molecular clusters; for a mutation, if the following (a1) or (a2) is satisfied, the mutation is a true mutation from the original DNA sample: (a1) supported by at least one pair of duplex molecular clusters; (a2) supported by at least 4 molecular clusters; criterion A means satisfying ①, ② and ③ at the same time; ①the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ②the random tag sequences are the same; ③the anchor sequences are the same; criterion B means satisfying both ④ and ⑤; ④the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ⑤the anchor sequences at both ends of the molecular cluster are the same but in opposite positions; the method for analyzing methylation in the DNA sample is: the DNA molecules whose sequencing data meet the criterion C are labeled as a cluster, and the number of clusters whose ends are the restriction sites of interest is calculated respectively, and recorded as unmethylated fragments; the number of all the clusters whose amplified fragments reach or exceed the first restriction site is calculated, and recorded as the total number of fragments; the average methylation level of the corresponding region is calculated according to the number of two fragments; the methylation level of the region = (1 - the number of unmethylated fragments / the total number of fragments) × 100%; criterion C means satisfying ⑥, ⑦ and ⑧ at the same time; ⑥the random tag sequences are the same; ⑦the anchor sequences are the same; ⑧the length of the DNA inserts is the same and the sequences are the same except for the mutation sites.
16. A method for distinguishing blood samples from tumor patients and blood samples from non-tumor patients, comprising the following steps: constructing a library according to the method described in claim 1; enriching and sequencing the target region of the library, and analyzing the occurrence of target mutation and/or methylation in the DNA sample according to the sequencing result; distinguishing blood samples from tumor patients and blood samples from non-tumor patients according to occurrence of target mutation and/or methylation in the DNA sample.
Description
DESCRIPTION OF THE DRAWINGS
[0110]
[0111]
[0112]
[0113]
[0114]
EMBODIMENTS
[0115] The following examples facilitate a better understanding of the present invention, but do not limit the present invention.
[0116] The experimental methods in the following examples, unless otherwise specified, are all conventional methods.
[0117] The experimental materials used in the following examples, unless otherwise specified, are all purchased from conventional biochemical reagent stores.
[0118] The quantitative experiments in the following examples are all set to repeat the experiment three times, and the results are averaged.
[0119] The TE buffer in the following examples is the product of ThermoFisher Company, the product catalog number is 12090015.
[0120] In the following examples, patients with hepatocellular carcinoma gave informed consent to the content of the present invention.
Example 1. Construction of MC Library
1. Methylation-Sensitive Restriction Endonuclease Digestion
[0121] 5-40 ng of cfDNA was taken to configure the reaction system as shown in Table 1, and then enzyme digestion treatment was performed in the PCR machine according to the procedure in Table 2 to obtain the enzyme digestion product (stored at 4° C.) .
[0122] Both Restriction Enzyme and Restriction Enzyme 10 × Buffer are products of ThermoFisher Company. Restriction Enzyme and Restriction Enzyme 10×Buffer can be selected according to different target regions to be tested, and the selection criterion is that the region to be tested contains at least one restriction enzyme cleavage site of the methylation-sensitive restriction enzyme.
TABLE-US-00001 Reaction system Composition Volume cfDNA 16.8 .Math.l Restriction Enzyme 10×Buffer 2 .Math.l Acetylated BSA (concentration: 10 .Math.g/.Math.l) 0.2 .Math.l Restriction Enzyme (concentration: 10 U/.Math.l) 1 .Math.l total volume 20 .Math.l
TABLE-US-00002 Reaction Procedure Temperature Time 37° C. 2 h
2. Purification of Enzyme Digestion Products
[0123] The enzyme digestion product obtained in step 1 was purified and enriched to obtain a purified product with Apostle MiniMax™ high-efficiency free DNA enrichment and isolation kit (standard version) (a product of Apostle Company, product catalog number is A17622-50)
3. Blunt End Repair and Adding a Treatment of Purified Products
[0124] The purified product obtained in step 2 was taken to configure the reaction system as shown in Table 3, and then end repair and adding A treatment at the 3′ end in a PCR machine were performed according to the reaction procedure in Table 4 to obtain a reaction product (stored at 4° C.).
TABLE-US-00003 Reaction system Composition Volume Purified product 50 .Math.l End Repair & A-Tailing Buffer (KAPA KK8505) 7 .Math.l End Repair & A-Tailing Enzyme Mix (KAPA KK8505) 3 .Math.l total volume 60 .Math.l
TABLE-US-00004 reaction procedure Temperature Time 20° C. 30 min 65° C. 30 min
4. Ligation the Reaction Product to the Adapter
[0125] The reaction system was configured according to Table 5, and the reaction was carried out at 20° C. for 15 min to obtain a ligation product (stored at 4° C.).
TABLE-US-00005 Reaction system Composition volume Reaction product obtained in step 3 60 .Math.l Adapter Mix (50 .Math.M) 1.5 .Math.l DNase/RNase-Free Water 8.5 .Math.l Ligation Buffer (KAPA KK8505) 30 .Math.l DNA Ligase (KAPA KK8505) 10 .Math.l Total volume 110 .Math.l
[0126] Adapter sequence information is shown in Table 6.
[0127] The single-stranded DNA molecules in Table 6 were dissolved with TE buffer and diluted to a concentration of 100 .Math.M, respectively. Two single-stranded DNA molecules in the same group were mixed in equal volumes (50 .Math.l each), and then annealed (annealing program: 95° C., 15 min; 25° C., 2 h) to obtain 12 sets of DNA solutions. The 12 sets of DNA solutions were mixed in equal volumes to obtain Adapter Mix.
TABLE-US-00006 Adapter sequence information Group Number Name Nucleotide sequence (5′-3′) 1 1 R21_F GACACGACGCTCTTCCGATCTNNNNNNNNCCACTAGTAGCCT(SEQ ID NO.1) 2 R21_R GGCTACTAGTGGCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.2) 2 3 R22_F GACACGACGCTCTTCCGATCTNNNNNNNNGGACTGTGTCGGT (SEQ ID NO.3) 4 R22_R CCGACACAGTCCCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.4) 3 5 R23_F GACACGACGCTCTTCCGATCTNNNNNNNNGGTACTGACAGGT (SEQ ID NO.5) 6 R23_R CCTGTCAGTACCCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.6) 4 7 R24_F GACACGACGCTCTTCCGATCTNNNNNNNNCCTAGTACAGCCT (SEQ ID NO.7) 8 R24_R GGCTGTACTAGGCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.8) 5 9 R25_F GACACGACGCTCTTCCGATCTNNNNNNNNGGTAGTCAGAGGT (SEQ ID NO.9) 10 R25_R CCTCTGACTACCCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.10) 6 11 R26_F GACACGACGCTCTTCCGATCTNNNNNNNNTTCTCACGTGTTT (SEQ ID NO.11) 12 R26_R AACACGTGAGAACTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.12) 7 13 R27_F GACACGACGCTCTTCCGATCTNNNNNNNNAACTCCACGTAAT (SEQ ID NO.13) 14 R27_R TTACGTGGAGTTCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.14) 8 15 R28_F GACACGACGCTCTTCCGATCTNNNNNNNNTTCTCGAGAATTT (SEQ ID NO.15) 16 R28_R AATTCTCGAGAACTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.16) 9 17 R29_F GACACGACGCTCTTCCGATCTNNNNNNNNAAACTCTTCCAAT (SEQ ID NO.17) 18 R29_R TTGGAAGAGTTTCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.18) 10 19 R30_F GACACGACGCTCTTCCGATCTNNNNNNNNTTGGAACGTCTTT (SEQ ID NO.19) 20 R30_R AAGACGTTCCAACTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.20) 11 21 R31_F GACACGACGCTCTTCCGATCTNNNNNNNNCCGGACTCCTCCT (SEQ ID NO.21) 22 R31_R GGAGGAGTCCGGCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.22) 12 23 R32_F GACACGACGCTCTTCCGATCTNNNNNNNNAAGGAGGAGTAAT (SEQ ID NO.23) 24 R32_R TTACTCCTCCTTCTGTCTCTTATACACATCTCCGAGCCCAC (SEQ ID NO.24)
[0128] In Table 6, 8 Ns represent an 8-bp random tag. In practical applications, the random tag length can be 8-14 bp.
[0129] Underlined indicates the 12-bp anchor sequence. In the upstream sequence (the ones containing “F” in the name are upstream sequences) and downstream sequence (the ones containing “R” in the name are the downstream sequences) of each group, the underlined parts are reverse complementary, and the upstream and downstream sequences can be brought together to form a linker by annealing. At the same time, the anchor sequence can serve as a sequence-fixed built-in tag for labeling the original template molecule. In practical applications, the length of the anchor sequence can be 12-20 bp, with no more than 3 consecutive repeating bases, and it cannot interact with other parts of the primer (such as forming hairpin structures, dimers, etc.); in the 12 groups, the bases are balanced at each position (ie, A, T, C, and G are evenly distributed), and the number of mismatched bases ≥3 (that is, each anchor sequence differs by at least 3 bases, the difference can be different in position or order).
[0130] The bold T at the end in the upstream sequence is complementary to the “A” added at the end of the original molecule for TA ligation.
[0131] In the upstream sequence, positions 1 to 21 from the 5′ end (from the Truseq sequencing kit of Illumina) is sequencing primer binding sequence, and positions 1 to 19 from the 5′ end is the part of the library amplification primer.
[0132] In the downstream sequence, the non-underlined part (from the nextera sequencing kit of Illumina) is the sequencing primer binding sequence, and the positions 1 to 22 from the 3′ end is the part of the library amplification primer.
[0133] Table 6 contains a total of 12 sets of linkers, which can form 12 × 12=144 kinds of marker combinations, combined with the sequence information of the molecule itself, which is enough to distinguish all molecules in the original sample. In practical applications, the number of groups can be appropriately increased (increased synthesis cost) or decreased (with slightly weaker differentiation effect).
[0134] The structure of the ligation product is shown in
5. Purification of Ligation Products
[0135] 110-220 .Math.l (i.e. 1-2 times the volume) of AMPure XP magnetic beads (Beckman A63880) was added to the ligation product obtained in step 4, mixed by vortexing, placed at room temperature for 10 min, and adsorption on magnetic stand was kept for 5 min. After the solution was clear, the supernatant was discarded, then 200 .Math.l of 80% (volume percent) ethanol aqueous solution was added to wash twice, and the supernatant was discarded. After the ethanol was air-dried, 30 .Math.l of DNase/RNase-Free Water was added, mixed by vortexing, placed at room temperature for 10 min, and adsorption on magnetic stand was kept for 5 min. The supernatant solution was pipetted into a PCR tube as a PCR template.
6. Library Amplification and Purification
[0136] The PCR template obtained in step 5 was taken to configure the reaction system according to Table 7, and PCR amplification was performed according to Table 8 to obtain PCR amplification products (stored at 4° C.).
TABLE-US-00007 Reaction system Composition volume HIFI (KAPA KK8505) 35 .Math.l MC_F (33 .Math.M) 2.5 .Math.l MC_R (33 .Math.M) 2.5 .Math.l PCR template 30 .Math.l Total volume 70 .Math.l
In Table 7, the primer information is as follows:
TABLE-US-00008 MC_F (SEQ ID NO.25) : 5′-GACACGACGCTCTTCCGAT-3′;
TABLE-US-00009 MC_R (SEQ ID NO.26) : 5′-GTGGGCTCGGAGATGTGTATAA-3′ ∘
TABLE-US-00010 reaction procedure Temperature Time Number of cycles 98° C. 45 s 98° C. 15 s 7-10 cycles 57-60° C. 30 s 72° C. 30 s 72° C. 5 min
[0137] 70-140 .Math.l (i.e. 1-2 times the volume) of AMPure XP magnetic beads was added to the PCR amplification product obtained in step (1), mixed by vortexing, placed at room temperature for 10 min, and adsorption on magnetic stand was kept for 5 min. After the solution was clear, the supernatant was discarded, then 200 .Math.l of 80% (volume percent) ethanol aqueous solution was added to wash twice, and the supernatant was discarded. After the ethanol was air-dried, 100 .Math.l of DNase/RNase-Free Water was added, mixed by vortexing, placed at room temperature for 10 min, and adsorption on magnetic stand was kept for 5 min. The supernatant solution was pipetted to obtain the product (stored at -20° C.). The product is the MC library that can be stored for a long time and used repeatedly.
[0138] After testing, the MC library could support 10-20 subsequent tests, and the results of each test could represent the mutation status of all the original samples and the methylation modification status in the areas covered by the restriction sites, without reducing the sensitivity and specificity. At the same time, the library construction method is not only applicable to cfDNA samples, but also to genomic DNA or cDNA samples.
[0139] Example 2. RaceSeq target region enrichment and construction of a sequencing library
[0140] As shown in
[0141] In
[0142] 1. 300 ng of the MC library prepared by Example 1 was taken, divided into two parts, to configure the reaction system of Table 9 (one was added to GSP1A mix and the other was added to GSP1B mix). The first round of PCR amplification was carried out according to the reaction procedure of Table 11, and the first round of amplification products were obtained (a total of two first-round amplification products were obtained, one was the amplification product of the GSP1A mix, and the other was the amplification product of the GSP1B mix).
TABLE-US-00011 Reaction system Composition volume Hifi (KAPA KK8505) 15 .Math.l upstream primer1355 3 .Math.l GSP1A mix/GSP1B mix 2 .Math.l MC library 10 .Math.l total volume 30 .Math.l
[0143] In Table 9, the primer information is as follows:Upstream primer
TABLE-US-00012 1355 (SEQ ID NO.27):
TABLE-US-00013 5′-TCTTTCCCTACACGACGCTCTTCCGAT-3′
.
[0144] GSP1A mix: each primer in the primer pool GSP1A in Table 10 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in primer pool GSP1A were used to amplify the positive strand of the template.
[0145] GSP1B mix: each primer in the primer pool GSP1B in Table 10 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in primer pool GSP1B were used to amplify the negative strand of the template.
[0146] In the primer pool GSP1A and the primer pool GSP1B, the primers with the same number (that is, the last four digits of the primer number are the same) detect the same mutation site from both positive and negative directions, and simultaneous use can maximize the enrichment of original molecular information.
TABLE-US-00014 Primer Information Gene Name Primer Pool Primer number SEQ ID NO. Nucleotide sequence (5′-3′) AXIN1 GSP1A HA1009 TGTATTAGGGTGCAGCGCTC (SEQ ID NO.28) AXIN1 GSP1A HA1010 CGCTCGGATCTGGACCTG (SEQ ID NO.29) AXIN1 GSP1A HA1011 TGGAGCCCTGTGACTCGAA (SEQ ID NO.30) AXIN1 GSP1A HA1012 GTGACCAGGACATGGATGAGG (SEQ ID NO.31) AXIN1 GSP1A HA1013 TCCTCCAGTAGACGGTACAGC (SEQ ID NO.32) AXIN1 GSP1A HA1014 TGCTGCTTGTCCCCACAC (SEQ ID NO.33) AXIN1 GSP1A HA1015 CCGCTTGGCACCACTTCC (SEQ ID NO.34) AXIN1 GSP1A HA1016 GGCACGGGAAGCACGTAC (SEQ ID NO.35) AXIN1 GSP1A HA1017 CCTTGCAGTGGGAAGGTG (SEQ ID NO.36) CTNNB1 GSP1A HA1018 GACAGAAAAGCGGCTGTTAGTCA (SEQ ID NO.37) TERT GSP1A HA1019 CCGACCTCAGCTACAGCAT (SEQ ID NO.38) TERT GSP1A HA1020 ACTTGAGCAACCCGGAGTCTG (SEQ ID NO.39) TERT GSP1A HA1021 CTCCTAGCTCTGCAGTCCGA (SEQ ID NO.40) TERT GSP1A HA1022 GCGCCTGGCTCCATTTCC (SEQ ID NO.41) TERT GSP1A HA1023 CGCCTGAGAACCTGCAAAGAG (SEQ ID NO.42) TERT GSP1A HA1024 GTCCAGGGAGCAATGCGT (SEQ ID NO.43) TERT GSP1A HA1025 CGGGTTACCCCACAGCCTA (SEQ ID NO.44) TERT GSP1A HA1026 GGCTCCCAGTGGATTCGC (SEQ ID NO.45) TERT GSP1A HA1027 GTCCTGCCCCTTCACCTT (SEQ ID NO.46) HBV-C GSP1A HA1028 CCGACTACTGCCTCACCCATAT (SEQ ID NO.47) HBV-C GSP1A HA1029 GGGTTTTTCTTGTTGACAAGAATCCT (SEQ ID NO.48) HBV-C GSP1A HA1030 CCAACCTCCAATCACTCACCAA (SEQ ID NO.49) HBV-C GSP1A HA1031 GGCGTTTTATCATATTCCTCTTCATCCT (SEQ ID NO.50) HBV-C GSP1A HA1032 CTACTTCCAGGAACATCAACTACCAG (SEQ ID NO.51) HBV-C GSP1A HA1033 CTGCACTTGTATTCCCATCCCAT (SEQ ID NO.52) HBV-C GSP1A HA1034 TCAGTTTACTAGTGCCATTTGTTCAGT (SEQ ID NO.53) HBV-C GSP1A HA1035 TACAACATCTTGAGTCCCTTTTTACCTC (SEQ ID NO.54 ) HBV-C GSP1A HA1036 AGAATTGTGGGTCTTTTGGGCTT (SEQ ID NO.55) HBV-C GSP1A HA1037 TGTAAACAATATCTGAACCTTTACCCTGTT (SEQ ID NO.56) HBV-C GSP1A HA1038 GCATGCGTGGAACCTTTGTG (SEQ ID NO.57) HBV-C GSP1A HA1039 AACTCTGTTGTCCTCTCTCGGAA (SEQ ID NO.58) HBV-C GSP1A HA1040 CTGAATCCCGCGGACGAC (SEQ ID NO.59) HBV-C GSP1A HA1041 CCGTCTGTGCCTTCTCATCTG (SEQ ID NO.60) HBV-C GSP1A HA1042 GAACGCCCACCAGGTCTTG (SEQ ID NO.61) HBV-C GSP1A HA1043 CCTTGAGGCGTACTTCAAAGACTG (SEQ ID NO.62) HBV-C GSP1A HA1044 GGAGGCTGTAGGCATAAATTGGT (SEQ ID NO.63) HBV-C GSP1A HA1045 GTCCTACTGTTCAAGCCTCCAA (SEQ ID NO.64) HBV-C GSP1A HA1046 GGGCTTCTGTGGAGTTACTCTC (SEQ ID NO.65) HBV-C GSP1A HA1047 TTGTATCGGGAGGCCTTAGAGT (SEQ ID NO.66) HBV-C GSP1A HA1048 TTCTGTGTTGGGGTGAGTTGA (SEQ ID NO.67) HBV-C GSP1A HA1049 CCAGCATCCAGGGAATTAGTAGTCA (SEQ ID NO.68) HBV-C GSP1A HA1050 TTCCTGTCTTACCTTTGGAAGAGAAAC (SEQ ID NO.69 ) HBV-C GSP1A HA1051 CCGGAAACTACTGTTGTTAGACGTA (SEQ ID NO.70) HBV-C GSP1A HA1052 CGTCGCAGAAGATCTCAATCTCG (SEQ ID NO.71) HBV-C GSP1A HA1053 AAACTCCCTCCTTTCCTAACATTCATTT (SEQ ID NO.72) HBV-C GSP1A HA1054 TATGCCTGCTAGGTTCTATCCTAACC (SEQ ID NO.73) HBV-C GSP1A HA1055 GGCATTATTTACATACTCTGTGGAAGG (SEQ ID NO.74) HBV-C GSP1A HA1056 GTTGGTCTTCCAAACCTCGACA (SEQ ID NO.75) HBV-C GSP1A HA1057 TTCAACCCCAACAAGGATCACT (SEQ ID NO.76) HBV-C GSP1A HA1058 TTCCACCAATCGGCAGTCAG (SEQ ID NO.77) HBV-B GSP1A HA1059 GCCCTGCTCAGAATACTGTCT (SEQ ID NO.78) HBV-B GSP1A HA1060 ATTCGCAGTCCCAAATCTCC (SEQ ID NO.79) HBV-B GSP1A HA1061 CATCTTCCTCTGCATCCTGCT (SEQ ID NO.80) HBV-B GSP1A HA1062 TTCCAGGATCATCAACCACCAG (SEQ ID NO.81) HBV-B GSP1A HA1063 GTCCCTTTATGCCGCTGT (SEQ ID NO.82) HBV-B GSP1A HA1064 ACCCTTATAAAGAATTTGGAGCTACTGTG (SEQ ID NO.83 ) HBV-B GSP1A HA1065 CTCCTGAACATTGCTCACCTCA (SEQ ID NO.84) TP53 GSP1A HA1071 AGACTGCCTTCCGGGTCA (SEQ ID NO.85) TP53 GSP1A HA1072 CCTGTGGGAAGCGAAAATTCCA (SEQ ID NO.86) TP53 GSP1A HA1073 ACCTGGTCCTCTGACTGCT (SEQ ID NO.87) TP53 GSP1A HA1074 AAGCAATGGATGATTTGATGCTGT (SEQ ID NO.88) TP53 GSP1A HA1075 GACCCAGGTCCAGATGAAGC (SEQ ID NO.89) TP53 GSP1A HA1076 TCCTGGCCCCTGTCATCT (SEQ ID NO.90) TP53 GSP1A HA1077 GTGCCCTGACTTTCAACTCTGT (SEQ ID NO.91) TP53 GSP1A HA1078 CAACTGGCCAAGACCTGC (SEQ ID NO.92) TP53 GSP1A HA1079 CGCCATGGCCATCTACAAGC (SEQ ID NO.93) TP53 GSP1A HA1080 GGTCCCCAGGCCTCTGAT (SEQ ID NO.94) TP53 GSP1A HA1081 GAGTGGAAGGAAATTTGCGTGT (SEQ ID NO.95) TP53 GSP1A HA1082 GCACTGGCCTCATCTTGGG (SEQ ID NO.96) TP53 GSP1A HA1083 CCATCCACTACAACTACATGTGTAAC (SEQ ID NO.97) TP53 GSP1A HA1084 TTTCCTTACTGCCTCTTGCTTCTC (SEQ ID NO.98) TP53 GSP1A HA1085 GGGACGGAACAGCTTTGAGG (SEQ ID NO.99) TP53 GSP1A HA1086 CACAGAGGAAGAGAATCTCCGCA (SEQ ID NO.100) TP53 GSP1A HA1087 TGCCTCAGATTCACTTTTATCACCTT (SEQ ID NO.101) TP53 GSP1A HA1088 CTCAGGTACTGTGTATATACTTACTTCTCC (SEQ ID NO.102 ) TP53 GSP1A HA1089 CGTGAGCGCTTCGAGATGT (SEQ ID NO.103) TP53 GSP1A HA1090 GTGATGTCATCTCTCCTCCCTG (SEQ ID NO.104) TP53 GSP1A HA1091 TGAAGTCCAAAAAGGGTCAGTCTAC (SEQ ID NO. 105) AXIN1 GSP1B HB1009 GGGAGCATCTTCGGTGAAAC (SEQ ID NO.106) AXIN1 GSP1B HB1010 CAGGCTTATCCCATCTTGGTCA (SEQ ID N0.107) AXIN1 GSP1B HB1011 TTGGTGGCTGGCTTGGTC (SEQ ID NO.108) AXIN1 GSP1B HB1012 GCTGTACCGTCTACTGGAGGA (SEQ ID NO.109) AXIN1 GSP1B HB1013 GCTTGTTCTCCAGCTCTCGGA (SEQ ID NO.110) AXIN1 GSP1B HB1014 GGGAAGTGGTGCCAAGCG (SEQ ID NO.111) AXIN1 GSP1B HB1015 GCACACGCTGTACGTGCT (SEQ ID NO.112) AXIN1 GSP1B HB1016 GCCTCCACCTGCTCCTTG (SEQ ID NO.113) AXIN1 GSP1B HB1017 CCCTCAATGATCCACTGCATGA (SEQ ID NO.114) CTNNB1 GSP1B HB1018 CTCATACAGGACTTGGGAGGTATC (SEQ ID NO.115) TERT GSP1B HB1019 CACAACCGCAGGACAGCT (SEQ ID NO.116) TERT GSP1B HB1020 CTCCAAGCCTCGGACTGC (SEQ ID NO.117) TERT GSP1B HB1021 GCCTCACACCAGCCACAAC (SEQ ID NO.118) TERT GSP1B HB1022 TCCCCACCATGAGCAAACCA (SEQ ID NO.119) TERT GSP1B HB1023 GTGCCTCCCTGCAACACT (SEQ ID NO.120) TERT GSP1B HB1024 GCACCACGAATGCCGGAC (SEQ ID NO.121) TERT GSP1B HB1025 GTGGGGTAACCCGAGGGA (SEQ ID NO.122) TERT GSP1B HB1026 GAGGAGGCGGAGCTGGAA (SEQ ID NO.123) TERT GSP1B HB1027 AGCGCTGCCTGAAACTCG (SEQ ID NO.124) TERT GSP1B HB1028 CGCACGAACGTGGCCAG (SEQ ID NO.125) HBV-C GSP1B HB1029 GAGCCACCAGCAGGAAAGT (SEQ ID NO.126) HBV-C GSP1B HB1030 CTAGGAATCCTGATGTTGTGCTCT (SEQ ID NO.127) HBV-C GSP1B HB1031 CGCGAGTCTAGACTCTGTGGTA (SEQ ID NO.128) HBV-C GSP1B HB1032 ATAGCCAGGACAAATTGGAGGACA (SEQ ID NO.129) HBV-C GSP1B HB1033 GACAAACGGGCAACATACCTT (SEQ ID NO.130) HBV-C GSP1B HB1034 CCGAAGGTTTTGTACAGCAACAA (SEQ ID NO.131) HBV-C GSP1B HB1035 CTGAGCCAGGAGAAACGGACTGA (SEQ ID NO.132) HBV-C GSP1B HB1036 GGGACTCAAGATGTTGTACAGACTTG (SEQ ID NO.133) HBV-C GSP1B HB1037 GTTAAGGGAGTAGCCCCAACG (SEQ ID NO.134) HBV-C GSP1B HB1038 CAGGCAGTTTTCGAAAACATTGCTT (SEQ ID NO.135) HBV-C GSP1B HB1039 TTAAAGCAGGATAGCCACATTGTGTAA (SEQ ID NO.136) HBV-C GSP1B HB1040 GGCAACAGGGTAAAGGTTCAGATAT (SEQ ID NO.137) HBV-C GSP1B HB1041 CCACAAAGGTTCCACGCAT (SEQ ID NO.138) HBV-C GSP1B HB1042 TGGAAAGGAAGTGTACTTCCGAGA (SEQ ID NO.139) HBV-C GSP1B HB1043 GTCGTCCGCGGGATTCAG (SEQ ID NO.140) HBV-C GSP1B HB1044 AAGGCACAGACGGGGAGA (SEQ ID NO.141) HBV-C GSP1B HB1045 TCACGGTGGTCTCCATGC (SEQ ID NO.142) HBV-C GSP1B HB1046 GGTCGTTGACATTGCTGAGAGT (SEQ ID NO.143) HBV-C GSP1B HB1047 AACCTAATCTCCTCCCCCAACT (SEQ ID NO.144) HBV-C GSP1B HB1048 GCAGAGGTGAAAAAGTTGCATGG (SEQ ID NO.145) HBV-C GSP1B HB1049 CCACCCAAGGCACAGCTT (SEQ ID NO.146) HBV-C GSP1B HB1050 ACTCCACAGAAGCCCCAA (SEQ ID NO.147) HBV-C GSP1B HB1051 GCCTCCCGATACAAAGCAGA (SEQ ID NO.148) HBV-C GSP1B HB1052 GATTCATCAACTCACCCCAACACA (SEQ ID NO.149) HBV-C GSP1B HB1053 ACATAGCTGACTACTAATTCCCTGGAT (SEQ ID NO.150) HBV-C GSP1B HB1054 ATCCACACTCCAAAAGACACCAAAT (SEQ ID NO.151) HBV-C GSP1B HB1055 GCGAGGGAGTTCTTCTTCTAGG (SEQ ID NO.152) HBV-C GSP1B HB1056 CAGTAAAGTTTCCCACCTTGTGAGT (SEQ ID NO.153) HBV-C GSP1B HB1057 CCTCCTGTAAATGAATGTTAGGAAAGG (SEQ ID NO.154) HBV-C GSP1B HB1058 GTTTAATGCCTTTATCCAAGGGCAAA (SEQ ID NO.155) HBV-C GSP1B HB1059 CTCTTATATAGAATCCCAGCCTTCCAC (SEQ ID NO.156) HBV-C GSP1B HB1060 CTTGTCGAGGTTTGGAAGACCA (SEQ ID NO.157) HBV-C GSP1B HB1061 GTTTGAGTTGGCTCCGAACG (SEQ ID NO.158) HBV-C GSP1B HB1062 CTGAGGGCTCCACCCCAA (SEQ ID NO.159) HBV-C GSP1B HB1063 GTGAAGAGATGGGAGTAGGCTGT (SEQ ID NO.160) HBV-B GSP1B HB1064 CCCATCTTTTTGTTTTGTGAGGGTTT (SEQ ID NO.161) HBV-B GSP1B HB1065 TTAAAGCAGGATATCCACATTGCGTA (SEQ ID NO.162 ) HBV-B GSP1B HB1066 TTGCTGAAAGTCCAAGAGTCCT (SEQ ID NO.163) HBV-B GSP1B HB1067 GGTGAGCAATGTTCAGGAGATTC (SEQ ID NO.164) HBV-B GSP1B HB1068 ACTACTAGATCCCTGGACGCTG (SEQ ID NO.165) HBV-B GSP1B HB1069 GGTGGAGATAAGGGAGTAGGCTG (SEQ ID NO.166) TP53 GSP1B HB1071 TGCCCTTCCAATGGATCCAC (SEQ ID NO.167) TP53 GSP1B HB1072 GTCCCCAGCCCAACCCTT (SEQ ID NO.168) TP53 GSP1B HB1073 CTCTGGCATTCTGGGAGCTT (SEQ ID NO.169) TP53 GSP1B HB1074 TGGTAGGTTTTCTGGGAAGGGA (SEQ ID NO.170) TP53 GSP1B HB1075 TGTCCCAGAATGCAAGAAGCC (SEQ ID NO.171) TP53 GSP1B HB1076 GGCATTGAAGTCTCATGGAAGCCA (SEQ ID NO.172) TP53 GSP1B HB1077 ACCTCCGTCATGTGCTGTGA (SEQ ID NO.173) TP53 GSP1B HB1078 CTCACCATCGCTATCTGAGCA (SEQ ID NO.174) TP53 GSP1B HB1079 GCAACCAGCCCTGTCGTC (SEQ ID NO.175) TP53 GSP1B HB1080 GCACCACCACACTATGTCGAA (SEQ ID NO.176) TP53 GSP1B HB1081 TTAACCCCTCCTCCCAGAGAC (SEQ ID NO.177) TP53 GSP1B HB1082 TTCCAGTGTGATGATGGTGAGGAT (SEQ ID NO.178) TP53 GSP1B HB1083 CAGCAGGCCAGTGTGCAG (SEQ ID NO.179) TP53 GSP1B HB1084 CCGGTCTCTCCCAGGACA (SEQ ID NO.180) TP53 GSP1B HB1085 GTGAGGCTCCCCTTTCTTGC (SEQ ID NO.181) TP53 GSP1B HB1086 TGGTCTCCTCCACCGCTTC (SEQ ID NO.182) TP53 GSP1B HB1087 GAAACTTTCCACTTGATAAGAGGTCC (SEQ ID NO.183) TP53 GSP1B HB1088 CTCCCCCCTGGCTCCTTC (SEQ ID NO.184) TP53 GSP1B HB1089 GGGGAGTAGGGCCAGGAAG (SEQ ID NO.185) TP53 GSP1B HB1090 GCCCTTCTGTCTTGAACATGAGT (SEQ ID NO.186) TP53 GSP1B HB1091 GTGGGAGGCTGTCAGTGG (SEQ ID NO.187) BDH1 GSP1A CA1001 GCCACCCGGACGCTTC (SEQ ID NO.188) EMX1 GSP1A CA1002 CAAACGAAACCCCACACGAAC (SEQ ID NO.189) LRRC4 GSP1A CA1003 GCGGAGGGAGCGAGTTC (SEQ ID NO.190) LRRC4 GSP1A CA1004 AACATAGTCCCCGCTGGCTA (SEQ ID NO.191) LRRC4 GSP1A CA1005 GGAGCGCTCAAACCCACA (SEQ ID NO.192) LRRC4 GSP1A CA1006 TACAACTGGCCCGTGTGG (SEQ ID NO.193) BDH1 GSP1A CA1007 GTCCTTCTTCGCCTGGCATC (SEQ ID NO.194) CLEC11A GSP1A CA1008 TGGGCTGGGAGACCGTG (SEQ ID NO.195) CLEC11A GSP1A CA1009 CCACCGGCTCTTCAAGCTC (SEQ ID NO.196) CLEC11A GSP1A CA1010 CATCGTCGCCGCTGCA (SEQ ID NO.197) HOXA1 GSP1A CA1011 AACGCATAGGAGGGGTGGAA (SEQ ID NO.198) HOXA1 GSP1A CA1012 CCTTTGGGTTGGGAGAAGAAAA (SEQ ID NO.199) EMX1 GSP1A CA1013 CACCCGCCGTGTACGTTT (SEQ ID NO.200) AK055957 GSP1A CA1014 CGGAATCGGGGTCTAAGTGG (SEQ ID NO.201) COTL1 GSP1B CB1001 CCTAGCGATCAGGGCACC (SEQ ID NO.202) COTL1 GSP1B CB1002 GATGAGAGAGCAGTCTGCGT (SEQ ID NO.203) COTL1 GSP1B CB1003 CGTTCTCGCGCTCTGCTTAC (SEQ ID NO.204) ACP1 GSP1B CB1004 GACCCCCGCTGCTCAC (SEQ ID NO.205) ACP1 GSP1B CB1005 CCCCCTAAGCCGCTGTT (SEQ ID NO.206) DAB2IP GSP1B CB1006 CCACACGGGCCAGTTGTA (SEQ ID NO.207) DAB2IP GSP1B CB1007 TGGCCGTTTTCGAAGAGGTAGA (SEQ ID NO.208) DAB2IP GSP1B CB1008 CACCGTTGGGCTGGTCC (SEQ ID NO.209) ACTB GSP1B CB1009 CGAGCTTGAAGAGCCGGTG (SEQ ID NO.210) BDH1 GSP1B CB1010 CGCCCACCCGAGTTCCT (SEQ ID NO.211) BDH1 GSP1B CB1011 TGGCCGGGACTGGAGG (SEQ ID NO.212) LRRC4 GSP1B CB1012 GGTAATACGTTCCGGCACTTCG (SEQ ID NO.213) LRRC4 GSP1B CB1013 GCCCCCACTTTCCAACTCC (SEQ ID NO.214) BDH1 GSP1B CB1014 GCGGTTCCGAAGTCCCTG (SEQ ID NO.215) LRRC4 GSP1B CB1015 CTCTCCAGCCCTCGGTG (SEQ ID NO.216)
TABLE-US-00015 Reaction Procedure Temperature Time Number of cycles 98° C. 3 min 98° C. 15 s 6-10 cycles 57-60° C. 60-90 s 72° C. 120 s 72° C. 10 min
[0147] 2. The two first-round amplification products obtained in step 1 were purified with 30-60 .Math.l (i.e. 1-2 times the volume) of AMPure XP magnetic beads, respectively, then eluted with 25 .Math.l of DNase/RNase-Free Water to obtain the first round of purification product.
[0148] 3. The first round of purification product obtained in step 2 was taken as templates to configure the reaction system of Table 12 (when using GSP1A mix amplification product as template, GSP2A mix was used for amplification; when using GSP1Bmix amplification product as template, GSP2B mix was used for amplification). The second round of PCR amplification was carried out according to the reaction procedure in Table 14 to obtain the second round of amplification product (stored at 4° C.).
TABLE-US-00016 Reaction system Composition volume KapaHifi 15 .Math.l upstream primer3355 2 .Math.l GSP2Amix/GSP2Bmix 1 .Math.l Index primer (10 .Math.M) 2 .Math.l template (GSP1Amix/GSP1Bmix) 10 .Math.l Total volume 30 .Math.l
[0149] In Table 12, the primer information is as follows: [0150] Upstream primer 3355 (SEQ ID NO.217): [0151] 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG CTCT-3′. The underlined part is the same part of the upstream primer 1355 of the first round, 3355 and 1355 are fixed sequences for sequencing on Illumina sequencing platform (can also be replaced with sequences that can be sequenced on other sequencing platforms). [0152] GSP2A mix: Each primer in the primer pool GSP2A in Table 13 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in the primer pool GSP2A were used to amplify the positive strand of the template. [0153] GSP2B mix: Each primer in the primer pool GSP2B in Table 13 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in the primer pool GSP2B were used to amplify the negative strand of the template.
[0154] In Table 13, positions 1 to 15 from the 5′ end are the parts that bind to the Index primer.
[0155] The primers with the same primer number in GSP2A mix and GSP1A mix(that is, the last four digits of the primer number are the same) are designed for the same mutation site, and the two primers form a nested relationship.
[0156] The primers with the same primer number in GSP2B mix and GSP2A mix (that is, the last four digits of the primer number are the same) are designed for the same mutation site, and the two primers form a nested relationship.
[0157] Index primer:
TABLE-US-00017 5′-CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO.218)
TABLE-US-00018 ∗∗∗∗∗∗∗∗GTGACTGGAGTTCCTTGGCACCCGAGAA-3′ (SEQ ID NO .219);
the underlined part is the part that binds to GSP2 mix. ******** is the index sequence position, the length of the index is 6-8 bp, the function is to distinguish the sequences between samples, and it is convenient for multiple samples to be mixed and sequenced. Except for the index sequence, the rest are fixed sequences of small RNA sequencing kit of Illumina.
TABLE-US-00019 Primer Information Gene name Primer pool Primer number SEQ ID NO.Primer sequence (5′ -3′ ) AXIN1 GSP2A HA2009 CTTGGCACCCGAGAATTCCATTGTTCCTTGACGCAGAG (SEQ ID NO.220) AXIN1 GSP2A HA2010 CTTGGCACCCGAGAATTCCAGACCTGGGGTATGAGCCTGA (SEQ ID NO.221) AXIN1 GSP2A HA2011 CTTGGCACCCGAGAATTCCAAGGCTGAAGCTGGCGAGA (SEQ ID NO.222) AXIN1 GSP2A HA2012 CTTGGCACCCGAGAATTCCATGAGGACGATGGCAGAGACG (SEQ ID NO.223) AXIN1 GSP2A HA2013 CTTGGCACCCGAGAATTCCAGTACAGCGAAGGCAGAGAGT (SEQ ID NO.224) AXIN1 GSP2A HA2014 CTTGGCACCCGAGAATTCCACACACAGGAGGAGGAAGGTGA (SEQ ID NO.225) AXIN1 GSP2A HA2015 CTTGGCACCCGAGAATTCCATGTGTGGACATGGGCTGTG (SEQ ID NO.226) AXIN1 GSP2A HA2016 CTTGGCACCCGAGAATTCCAACCCAAGTCAGGGGCGAA (SEQ ID NO.227) AXIN1 GSP2A HA2017 CTTGGCACCCGAGAATTCCAGCGTGCAAAAGAAATGCCAAGAAG (SEQ ID NO.228) CTNNB1 GSP2A HA2018 CTTGGCACCCGAGAATTCCATAGTCACTGGCAGCAACAGTC (SEQ ID NO.229) TERT GSP2A HA2019 CTTGGCACCCGAGAATTCCACTGCAAGGCCTCGGGAGA (SEQ ID NO.230) TERT GSP2A HA2020 CTTGGCACCCGAGAATTCCAATTCCTGGGAAGTCCTCAGCT (SEQ ID NO.231) TERT GSP2A HA2021 CTTGGCACCCGAGAATTCCAGCTTGGAGCCAGGTGCCT (SEQ ID NO.232) TERT GSP2A HA2022 CTTGGCACCCGAGAATTCCACATTTCCCACCCTTTCTCGACGG (SEQ ID NO.233) TERT GSP2A HA2023 CTTGGCACCCGAGAATTCCAACGGGCCTGTGTCAAGGA (SEQ ID NO.234) TERT GSP2A HA2024 CTTGGCACCCGAGAATTCCAATGCGTCCTCGGGTTCGT (SEQ ID NO.235) TERT GSP2A HA2025 CTTGGCACCCGAGAATTCCAAGCCTAGGCCGATTCGAC (SEQ ID NO.236) TERT GSP2A HA2026 CTTGGCACCCGAGAATTCCAGATTCGCGGGCACAGACG (SEQ ID NO.237) TERT GSP2A HA2027 CTTGGCACCCGAGAATTCCATTCCAGCTCCGCCTCCTC (SEQ ID NO. 238) HBV-C GSP2A HA2028 CTTGGCACCCGAGAATTCCACCCATATCGTCAATCTTCTCGAGG (SEQ ID NO.239) HBV-C GSP2A HA2029 CTTGGCACCCGAGAATTCCATCACAGTACCACAGAGTCTAGACTC (SEQ ID NO.240) HBV-C GSP2A HA2030 CTTGGCACCCGAGAATTCCAAACCTCTTGTCCTCCAATTTGTCC (SEQ ID NO.241) HBV-C GSP2A HA2031 CTTGGCACCCGAGAATTCCACCTGCTGCTATGCCTCATCTTC (SEQ ID NO.242) HBV-C GSP2A HA2032 CTTGGCACCCGAGAATTCCACACGGGACCATGCAAGACC (SEQ ID NO.243) HBV-C GSP2A HA2033 CTTGGCACCCGAGAATTCCATGGGCTTTCGCAAGATTCCTAT (SEQ ID NO.244) HBV-C GSP2A HA2034 CTTGGCACCCGAGAATTCCACGTAGGGCTTTCCCCCACT (SEQ ID NO.245) HBV-C GSP2A HA2035 CTTGGCACCCGAGAATTCCACCTCTATTACCAATTTTCTTTTGTCTTTGGG (SEQ ID NO.246) HBV-C GSP2A HA2036 CTTGGCACCCGAGAATTCCAACACAATGTGGCTATCCTGCTT (SEQ ID NO.247) HBV-C GSP2A HA2037 CTTGGCACCCGAGAATTCCAGGCAACGGTCAGGTCTCT (SEQ ID NO.248) HBV-C GSP2A HA2038 CTTGGCACCCGAGAATTCCACTCTGCCGATCCATACTGCGGAA (SEQ ID NO.249) HBV-C GSP2A HA2039 CTTGGCACCCGAGAATTCCACACTTCCTTTCCATGGCTGCTA (SEQ ID NO.250) HBV-C GSP2A HA2040 CTTGGCACCCGAGAATTCCACCGTTTGGGACTCTACCGT (SEQ ID NO.251) HBV-C GSP2A HA2041 CTTGGCACCCGAGAATTCCACGTGTGCACTTCGCTTCA (SEQ ID NO.252) HBV-C GSP2A HA2042 CTTGGCACCCGAGAATTCCATTGCCCAAGGTCTTACATAAGAGG (SEQ ID NO.253) HBV-C GSP2A HA2043 CTTGGCACCCGAGAATTCCAGTTTGTTTAAGGACTGGGAGGAGTT (SEQ ID NO.254) HBV-C GSP2A HA2044 CTTGGCACCCGAGAATTCCAGGTCTGTTCACCAGCACCATG (SEQ ID NO.255) HBV-C GSP2A HA2045 CTTGGCACCCGAGAATTCCACTGTGCCTTGGGTGGCTT (SEQ ID NO.256) HBV-C GSP2A HA2046 CTTGGCACCCGAGAATTCCATTGCCTTCTGATTTCTTTCCTTCTATT (SEQ ID NO.257) HBV-C GSP2A HA2047 CTTGGCACCCGAGAATTCCAGAGTCTCCGGAACATTGTTCACC (SEQ ID NO. 258) HBV-C GSP2A HA2048 CTTGGCACCCGAGAATTCCAAGTTGATGAATCTGGCCACCT (SEQ ID NO.259) HBV-C GSP2A HA2049 CTTGGCACCCGAGAATTCCACAGCTATGTTAATGTTAATATGGGCCTA (SEQ ID NO.260) HBV-C GSP2A HA2050 CTTGGCACCCGAGAATTCCATATTTGGTGTCTTTTGGAGTGTGGAT (SEQ ID NO.261) HBV-C GSP2A HA2051 CTTGGCACCCGAGAATTCCATAGAGGCAGGTCCCCTAGAAG (SEQ ID NO.262) HBV-C GSP2A HA2052 CTTGGCACCCGAGAATTCCACAATGTTAGTATCCCTTGGACTCACA (SEQ ID NO.263) HBV-C GSP2A HA2053 CTTGGCACCCGAGAATTCCAACAGGAGGACATTATTGATAGATGTCA(SEQ ID NO.264) HBV-C GSP2A HA2054 CTTGGCACCCGAGAATTCCAAACCTTACCAAGTATTTGCCCTT (SEQ ID NO.265) HBV-C GSP2A HA2055 CTTGGCACCCGAGAATTCCATCTGTGGAAGGCTGGGATTCTATAT (SEQ ID NO.266) HBV-C GSP2A HA2056 CTTGGCACCCGAGAATTCCAGGGACAAATCTTTCTGTTCCCA (SEQ ID NO.267) HBV-C GSP2A HA2057 CTTGGCACCCGAGAATTCCAGGCCAGAGGCAAATCAGGT (SEQ ID NO. 268) HBV-C GSP2A HA2058 CTTGGCACCCGAGAATTCCACAGTCAGGAAGACAGCCTACTC (SEQ ID NO.269) HBV-B GSP2A HA2059 CTTGGCACCCGAGAATTCCAAATACTGTCTCTGCCATATCGTCA (SEQ ID NO.270) HBV-B GSP2A HA2060 CTTGGCACCCGAGAATTCCAGTGTGTTTCATGAGTGGGAGGA (SEQ ID NO.271) HBV-B GSP2A HA2061 NA HBV-B GSP2A HA2062 NA HBV-B GSP2A HA2063 NA HBV-B GSP2A HA2064 CTTGGCACCCGAGAATTCCATTTGCCTTCTGACTTCTTTCCGTC (SEQ ID NO.272) HBV-B GSP2A HA2065 CTTGGCACCCGAGAATTCCACACAGCACTCAGGCAAGCTA (SEQ ID NO.273) TP53 GSP2A HA2071 CTTGGCACCCGAGAATTCCAGTCACTGCCATGGAGGAGC (SEQ ID NO.274) TP53 GSP2A HA2072 CTTGGCACCCGAGAATTCCACCATGGGACTGACTTTCTGC (SEQ ID NO.275) TP53 GSP2A HA2073 CTTGGCACCCGAGAATTCCAACTGCTCTTTTCACCCATCTACA (SEQ ID NO.276) TP53 GSP2A HA2074 CTTGGCACCCGAGAATTCCATGTCCCCGGACGATATTGAAC (SEQ ID NO.277) TP53 GSP2A HA2075 CTTGGCACCCGAGAATTCCACAGATGAAGCTCCCAGAATGCC (SEQ ID NO.278) TP53 GSP2A HA2076 CTTGGCACCCGAGAATTCCATGTCATCTTCTGTCCCTTCCCA (SEQ ID NO.279) TP53 GSP2A HA2077 CTTGGCACCCGAGAATTCCACAACTCTGTCTCCTTCCTCTTCCT (SEQ ID NO.280) TP53 GSP2A HA2078 CTTGGCACCCGAGAATTCCATGTGCAGCTGTGGGTTGAT (SEQ ID NO.281) TP53 GSP2A HA2079 CTTGGCACCCGAGAATTCCACAAGCAGTCACAGCACATGACG (SEQ ID NO. 282) TP53 GSP2A HA2080 CTTGGCACCCGAGAATTCCACCTCTGATTCCTCACTGATTGCT (SEQ ID NO.283) TP53 GSP2A HA2081 CTTGGCACCCGAGAATTCCATTGCGTGTGGAGTATTTGGATG (SEQ ID NO. 284) TP53 GSP2A HA2082 CTTGGCACCCGAGAATTCCATCTTGGGCCTGTGTTATCTCCT (SEQ ID NO. 285) TP53 GSP2A HA2083 CTTGGCACCCGAGAATTCCAACATGTGTAACAGTTCCTGCATGG (SEQ ID NO.286) TP53 GSP2A HA2084 CTTGGCACCCGAGAATTCCACTTGCTTCTCTTTTCCTATCCTGAGT (SEQ ID NO.287) TP53 GSP2A HA2085 CTTGGCACCCGAGAATTCCACTTTGAGGTGCGTGTTTGTGC (SEQ ID NO.288) TP53 GSP2A HA2086 CTTGGCACCCGAGAATTCCAGCAAGAAAGGGGAGCCTCA (SEQ ID NO. 289) TP53 GSP2A HA2087 CTTGGCACCCGAGAATTCCAATCACCTTTCCTTGCCTCTTTCC (SEQ ID NO.290) TP53 GSP2A HA2088 CTTGGCACCCGAGAATTCCATTCTCCCCCTCCTCTGTTGC (SEQ ID NO.291) TP53 GSP2A HA2089 CTTGGCACCCGAGAATTCCACTTCGAGATGTTCCGAGAGCT (SEQ ID NO.292) TP53 GSP2A HA2090 CTTGGCACCCGAGAATTCCACCTCCCTGCTTCTGTCTCCTA (SEQ ID NO.293) TP53 GSP2A HA2091 CTTGGCACCCGAGAATTCCATCAGTCTACCTCCCGCCATA (SEQ ID NO.294) AXIN1 GSP2B HB2009 CTTGGCACCCGAGAATTCCAGAAACTTGCTCCGAGGTCCA (SEQ ID NO.295) AXIN1 GSP2B HB2010 CTTGGCACCCGAGAATTCCACATCCAGCAGGGAATGCAGT (SEQ ID NO.296) AXIN1 GSP2B HB2011 CTTGGCACCCGAGAATTCCAGACACGATGCCATTGTTATCAAGASEQ ID NO. 297) AXIN1 GSP2B HB2012 CTTGGCACCCGAGAATTCCACTGTCTCCAGGAGCAGCTTC (SEQ ID NO. 298) AXIN1 GSP2B HB2013 CTTGGCACCCGAGAATTCCACGGAGGTGAGTACAGAAAGTGG (SEQ ID NO.299) AXIN1 GSP2B HB2014 CTTGGCACCCGAGAATTCCAGGAGGCAGCTTGTGACACG (SEQ ID NO.300) AXIN1 GSP2B HB2015 CTTGGCACCCGAGAATTCCACTCGTCCAGGATGCTCTCAG (SEQ ID NO.301) AXIN1 GSP2B HB2016 CTTGGCACCCGAGAATTCCAGTGGTGGACGTGGTGGTG (SEQ ID NO.302) AXIN1 GSP2B HB2017 CTTGGCACCCGAGAATTCCATGATTTTCTGGTTCTTCTCCGCAT (SEQ ID NO.303) CTNNB1 GSP2B HB2018 CTTGGCACCCGAGAATTCCAGAGGTATCCACATCCTCTTCCTCA (SEQ ID NO.304) TERT GSP2B HB2019 CTTGGCACCCGAGAATTCCAAGGACTTCCCAGGAATCCAG (SEQ ID NO. 305) TERT GSP2B HB2020 CTTGGCACCCGAGAATTCCAAGCTAGGAGGCCCGACTT (SEQ ID NO.306) TERT GSP2B HB2021 CTTGGCACCCGAGAATTCCAACAACGGCCTTGACCCTG (SEQ ID NO.307) TERT GSP2B HB2022 CTTGGCACCCGAGAATTCCACCACCCCAAATCTGTTAATCACC (SEQ ID NO.308) TERT GSP2B HB2023 CTTGGCACCCGAGAATTCCAAACACTTCCCCGCGACTTGG (SEQ ID NO.309) TERT GSP2B HB2024 CTTGGCACCCGAGAATTCCACGTGAAGGGGAGGACGGA (SEQ ID NO.310) TERT GSP2B HB2025 CTTGGCACCCGAGAATTCCAGGGGCCATGATGTGGAGG (SEQ ID NO.311) TERT GSP2B HB2026 CTTGGCACCCGAGAATTCCAAAGGTGAAGGGGCAGGAC (SEQ ID NO.312) TERT GSP2B HB2027 CTTGGCACCCGAGAATTCCAGCGGAAAGGAAGGGGAGG (SEQ ID NO.313) TERT GSP2B HB2028 CTTGGCACCCGAGAATTCCAGCAGCACCTCGCGGTAG (SEQ ID NO.314) HBV-C GSP2B HB2029 CTTGGCACCCGAGAATTCCAGGAAAGTATAGGCCCCTCACTC (SEQ ID NO.315) HBV-C GSP2B HB2030 CTTGGCACCCGAGAATTCCACTCTCCATGTTCGGGGCA (SEQ ID NO.316) HBV-C GSP2B HB2031 CTTGGCACCCGAGAATTCCAGAGGATTCTTGTCAACAAGAAAAACCC (SEQ ID NO. 317) HBV-C GSP2B HB2032 CTTGGCACCCGAGAATTCCAACAAGAGGTTGGTGAGTGATTGG (SEQ ID NO.318) HBV-C GSP2B HB2033 CTTGGCACCCGAGAATTCCAGTCCAGAAGAACCAACAAGAAGATGA (SEQ ID NO.319) HBV-C GSP2B HB2034 CTTGGCACCCGAGAATTCCACATAGAGGTTCCTTGAGCAGGAATC (SEQ ID NO.320) HBV-C GSP2B HB2035 CTTGGCACCCGAGAATTCCACACTCCCATAGGAATCTTGCGAA (SEQ ID NO.321) HBV-C GSP2B HB2036 CTTGGCACCCGAGAATTCCACCCCCAATACCACATCATCCATA (SEQ ID NO.322) HBV-C GSP2B HB2037 CTTGGCACCCGAGAATTCCAAGGGTTCAAATGTATACCCAAAGACAA (SEQ ID NO.323) HBV-C GSP2B HB2038 CTTGGCACCCGAGAATTCCAAGTTTTAGTACAATATGTTCTTGCGGTA (SEQ ID NO. 324) HBV-C GSP2B HB2039 CTTGGCACCCGAGAATTCCACATTGTGTAAAAGGGGCAGCA (SEQ ID NO.325) HBV-C GSP2B HB2040 CTTGGCACCCGAGAATTCCATGTTTACACAGAAAGGCCTTGTAAGT (SEQ ID NO.326) HBV-C GSP2B HB2041 CTTGGCACCCGAGAATTCCACATGCGGCGATGGCCAATA (SEQ ID NO.327) HBV-C GSP2B HB2042 CTTGGCACCCGAGAATTCCATTCCGAGAGAGGACAACAGAGTTGT (SEQ ID NO.328) HBV-C GSP2B HB2043 CTTGGCACCCGAGAATTCCAGACGGGACGTAAACAAAGGAC (SEQ ID NO.329) HBV-C GSP2B HB2044 CTTGGCACCCGAGAATTCCAGGAGACCGCGTAAAGAGAGG (SEQ ID NO.330) HBV-C GSP2B HB2045 CTTGGCACCCGAGAATTCCAGTGCAGAGGTGAAGCGAAGT (SEQ ID NO.331) HBV-C GSP2B HB2046 CTTGGCACCCGAGAATTCCATCCAAGAGTCCTCTTATGTAAGACC (SEQ ID NO.332) HBV-C GSP2B HB2047 CTTGGCACCCGAGAATTCCACAACTCCTCCCAGTCCTTAAACA (SEQ ID NO.333) HBV-C GSP2B HB2048 CTTGGCACCCGAGAATTCCAGGTGCTGGTGAACAGACCAA (SEQ ID NO.334) HBV-C GSP2B HB2049 CTTGGCACCCGAGAATTCCACTTGGAGGCTTGAACAGTAGGA (SEQ ID NO.335) HBV-C GSP2B HB2050 CTTGGCACCCGAGAATTCCAAATTCTTTATACGGGTCAATGTCCA (SEQ ID NO.336) HBV-C GSP2B HB2051 CTTGGCACCCGAGAATTCCACAGAGGCGGTGTCGAGGA (SEQ ID NO.337) HBV-C GSP2B HB2052 CTTGGCACCCGAGAATTCCAACACAGAACAGCTTGCCTGA (SEQ ID NO. 338) HBV-C GSP2B HB2053 CTTGGCACCCGAGAATTCCACTGGGTCTTCCAAATTACTTCCCA (SEQ ID NO.339) HBV-C GSP2B HB2054 CTTGGCACCCGAGAATTCCAGTTTCTCTTCCAAAGGTAAGACAGGA (SEQ ID NO.340) HBV-C GSP2B HB2055 CTTGGCACCCGAGAATTCCAACCTGCCTCTACGTCTAACAACA (SEQ ID NO.341) HBV-C GSP2B HB2056 CTTGGCACCCGAGAATTCCATTGTGAGTCCAAGGGATACTAACATTG (SEQ ID NO.342) HBV-C GSP2B HB2057 CTTGGCACCCGAGAATTCCAGGGAGTTTGCCACTCAGGATTAAA (SEQ ID NO.343) HBV-C GSP2B HB2058 CTTGGCACCCGAGAATTCCAGGGCAAATACTTGGTAAGGTTAGGATA(SEQ ID NO.344) HBV-C GSP2B HB2059 CTTGGCACCCGAGAATTCCACCTTCCACAGAGTATGTAAATAATGCCTA (SEQ ID NO.345) HBV-C GSP2B HB2060 CTTGGCACCCGAGAATTCCACTCCCATGCTGTAGCTCTTGTT (SEQ ID NO.346) HBV-C GSP2B HB2061 CTTGGCACCCGAGAATTCCAGCTGGGTCCAACTGGTGATC (SEQ ID NO.347) HBV-C GSP2B HB2062 CTTGGCACCCGAGAATTCCACCCCAAAAGACCACCGTGTG (SEQ ID NO. 348) HBV-C GSP2B HB2063 CTTGGCACCCGAGAATTCCATCTTCCTGACTGCCGATTGGT (SEQ ID NO.349) HBV-B GSP2B HB2064 NA HBV-B GSP2B HB2065 NA HBV-B GSP2B HB2066 CTTGGCACCCGAGAATTCCACAAGACCTTGGGCAGGTTCC (SEQ ID NO.350) HBV-B GSP2B HB2067 CTTGGCACCCGAGAATTCCAATTCTAAGGCTTCCCGATACAGA (SEQ ID NO.351) HBV-B GSP2B HB2068 CTTGGCACCCGAGAATTCCAACGCTGGATCTTCTAAATTATTACCC (SEQ ID NO.352) HBV-B GSP2B HB2069 NA TP53 GSP2B HB2071 CTTGGCACCCGAGAATTCCAGATCCACTCACAGTTTCCATAGG (SEQ ID NO.353) TP53 GSP2B HB2072 CTTGGCACCCGAGAATTCCACAGCCCAACCCTTGTCCTTA (SEQ ID NO.354) TP53 GSP2B HB2073 CTTGGCACCCGAGAATTCCATGGGAGCTTCATCTGGACCTG (SEQ ID NO.355) TP53 GSP2B HB2074 CTTGGCACCCGAGAATTCCAGAAGGGACAGAAGATGACAGG (SEQ ID NO.356) TP53 GSP2B HB2075 CTTGGCACCCGAGAATTCCACAAGAAGCCCAGACGGAAACC (SEQ ID NO.357) TP53 GSP2B HB2076 CTTGGCACCCGAGAATTCCACCCCTCAGGGCAACTGAC (SEQ ID NO.358) TP53 GSP2B HB2077 CTTGGCACCCGAGAATTCCAGTGCTGTGACTGCTTGTAGATGGC (SEQ ID NO.359) TP53 GSP2B HB2078 CTTGGCACCCGAGAATTCCAATCTGAGCAGCGCTCATGGTG (SEQ ID NO.360) TP53 GSP2B HB2079 CTTGGCACCCGAGAATTCCACCCTGTCGTCTCTCCAGC (SEQ ID NO.361) TP53 GSP2B HB2080 CTTGGCACCCGAGAATTCCACTATGTCGAAAAGTGTTTCTGTCATCC (SEQ ID NO.362) TP53 GSP2B HB2081 CTTGGCACCCGAGAATTCCAGAGACCCCAGTTGCAAACCAG (SEQ ID NO.363) TP53 GSP2B HB2082 CTTGGCACCCGAGAATTCCATGGGCCTCCGGTTCATGC (SEQ ID NO.364) TP53 GSP2B HB2083 CTTGGCACCCGAGAATTCCAGTGCAGGGTGGCAAGTGG (SEQ ID NO.365) TP53 GSP2B HB2084 CTTGGCACCCGAGAATTCCAGACAGGCACAAACACGCAC (SEQ ID NO.366) TP53 GSP2B HB2085 CTTGGCACCCGAGAATTCCATTCTTGCGGAGATTCTCTTCCTCT (SEQ ID NO.367) TP53 GSP2B HB2086 CTTGGCACCCGAGAATTCCACGCTTCTTGTCCTGCTTGCT (SEQ ID NO. 368) TP53 GSP2B HB2087 CTTGGCACCCGAGAATTCCAACTTGATAAGAGGTCCCAAGACTTAG (SEQ ID NO.369) TP53 GSP2B HB2088 CTTGGCACCCGAGAATTCCAAGCCTGGGCATCCTTGAG (SEQ ID NO.370) TP53 GSP2B HB2089 CTTGGCACCCGAGAATTCCACAGGAAGGGGCTGAGGTC (SEQ ID NO.371) TP53 GSP2B HB2090 CTTGGCACCCGAGAATTCCACATGAGTTTTTTATGGCGGGAGGT (SEQ ID NO.372) TP53 GSP2B HB2091 CTTGGCACCCGAGAATTCCACAGTGGGGAACAAGAAGTGGA (SEQ ID NO.373) BDH1 GSP2A CA2001 CTTGGCACCCGAGAAGGACGCTTCTACACGCGAA (SEQ ID NO.374) EMX1 GSP2A CA2002 CTTGGCACCCGAGAACACGAACGAAAAGGAACATGTCT (SEQ ID NO.375) LRRC4 GSP2A CA2003 CTTGGCACCCGAGAACGAGTTCGCGGCTTCGG (SEQ ID NO.376) LRRC4 GSP2A CA2004 CTTGGCACCCGAGAACAGCAGCAGCAGCGGG (SEQ ID NO.377) LRRC4 GSP2A CA2005 CTTGGCACCCGAGAACAAACCCACAGGGTATCTATCAGG (SEQ ID NO. 378) LRRC4 GSP2A CA2006 CTTGGCACCCGAGAAGCTGGGCGTGCACGATC (SEQ ID NO.379) BDH1 GSP2A CA2007 CTTGGCACCCGAGAACCTGGCATCGCTCACCC (SEQ ID NO.380) CLEC11A GSP2A CA2008 CTTGGCACCCGAGAAGACCGTGGGGCTGTGAG (SEQ ID NO.381) CLEC11A GSP2A CA2009 CTTGGCACCCGAGAACTCTTCAAGCTCGGAATGGA (SEQ ID NO.382) CLEC11A GSP2A CA2010 CTTGGCACCCGAGAAGCCGCTGCAGACGGAT (SEQ ID NO.383) HOXA1 GSP2A CA2011 CTTGGCACCCGAGAAAGGAGGGGTGGAACCCAG (SEQ ID NO.384) HOXA1 GSP2A CA2012 CTTGGCACCCGAGAATGGGAGAAGAAAAAAACACACACAC (SEQ ID NO.385) EMX1 GSP2A CA2013 CTTGGCACCCGAGAATTTCGCGGGACAAAAACCAC (SEQ ID NO.386) AK055957 GSP2A CA2014 CTTGGCACCCGAGAATCTAAGTGGCCAGGGCACTG (SEQ ID NO.387) COTL1 GSP2B CB2001 CTTGGCACCCGAGAAGATCAGGGCACCTTGGGC (SEQ ID NO.388) COTL1 GSP2B CB2002 CTTGGCACCCGAGAACTGCAACACCGCGAGCC (SEQ ID NO. 389) COTL1 GSP2B CB2003 CTTGGCACCCGAGAACGCTCTGCTTACGTGCTGAC (SEQ ID NO.390) ACP1 GSP2B CB2004 CTTGGCACCCGAGAAGCCGCTGCAGCAGTCC (SEQ ID NO.391) ACP1 GSP2B CB2005 CTTGGCACCCGAGAACGCTGTTGCCTTGGCGA (SEQ ID NO.392) DAB2IP GSP2B CB2006 CTTGGCACCCGAGAAGCCAGTTGTAGGGAGCGA (SEQ ID NO.393) DAB2IP GSP2B CB2007 CTTGGCACCCGAGAACGAAGAGGTAGAGGCCCTCG (SEQ ID NO.394) DAB2IP GSP2B CB2008 CTTGGCACCCGAGAAGTCCGGGCTGAGCGGAT (SEQ ID NO.395) ACTB GSP2B CB2009 CTTGGCACCCGAGAAGCCCTCCACCACGGTTCTAT (SEQ ID NO.396) BDH1 GSP2B CB2010 CTTGGCACCCGAGAAGAGTTCCTCCCAGCCAGC (SEQ ID NO.397) BDH1 GSP2B CB2011 CTTGGCACCCGAGAAGGGACTGGAGGGCGTAGAG (SEQ ID NO.398) LRRC4 GSP2B CB2012 CTTGGCACCCGAGAAACTTCGCGGCGGCTCA (SEQ ID NO.399) LRRC4 GSP2B CB2013 CTTGGCACCCGAGAACCAACTCCACGGTTCCTGC (SEQ ID NO.400) BDH1 GSP2B CB2014 CTTGGCACCCGAGAATGAGGGCGAAGGCCTGA (SEQ ID NO.401) LRRC4 GSP2B CB2015 CTTGGCACCCGAGAAGGTGGTACCGATGAGAGCG (SEQ ID NO. 402) Note: NA means no primer.
TABLE-US-00020 Reaction Procedure Temperature Time Number of cycles 98° C. 3 min 98° C. 15 s 6-10 cycles 57-60° C. 60-90 s 72° C. 90 s 98° C. 15 s 6-10 cycles 57-60° C. 30-60 s 72° C. 30 s 72° C. 10 min
[0158] 4. The product of the second round of amplification using GSP2A mix obtained in step 3 and the product of the second round of amplification using GSP1B mix were mixed in equal volumes, purified with AMPure XP magnetic beads at a ratio of 1:(1-2), then eluted with 50 .Math.l of DNase/RNase-Free Water to obtain the product of the second round of purification, which was the sequencing library that could be sequenced on the Illumina Hiseq X platform.
[0159] DNA random tags on the MC library were added to the downstream of the Readl sequence of the sequencing library along with the cfDNA sequences. During sequencing, DNA random tag sequence, anchor sequence, and cfDNA sequence (c, d, and e sequences in
[0160] The analysis method of hepatocellular carcinoma-specific gene variation was as follows: DNA molecules whose sequencing data met the criterion A were traced back to a molecular cluster; the molecular clusters which met the criterion B were labeled as a pair of duplex molecular clusters; for a mutation, if the following (al) or (a2) is satisfied, the mutation is a true mutation from the original DNA sample: (a1) supported by at least one pair of duplex molecular clusters; (a2) supported by at least 4 molecular clusters; criterion A means satisfying ①, ②and ③ at the same time; ①thelength of the DNA inserts is the same and the sequences are the same except for the mutation sites; ②the random tag sequences are the same; ③ the anchor sequences are the same; criterion B means satisfying both ④and ⑤;④the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ⑤the anchor sequences at both ends of the molecular cluster are the same but in opposite positions.
[0161] The analysis method for the degree of hepatocellular carcinoma-specific methylation modification was as follows: the DNA molecules whose sequencing data met the criterion C were labeled as a cluster, and the number of clusters whose ends were the restriction sites of interest was calculated respectively, and recorded as unmethylated fragments; the number of all the clusters whose amplified fragments reached or exceeded the first restriction site was calculated, and recorded as the total number of fragments. The average methylation level of the corresponding region was calculated according to the number of two fragments. The methylation level of the region = (1 - the number of unmethylated fragments / the total number of fragments) X 100%. Criterion C means satisfying ⑥, ⑦ and ⑧ at the same time; ⑥the random tag sequences are the same; ⑦the anchor sequences are the same; ⑧the length of the DNA inserts is the same and the sequences are the same except for the mutation sites.
Example 3. Capture and Sequencing of MC Library
[0162] As shown in
The upstream primer is:
TABLE-US-00021 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGC TCTTCCGATCT-3′(SEQ ID NO.403)
(“a” in
[0163] The downstream primer is:
TABLE-US-00022 5′-CAAGCAGAAGACGGCATACGAGAT (SEQID NO.404)
TABLE-US-00023 GTCTCGTGGGCTCGGAGATGTGTATAA-3′ (SEQ IDNO.405)
(“b” in
[0164] The captured library has the same DNA random tag sequence, anchor sequence and cfDNA sequence as the MC library, which are located downstream of Read1 sequentially.
[0165] DNA molecules whose sequencing data met the criterion A were traced back to a molecular cluster; criterion A means satisfying ①, ② and ③ at the same time; ①the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ②the random tag sequences are the same; ③ the anchor sequences are the same. The molecular clusters which met the criterion B were labeled as a pair of duplex molecular clusters. Criterion B means satisfying both ④ and ⑤; ④the length of the DNA inserts is the same and the sequences are the same except for the mutation sites; ⑤the anchor sequences at both ends of the molecular cluster are the same but in opposite positions. For a mutation, if the following (al) or (a2) is satisfied, the mutation is a true mutation from the original DNA sample: (al) supported by at least one pair of duplex molecular clusters; (a2) supported by at least 4 molecular clusters. Mutations supported by a pair of duplex clusters are more reliable and it can reduce false positive mutations by 90%.
[0166] The DNA molecules whose sequencing data met the criterion C were labeled as a cluster, and the number of clusters whose ends were the restriction sites of interest was calculated respectively and recorded as unmethylated fragments; the number of all the clusters whose amplified fragments reached or exceeded the first restriction site was calculated, and recorded as the total number of fragments. The average methylation level of the corresponding region was calculated according to the number of two fragments. The methylation level of the region = (1 - the number of unmethylated fragments / the total number of fragments) X 100%. Criterion C means satisfying ⑥, ⑦ and ⑧ at the same time; ⑥the random tag sequences are the same; ⑦the anchor sequences are the same; ⑧the length of the DNA inserts is the same and the sequences are the same except for the mutation sites.
Example 4. Comparison of Detection Method
1. Comparison 1 of Detection Methods
[0167] cfDNA specimens from 21 hepatocellular carcinoma patients were collected.
[0168] After completing step 1, each cfDNA sample was taken, and the MC library was constructed according to the method in Example 1. Then, the RaceSeq target region was enriched and sequenced according to the method in Example 2 to obtain the methylation level of the AK055957 gene.
[0169] After completing step 1, each cfDNA specimen was taken, and the Padlock method (Xu R H, Wei W, Krawczyk M, et al. Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma[J]. Nature Materials, 2017, 16(11):1155.) was used to detect the methylation level of the AK055957 gene. Padlock is a methylation-targeted sequencing technology, and the conformation of Padlock probe is similar to that of padlock. It can be applied to high-throughput methylation-targeted sequencing, and is an efficient library construction method after bisulfite conversion, known as “BSPP”. After the cfDNA is converted by bisulfite, it can be amplified and ligated into a circular shape when paired complementary to the capture arm of a bisulfite padlock probe (BSPP). Padlock probes ligated into circles can be screened with exonuclease, and the corresponding DNA methylation information can be obtained by sequencing the amplified products.
[0170] The test results are shown in
2. Comparison 2 of Detection Methods
[0171] Mutation and mutation frequency detected by mutation/methylation co-detection method
[0172] ①cfDNA of a hepatocellular carcinoma patient was collected.
[0173] ②After completing step ①, 5-40 ng of cfDNA was taken to configure the reaction system as shown in Table 1, and then enzyme digestion was performed in the PCR machine to obtain the enzyme-digested product (stored at 4° C.). Wherein the time of enzyme digestion was 0h, 0.2 h, 0.4 h, 0.6 h, 0.8 h or 1 h.
[0174] ③ After completing step ②, the enzyme digestion product was taken to construct the MC library according to the methods of 2 to 6 in Example 1, then, RaceSeq target region enrichment and sequencing were performed according to the method in Example 2. During data analysis, the sequencing data of DNA molecules with the same random tag sequence, the same DNA insert length, and the same sequence except for the mutation sites, were traced back to a molecular cluster. If the number of molecules in the cluster is >5 and the concordance rate of molecular mutation within the cluster is >80% and the number of clusters is >, 5, the mutation is a true mutation from the original DNA sample. The proportion of clusters containing this molecular mutation is the mutation frequency.
[0175] Detection of mutation and mutation frequency by single mutation detection method
[0176] ① cfDNA of a hepatocellular carcinoma patient was collected.
[0177] ②After completing step ①, 5-40 ng of cfDNA was taken to configure the reaction system as shown in Table 3, and then end repair and adding A treatment at the 3′ end in a PCR machine were performed according to the reaction procedure in Table 4 to obtain a reaction product (stored at 4° C.).
[0178] ③ After completing step ②, the enzyme digestion product was taken to construct the MC library according to the methods of 2 to 6 in Example 1, then, RaceSeq target region enrichment and sequencing were performed according to the method in Example 2. During data analysis, the sequencing data of DNA molecules with the same random tag sequence, the same DNA insert length, and the same sequence except for the mutation sites, were traced back to a molecular cluster. If the number of molecules in the cluster is >5 and the concordance rate of molecular mutation within the cluster is >80% and the number of clusters is >, 5, the mutation is a true mutation from the original DNA sample. The proportion of clusters containing this molecular mutation is the mutation frequency.
[0179] The mutation frequency of each mutation site obtained according to the mutation/methylation co-detection method was taken as the abscissa, the mutation frequency obtained by the single mutation detection method was taken as the ordinate, a scatter plot was drawn, and linear fitting curve and correlation coefficient R2 was added.
[0180] The test results are shown in
Example 5. Accuracy Experiment
[0181] The mutation standard is a product of Horizon Discovery Company, catalog number HD701.
[0182] 1. Accuracy experiment 1 [0183] (1) The mutation standard was taken to construct the MC library according to the methods of to 6 in Example 1, then, RaceSeq target region enrichment and sequencing were performed according to the method (only GSP2A mix in step 3 was replaced with GSP2A mix-and GSP2B mix was replaced with GSP2B mix-1) in Example 2.
[0184] GSP2A mix-1: Each primer in the primer pool GSP2A in Table 15 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in the primer pool GSP2A were used to amplify the positive strand of the template.
[0185] GSP2B mix-1: Each primer in the primer pool GSP2B in Table 15 was dissolved and diluted to a concentration of 100 .Math.M with TE buffer, then mixed in equal volumes, and diluted to 0.3 .Math.M with TE buffer. The primers in the primer pool GSP2B were used to amplify the negative strand of the template.
TABLE-US-00024 Primer sequences Gene name Chromos ome Mutation site Primer pool Primer number Primer sequence (5′ -3′ ) PIK3CA 3 178916875 GSP2A HA2094 Cagaaagggaagaattttttgatgaaaca(SEQ ID NO:406) PIK3CA 3 178921551 GSP2A HA2095 ctcagaataaaaattctttgtgcaacctac(SEQ ID NO:407) PIK3CA 3 178936082 GSP2A HA2096 gctcaaagcaatttctacacgagatc(SEQ ID NO: 408) PIK3CA 3 178952072 GSP2A HA2097 gcaagaggctttggagtatttcatg(SEQ ID NO:409) KRAS 12 25398285 GSP2A HA2115 tgactgaatataaacttgtggtagttgg(SEQ ID NO:410) KRAS 12 25380277 GSP2A HA2116 cctgtctcttggatattctcgacac(SEQ ID NO:411) KRAS 12 25378562 GSP2A HA2117 gcaagaagttatggaattccttttattgaa(SEQ ID NO:412) EGFR 7 55241707 GSP2A HA2121 ttgaggatcttgaaggaaactgaatt(SEQ ID NO:413) EGFR 7 55242463 GSP2A HA2122 tgagaaagttaaaattcccgtcgcta(SEQ ID NO:414) EGFR 7 55249004 GSP2A HA2123 ctccaggaagcctacgtgatg(SEQ ID NO:415) EGFR 7 55249071 GSP2A HA2124 acctccaccgtgcagctc(SEQ ID NO:416) EGFR 7 55259514 GSP2A HA2125 ccgcagcatgtcaagatcacag(SEQ ID NO:417) PIK3CA 3 178916875 GSP2B HB2094 ggttgaaaaagccgaaggtcac(SEQ ID NO:418) PIK3CA 3 178921551 GSP2B HB2095 catttgactttaccttatcaatgtctcgaa(SEQ ID NO:419) PIK3CA 3 178936082 GSP2B HB2096 acttacctgtgactccatagaaaatctt(SEQ ID NO: 420) PIK3CA 3 178952072 GSP2B HB2097 caatccatttttgttgtccagcc(SEQ ID NO:421) KRAS 12 25398285 GSP2B HB2115 tagctgtatcgtcaaggcactc(SEQ ID NO:422) KRAS 12 25380277 GSP2B HB2116 ggtccctcattgcactgtact(SEQ ID NO:423) KRAS 12 25378562 GSP2B HB2117 tgtatttatttcagtgttacttacctgtcttg(SE Q ID NO:424) EGFR 7 55241707 GSP2B HB2121 accttatacaccgtgccgaa(SEQ ID NO:425) EGFR 7 55242463 GSP2B HB2122 actcacatcgaggatttccttgtt(SEQ ID NO:426) EGFR 7 55249004 GSP2B HB2123 cggtggaggtgaggcagat(SEQ ID NO:427) EGFR 7 55249071 GSP2B HB2124 gtccaggaggcagccgaa(SEQ ID NO:428) EGFR 7 55259514 GSP2B HB2125 gtattctttctcttccgcaccca(SEQ ID NO: 429)
[0186] According to the sequencing results, the mutation frequency of the mutation site was obtained.
[0187] The test results are shown in Table 16. The results show that the mutation frequency of the mutation site is basically close to the theoretical value by using the mutation/methylation co-detection method to detect the mutation standard. It can be seen that the mutation/methylation co-detection method has high accuracy for the mutation detection of hepatocellular carcinoma-specific genes (such as CTNNB 1 gene, TP53 gene, and AXIN1 gene).
TABLE-US-00025 Accuracy experiment Gene name geneID Mutation/methylation co-detection results Mutation frequency of mutation standard Mutation type Ref Alt Sequencing depth Mutation frequency EGFR ENSG00000146648 10191 0.0147 0.01 INS - C PIK3CA ENSG00000121879 5020 0.07749 0.09 SNP G A PIK3CA ENSG00000121879 9192 0.19093 0.175 SNP A G EGFR ENSG00000146648 3988 0.27282 0.245 SNP G A EGFR ENSG00000146648 10147 0.00581 0.02 SNP C T EGFR ENSG00000146648 12716 0.03374 0.03 SNP T G KRAS ENSG00000133703 12604 0.14392 0.15 SNP C T KRAS ENSG00000133703 12609 0.06138 0.06 SNP C T Note: geneID represents the gene number in the Ensemble database, Ref is the normal type, Alt is the type after gene mutation, INS stands for insertion, DEL for deletion, and SNP for single nucleotide polymorphism.
2. Accuracy Experiment 2
[0188] Human methylation and non-methylation standards are products of Zymo Research, Catalog No. D5014.
[0189] (1) The methylation standard and the non-methylation standard in the human methylation and non-methylation standard are mixed according to different ratios to obtain the sample to be tested. In the sample to be tested, the proportion of methylation standard is 0%, 20% or 100%, namely tumor-specific genes (BDH1 gene, EMX1 gene, LRRC4 gene, CLEC11A gene, HOXA1 gene, AK055957 gene, COTL1 gene, ACP1 gene or DAB2IP gene) were methylated at 0%, 20% or 100%. [0190] (2) The sample to be tested was taken, the MC library was constructed according to the method in Example 1, and then the RaceSeq target region was enriched and sequenced according to the method in Example 2 to obtain the detection value of the methylation site.
[0191] The test results are shown in Table 17 and Table 18 (the last four digits of the sample type are the names of tumor-specific genes). The methylation standard was detected by mutation/methylation co-detection method, and the detected value was basically close to the theoretical value. It can be seen that the mutation/methylation co-detection method has high accuracy in the detection of methylation levels of tumor-specific genes (such as BDH1 gene, EMX1 gene, LRRC4 gene, CLEC11A gene, HOXA1 gene, AK055957 gene, COTL1 gene, ACP1 gene, DAB2IP gene) .
TABLE-US-00026 Accuracy test results for methylation standards (positive strand) Sample type 0% methylation standard 20% methylation standard 100% methylation standard CA2001 BDH1 2% 18% 97% CA2002 EMX1 3% 19% 96% CA2003 LRRC4 2% 9% 100% CA2004 LRRC4 3% 32% 97% CA2006 CLEC11A 2% 20% 97% CA2007 CLEC11A 2% 25% 99% CA2008 HOXA1 3% 20% 99% CA2009 HOXA1 3% 23% 99% CA2010 EMX1 3% 32% 99% CA2011 AK055957 3% 23% 99% CA2012 COTL1 3% 18% 98% CA2013 ACP1 4% 27% 98% CA2014 DAB2IP 2% 21% 98%
TABLE-US-00027 Accuracy test results for methylation standards (negative strand) Sample type 0% methylation standard 20% methylation standard 100% methylation standard CB2001_BDH1 3% 21% 96% CB2002_LRRC4 3% 17% 98% CB2004_LRRC4 2% 9% 96% CB2005_DAB2IP 2% 3% 99% CB2007_CLEC11A 4% 50% 94% CB2008_CLEC11A 3% 18% 97% CB2009_HOXA1 2% 20% 98% CB2011_EMX1 3% 23% 99% CB2012_AK055957 4% 19% 100% CB2013_RASSF2 7% 60% 94% CB2015_DAB2IP 3% 23% 99%
Example 6. Application of Mutation/Methylation Co-Detection Method in cfDNA of Patients with Hepatocellular Carcinoma
[0192] 1. Blood samples from 1 normal person, 1 patient with liver cirrhosis and 3 patients with hepatocellular carcinoma were collected, and cfDNA was extracted.
[0193] 2. 5-40 ng of cfDNA was taken to construct the MC library according to Example 1, and RaceSeq target region enrichment and sequencing was performed according to the method in Example 2.
[0194] 3. The methylation detection results are shown in Table 19 and Table 20. The results showed that HCC-specific hypermethylated genes had higher methylation levels in the examined HCC samples than in non-HCC samples. Mutation/methylation co-detection method can be applied to the detection of hepatocellular carcinoma cfDNA samples.
TABLE-US-00028 Detection results of methylation levels in target regions of cfDNA samples (positive strand) Sample type Normal Cirrhosis HCC1 HCC2 HCC3 CA2001_BDH1 3% 3% 28% 25% 47% CA2002_EMX1 4% 6% 11% 26% 4% CA2003_LRRC4 3% 5% 16% 28% 28% CA2004_LRRC4 3% 6% 29% 46% 48% CA2006_CLEC11A 3% 4% 11% 20% 2% CA2007_CLEC11A 3% 5% 22% 25% 10% CA2008_HOXA1 4% 4% 24% 33% 5% CA2009_HOXA1 8% 7% 10% 11% 11% CA2010_EMX1 7% 9% 21% 47% 8% CA2011_AK055957 5% 9% 40% 43% 45% CA2012_COTL1 5% 9% 17% 19% 5% CA2013_ACP1 1% 3% 5% 5% 14% CA2014_DAB2IP 5% 7% 19% 27% 50%
TABLE-US-00029 Detection results of methylation levels in target regions of cfDNA samples (negative strand) Sample type Normal Cirrhosis HCC1 HCC2 HCC3 CB2001_BDH1 5% 5% 24% 23% 56% CB2002_LRRC4 4% 13% 40% 47% 50% CB2004_LRRC4 1% 4% 11% 17% 28% CB2005_DAB2IP 4% 5% 10% 16% 27% CB2007_CLEC11A 11% 8% 17% 38% 6% CB2008_CLEC11A 2% 5% 22% 23% 7% CB2009_HOXA1 4% 2% 10% 21% 3% CB2011_EMX1 12% 11% 20% 39% 7% CB2012_AK055957 3% 9% 39% 38% 43% CB2013_RASSF2 5% 1% 4% 18% 4% CB2015_DAB2IP 9% 6% 18% 31% 57%
Industrial Application
[0195] The present invention discloses a method for simultaneously detecting the mutation (including point mutation, insertion-deletion mutation, HBV integration and other mutation forms) and/or methylation of tumor-specific genes in ctDNA in one sample. Not only the sample size requirement is low, but the MC library prepared by this method can support 10-20 subsequent detections. The results of each test can represent the mutation status of all the original ctDNA specimens and the methylation modification status of the region covered by the restriction sites, without reducing the sensitivity and specificity. At the same time, the library construction method is not only applicable to cfDNA samples, but also to genomic DNA or cDNA samples. The invention has important clinical significance for early tumor screening, disease tracking, efficacy evaluation, prognosis prediction and the like, and has great application value.