METHOD AND SYSTEM FOR CONSTRUCTING SEQUENCING LIBRARY ON THE BASIS OF METHYLATED DNA TARGET REGION, AND USE THEREOF

Abstract

A method and system for constructing sequencing library, includes: obtaining a transformed DNA sample with a universal sequence; performing amplification using a first specific primer located upstream of the target region and a first universal primer at least partially matching or overlapping the universal sequence; and performing amplification using a second specific primer, a second universal primer and a tagged primer. The second specific primer is located downstream of the first specific primer and upstream of the target region, the second universal primer overlaps at least a partial sequence of the second specific primer, and the tagged primer overlaps a partial sequence of the first universal primer. Alternatively, the second specific primer is located downstream of the target region, the second universal primer overlaps at least a partial sequence of the first specific primer, and the tagged primer overlaps a partial sequence of the second specific primer.

Claims

1. A method for constructing a sequencing library based on a target region of a methylated DNA, the method comprising: step 1 of obtaining a transformed DNA sample with universal sequence based on a methylated DNA sample by ligating a universal sequence to at least one end of the methylated DNA sample and treating the methylated DNA sample with bisulfate; step 2 of performing, by using a first specific primer and a first universal primer, a first amplification on the transformed DNA sample with universal sequence to obtain a first amplification product, wherein the first specific primer is located upstream of the target region, the first universal primer at least partially matches or overlaps the universal sequence, and the first universal primer is located downstream of the target region; and step 3 of performing, by using a second specific primer, a second universal primer and a tagged primer, a second amplification on the first amplification product to obtain a second amplification product and obtain the sequencing library, wherein the second specific primer, the second universal primer, and the tagged primer are set forth: i: the second specific primer is located downstream of the first specific primer and upstream of the target region, the second universal primer overlaps at least a partial sequence of the second specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the first universal primer; or ii: the second specific primer is located downstream of the target region, the second universal primer overlaps at least a partial sequence of the first specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the second specific primer.

2. The method according to claim 1, wherein the first specific primer and the second specific primer are designed for only one stand of the DNA sample.

3. The method according to claim 1, wherein in step 3, a 5′-end of the second specific primer overlaps at least a partial sequence of a 3 ‘-end of the second universal primer, and a 3’-end of the tagged primer overlaps a partial sequence of a 5′-end of the first universal primer.

4. The method according to claim 1, wherein in step 3, a 5′-end of the second specific primer overlaps at least a partial sequence of a 3′-end of the tagged primer, and a 3′-end of the second universal primer overlaps a partial sequence of a 5′-end of the first specific primer.

5. The method according to claim 1, wherein step 1 comprises: sub-step 1-a of treating the methylated DNA sample with bisulfite to obtain a transformed DNA sample; and sub-step 1-b of replicating the transformed DNA sample by using DNA polymerase and the first sequencing primer to obtain the transformed DNA sample with universal sequence, wherein a 3′-end of the first sequencing primer comprises random bases, and a 5′-end of the first sequencing primer is the universal sequence.

6. The method according to claim 5, wherein the number of the random bases is 6 to 12, and the random bases are A, T, or C.

7. The method according to claim 5, wherein the universal sequence is a sequencing adapter sequence or a known sequence; and optionally, cytosine in the sequencing adapter sequence or the known sequence is methylated cytosine.

8. The method according to claim 1, wherein step 1 further comprises: sub-step 1-1 of performing end repair by adding A-tailing to the methylated DNA sample to obtain a repaired DNA sample; sub-step 1-2 of ligating the universal sequence to the at least one end of the repaired DNA sample to obtain a DNA sample with universal sequence; and sub-step 1-3 of treating, by using bisulfite, the DNA sample with universal sequence to obtain the transformed DNA sample with universal sequence.

9. The method according to claim 8, wherein the universal sequence is at least one selected from a sequencing adapter sequence or a modified sequencing adapter sequence; optionally, the modified sequencing adapter sequence is a sequencing adapter sequence in which cytosines on one strand are methylated and cytosines on the other strand are unmethylated; a sequencing adapter sequence with a known sequence and a random sequence, a base at a 3′-end of one strand of the sequencing adapter being not modified with a non-hydroxy group; or a sequencing adapter sequence with a known sequence and a random sequence, a base at a 3′-end of one strand of the sequencing adapter being modified with a non-hydroxy group; and optionally, the random sequence is a molecular tag sequence.

10. The method according to claim 1, wherein step 1 further comprises: sub-step {circle around (1)} D of interrupting and transposing the DNA sample by using a transposase to obtain a DNA sample with universal sequence, wherein the transposase is embedded with the universal sequence; and sub-step {circle around (2)} of treating the DNA sample with universal sequence by using bisulfate to obtain the transformed DNA sample with universal sequence.

11. The method according to claim 10, wherein the universal sequence is a transposase effector sequence or a Tn5 transposase effector sequence with sequencing adapter, preferably the transposase effector sequence; and preferably, cytosine in the transposase effector sequence is methylated cytosine.

12. A method for sequencing a methylated DNA sample, the method comprising: constructing and obtaining a sequencing library based on the methylated DNA sample by the method according to claim 1; and performing a high-throughput sequencing on the sequencing library to obtain sequencing results.

13. A method for determining a methylation status of a methylated DNA sample, the method comprising: constructing and obtaining a sequencing library based on the methylated DNA sample by the method according to claim 1; performing a high-throughput sequencing on the sequencing library to obtain sequencing results; and aligning the sequencing results to a reference genome to determine the methylation status of the methylated DNA sample.

14. A kit configured to construct a sequencing library based on a target region of a methylated DNA by the method according to claim 1, the kit comprising a universal sequence, a tagged primer, a first universal primer, a second universal primer, a methylation detection reagent, a first specific primer, and a second specific primer.

15. The kit according to claim 14, wherein the tagged primer contains a tag sequence, the first universal primer matches or overlaps at least a part of the universal sequence, and the first specific primer and the second specific primer are designed for only one stand of the DNA sample.

16. The kit according to claim 14, wherein a 5′-end of the second specific primer overlaps at least a partial sequence of a 3′-end of the second universal primer, and a 3′-end of the tagged primer overlaps a partial sequence of a 5′-end of the first universal primer.

17. The kit according to claim 14, wherein a 5′-end of the second specific primer overlaps at least a partial sequence of a 3′-end of the tagged primer, and a 3′-end of the second universal primer overlaps a partial sequence of a 5′-end of the first specific primer.

18. The kit according to claim 14, wherein a 3′-end of the first sequencing primer comprises random bases, and a 5′-end of the first sequencing primer is the universal sequence.

19. The kit according to claim 18, wherein the number of the random bases is 6 to 12, and the random bases are A, T, or C.

20. The kit according to claim 18, wherein the universal sequence is a sequencing adapter sequence or a known sequence; and optionally, cytosine in the sequencing adapter sequence or the known sequence is methylated cytosine.

Description

BRIEF DESCRIPTION OF DRAWINGS

[0059] The above and/or additional aspects and advantages of the present disclosure will become apparent and easy to understand in conjunction with the description of the embodiments with reference to the following drawings, in which:

[0060] FIG. 1A and FIG. 1B are flow charts of random primer library construction according to an embodiment of the present disclosure.

[0061] FIG. 2A and FIG. 2B are flow charts of adapter connection library construction according to an embodiment of the present disclosure.

[0062] FIG. 3 is a flow chart of transposon library construction according to an embodiment of the present disclosure.

[0063] FIG. 4 is a schematic diagram of sequences with different adapters according to an embodiment of the present disclosure.

[0064] FIG. 5 is a quality inspection graph of a sequencing library according to an embodiment of the present disclosure.

[0065] FIG. 6 is a diagram illustrating results of sequencing depths of respective amplicons according to an embodiment of the present disclosure.

[0066] FIG. 7 is a quality inspection graph of a sequencing library according to an embodiment of the present disclosure.

[0067] FIG. 8 is a diagram illustrating results of sequencing depths of respective amplicons according to an embodiment of the present disclosure.

[0068] FIG. 9 is a schematic structural diagram of a system for constructing a sequencing library based on a target region of methylated DNA according to an embodiment of the present disclosure.

[0069] FIG. 10 is a schematic structural diagram of a universal transformation module according to an embodiment of the present disclosure.

[0070] FIG. 11 is a schematic structural diagram of a universal transformation module according to an embodiment of the present disclosure.

[0071] FIG. 12 is a schematic structural diagram of a universal transformation module according to an embodiment of the present disclosure.

DESCRIPTION OF EMBODIMENTS

[0072] The embodiments of the present disclosure are described in detail below. Examples of the embodiments are illustrated in the accompanying drawings, throughout which the same or similar reference signs indicate the same or similar elements or elements with the same or similar functions throughout the whole text. The embodiments described below with reference to the accompanying drawings are exemplary, and they are intended to explain the present disclosure, but should not be construed as limitations of the present disclosure.

[0073] In order to have a more intuitive understanding of the present disclosure, the terms included in the present disclosure are explained and described below. Those skilled in the art should understand that these explanations and descriptions are only for more convenient understanding, but should not be regarded as limitations of the protection scope of the present disclosure. Herein, unless otherwise specified, where two nucleic acid sequences are described as being connected, it means that they are connected via a 3′, 5′-phosphodiester linkage. Unless otherwise specified herein, where a base is mentioned, the base N or n represents any one of bases A, T, C, or G.

[0074] Herein, the terms “upstream” and “downstream” refer to that, when comparing two or more nucleic acid sequences according to the order of nucleotides from 5′-end to 3′-end, the nucleic acid sequence located upstream can recognize or match a region closer to the 5′-end of the template sequence than the nucleic acid sequence located downstream. Since different nucleic acid sequences may have different lengths, the regions to be recognized or matched by them may also have different lengths. When it is described that an A nucleic acid sequence is located downstream of a B nucleic acid sequence, it means that a site recognized by or paired with the 3′-end of the A nucleic acid sequence is closer to the 3′-end of the template sequence than a site recognized by or paired with the 3′-end of the B nucleic acid sequence.

[0075] Herein, when two nucleic acid sequences are described to “match with each other”, it means that bases of one of the two nucleic acid sequences are complementarily paired with bases of the other one nucleic acid sequence. When two nucleic acid sequences are described to be at least partially overlap, it means that the two nucleic acid sequences have at least one fragment of identical nucleic acid sequence.

[0076] Herein, either the “bisulfite-” or “sulfite-” treatment refers to a reagent or process that deaminates cytosine in DNA into uracil. Therefore, the bisulfite treatment and the sulfite treatment are included in the protection scope of the present disclosure.

[0077] In order to solve the problem of primer dimers between multiple pairs of methylated specific primers in the process of amplifying methylated DNA, the present disclosure creatively provides a one-way primer amplification method, that is, only primers for one strand of the DNA template are designed. In this regard, the designed specific primers each only contain A, T, and G, or A, T, and C, and they can hardly form primer dimers. At the same time, in order to ensure the specificity of primer amplification, during the second round of PCR amplification, specific primers for amplification are designed on the product of the first round of amplification to further ensure the specificity of amplification. The sequencing library prepared in such manner meets the requirements of sequencing.

[0078] In detail, the genomic DNA (gDNA) is transposed by a Tn5 transposon, a universal sequence is introduced to the interrupted gDNA or free DNA (cfDNA) molecules (the original DNAs) through adapter connection or random DNA replication; the DNA introduced with the universal sequence is subjected to a bisulfite treatment (BS treatment) to obtain a bisulfite-transformed DNA sequence, in which the unmethylated cytosine (C) of the original DNA is converted to uracil (U). A universal primer is designed based on the introduced universal sequence, a specific primer is designed to be located upstream of the target region of the transformed DNA sequence, and the specific primer is designed for only one strand on the DNA template. PCR amplification is performed by using the universal primer and the specific primer to obtain the PCR product. At the same time, in order to increase the specificity of amplification, a nested primer is designed to be located downstream of the above-mentioned specific primer or the specific primer is designed to be located downstream of the target region, and either the nested primer or the specific primer is designed for only one strand of the DNA template. A second-step amplification is performed on the product of the first-step PCR by using the nested primer or the downstream specific primer and the universal primer, to finally obtain a product of PCR amplification on the bisulfite-treated template (BS-PCR).

[0079] In one aspect of the present disclosure, the present disclosure provides a method for constructing a sequencing library based on a target region of a methylated DNA, the method including: (1) obtaining a transformed DNA sample with universal sequence based on a methylated DNA sample by constructing a bisulfite-treated DNA sample with a universal sequence ligated to at least one end of the methylated DNA sample; (2) preforming, by using the first specific primer and the first universal primer, a first amplification on the transformed DNA sample with universal sequence to obtain a first amplification product, wherein the first specific primer is located upstream of the target region, and the first universal primer at least partially overlaps or matches the universal sequence; and the universal sequence is located downstream of the target region; and (3) performing, by using a second specific primer, a second universal primer and a tagged primer, a second amplification on the first amplification product to obtain a second amplification product and obtain a sequencing library, wherein the second specific primer is located downstream of the first specific primer and upstream of the target region, the second universal primer overlaps at least a partial sequence of the second specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the first universal primer; or wherein the second specific primer is located downstream of the target region, the second universal primer overlaps at least a part a partial sequence of the first specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the second specific primer.

[0080] In the process of obtaining the transformed DNA sample with universal sequence, different methods can be adopted depending upon the precedence order of the universal sequence treatment and the bisulfite treatment.

[0081] In at least some embodiments of the present disclosure, the universal sequence is introduced by the following method:

[0082] 1. DNA molecules of gDNA, interrupted gDNA or cfDNA are first treated with bisulfite, and then the template is replicated by using a first sequencing primer and DNA polymerase to obtain a bisulfite-treated DNA template with universal sequence (as shown in FIG. 1). The first sequencing primer is a primer that has 6-12 random N bases (degenerate bases composed of A/T/C/G) or 6-12 random H bases (degenerate bases composed of A/T/C) at a 3′-end, and a partial or complete sequencing adapter sequence or a known sequence (in which cytosine is preferably the methylated cytosine) at a 5′-end. The suitable sequencing adapter sequence includes, but are not limited to, the sequencing adapters of MGI platform as well as the sequencing adapter sequences of Illumina and proton platforms. In at least some embodiments, the suitable DNA polymerase can be conventional rTaq, Fusion, or can be Bst or phi29, etc.

[0083] In at least some embodiments of the present disclosure, the universal sequence is introduced by the following method:

[0084] The interrupted gDNA or cfDNA is end-repaired by adding A-tailing, and then a specific adapter sequence is added, which can be partial or complete sequencing adapter sequences or modified sequencing adapter sequences. These modified sequencing adapter sequences each can be a sequencing adapter sequence having a known sequence and one strand with non-hydroxyl modified base at 3′-end, or a sequencing adapter sequence having a known sequence and one strand without non-hydroxyl modified base at 3′-end, for example, No. 1, No. 2, No. 3, and No. 4 shown in FIG. 4. After purification, the product added with the universal sequence is treated with sulfite to obtain the transformed DNA template (FIG. 2).

[0085] In some other embodiments of the present disclosure, the universal sequence is introduced by the following method.

[0086] An adapter sequence is embedded in Tn5 transposase. The adapter can be the effective 19 bp specific sequence of the Tn5 transposase itself, or a combination of the effective sequence and other sequences (such as sequencing adapter sequence), preferably 19 bp specific sequence. The cytosine in the 19 bp specific sequence is preferably methylated cytosine. The gDNA is transposed by Tn5 transposition to be added with a specific adapter. After purification, the product with added the specific adapters is treated with bisulfate to obtain the transformed DNA template (as shown in FIG. 3).

[0087] After obtaining the above-mentioned transformed DNA sample with universal sequence, a sequencing library is obtained by PCR amplification is performed with one-way specific primers, and the amplification method can be any one of the followings.

[0088] In at least some embodiments of the present disclosure, the sequencing library is obtained by performing PCR amplification by the following method.

[0089] A first-step PCR amplification is performed on the sulfite-treated DNA by using a specific primer and a first universal primer. A sequence of the 3′-end of the first universal primer is partially or completely complementary to or partially or completely overlaps the added universal sequence. For example, the 5′-end of the first universal sequence is a partial or complete sequencing adapter sequence (preferred partial sequence). The binding site of the first specific primer sequence is located upstream of the target region to be amplified, and is designed for the bisulfate-treated DNA template sequence. The obtained product is purified and is then subjected to a second-step PCR amplification by using a second specific primer (also referred to as nested primer in the following examples), a second universal primer, and a tagged primer. In a first cycle of the second-step PCR, the second specific primer and the tagged primer are first subjected to PCR, and the subsequent cycles are performed with the second specific primer, the second universal primer and the tagged primer together, so as to perform multiple rounds of PCR. The 5′-end of the second specific primer overlaps a partial or complete sequence of the 3′-end of the second universal primer. The 3′-end of the second specific primer is a specific sequence, and the specific sequence is designed to be located between the first specific primer and the target region. The second universal primer can be a partial or complete sequence of the sequencing universal adapter, and a 3 ‘-end thereof is identical to a partial or complete sequence of the 5’-end of the second specific primer. The 3′-end of the tagged primer is identical to a partial or complete sequence of the 5′-end of the first universal primer, and a known tag sequence of 8-12 bp is present in the middle of the tagged primer (each platform is used to distinguish the tag sequences of mixed sample), which is used for subsequent multi-sample mixed sequencing (FIG. 1A, FIG. 2A, path A of FIG. 3).

[0090] In some other embodiments of the present disclosure, the sequencing library is obtained by performing PCR amplification by the following method.

[0091] A first-step PCR amplification is performed on the sulfite-treated DNA by using a first specific primer (also referred to as the upstream specific primer in the following examples) and a first universal primer. A sequence of the 3′-end of the first universal primer is partially or completely complementary to or partially or completely overlaps the introduced universal sequence (the universal sequence preferably uses a known sequence other than the sequencing adapter sequence). The specific sequence of the 3′-end of the first specific primer is designed to be located upstream of the target region to be amplified, and is designed specifically for the bisulfate-treated DNA template sequence, and the 5′-end of the first specific primer is a partial or complete sequencing adapter sequence (preferred partial sequence). After the obtained product is purified, a second-step PCR amplification is performed using a second specific primer (referred to as downstream specific primer in the following embodiments, accordingly), a second universal primer, and a tagged primer. In a first cycle of the second-step PCR, the second specific primer and the second universal primer are first subjected to PCR amplification, and in the subsequent cycles, the second specific primer, the second universal primer and the tagged primer together are subjected to multiple rounds of PCR. The 5′-end of the downstream specific primer overlaps a partial or complete sequence of the 3′-end of the tagged primer, and the 3′-end of the second specific primer is a specific sequence. The specific sequence is designed to be located downstream of the target region. The second universal primer can be a partial or complete sequencing adapter sequence, which has a 3′-end overlapping a partial or complete sequence of the 5 ‘-end of the first specific primer. The 3’-end of the tagged primer is identical to a partial or complete sequence of the 5′-end of the second specific primer, and the tagged primer has a known tag sequence of 8-12 bp in the middle (each platform is used to distinguish tag sequences of a mixed sample), which is used for subsequent multi-sample mixed sequencing (FIG. 1B, FIG. 2B, path B of FIG. 3).

[0092] According to another aspect of the present disclosure, the present disclosure provides a system for constructing a sequencing library based on a target region of a methylated DNA. As illustrated in FIG. 9, the system includes a universal transformation module, a first amplification module, and a second amplification module that are connected in sequence. The universal transformation module is configured to obtain a transformed DNA sample with universal sequence based on a methylated DNA sample by constructing a DNA sample with universal sequence ligated to at least one end thereof and treated with bisulfate. The first amplification module is configured to perform the first amplification on the transformed DNA sample with universal sequence by using the first specific primer and the first universal primer, to obtain a first amplification product. The first specific primer is located upstream of the target region, and the first universal primer at least partially matches or overlaps the universal sequence. The second amplification module is configured to perform a second amplification on the first amplification product by using a second specific primer, a second universal primer, and a tagged primer to obtain a second amplification product and obtain a sequencing library. The second specific primer, the universal primer and the tagged primer are as set forth in (i) or (ii): (i) the second specific primer is located downstream of the first specific primer and upstream of the target region, the second universal primer overlaps at least a partial sequence of the second specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the first universal primer; or (ii) the second specific primer is located downstream of the target region, the second universal primer overlaps at least a partial sequence of the first specific primer, the tagged primer contains a tag sequence, and the tagged primer overlaps a partial sequence of the second specific primer.

[0093] In at least some embodiments of the present disclosure, the universal transformation module, as shown in FIG. 10, includes a transformation unit and an amplification unit connected to the transformation unit. The transformation unit is configured to treat the methylated DNA sample with bisulfite to obtain a transformed DNA sample. The amplification unit is configured to replicate the transformed DNA sample by using a DNA polymerase and a first sequencing primer, to obtain the transformed DNA sample with universal sequence. The 3′-end of the first sequencing primer is random bases, and the 5′-end of the first sequencing primer is a universal sequence.

[0094] In at least some embodiments of the present disclosure, the universal transformation module, as shown in FIG. 11, includes a repair unit, a connection unit, and a transformation unit that are connected in sequence. The repair unit is configured to perform end repair by adding A-tailing on the methylated DNA sample, to obtain a repaired DNA sample. The connecting unit is configured to ligate the universal sequence to at least one end of the repaired DNA sample, to obtain a DNA sample with universal sequence. The transformation unit is configured to treat the DNA sample with universal sequence by using bisulfite, so as to obtain the transformed DNA sample with universal sequence.

[0095] In at least some embodiments of the present disclosure, the universal transformation module, as shown in FIG. 12, includes a transposition unit and a transformation unit connected to the transposition unit. The transposable unit is configured to interrupt and transpose the DNA sample by using a transposase (embedded with a universal sequence), to obtain the DNA sample with universal sequence. The transformation unit is configured to treat the DNA sample with universal sequence by using bisulfite, to obtain the transformed DNA sample with universal sequence.

[0096] The solutions of the present disclosure will be explained below in conjunction with embodiments. Those skilled in the art can understand that the following embodiments are only used to illustrate the present disclosure, and should not be regarded as limiting the scope of the present disclosure. Wherever specific techniques or conditions are not indicated in the embodiments, the procedures shall be carried out in accordance with the techniques or conditions described in the literatures in the field or in accordance with the product instructions. The reagents or instruments used without indication of the manufacturer are all conventional products that are commercially available.

Example 1: Library Construction and Sequencing Based on Methylation Multiplex-PCR

[0097] Experimental design: 100 ng of Yanhuang genomic DNAs was subjected to bisulfite treatment, then a DNA target methylation library was prepared by following the steps of the present disclosure, and the library was loaded on MGISEQ-2000 sequencer for sequencing, with sequencing type PE100, and then the data was analyzed, including data utilization, mappability, amplicon specificity, uniformity, and other properties.

[0098] 1. Bisulfite Treatment

[0099] Using a EZ DNA Methylation-Gold Kit™ (ZYMO, USA, catalog number D5005), the above-mentioned DNAs were co-treated with bisulfite.

[0100] Solution Preparation:

[0101] Preparation of CT conversion reagent solution: CT conversion reagent (solid mixture) was taken out from the kit. 900 μL of water, 50 μL of M-dissolving buffer, and 300 μL of M-dissolving buffer were added, respectively. Then, the mixture was dissolved at room temperature and was oscillated for 10 minutes or shaken on a shaker for 10 minutes.

[0102] Preparation of M-washing buffer: 24 mL of 100% ethanol was added to the M-Washing Buffer for use.

[0103] Specific steps are as follows:

[0104] (1) 130 μL of the CT conversion reagent solution and the above DNAs were added to a PCR tube, followed by flicking or pipetting to suspend the mixed sample;

[0105] Then, the sample tube was placed on the PCR machine to perform the following steps: 5 minutes at 98° C. and 2.5 hours at 64° C.

[0106] After completing the above operations, immediately proceeding to the next step.

[0107] (2) the Zymo-Spin IC™ Column was placed into the collection tube, and 600 μL of M-binding buffer was added.

[0108] Then, the bisulfate-treated sample was added into the Zymo-Spin IC™ Column containing the M-binding buffer, followed by closing the lid and mixing upside down.

[0109] Centrifugation was performed at a full speed (>10,000×g) for 30 seconds, the collection solution in the collection tube was discarded, 100 μL of the M-washing Buffer was added into the column, followed by centrifuging at a full speed (>10,000×g) for 30 seconds and discarding the liquid in the collection tube.

[0110] 200 μL of M-Desulphonation Buffer was added into the column, stood at room temperature for 15 min, followed by centrifuging at a full speed (>10,000×g) for 30 seconds, and discarding the liquid in the collection tube.

[0111] 200 μL of the M-wash Buffer was added into the column, followed by centrifuging at a full speed (>10,000×g) for 30 seconds, and discarding the liquid in the collection tube. This step was repeated one more time.

[0112] The Zymo-Spin IC™ Column was placed in a new 1.5 mL EP tube, 40 μL of M-elution buffer r was added into the column matrix, and stood at room temperature for 2 min, followed by centrifuging at a full speed (>10,000×g) to elute the target fragment DNA.

[0113] 2. DNA Replication

[0114] (1) DNA replication was performed on the bisulfite-treated DNA in the PCR tube according to the following reaction system.

TABLE-US-00001 Ligated DNA from the previous step 38 μL 5 × Bst buffer 5 μL Random primer (10 μM) 5 μL BST enzyme (NEB, USA, catalog No. M0538) 1 μL dNTP mix (10 mM) 1 μL Total volume 50 μL

[0115] The sequence of the random primer (i.e., the first sequencing primer mentioned in this disclosure): CGCTTGGCCTCCGACTTNNNNNNNN (SEQ ID NO: 24), where N is a random one selected from the group consisting of four bases: A/T/C/G.

[0116] (2) The above reaction system was placed on the PCR machine, and reacted at 65° C. for 10 minutes.

[0117] (3) After the reaction was complete, purification was performed using 1.5×AMPure magnetic beads (Beckman, AMPure XP, catalog No. A63881), and the purified product was finally dissolved in 22 μl of elution buffer.

[0118] 3. A First Round of PCR

[0119] (1) The PCR system in the PCR tube was configured according to the following reaction system

TABLE-US-00002 Treated DNA from the previous step 20 μL 2 × KAPA2G Fast ReadyMix (Kapa, USA, KK5102) 25 μL First specific primer pool (10 μM) 2.5 μL First universal primer (10 μM) 2.5 μL Total volume 50 μL

[0120] (2) Reaction conditions of PCR

TABLE-US-00003 94° C. 1 min 94° C. 30 s 58° C. 2 min {close oversize brace} 15 cycles 72° C. 30 s 72° C. 5 min 12° C. maintained

[0121] (3) Purification was performed with 1.5×AMPure magnetic beads after the reaction was complete, and finally the purified product was dissolved in 22 μl of elution buffer.

[0122] 3. A Second Round of PCR

[0123] (1) The PCR system in the PCR tube was configured according to the following reaction system. The nested primer pool is shown in Table 4 below, and the tagged primer is shown in Table 5 below.

TABLE-US-00004 Treated DNA from the previous step 17.5 μL 2 × KAPA2G Fast Ready Mix 25 μL Second specific primer pool (10 μM) 2.5 μL The second universal primer (10 μM) 2.5 μL Tagged primer (10 μM) 2.5 Total volume 50 μL

[0124] (2) Reaction conditions of PCR

TABLE-US-00005 94° C. 1 min 94° C. 30 s 58° C. 2 min {close oversize brace} 20 cycles 72° C. 30 s 72° C. 5 min 12° C. maintained

[0125] (3) Purification was performed with 1.0×AMPure magnetic beads after the reaction was finished, and finally the purified product was dissolved in 22 μl of elution buffer.

[0126] 4. Library Detection:

[0127] An Bioanalyzer analysis system (Agilent, Santa Clara, USA) was used to detect the size and content of the inserts in the library, and the results are shown in FIG. 5.

[0128] 5. On-Machine Sequencing

[0129] High-throughput sequencing was performed on the obtained library with sequencing platform MGISEQ-2000, sequencing type PE100. After comparing the sequencing data, respective basic parameters, including off-machine data, available data, mappability, GC content, etc. were statistically analyzed. The results are shown in Table 1 below. The depth of each amplicon is shown in FIG. 6. In FIG. 6, the abscissa represents different CpG sites.

TABLE-US-00006 TABLE 1 Sequencing test results Off- Adapter machine filtration 0.1X No. data ratio Mappability Specificity uniformity Sample 1 136227 1.3% 89.6% 78.6% 90% Sample 2 115298 1.0% 88.5% 77.5% 90% Sample 3 114045 0.9% 88.1% 78.7% 90%

[0130] In Table 1, sample 1 to sample 3 represent three replicates of the same sample, respectively; the mappability refers to a mapping ratio with the genome; the specificity refers to a ratio of reads of the target regions to the total reads of the whole sequencing; and the uniformity refers to a proportion of the number of the target regions having a depth 0.1 times greater than an average depth of the target regions to the total number of the target regions.

[0131] It can be seen from Table 1 that the adapter filtration ratio of each sample is around 1%, which, in conjunction with the library quality inspection results shown in FIG. 5, indicates an extremely small amount of primer dimers was formed; and the mappabilities are all within a range of 88% to 89%, the specificities are within a range of 77% to 79%, demonstrating good performances. In addition, the depth uniformity of the respective amplicons is good.

Example 2: Library Construction and Sequencing Based on Methylation Multiplex-PCR

[0132] Experimental design: interrupted Yanhuang genomic DNA of 200-300 bp was used, then a DNA target methylation library was prepared according to the method provided by the present disclosure, and then the library was loaded on a MGISEQ-2000 sequencer for on-machine sequencing, sequencing type PE100, and then data analysis was performed, including data utilization, mappability, amplicon specificity, uniformity and other properties.

[0133] 1. End Repair

[0134] (1) An end repair reaction system was prepared with the DNA fragments obtained in the previous step in a 1.5 mL centrifuge tube according to the following table:

TABLE-US-00007 DNA fragment 30 μL H.sub.2O 45 μL 10 × Polynucleotide kinase buffer 10 μL (Enzymatic, catalog No. Y9040L) dNTPs (each component was 10 mM) 4 μL (Enzymatic, catalog No. N2010L) T4 DNA polymerase 5 μL (Enzymatic, catalog No. P7080L) Klenow fragment 1 μL (Enzymatic, catalog No. P7060L) T4 polynucleotide kinase 5 μL (Enzymatic, catalog No. Y9040L) Total volume 100 μL

[0135] (2) The above reaction system was placed on a Thermomixer (Eppendorf) at 20° C. and reacted for 30 min. After the reaction, purification was performed with AMPure magnetic beads, and finally the purified product was dissolved in 34 μl of elution buffer. The above reagents were all reagents purchased from enzymatic company.

[0136] 2. Addition of Base A-Tailing

[0137] (1) A reaction system for adding base A was prepared in a 1.5 mL centrifuge tube from the DNA obtained in the previous step according to the following table.

TABLE-US-00008 DNA 32 μL 10 × Klenow buffer (Enzymatic, catalog No. P7010-HC-L) 5 μL dATP (diluted to 1 mM, Enzymatic, catalog No. N2010-A-L) 10 μL Klenow (3’-5’exo-, Enzymatic, catalog No. P7010-HC-L) 3 μL Total volume 50 μL

[0138] (2) The above reaction system was placed on a Thermomixer (Eppendorf) at 37° C. and reacted for 30 min. After the reaction was complete, purification was performed with AMPure magnetic beads, and the purified product was finally dissolved in 20 μl of elution buffer.

[0139] 2. Ligation of Methylation Adapter 1:

[0140] (1) Methylated adapters (also referred to as “methylated tag adapter”) were prepared with the DNA obtained in the previous step according to the following table:

TABLE-US-00009 DNA 18 μL 2 × Rapid ligation buffer 25 μL Methylated tag adapter* 4 μL T4 DNA ligase (Rapid, enzymatic, L603-HC-L) 3 μL Total volume 50 μL

[0141] The sequences of the methylated adapters* are as below:

TABLE-US-00010 Adapter 1: (SEQ ID NO: 25) 5′/5Phos/AGTCGGAGGCCAAGCGGT Adapter 2: (SEQ ID NO: 26) 5′ACATGGCTACGATCCGACTddT

[0142] Each cytosine in the sequence of the adapter 1 was methylated for protection, the cytosine in the adapter 2 was methylated for protection or not methylated, and the last base of the 3′-end in the adapter 2 was blocking-modified (i.e., dideoxy-modification) to prevent ligating with the template.

[0143] (2) The above reaction system was placed on a Thermomixer (Eppendorf) at 20° C., and reacted for 15 minutes to obtain a ligated product. After the reaction, purification was performed with AMPure magnetic beads, and the purified product was finally dissolved in 22 μl of elution buffer.

[0144] 3. Sulfite Treatment

[0145] Using a kit EZ DNA Methylation-Gold Kit™ (ZYMO), the above-mentioned ligated DNA was subjected to bisulfite co-treatment.

[0146] (1) Reagent preparation:

[0147] Preparation of CT conversion reagent solution: the CT conversion reagent (solid mixture) was taken out from the kit. 900 μL of water, 50 μL of M-dissolving buffer, and 300 μL of M-dissolving buffer were added, respectively. The mixture was dissolved at room temperature and oscillated for 10 minutes or shaken on a shaker for 10 minutes.

[0148] Preparation of M-washing buffer: 24 mL of 100% ethanol was added to a M-washing buffer for use.

[0149] (2) 130 μL of the CT conversion reagent solution and the above-mentioned ligated DNA to a PCR tube, following by flicking or pipetting to suspend the mixed sample.

[0150] Then, the sample tube was placed on a PCR machine to operate according to the steps: 5 minutes at 98° C., and 2.5 hours at 64° C.

[0151] After the above operations were finished, immediately proceeding to the next operation or storing the sample at 4° C. (up to 20 hours) for later use.

[0152] (3) A Zymo-Spin IC™ Column was placed into a collection tube, and 600 μL of M-binding buffer was added;

[0153] Then, the above bisulfite-treated sample was added to the Zymo-Spin IC™ Column containing the M-binding buffer, followed by closing the lid and mixing upside down;

[0154] centrifuging at a full speed (>10,000×g) for 30 seconds and discarding the collection solution in the collection tube;

[0155] adding 100 μL of the M-washing buffer to the column, centrifuging at a full speed (>10,000×g) for 30 seconds, and discarding the liquid in the collection tube;

[0156] adding 200 μL of M-Desulphonation Buffer to the column, leaving it at room temperature for 15 min, centrifuging at a full speed (>10,000×g) for 30 seconds, and discarding the liquid in the collection tube;

[0157] adding 200 μL of the M-washing buffer to the column, centrifuging at a full speed (>10,000×g) for 30 seconds, discarding the liquid in the collection tube, and repeating this step one more time;

[0158] placing the Zymo-Spin IC™ Column in a new 1.5 mL EP tube, adding 18 μL of M-elution buffer r to the column matrix, leaving it at room temperature for 2 min, and centrifuging at a full speed (>10,000×g) to elute the target fragmented DNA.

[0159] 4. First Round of PCR

[0160] (1) A PCR system was prepared in a PCR tube according to the following reaction system. The primers contained in the upstream specific primer pool are shown in Table 3 below, and the first universal primer is shown in Table 5 below.

TABLE-US-00011 Treated DNA from the previous step 20 μL 2 × KAPA2G Fast ReadyMix 25 μL First specific primer pool (10 μM) 2.5 μL First universal primer (10 μM) 2.5 μL Total volume 50 μL

[0161] (2) Reaction conditions for PCR

TABLE-US-00012 94° C. 1 min 94° C. 30 s 58° C. 2 min {close oversize brace} 15 cycles 72° C. 30 s 72° C. 5 min 12° C. maintained

[0162] After the reaction was finished, purification was performed with 1.5×AMPure magnetic beads, and finally the purified product was dissolved in 22 μl of elution buffer.

[0163] 5. Second Round of PCR

[0164] (1) A PCR system was prepared in the PCR tube according to the following reaction system. The primers contained in the Nested primer pool are shown in Table 4 below, and the second universal primer and the tagged primer are shown in Table 5 below.

TABLE-US-00013 Treated DNA from the previous step 17.5 μL 2 × KAPA2G Fast ReadyMix 25 μL Second specific primer pool (10 μM) 2.5 μL Second universal primer (10 μM) 2.5 μL Tagged primer (10 μM) 2.5 Total volume 50 μL

[0165] (2) Reaction conditions for PCR

TABLE-US-00014 94° C. 1 min 94° C. 30 s 58° C. 2 min {close oversize brace} 20 cycles 72° C. 30 s 72° C. 5 min 12° C. maintained

[0166] After the reaction was finished, purification was performed with 1.0×AMPure magnetic beads, and finally the purified product was dissolved in 22 μl of elution buffer.

[0167] 6. Library Detection:

[0168] A Bioanalyzer analysis system (Agilent, Santa Clara, USA) was used to detect the size and content of the inserts in the library, and the results are shown in FIG. 7.

[0169] 7. On-Machine Sequencing

[0170] High-throughput sequencing was performed on the obtained library using the sequencing platform MGISEQ-2000 (MGI, sequencing type PE100). After alignment of the sequencing data, the respective basic parameters are statistically analyzed, including off-machine data, available data, mappability, and specificity, etc. The results are shown in Table 2. The sequencing depth of each amplicon is shown in FIG. 8.

TABLE-US-00015 TABLE 2 Sequencing results Adapter filtration No. Raw data rate Mappability Specificity Uniformity Sample 1 112792 0.8% 84.3% 89.3% 100% Sample 2 131590 1.1% 85.6% 90.8% 100% Sample 3 120311 0.9% 86.1% 90.7% 100%

[0171] In Table 2, Sample 1 to Sample 3 represent three replicates of one same sample, respectively; the mappability refers to a ratio of mapping to the genome; the specificity refers to a ratio of reads of the target regions to the total reads of the whole sequencing; the uniformity refers to a ratio of the number of target regions having a depth that is 0.1 times greater than an average depth of the target regions to the total number of the target regions.

[0172] It can be seen from the results in Table 2, FIG. 7 and FIG. 8 that, using the amplification method provided by the present disclosure, the adapter filtration rate is around 1%, with few primer dimers, and the mappability is in a range of 84% to 86%, the specificity is in a range of 89% to 90%, demonstrating good performances and uniform coverage depth between the amplicons.

TABLE-US-00016 TABLE 3 First specific primer pool Target CpG sites No. Sequence cg21646186 First specific GGAGGYSTAGYGATTTTAG (SEQ ID NO: 1) primer 01 cg19426625 First specific GGGAGAATTTTGAAAATGAAATATATTTTT primer 02 (SEQ ID NO: 2) cg00960700 First specific TTTTYGTTTTTYGTTTTYGTTTTT primer 03 (SEQ ID NO: 3) cg06310157 First specific TTTTTGAATTYGAGGTATYGGTT primer 04 (SEQ ID NO: 4) cg15025536 First specific TTTTAATTTAGAATTTATTATTATTTGAAGTTTTA primer 05 (SEQ ID NO: 5) cg12743416 First specific ATTTGGATYGTATTTTTAAGATATTTAATTATTAA primer 06 (SEQ ID NO: 6) cg07382129 First specific TGTGTTTYTATAAAGGTTAGGAGTTT primer 07 (SEQ ID NO: 7) cg24084681 First specific GGGTGGTTGATTTATGTAYGG primer 08 (SEQ ID NO: 8) cg06837426 First specific AGATTGTGYGGTAGTAAGTTTTT primer 09 (SEQ ID NO: 9) cg00648301 First specific GTTTGTTTGYGYGTTTG (SEQ ID NO: 10) primer 10

[0173] The first specific primer pool was an equimolar mixture of the above-mentioned primers, and the Y base is a C/T degenerate base.

TABLE-US-00017 TABLE 4 Nested Primer Pool Target CpG sites No. Sequence cg21646186 Second specific ACATGGCTACGATCCGACTTGGAGTTTYGGGGYG primer 01 YGTG (SEQ ID NO: 11) cg19426625 Second specific ACATGGCTACGATCCGACTTTTTTTGATATTGAAAA primer 02 TGTAATTGGTTTTT (SEQ ID NO: 12) cg00960700 Second specific ACATGGCTACGATCCGACTTGGTYTYGGTTGGYGT primer 03 TTT (SEQ ID NO: 13) cg06310157 Second specific ACATGGCTACGATCCGACTTGGAGTATTTTATTTTT primer 04 GTTGTTTATTATTATTTTT (SEQ ID NO: 14) cg15025536 Second specific ACATGGCTACGATCCGACTTGTTGAAGTGAGAATG primer 05 TGATTATTAATTTTT (SEQ ID NO: 15) cg12743416 Second specific ACATGGCTACGATCCGACTTGTGTGTGTGTGTGTA primer 06 TTTATATATTTATATAAAA (SEQ ID NO: 16) cg07382129 Second specific ACATGGCTACGATCCGACTTTTAGAATTGAGATTA primer 07 GAGAGGTAAGTAATG (SEQ ID NO: 17) cg24084681 Second specific ACATGGCTACGATCCGACTTGTTAAGTTGAAAAGT primer 08 TGAATTTGTTTTT (SEQ ID NO: 18) cg06837426 Second specific ACATGGCTACGATCCGACTTYGGGTTGTTTTTGTAT primer 09 TTATTGTTG (SEQ ID NO: 19) cg00648301 Second specific ACATGGCTACGATCCGACTTGTATTTYGGTAATTTY primer 10 GAGGTTG (SEQ ID NO: 20)

[0174] The second specific primer pool is composed of the above-mentioned primers in an equimolar mixture, and the Y base is a C/T merged base.

TABLE-US-00018 TABLE 5 Universal primers No. Sequence First univer- CGCTTGGCCTCCGACTT (SEQ ID NO: 21) sal primer Second univer- /5Phos/#GAACGACATGGCTACGATCCGACTT sal primer (SEQ ID NO: 22) Tagged primer TGTGAGCCAAGGAGTTNNNNNNNNNNTTGTCTT CCTAAGACCGCTTGGCCTCCGACTT (SEQ ID NO: 23)

[0175] In the above table, the N base is the barcode sequence on the MGI sequencing platform.

[0176] In the description of the present disclosure, the terms “first”, “second”, etc. are only used for descriptive purposes, and cannot be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present disclosure, “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.

[0177] In the present disclosure, unless otherwise clearly specified and limited, the terms “connected”, “connected”, “fixed” and other terms should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or as one piece; mechanical connection or electrical connection or mutual communication; direct connection, or indirect connection through an intermediate medium; and internal communication between two components or mutual interaction between two components, unless otherwise specified. For those skilled in the art, the specific meaning of the above-mentioned terms in the present disclosure can be understood according to specific circumstances.

[0178] In the description of this specification, descriptions with reference to the terms “one embodiment”, “some embodiments”, “examples”, “specific examples”, or “some examples” etc. mean specific features, structure, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the above-mentioned terms are not necessarily directed to the same embodiment or example. Moreover, the described specific features, structures, materials or characteristics can be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art can combine different embodiments or examples and the features of the different embodiments or examples described in this specification without contradicting each other.

[0179] Although the embodiments of the present disclosure have been illustrated and described above, it can be understood that the above-mentioned embodiments are exemplary and should not be construed as limiting the present disclosure. Those skilled in the art can make changes, modifications, replacements and modifications to the above-mentioned embodiments within the scope of the present disclosure.

METHOD AND SYSTEM FOR CONSTRUCTING SEQUENCING LIBRARY ON THE BASIS OF METHYLATED DNA TARGET REGION, AND USE THEREOF

Inventors

Cpc classification

Classification Explorer

C12Q2525/161

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2563/179

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2535/122

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2537/143

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6869

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2535/122

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2563/179

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/179

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/1093

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/173

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2537/143

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6806

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/179

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/173

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/1093

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6806

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/191

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/191

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2523/125

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2525/161

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2523/125

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C12Q1/6869

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/10

CHEMISTRY; METALLURGY

Abstract

Claims

Description