Methods for multiplex PCR
11203781 · 2021-12-21
Assignee
Inventors
Cpc classification
C12Q2525/186
CHEMISTRY; METALLURGY
C12N15/66
CHEMISTRY; METALLURGY
C12Q2525/186
CHEMISTRY; METALLURGY
C12Q1/6874
CHEMISTRY; METALLURGY
International classification
C12P19/34
CHEMISTRY; METALLURGY
C12N15/66
CHEMISTRY; METALLURGY
Abstract
Methods for performing multiplex PCR-based enrichment of a target substrate are provided. Systems and methods for generating a sequencing library are also provided.
Claims
1. A method of multiplex PCR amplification of a specific target locus on a nucleic acid substrate for preparing a targeted next generation sequencing library comprising the steps of: (i) combining a plurality of target-specific primers with the nucleic acid substrate to yield a single polymerase chain reaction (PCR) reaction mixture, wherein the plurality of target-specific primers comprise a first forward primer, a second forward primer, a first reverse primer and a second reverse primer, wherein each of the first and second forward and reverse primers comprise a 3′ complementary sequence that is fully complementary to a sequence of the specific target locus and a 5′ noncomplementary sequence that is not complementary to a sequence of the nucleic acid substrate, wherein the 3′ complementary sequence for each of the first and second forward and reverse primers is different, wherein the 3′ complementary sequence is between 10 and 40 bases in length, wherein the nucleic acid substrate is human genomic DNA, and wherein the specific target locus is a gene known to have clinical relevance in oncology (ii) subjecting the PCR reaction mixture to a multiplex polymerase chain reaction thereby generating at least three amplicons within the specific target locus, wherein the at least three amplicons comprise a first amplicon produced by the first forward primer and the first reverse primer, a second amplicon produced by the second forward primer and the second reverse primer, and a third amplicon produced by the second forward primer and the first reverse primer, wherein the third amplicon is shorter in length than the first and second amplicons, wherein at least a portion of the 5′ noncomplementary sequence of the second forward primer and the first reverse primer is the same such that each strand of the third amplicon comprises a 3′ sequence and a 5′ sequence that are complementary to each other, wherein the third amplicon possesses overlapping sequence with the first and second amplicons, wherein the first amplicon possesses overlapping sequence with the second amplicon, wherein when the third amplicon is denatured, each strand of the third amplicon forms a secondary structure as a result of the 3′ sequence being complementary to the 5′ sequence, wherein the secondary structure is stable during a primer annealing step of the multiplex polymerase chain reaction.
2. The method of claim 1, wherein the 3′ complementary sequence is between 16 and 30 bases in length.
3. The method of claim 1, wherein the gene known to have clinical relevance in oncology is selected from the group consisting of ABU, ABL1, ABL2, ACSL3, AF15Q14, AF1Q, AF3p21, AF5q31, AKAP9, AKT1, AKT2, ALDH2, ALK, ALO17, AMER1, APC, ARHGEF12, ARHH, ARID1A, ARID2, ARNT, ASPSCR1, ASXL1, ATF1, ATIC, ATM, ATP1A1, ATP2B3, ATRX, AXIN1, BAP1, BCL10, BCL11A, BCL11B, BCL2, BCL3, BCLS, BCL6, BCL7A, BCL9, BCOR, BCR, BHD, BIRC3, BLM, BMPR1A, BRAF, BRCA1, BRCA2, BRD3, BRD4, BRIP1, BTG1, BUB1B, C12orf9, C15orf21, C15orf55, C16orf75, C2orf44, CACNA1D, CALR, CAMTA1, CANT1, CARD11, CARS, CASP8, CBFA2T1, CBFA2T3, CBFB, CBL, CBLB, CBLC, CCDC6, CCNBHP1, CCND1, CCND2, CCND3, CCNE1, CD273, CD274, CD74, CD79A, CD79B, CDC73, CDH1, CDH11, CDK12, CDK4, CDK6, CDKN2A, CDKN2C, CDKN2a(p14), CDX2, CEBPA, CEP1, CEP89, CHCHD7, CHEK2, CHIC2, CHN1, CIC, CIITA, CLIP1, CLTC, CLTCL1, CMKOR1, CNOT3, COL1A1, COL2A1, COPEB, COX6C, CREB1, CREB3L1, CREB3L2, CREBBP, CRLF2, CRTC3, CSF3R, CTNNB1, CUX1, CYLD, D10S170, DAXX, DCTN1, DDB2, DDIT3, DDX10, DDX5, DDX6, DEK, DICER1, DNM2, DNMT3A, DUX4, EBF1, ECT2L, EGFR, EIF3E, EIF4A2, ELF4, ELK4, ELKS, ELL, ELN, EML4, EP300, EPS15, ERBB2, ERC1, ERCC2, ERCC3, ERCC4, ERCC5, ERG, ETV1, ETV4, ETV5, ETV6, EVI1, EWSR1, EXT1, EXT2, EZH2, EZR, FACL6, FAM46C, FANCA, FANCC, FANCD2, FANCE, FANCF, FANCG, FAS, FBXO11, FBXW7, FCGR2B, FEV, FGFR1, FGFR1OP, FGFR2, FGFR3, FH, FHIT, FIP1L1, FLI1, F1127352, FLT3, FNBP1, FOXA1, FOXL2, FOXO1A, FOXO3A, FOXO4, FOXP1, FSTL3, FUBP1, FUS, FVT1, GAS7, GATA1, GATA2, GATA3, GMPS, GNA11, GNAQ, GNAS, GOLGA5, GOPC, GPC3, GPHN, GRAF, H3F3A, H3F3B, HCMOGT-1, HEAB, HERPUD1, HEY1, HIP1, HIST1H3B, HIST1H4I, HLA-A, HLF, HLXB9, HMGA1, HMGA2, HNRNPA2B1, HOOK3, HOXA11, HOXA13, HOXA9, HOXC11, HOXC13, HOXD11, HOXD13, HRAS, HSPCA, HSPCB, IDH1, IDH2, IGH\, IGK, IGL, IKZFL IL2, IL21R, IL6ST, IL7R, IRF4, IRTA1, ITK, JAKL JAK2, JAK3, JAZFL JUN, KCNJ5, KDM5A, KDM5C, KDM6A, KDR, KIAA1549, KIAA1598, KIF5B, KIT, KLF4, KLK2, KMT2D, KRAS, KTN1, LAF4, LASP1, LCK, LCP1, LCX, LHFP, LIFR, LMNA, LMO1, LMO2, LPP, LRIG3, LSM14A, LYL1, MAF, MAFB, MALAT1, MALT1, MAML2, MAP2K1, MAP2K2, MAP2K4, MAX, MDM2, MDM4, MDS1, MDS2, MECT1, MED12, MEN1, MET, MITF, MKL1, MLF1, MLH1, MLL, MLL3, MLLT1, MLLT10, MLLT2, MLLT3, MLLT4, MLLT6, MLLT7, MN1, MPL, MSF, MSH2, MSH6, MSI2, MSN, MTCP1, MUC1, MUTYH, MYB, MYC, MYCL1, MYCN, MYD88, MYH11, MYH9, MYO5A, MYST4, NAB2, NACA, NBS1, NCOA1, NCOA2, NCOA4, NDRG1, NF1, NF2, NFATC2, NFE2L2, NFIB, NFKB2, NIN, NKX2-1, NONO, NOTCH1, NOTCH2, NPM1, NR4A3, NRAS, NRG1, NSD1, NT5C2, NTRK1, NTRK3, NUMA1, NUP214, NUP98, NUTM2A, NUTM2B, OLIG2, OMD, P2RY8, PAFAH1B2, PALB2, PAX3, PAX5, PAX7, PAX8, PBRM1, PBX1, PCM1, PCSK7, PDE4DIP, PDGFB, PDGFRA, PDGFRB, PER1, PHF6, PHOX2B, PICALM, PIK3CA, PIK3R1, PIM1, PLAG1, PLCG1, PML, PMS1, PMS2, PMX1, PNUTL1, POT1, POU2AF1, POU5F1, PPARG, PPFIBP1, PPP2R1A, PRCC, PRDM1, PRDM16, PRF1, PRKAR1A, PSIP1, PTCH1, PTEN, PTPN11, PTPRB, PTPRC, PTPRK, PWWP2A, RAB5EP, RAC1, RAD21, RAD51L1, RAF1, RALGDS, RANBP17, RAP1GD51, RARA, RB1, RBM15, RECQL4, REL, RET, RNF43, ROS1, RPL10, RPL22, RPL5, RPN1, RSPO2, RSPO3, RUNDC2A, RUNX1, RUNXBP2, SBDS, SDC4, SDH5, SDHB, SDHC, SDHD, 42253, SET, SETBP1, SETD2, SF3B1, SFPQ, SFRS3, SH2B3, SH3GL1, SIL, SLC34A2, SLC45A3, SMAD4, SMARCA4, SMARCB1, SMARCE1, SMO, SOCS1, SOX2, SRGAP3, SRSF2, SS18, SS18L1, SSX1, SSX2, SSX4, STAG2, STAT3, STATSB, STAT6, STK11, STL, SUFU, SUZ12, SYK, TAF15, TALL TAL2, TBL1XR1, TCEA1, TCF1, TCF12, TCF3, TCF7L2, TCL1A, TCL6, TERT, TET2, TFE3, TFEB, TFG, TFPT, TFRC, THRAP3, TIF1, TLX1, TLX3, TMPRSS2, TNFAIP3, TNFRSF14, TNFRSF17, TOP1, TP53, TPM3, TPM4, TPR, TRA, TRAF7, TRB, TRD, TRIM27, TRIM33, TRIP11, TRRAP, TSC1, TSC2, TSHR, TTL, U2AF1, UBRS, USP6, VHL, VTI1A, WAS, WHSC1, WHSC1L1, WIF1, WRN, WT1, WWTR1, XPA, XPC, XPO1, YWHAE, ZCCHC8, ZNF145, ZNF198, ZNF278, ZNF331, ZNF384, ZNF521, ZNF9 and ZRSR2.
4. The method of claim 1, wherein the gene known to have clinical relevance in oncology is BRCA1 or BRCA2.
5. The method of claim 1, wherein the gene known to have clinical relevance in oncology is TP53.
6. The method of claim 1, further comprising (iii) incubating a sample comprising the first and second amplicons with a 3′ adaptor, a 5′ adaptor and a ligase under conditions sufficient to permit ligation of the 3′ adaptor to a 3′ end of the first and second amplicons and 5′ adaptor to a 5′ end of the first and second amplicons thereby yielding the targeted next generation sequencing library.
7. The method of claim 6, further comprising (iv) sequencing the targeted next generation sequencing library.
8. The method of claim 1, wherein the melting temperature of the target-specific primers is from about 60.5° C. to about 65.5° C.
9. A method of multiplex PCR amplification of a specific target locus on a nucleic acid substrate for preparing a targeted next generation sequencing library comprising the steps of: (i) combining a plurality of target-specific primers with the nucleic acid substrate to yield a single polymerase chain reaction (PCR) reaction mixture, wherein the plurality of target-specific primers comprise a first forward primer, a second forward primer, a first reverse primer and a second reverse primer, wherein each of the first and second forward and reverse primers comprise a 3′ complementary sequence that is fully complementary to a sequence of the specific target locus and a 5′ noncomplementary sequence that is not complementary to a sequence of the nucleic acid substrate, wherein the 3′ complementary sequence for each of the first and second forward and reverse primers is different, wherein the 3′ complementary sequence is between 16 and 30 bases in length, and wherein the specific target locus is a gene known to have clinical relevance in oncology; (ii) subjecting the PCR reaction mixture to a multiplex polymerase chain reaction thereby generating at least three amplicons within the specific target locus, wherein the at least three amplicons comprise a first amplicon produced by the first forward primer and the first reverse primer, a second amplicon produced by the second forward primer and the second reverse primer, and a third amplicon produced by the second forward primer and the first reverse primer, wherein the third amplicon is shorter in length than the first and second amplicons, wherein at least a portion of the 5′ noncomplementary sequence of the second forward primer and the first reverse primer is the same such that each strand of the third amplicon comprises a 3′ sequence and a 5′ sequence that are complementary to each other, wherein the third amplicon possesses overlapping sequence with the first and second amplicons, wherein the first amplicon possesses overlapping sequence with the second amplicon, wherein when the third amplicon is denatured, each strand of the third amplicon forms a secondary structure as a result of the 3′ sequence being complementary to the 5′ sequence, wherein at the end of the multiplex polymerase chain reaction, the first and second amplicons are each present at a greater amount than the third amplicon, and wherein the nucleic acid substrate is human genomic DNA.
10. The method of claim 9, wherein the gene known to have clinical relevance in oncology is selected from the group consisting of ABU, ABL1, ABL2, ACSL3, AF15Q14, AF1Q, AF3p21, AF5q31, AKAP9, AKT1, AKT2, ALDH2, ALK, ALO17, AMER1, APC, ARHGEF12, ARHH, ARID1A, ARID2, ARNT, ASPSCR1, ASXL1, ATF1, ATIC, ATM, ATP1A1, ATP2B3, ATRX, AXIN1, BAP1, BCL10, BCL11A, BCL11B, BCL2, BCL3, BCLS, BCL6, BCL7A, BCL9, BCOR, BCR, BHD, BIRC3, BLM, BMPR1A, BRAF, BRCA1, BRCA2, BRD3, BRD4, BRIP1, BTG1, BUB1B, C12orf9, C15orf21, C15orf55, C16orf75, C2orf44, CACNA1D, CALR, CAMTA1, CANT1, CARD11, CARS, CASP8, CBFA2T1, CBFA2T3, CBFB, CBL, CBLB, CBLC, CCDC6, CCNBHP1, CCND1, CCND2, CCND3, CCNE1, CD273, CD274, CD74, CD79A, CD79B, CDC73, CDH1, CDH11, CDK12, CDK4, CDK6, CDKN2A, CDKN2C, CDKN2a(p14), CDX2, CEBPA, CEP1, CEP89, CHCHD7, CHEK2, CHIC2, CHN1, CIC, CIITA, CLIP1, CLTC, CLTCL1, CMKOR1, CNOT3, COL1A1, COL2A1, COPEB, COX6C, CREB1, CREB3L1, CREB3L2, CREBBP, CRLF2, CRTC3, CSF3R, CTNNB1, CUX1, CYLD, D10S170, DAXX, DCTN1, DDB2, DDIT3, DDX10, DDX5, DDX6, DEK, DICER1, DNM2, DNMT3A, DUX4, EBF1, ECT2L, EGFR, EIF3E, EIF4A2, ELF4, ELK4, ELKS, ELL, ELN, EML4, EP300, EPS15, ERBB2, ERC1, ERCC2, ERCC3, ERCC4, ERCC5, ERG, ETV1, ETV4, ETV5, ETV6, EVI1, EWSR1, EXT1, EXT2, EZH2, EZR, FACL6, FAM46C, FANCA, FANCC, FANCD2, FANCE, FANCF, FANCG, FAS, FBXO11, FBXW7, FCGR2B, FEV, FGFR1, FGFR1OP, FGFR2, FGFR3, FH, FHIT, FIP1L1, FLI1, F1127352, FLT3, FNBP1, FOXA1, FOXL2, FOXO1A, FOXO3A, FOXO4, FOXP1, FSTL3, FUBP1, FUS, FVT1, GAS7, GATA1, GATA2, GATA3, GMPS, GNA11, GNAQ, GNAS, GOLGA5, GOPC, GPC3, GPHN, GRAF, H3F3A, H3F3B, HCMOGT-1, HEAB, HERPUD1, HEY1, HIP1, HIST1H3B, HIST1H4I, HLA-A, HLF, HLXB9, HMGA1, HMGA2, HNRNPA2B1, HOOK3, HOXA11, HOXA13, HOXA9, HOXC11, HOXC13, HOXD11, HOXD13, HRAS, HSPCA, HSPCB, IDH1, IDH2, IGH\, IGK, IGL, IKZFL IL2, IL21R, IL6ST, IL7R, IRF4, IRTA1, ITK, JAKL JAK2, JAK3, JAZFL JUN, KCNJ5, KDM5A, KDM5C, KDM6A, KDR, KIAA1549, KIAA1598, KIF5B, KIT, KLF4, KLK2, KMT2D, KRAS, KTN1, LAF4, LASP1, LCK, LCP1, LCX, LHFP, LIFR, LMNA, LMO1, LMO2, LPP, LRIG3, LSM14A, LYL1, MAF, MAFB, MALAT1, MALT1, MAML2, MAP2K1, MAP2K2, MAP2K4, MAX, MDM2, MDM4, MDS1, MDS2, MECT1, MED12, MEN1, MET, MITF, MKL1, MLF1, MLH1, MLL, MLL3, MLLT1, MLLT10, MLLT2, MLLT3, MLLT4, MLLT6, MLLT7, MN1, MPL, MSF, MSH2, MSH6, MSI2, MSN, MTCP1, MUC1, MUTYH, MYB, MYC, MYCL1, MYCN, MYD88, MYH11, MYH9, MYO5A, MYST4, NAB2, NACA, NBS1, NCOA1, NCOA2, NCOA4, NDRG1, NF1, NF2, NFATC2, NFE2L2, NFIB, NFKB2, NIN, NKX2-1, NONO, NOTCH1, NOTCH2, NPM1, NR4A3, NRAS, NRG1, NSD1, NT5C2, NTRK1, NTRK3, NUMA1, NUP214, NUP98, NUTM2A, NUTM2B, OLIG2, OMD, P2RY8, PAFAH1B2, PALB2, PAX3, PAX5, PAX7, PAX8, PBRM1, PBX1, PCM1, PCSK7, PDE4DIP, PDGFB, PDGFRA, PDGFRB, PER1, PHF6, PHOX2B, PICALM, PIK3CA, PIK3R1, PIM1, PLAG1, PLCG1, PML, PMS1, PMS2, PMX1, PNUTL1, POT1, POU2AF1, POU5F1, PPARG, PPFIBP1, PPP2R1A, PRCC, PRDM1, PRDM16, PRF1, PRKAR1A, PSIP1, PTCH1, PTEN, PTPN11, PTPRB, PTPRC, PTPRK, PWWP2A, RAB5EP, RAC1, RAD21, RAD51L1, RAF1, RALGDS, RANBP17, RAP1GD51, RARA, RB1, RBM15, RECQL4, REL, RET, RNF43, ROS1, RPL10, RPL22, RPL5, RPN1, RSPO2, RSPO3, RUNDC2A, RUNX1, RUNXBP2, SBDS, SDC4, SDH5, SDHB, SDHC, SDHD, 42253, SET, SETBP1, SETD2, SF3B1, SFPQ, SFRS3, SH2B3, SH3GL1, SIL, SLC34A2, SLC45A3, SMAD4, SMARCA4, SMARCB1, SMARCE1, SMO, SOCS1, SOX2, SRGAP3, SRSF2, SS18, SS18L1, SSX1, SSX2, SSX4, STAG2, STAT3, STATSB, STAT6, STK11, STL, SUFU, SUZ12, SYK, TAF15, TAL1, TAL2, TBL1XR1, TCEA1, TCF1, TCF12, TCF3, TCF7L2, TCL1A, TCL6, TERT, TET2, TFE3, TFEB, TFG, TFPT, TFRC, THRAP3, TIF1, TLX1, TLX3, TMPRSS2, TNFAIP3, TNFRSF14, TNFRSF17, TOP1, TP53, TPM3, TPM4, TPR, TRA, TRAF7, TRB, TRD, TRIM27, TRIM33, TRIP11, TRRAP, TSC1, TSC2, TSHR, TTL, U2AF1, UBRS, USP6, VHL, VTI1A, WAS, WHSC1, WHSC1L1, WIF1, WRN, WT1, WWTR1, XPA, XPC, XPO1, YWHAE, ZCCHC8, ZNF145, ZNF198, ZNF278, ZNF331, ZNF384, ZNF521, ZNF9 and ZRSR2.
11. The method of claim 9, wherein the melting temperature of the target-specific primers is from about 60.5° C. to about 65.5° C.
12. The method of claim 9, wherein the gene known to have clinical relevance in oncology is BRCA1 or BRCA2.
13. The method of claim 9, wherein the gene known to have clinical relevance in oncology is TP53.
14. A method of multiplex PCR amplification of a specific target locus on a nucleic acid substrate for preparing a targeted next generation sequencing library comprising the steps of: (i) combining a plurality of target-specific primers with the nucleic acid substrate to yield a single polymerase chain reaction (PCR) reaction mixture, wherein the plurality of target-specific primers comprise a first forward primer, a second forward primer, a first reverse primer and a second reverse primer, wherein each of the first and second forward and reverse primers comprise a 3′ complementary sequence that is fully complementary to a sequence of the specific target locus and a 5′ noncomplementary sequence that is not complementary to a sequence of the nucleic acid substrate, wherein the 3′ complementary sequence for each of the first and second forward and reverse primers is different, wherein the 3′ complementary sequence is between 16 and 30 bases in length; (ii) subjecting the PCR reaction mixture to a multiplex polymerase chain reaction thereby generating at least three amplicons within the specific target locus, wherein the at least three amplicons comprise a first amplicon produced by the first forward primer and the first reverse primer, a second amplicon produced by the second forward primer and the second reverse primer, and a third amplicon produced by the second forward primer and the first reverse primer, wherein the third amplicon is shorter in length than the first and second amplicons, wherein at least a portion of the 5′ noncomplementary sequence of the second forward primer and the first reverse primer is the same such that each strand of the third amplicon comprises a 3′ sequence and a 5′ sequence that are complementary to each other, wherein the third amplicon possesses overlapping sequence with the first and second amplicons, wherein the first amplicon possesses overlapping sequence with the second amplicon, wherein when the third amplicon is denatured, each strand of the third amplicon forms a secondary structure as a result of the 3′ sequence being complementary to the 5′ sequence, wherein the secondary structure is stable during a primer annealing step of the multiplex polymerase chain reaction.
15. The method of claim 14, wherein the specific target locus is selected from the group consisting of a gene known to have relevance in oncology, a gene associated with drug resistance, inherited or infectious disease, a bacterial gene, a viral gene, and a fungal gene.
16. The method of claim 14, wherein the melting temperature of the target-specific primers is from about 60.5° C. to about 65.5° C.
17. The method of claim 14, wherein the specific target locus is a gene known to have clinical relevance in oncology.
18. The method of claim 17, wherein the gene known to have clinical relevance in oncology is selected from the group consisting of ABU, ABL1, ABL2, ACSL3, AF15Q14, AF1Q, AF3p21, AF5q31, AKAP9, AKT1, AKT2, ALDH2, ALK, ALO17, AMER1, APC, ARHGEF12, ARHH, ARID1A, ARID2, ARNT, ASPSCR1, ASXL1, ATF1, ATIC, ATM, ATP1A1, ATP2B3, ATRX, AXIN1, BAP1, BCL10, BCL11A, BCL11B, BCL2, BCL3, BCLS, BCL6, BCL7A, BCL9, BCOR, BCR, BHD, BIRC3, BLM, BMPR1A, BRAF, BRCA1, BRCA2, BRD3, BRD4, BRIP1, BTG1, BUB1B, C12orf9, C15orf21, C15orf55, C16orf75, C2orf44, CACNA1D, CALR, CAMTA1, CANT1, CARD11, CARS, CASP8, CBFA2T1, CBFA2T3, CBFB, CBL, CBLB, CBLC, CCDC6, CCNBHP1, CCND1, CCND2, CCND3, CCNE1, CD273, CD274, CD74, CD79A, CD79B, CDC73, CDH1, CDH11, CDK12, CDK4, CDK6, CDKN2A, CDKN2C, CDKN2a(p14), CDX2, CEBPA, CEP1, CEP89, CHCHD7, CHEK2, CHIC2, CHN1, CIC, CIITA, CLIP1, CLTC, CLTCL1, CMKOR1, CNOT3, COL1A1, COL2A1, COPEB, COX6C, CREB1, CREB3L1, CREB3L2, CREBBP, CRLF2, CRTC3, CSF3R, CTNNB1, CUX1, CYLD, D10S170, DAXX, DCTN1, DDB2, DDIT3, DDX10, DDX5, DDX6, DEK, DICER1, DNM2, DNMT3A, DUX4, EBF1, ECT2L, EGFR, EIF3E, EIF4A2, ELF4, ELK4, ELKS, ELL, ELN, EML4, EP300, EPS15, ERBB2, ERC1, ERCC2, ERCC3, ERCC4, ERCC5, ERG, ETV1, ETV4, ETV5, ETV6, EVIL EWSR1, EXT1, EXT2, EZH2, EZR, FACL6, FAM46C, FANCA, FANCC, FANCD2, FANCE, FANCF, FANCG, FAS, FBXO11, FBXW7, FCGR2B, FEV, FGFR1, FGFR1OP, FGFR2, FGFR3, FH, FHIT, FIP1L1, FLI1, F1127352, FLT3, FNBP1, FOXA1, FOXL2, FOXO1A, FOXO3A, FOXO4, FOXP1, FSTL3, FUBP1, FUS, FVT1, GAS7, GATA1, GATA2, GATA3, GMPS, GNA11, GNAQ, GNAS, GOLGA5, GOPC, GPC3, GPHN, GRAF, H3F3A, H3F3B, HCMOGT-1, HEAB, HERPUD1, HEY1, HIP1, HIST1H3B, HIST1H4I, HLA-A, HLF, HLXB9, HMGA1, HMGA2, HNRNPA2B1, HOOK3, HOXA11, HOXA13, HOXA9, HOXC11, HOXC13, HOXD11, HOXD13, HRAS, HSPCA, HSPCB, IDH1, IDH2, IGH\, IGK, IGL, IKZFL IL2, IL21R, IL6ST, IL7R, IRF4, IRTA1, ITK, JAKL JAK2, JAK3, JAZFL JUN, KCNJ5, KDM5A, KDM5C, KDM6A, KDR, KIAA1549, KIAA1598, KIF5B, KIT, KLF4, KLK2, KMT2D, KRAS, KTN1, LAF4, LASP1, LCK, LCP1, LCX, LHFP, LIFR, LMNA, LMO1, LMO2, LPP, LRIG3, LSM14A, LYL1, MAF, MAFB, MALAT1, MALT1, MAML2, MAP2K1, MAP2K2, MAP2K4, MAX, MDM2, MDM4, MDS1, MDS2, MECT1, MED12, MEN1, MET, MITF, MKL1, MLF1, MLH1, MLL, MLL3, MLLT1, MLLT10, MLLT2, MLLT3, MLLT4, MLLT6, MLLT7, MN1, MPL, MSF, MSH2, MSH6, MSI2, MSN, MTCP1, MUC1, MUTYH, MYB, MYC, MYCL1, MYCN, MYD88, MYH11, MYH9, MYO5A, MYST4, NAB2, NACA, NBS1, NCOA1, NCOA2, NCOA4, NDRG1, NF1, NF2, NFATC2, NFE2L2, NFIB, NFKB2, NIN, NKX2-1, NONO, NOTCH1, NOTCH2, NPM1, NR4A3, NRAS, NRG1, NSD1, NT5C2, NTRK1, NTRK3, NUMA1, NUP214, NUP98, NUTM2A, NUTM2B, OLIG2, OMD, P2RY8, PAFAH1B2, PALB2, PAX3, PAX5, PAX7, PAX8, PBRM1, PBX1, PCM1, PCSK7, PDE4DIP, PDGFB, PDGFRA, PDGFRB, PER1, PHF6, PHOX2B, PICALM, PIK3CA, PIK3R1, PIM1, PLAG1, PLCG1, PML, PMS1, PMS2, PMX1, PNUTL1, POT1, POU2AF1, POU5F1, PPARG, PPFIBP1, PPP2R1A, PRCC, PRDM1, PRDM16, PRF1, PRKAR1A, PSIP1, PTCH1, PTEN, PTPN11, PTPRB, PTPRC, PTPRK, PWWP2A, RAB5EP, RAC1, RAD21, RAD51L1, RAF1, RALGDS, RANBP17, RAP1GD51, RARA, RB1, RBM15, RECQL4, REL, RET, RNF43, ROS1, RPL10, RPL22, RPL5, RPN1, RSPO2, RSPO3, RUNDC2A, RUNX1, RUNXBP2, SBDS, SDC4, SDH5, SDHB, SDHC, SDHD, 42253, SET, SETBP1, SETD2, SF3B1, SFPQ, SFRS3, SH2B3, SH3GL1, SIL, SLC34A2, SLC45A3, SMAD4, SMARCA4, SMARCB1, SMARCE1, SMO, SOCS1, SOX2, SRGAP3, SRSF2, SS18, SS18L1, SSX1, SSX2, SSX4, STAG2, STAT3, STATSB, STAT6, STK11, STL, SUFU, SUZ12, SYK, TAF15, TAL1, TAL2, TBL1XR1, TCEA1, TCF1, TCF12, TCF3, TCF7L2, TCL1A, TCL6, TERT, TET2, TFE3, TFEB, TFG, TFPT, TFRC, THRAP3, TIF1, TLX1, TLX3, TMPRSS2, TNFAIP3, TNFRSF14, TNFRSF17, TOP1, TP53, TPM3, TPM4, TPR, TRA, TRAF7, TRB, TRD, TRIM27, TRIM33, TRIP11, TRRAP, TSC1, TSC2, TSHR, TTL, U2AF1, UBRS, USP6, VHL, VTI1A, WAS, WHSC1, WHSC1L1, WIF1, WRN, WT1, WWTR1, XPA, XPC, XPO1, YWHAE, ZCCHC8, ZNF145, ZNF198, ZNF278, ZNF331, ZNF384, ZNF521, ZNF9 and ZRSR2.
19. The method of claim 17, wherein the gene known to have clinical relevance in oncology is BRCA1 or BRCA2.
20. The method of claim 17, wherein the gene known to have clinical relevance in oncology is TP53.
Description
BRIEF DESCRIPTION OF THE FIGURES
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)
(54)
(55)
(56)
(57)
(58)
(59)
(60)
(61)
(62)
(63)
DETAILED DESCRIPTION OF THE INVENTION
(64) In one aspect, the invention describes a highly efficient method of adapter ligation to the ends of fragmented double-stranded DNA molecules. Such DNA molecules are referred to herein as “substrate molecules.” In one aspect, the method comprises a single incubation that includes (1) annealing of a 5′ adapter to a pre-existing 3′ overhang on a substrate molecule, preferably a 3′ adapter, (2) removal of a damaged base from the 5′-termini of the substrate molecules, which enables (3) efficient ligation of the 5′ adapter to the exposed 5′-phosphate of the substrate molecules. In another aspect, the method comprises two incubations, where in the first incubation a 3′ adapter is ligated to the substrate molecule, and in the second incubation the 5′ adapter is ligated to the substrate molecule, as described above (see
(65) In another aspect, disclosure describes a highly efficient method of multiplex amplicon NGS library preparation. In one aspect, the method allows synthesis and amplification of multiple overlapping amplicons in a single tube. In another aspect, it describes a novel, highly efficient method of adapter ligation to the ends of PCR amplicons that is free of chimeric amplicons and adapter-dimers. In one aspect, it allows incorporation of unique degenerate sequence tags to identify individual amplicons. In another aspect, the method comprises a single incubation that includes degradation of the 5′ termini of the amplicons followed by simultaneous ligation of the second adapter B and linker-mediated ligation of the remainder of the 1st adapter A to the substrate amplicons. In various embodiments, the disclosure further provides methods that comprise additional steps that occur prior to the ligation step, including: (i) a multiplexed PCR reaction (ii) a purification step, and (iii) a universal single primer amplification step. Alternatively, additional steps that occur prior to the ligation step include: (i) a combined multiplex PCR reaction with universal single primer amplification, followed by (ii) a purification step. Various options of the steps are contemplated by the disclosure, and are discussed in further detail below.
(66) The term “reaction conditions” or “standard reaction conditions” as used herein means conditions according to manufacturer's instructions. It is understood that all enzymes herein disclosed are used under standard reaction conditions, unless indicated otherwise. The term “first polynucleotide” as used herein is used interchangeably with “3′ adapter,” “first adapter,” or “Adapter A” and the term “second polynucleotide” as used herein is used interchangeably with “5′ adapter,” “second adapter” or “Adapter B.” In certain instances, when Adapter A is used in reference to IonTorrent™ technology, e.g.,
(67) A “3′ adapter” as used herein ligates to a 3′ end of a substrate molecule, and a “5′ adapter” ligates to a 5′ end of a substrate molecule.
(68) As used herein, a “damaged” 5′ terminus is one that lacks a 5′ phosphate.
(69) As used herein, a “processed” substrate molecule is one to which a 5′ adapter has been attached.
(70) As used herein, a “high fidelity polymerase” is one that possesses 3′-5′ exonuclease (i.e., proofreading) activity.
(71) The term “tolerant,” as used herein, refers to a property of a polymerase that can extend through a template containing a cleavable base (e.g., uracil, inosine, and RNA).
(72) As used herein, the term “asymmetric” refers to a double stranded molecule with both adapters at both termini instead of a single adapter at both termini. Thus, the asymmetry arises from the fact that both adapters are largely non-complementary to each other and have single stranded portions.
(73) As used herein, a “universal primer” is an oligonucleotide used in an amplification reaction to incorporate a universal adapter sequence. A “universal adapter” as used herein is a portion of the amplification product that corresponds to the universal primer sequence and its reverse complement.
(74) It will be understood that a modification that decreases the binding stability of two nucleic acids includes, but is not limited to a nucleotide mismatch, a deoxyinosine, an inosine or a universal base.
(75) It will also be understood that a modification that increases the binding stability of two nucleic acids includes, but is not limited to a locked nucleic acid (LNA), spermine and spermidine or other polyamines, and cytosine methylation.
(76) As used herein, the term “universal base” is one that can base pair with all four naturally occurring bases without hydrogen bonding and is less destabilizing than a mismatch, and includes but is not limited to 5′ nitroindole.
(77) A “molecular identification tag” as used herein is anywhere between 4 and 16 bases in length where the optimal length is between 8 and 12 degenerate N bases.
Substrate Molecule
(78) It is contemplated that a substrate molecule is obtained from a naturally occurring source or it can be synthetic. The naturally occurring sources include but are not limited to genomic DNA, cDNA, DNA produced by whole genome amplification, primer extension products comprising at least one double-stranded terminus, and a PCR amplicon. The naturally occurring source is, in various embodiments, a prokaryotic source or a eukaryotic source. For example and without limitation, the source can be a human, mouse, virus, plant or bacteria or a mixture comprising a plurality of genomes.
(79) As used herein, an “amplicon” is understood to mean a portion of a polynucleotide that has been synthesized using amplification techniques.
(80) If the source of the substrate molecule is genomic DNA, it is contemplated that in some embodiments the genomic DNA is fragmented. Fragmenting of genomic DNA is a general procedure known to those of skill in the art and is performed, for example and without limitation in vitro by shearing (nebulizing) the DNA, cleaving the DNA with an endonuclease, sonicating the DNA, by heating the DNA, by irradiation of DNA using alpha, beta, gamma or other radioactive sources, by light, by chemical cleavage of DNA in the presence of metal ions, by radical cleavage and combinations thereof. Fragmenting of genomic DNA can also occur in vivo, for example and without limitation due to apoptosis, radiation and/or exposure to asbestos. According to the methods provided herein, a population of substrate molecules is not required to be of a uniform size. Thus, the methods of the disclosure are effective for use with a population of differently-sized substrate polynucleotide fragments.
(81) The substrate molecule, as disclosed herein, is at least partially double stranded and comprises a 3′ overhang (see
(82) Some applications of the current invention involve attachment of adapter sequences not to original or native double stranded DNA substrate molecules but to a double stranded DNA produced by primer extension synthesis. One example of such an application is a DNA library produced by (a) attachment of an oligonucleotide comprising a primer-binding sequence to the 3′ end of single-stranded or double-stranded DNA to enable primer extension, (b) extension of the primer annealed to the oligonucleotide, and (c) attachment of the 3′ and 5′ adapters to the double-stranded DNA ends produced by the primer-extension.
(83) The length of either a double-stranded portion or a single-stranded portion of a substrate molecule is contemplated to be between about 3 and about 1×10.sup.6 nucleotides. In some aspects, the length of the substrate molecule is between about 10 and about 3000 nucleotides, or between about 40 and about 2000 nucleotides, or between about 50 and about 1000 nucleotides, or between about 100 and about 500 nucleotides, or between about 1000 and about 5000 nucleotides, or between about 10,000 and 50,000 nucleotides, or between about 100,000 and 1×106 nucleotides. In further aspects, the length of the substrate molecule is at least 3 and up to about 50, 100 or 1000 nucleotides; or at least 10 and up to about 50, 100 or 1000 nucleotides; or at least 100 and up to about 1000, 5000 or 10000 nucleotides; or at least 1000 and up to about 10000, 20000 and 50000; or at least 10000 and up to about 20000, 50000 and 100,000 nucleotides; or at least 20000 and up to about 100,000, 200,000 or 500,000 nucleotides; or at least 200,000 and up to about 500,000, 700,000 or 1,000,000 nucleotides. In various aspects, the length of the substrate molecule is about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 41, about 42, about 43, about 44, about 45, about 46, about 47, about 48, about 49, about 50, about 51, about 52, about 53, about 54, about 55, about 56, about 57, about 58, about 59, about 60, about 61, about 62, about 63, about 64, about 65, about 66, about 67, about 68, about 69, about 70, about 71, about 72, about 73, about 74, about 75, about 76, about 77, about 78, about 79, about 80, about 81, about 82, about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91, about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, about 200, about 210, about 220, about 230, about 240, about 250, about 260, about 270, about 280, about 290, about 300, about 310, about 320, about 330, about 340, about 350, about 360, about 370, about 380, about 390, about 400, about 410, about 420, about 430, about 440, about 450, about 460, about 470, about 480, about 490, about 500, about 510, about 520, about 530, about 540, about 550, about 560, about 570, about 580, about 590, about 600, about 610, about 620, about 630, about 640, about 650, about 660, about 670, about 680, about 690, about 700, about 710, about 720, about 730, about 740, about 750, about 760, about 770, about 780, about 790, about 800, about 810, about 820, about 830, about 840, about 850, about 860, about 870, about 880, about 890, about 900, about 910, about 920, about 930, about 940, about 950, about 960, about 970, about 980, about 990, about 1000, about 1100, about 1200, about 1300, about 1400, about 1500, about 1600, about 1700, about 1800, about 1900, about 2000, about 2100, about 2200, about 2300, about 2400, about 2500, about 2600, about 2700, about 2800, about 2900, about 3000, about 3100, about 3200, about 3300, about 3400, about 3500, about 3600, about 3700, about 3800, about 3900, about 4000, about 4100, about 4200, about 4300, about 4400, about 4500, about 4600, about 4700, about 4800, about 4900, about 5000, 10,000, 15,000, 20,000, 50,000, 100,000, 150,000, 200,000, 250,000, 300,000, 350,000, 400,000, 450,000, 500,000, 550,000, 600,000, 650,000, 700,000, 750,000, 800,000, 850,000, 900,000, 950,000, 1,000,000 or more nucleotides.
Amplicon Molecules
(84) As used herein, an “amplicon” is understood to mean a portion of a polynucleotide that has been synthesized using amplification techniques.
(85) The length of an amplicon is contemplated to be between about 10 bp to 175 bp, where the desired amplicon size is significantly shorter than circulating cell-free DNA fragments (˜165 bp) and small enough in size as to not span formalin-induced cross linked DNA from preserved samples, ideally <150 bp in length. It is contemplated the amplicon can be 15 bp, 20 bp, 25 bp, 30 bp, 35 bp, 40 bp, 45 bp, 50 bp, 51 bp, 52 bp, 53 bp, 54 bp, 55 bp, 56 bp, 57 bp, 58 bp, 59 bp, 60 bp, 61 bp, 62 bp, 63 bp, 64 bp, 65 bp, 66 bp, 67 bp, 68 bp, 69 bp, 70 bp, 71 bp, 72 bp, 73 bp, 74 bp, 75 bp, 76 bp, 77 bp, 78 bp, 79 bp, 80 bp, 81 bp, 82 bp, 83 bp, 84 bp, 85 bp, 86 bp, 87 bp, 88 bp, 89 bp, 90 bp, 91 bp, 92 bp, 93 bp, 94 bp, 95 bp, 96 bp, 97 bp, 98 bp, 99 bp, 100 bp, 101 bp, 102 bp, 103 bp, 104 bp, 105 bp, 106 bp, 107 bp, 108 bp, 109 bp, 110 bp, 111 bp, 112 bp, 113 bp, 114 bp, 115 bp, 116 bp, 117 bp, 118 bp, 119 bp, 120 bp, 121 bp, 122 bp, 123 bp, 124 bp, 125 bp, 126 bp, 127 bp, 128 bp, 129 bp, 130 bp, 131 bp, 132 bp, 133 bp, 134 bp, 135 bp, 136 bp, 137 bp, 138 bp, 139 bp, 140 bp, 141 bp, 142 bp, 143 bp, 144 bp, 145 bp, 146 bp, 147 bp, 148 bp, 149 bp, 150 bp, 151 bp, 152 bp, 153 bp, 154 bp, 155 bp, 156 bp, 157 bp, 158 bp, 159 bp, 160 bp, 161 bp, 162 bp, 163 bp, 164 bp, 165 bp, 166 bp, 167 bp, 168 bp, 169 bp, 170 bp, 171 bp, 172 bp, 173 bp, 174 bp, 175 bp or more in length.
(86) Alternatively, for longer reads, particularly for long read sequence technologies capable of providing multi-kilobase reads that provide haplotyping information or span repetitive or other difficult sequences (PacBio), amplicon length is contemplated to be between 150 bp to 150,000 bp or more in length, when high molecular weight DNA is utilized as the input DNA for the amplification reaction. It is contemplated the amplicon can be 150 bp, 200 bp, 300 bp, 400 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1,000 bp, 2,000 bp, 3,000 bp, 4,000 bp, 5,000 bp, 6,000 bp, 7,000 bp, 8,000 bp, 9,000 bp, 10,000 bp, 11,000 bp, 12,000 bp, 13,000 bp, 14,000 bp, 15,000 bp, 16,000 bp, 17,000 bp, 18,000 bp, 19,000 bp, 20,000 bp, 30,000 bp, 40,000 bp, 50,000 bp, 100,000 bp, 150,000 bp or more in length.
(87) In any of the methods disclosed herein, it is contemplated that the target loci chosen for multiplexed amplification correspond to any of a variety of applications, including but not limited to oncology specific targets, drug resistance specific targets, drug metabolism and absorption targets (e.g. CYP2D6), targets for inherited disease (e.g. cystic fibrosis CFTR gene, Lynch syndrome MLH1, MSH2, MSH6, PMS2 and EPCAM genes) targets from infectious pathogens, targets for pathogen host loci, species-specific targets, and any clinically actionable targets. In one aspect, the target loci are chosen from a set of oncology targets including but not limited to BRAF, KRAS, EGFR, KIT, HRAS, NRAS, MET, RET, GNA11, GNAQ, NOTCH1, ALK, PIK3CA, JAK2, AKT1, DNMT3A, IDH2, ERBB2 and TP53. In another aspect, the oncology targets include 400-600 genes, including but not limited to the following subset of genes: ACURL1, AKT1, APC, APEX1, AR, ATM, ATP11B, BAP1, BCL2L1, BCL9, BIRC2, BIRC3, BRCA1, BRCA2, CCND1, CCNE1, CD274, CD44, CDH1, CDK4, CDK6, CDKN2A, CSNK2A1, DCON1D1, EGFR, ERBB2, FBXW7, FGFR1, FGFR2, FGFR3, FGFR4, FLT3, GAS6, GATA3, IGF1R, IL6, KIT, KRAS, MCL1, MDM2, MET, MSH2, MYC, MYCL, MYCN, MYO18A, NF1, NF2, NKX2-1, NKX2-8, NOTCH1, PDCD1LG2, PDGFRA, PIK3CA, PIK3R1, PNP, PPARG, PTCH1, PTEN, RB1, RPS6KB1, SMAD4, SMARCB1, SOX2, STK11, TERT, TET2, TIAF1, TP53, TSC1, TSC2, VHL, WT1 and ZNF217. In further embodiments, the target loci are chosen from a subset of genes known to have clinical relevance in oncology, including but not limited to ABI1, ABL1, ABL2, ACSL3, AF15Q14, AF1Q, AF3p21, AF5q31, AKAP9, AKT1, AKT2, ALDH2, ALK, ALO17, AMER1, APC, ARHGEF12, ARHH, ARID1A, ARID2, ARNT, ASPSCR1, ASXL1, ATF1, ATIC, ATM, ATP1A1, ATP2B3, ATRX, AXIN1, BAP1, BCL10, BCL11A, BCL11B, BCL2, BCL3, BCLS, BCL6, BCL7A, BCL9, BCOR, BCR, BHD, BIRC3, BLM, BMPR1A, BRAF, BRCA1, BRCA2, BRD3, BRD4, BRIP1, BTG1, BUB1B, C12orf9, C15orf21, C15orf55, C16orf75, C2orf44, CACNA1D, CALR, CAMTA1, CANT1, CARD11, CARS, CASP8, CBFA2T1, CBFA2T3, CBFB, CBL, CBLB, CBLC, CCDC6, CCNBHP1, CCND1, CCND2, CCND3, CCNE1, CD273, CD274, CD74, CD79A, CD79B, CDC73, CDH1, CDH11, CDK12, CDK4, CDK6, CDKN2A, CDKN2C, CDKN2a(p14), CDX2, CEBPA, CEP1, CEP89, CHCHD7, CHEK2, CHIC2, CHN1, CIC, CIITA, CLIP1, CLTC, CLTCL1, CMKOR1, CNOT3, COL1A1, COL2A1, COPEB, COX6C, CREB1, CREB3L1, CREB3L2, CREBBP, CRLF2, CRTC3, CSF3R, CTNNB1, CUX1, CYLD, D10S170, DAXX, DCTN1, DDB2, DDIT3, DDX10, DDX5, DDX6, DEK, DICER1, DNM2, DNMT3A, DUX4, EBF1, ECT2L, EGFR, EIF3E, EIF4A2, ELF4, ELK4, ELKS, ELL, ELN, EML4, EP300, EPS15, ERBB2, ERC1, ERCC2, ERCC3, ERCC4, ERCC5, ERG, ETV1, ETV4, ETV5, ETV6, EVI1, EWSR1, EXT1, EXT2, EZH2, EZR, FACL6, FAM46C, FANCA, FANCC, FANCD2, FANCE, FANCF, FANCG, FAS, FBX011, FBXW7, FCGR2B, FEV, FGFR1, FGFR1OP, FGFR2, FGFR3, FH, FHIT, FIP1L1, FLI1, FLJ27352, FLT3, FNBP1, FOXA1, FOXL2, FOXO1A, FOXO3A, FOXO4, FOXP1, FSTL3, FUBP1, FUS, FVT1, GAS7, GATA1, GATA2, GATA3, GMPS, GNA11, GNAQ, GNAS, GOLGA5, GOPC, GPC3, GPHN, GRAF, H3F3A, H3F3B, HCMOGT-1, HEAB, HERPUD1, HEY1, HIP1, HIST1H3B, HIST1H4I, HLA-A, HLF, HLXB9, HMGA1, HMGA2, HNRNPA2B1, HOOK3, HOXA11, HOXA13, HOXA9, HOXC11, HOXC13, HOXD11, HOXD13, HRAS, HSPCA, HSPCB, IDH1, IDH2, IGH\, IGK, IGL, IKZF1, IL2, IL21R, IL6ST, IL7R, IRF4, IRTA1, ITK, JAKL JAK2, JAK3, JAZFL JUN, KCNJ5, KDM5A, KDM5C, KDM6A, KDR, KIAA1549, KIAA1598, KIF5B, KIT, KLF4, KLK2, KMT2D, KRAS, KTN1, LAF4, LASP1, LCK, LCP1, LCX, LHFP, LIFR, LMNA, LMO1, LMO2, LPP, LRIG3, LSM14A, LYL1, MAF, MAFB, MALAT1, MALT1, MAML2, MAP2K1, MAP2K2, MAP2K4, MAX, MDM2, MDM4, MDS1, MDS2, MECT1, MED12, MEN1, MET, MITF, MKL1, MLF1, MLH1, MLL, MLL3, MLLT1, MLLT10, MLLT2, MLLT3, MLLT4, MLLT6, MLLT7, MN1, MPL, MSF, MSH2, MSH6, MSI2, MSN, MTCP1, MUC1, MUTYH, MYB, MYC, MYCL1, MYCN, MYD88, MYH11, MYH9, MYO5A, MYST4, NAB2, NACA, NBS1, NCOA1, NCOA2, NCOA4, NDRG1, NF1, NF2, NFATC2, NFE2L2, NFIB, NFKB2, NIN, NKX2-1, NONO, NOTCH1, NOTCH2, NPM1, NR4A3, NRAS, NRG1, NSD1, NT5C2, NTRK1, NTRK3, NUMA1, NUP214, NUP98, NUTM2A, NUTM2B, OLIG2, OMD, P2RY8, PAFAH1B2, PALB2, PAX3, PAX5, PAX7, PAX8, PBRM1, PBX1, PCM1, PCSK7, PDE4DIP, PDGFB, PDGFRA, PDGFRB, PER1, PHF6, PHOX2B, PICALM, PIK3CA, PIK3R1, PIM1, PLAG1, PLCG1, PML, PMS1, PMS2, PMX1, PNUTL1, POT1, POU2AF1, POU5F1, PPARG, PPFIBP1, PPP2R1A, PRCC, PRDM1, PRDM16, PRF1, PRKAR1A, PSIP1, PTCH1, PTEN, PTPN11, PTPRB, PTPRC, PTPRK, PWWP2A, RAB5EP, RAC1, RAD21, RAD51L1, RAF1, RALGDS, RANBP17, RAP1GD51, RARA, RB1, RBM15, RECQL4, REL, RET, RNF43, ROS1, RPL10, RPL22, RPL5, RPN1, RSPO2, RSPO3, RUNDC2A, RUNX1, RUNXBP2, SBDS, SDC4, SDH5, SDHB, SDHC, SDHD, 42253, SET, SETBP1, SETD2, SF3B1, SFPQ, SFRS3, SH2B3, SH3GL1, SIL, SLC34A2, SLC45A3, SMAD4, SMARCA4, SMARCB1, SMARCE1, SMO, SOCS1, SOX2, SRGAP3, SRSF2, SS18, SS18L1, SSX1, SSX2, SSX4, STAG2, STAT3, STATSB, STAT6, STK11, STL, SUFU, SUZ12, SYK, TAF15, TAL1, TAL2, TBL1XR1, TCEA1, TCF1, TCF12, TCF3, TCF7L2, TCL1A, TCL6, TERT, TET2, TFE3, TFEB, TFG, TFPT, TFRC, THRAP3, TIF1, TLX1, TLX3, TMPRSS2, TNFAIP3, TNFRSF14, TNFRSF17, TOP1, TP53, TPM3, TPM4, TPR, TRA, TRAF7, TRB, TRD, TRIM27, TRIM33, TRIP11, TRRAP, TSC1, TSC2, TSHR, TTL, U2AF1, UBRS, USP6, VHL, VTI1A, WAS, WHSC1, WHSC1L1, WIF1, WRN, WT1, WWTR1, XPA, XPC, XPO1, YWHAE, ZCCHC8, ZNF145, ZNF198, ZNF278, ZNF331, ZNF384, ZNF521, ZNF9 and ZRSR2.
(88) In another aspect, the targets are specific to drug resistance loci, including loci conferring resistance to tyrosine kinase inhibitors used as targeted anti-tumor agents, other targeted loci related to targeted anti-tumor agents, antibiotic resistance loci, and anti-viral resistance loci.
(89) In another aspect, detection of enteric, blood-borne, CNS, respiratory, sexually transmitted, and urinary tract pathogens including bacteria, fungi, yeasts, viruses, or parasites can be performed. Pathogens causing infections of the ear, dermis, or eyes could also be detected. Differentiation between pathovars of bacteria or viruses could be conducted as well as genes promoting antibiotic resistance or encoding toxins.
(90) The types of genetic lesions that can be detected from sequence analysis of the resulting amplicons include SNV (single nucleotide variants), point mutations, transitions, transversions, nonsense mutations, missense mutations, single base insertions and deletions, larger insertions and deletions that map between a primer pair, known chromosomal rearrangements such as translocations, gene fusions, deletions, insertions where primer pairs are designed to flank the breakpoint of such known rearrangements; copy number variations that include amplification events, deletions and loss of heterozygosity (LOH), aneuploidy, uniparental disomies, and other inherited or acquired chromosomal abnormalities. In addition, if bisulfite conversion is performed prior to multiplexed PCR and primers are designed to bisulfite converted DNA and optionally do not overlap with CpG dinucleotides which can result in various modified sequence states making primer design more difficult, methylation changes can also be detected using the disclosed method.
(91) For amplification of the target loci, the optimal length of the 3′ target-specific portion of the primer is between 15 and 30 bases but not limited to this range, where the target-specific portion of the primer is 5 to 50 bases or 10 to 40 bases or any length in between. The desired Tm defined at 2.5 mM Mg.sup.2+, 50 mM NaCl and 0.25 μM of oligonucleotides is 63° C., where variation in Tm among multiplexed primers is not more than ±2.5° C. to ensure even amplification under fixed reaction conditions. Desired GC content of the target-specific portion of the primers is ideally 50% but can vary between 30% and 70%. The target-specific primers are designed to avoid overlap with repetitive, non-unique sequences or common SNP polymorphisms or known mutations for the condition being assayed, in order to ensure specific, unbiased amplification from DNA samples from diverse genetic backgrounds. Additionally, target-specific targets and complementary primer designs should not be subject to secondary structure formation which would reduce performance.
(92) The universal primer comprises cleavable bases including but not limited to deoxyuridine, deoxyinosine or RNA, and can contain one, two, three, four, five or more cleavable bases. Additionally, the target-specific primers and the universal primer comprise 1, 2, 3, 4 or more nuclease resistant moieties at their 3′ termini.
Adapter Molecule
(93) The disclosure contemplates the use of a 5′ adapter and a 3′ adapter (see
(94) In further embodiments, the 5′ adapter is single stranded. In embodiments wherein the 5′ adapter hybridizes to oligonucleotide 1 of the 3′ adapter, it is contemplated in further embodiments that such annealing results in either a nick, gap or in an overlapping base or bases between the 5′ adapter and the substrate molecule (see
(95) The disclosure also contemplates the use of a universal adapter incorporated by PCR, a single stranded 5′ adapter and the remainder of a 3′ adapter that is ligated to one strand of the universal adapter on partially processed amplicon substrates. According to the disclosure, ligation of the remainder of the 3′ adapter is mediated by a linker. For the linker molecule, any length complementary to the universal adapter and the remainder of the 3′ adapter is contemplated as long as the three oligonucleotides are capable of annealing to each other under standard reaction conditions. Thus, the complementarity is such that they can anneal to each other. In various embodiments, the complementarity is from about 70%, 75%, 80%, 85%, 90%, 95% to about 100%, or from about 70%, 75%, 80%, 85%, 90%, to about 95%, or from about 70%, 75%, 80%, 85% to about 90%.
(96) In further embodiments, the 5′ adapter is single stranded. In embodiments wherein the 5′ adapter hybridizes to the 3′ overhang of the universal adapter on the amplicon termini, it is contemplated in further embodiments that such annealing results in either a nick or gap between the 5′ adapter and the amplicon substrate. In various embodiments, the gap is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 bases in length.
(97) The length of either a universal adapter, 5′ adapter B or remainder of the 3′ adapter A is contemplated to be between about 5 and about 200 nucleotides. In some aspects, the length of the universal adapter, 5′ adapter or the 3′ adapter is between about 5 and about 200 nucleotides, or between about 5 and about 150 nucleotides, or between about 5 and about 100 nucleotides, or between about 5 and about 50 nucleotides, or between about 5 and about 25 nucleotides, or between about 10 and 200 nucleotides, or between about 10 and 100 nucleotides. In further aspects, the length of the 5′ adapter or the 3′ adapter is at least 5 and up to about 50, 100 or 200 nucleotides; or at least 10 and up to about 50, 100 or 200 nucleotides; or at least 15 and up to about 50, 100, or 200 nucleotides; or at least 20 and up to about 50, 100 or 200 nucleotides; or at least 30 and up to about 50, 100 or 200 nucleotides; or at least 40 and up to about 50, 100 or 200 nucleotides. In various aspects, the length of the substrate molecule is about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 41, about 42, about 43, about 44, about 45, about 46, about 47, about 48, about 49, about 50, about 51, about 52, about 53, about 54, about 55, about 56, about 57, about 58, about 59, about 60, about 61, about 62, about 63, about 64, about 65, about 66, about 67, about 68, about 69, about 70, about 71, about 72, about 73, about 74, about 75, about 76, about 77, about 78, about 79, about 80, about 81, about 82, about 83, about 84, about 85, about 86, about 87, about 88, about 89, about 90, about 91, about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1000, about 1100, about 1200, about 1300, about 1400, about 1500, about 1600, about 1700, about 1800, about 1900, about 2000, about 2100, about 2200, about 2300, about 2400, about 2500, about 2600, about 2700, about 2800, about 2900, about 3000, about 3100, about 3200, about 3300, about 3400, about 3500, about 3600, about 3700, about 3800, about 3900, about 4000, about 4100, about 4200, about 4300, about 4400, about 4500, about 4600, about 4700, about 4800, about 4900, about 5000, about 5100, about 5200, about 5300, about 5400, about 5500, about 5600, about 5700, about 5800, about 5900, about 6000, about 6100, about 6200, about 6300, about 6400, about 6500, about 6600, about 6700, about 6800, about 6900, about 7000, about 7100, about 7200, about 7300, about 7400, about 7500, about 7600, about 7700, about 7800, about 7900, about 8000, about 8100, about 8200, about 8300, about 8400, about 8500, about 8600, about 8700, about 8800, about 8900, about 9000, about 9100, about 9200, about 9300, about 9400, about 9500, about 9600, about 9700, about 9800, about 9900, about 10000, about 10500, about 11000, about 11500, about 12000, about 12500, about 13000, about 13500, about 14000, about 14500, about 15000, about 15500, about 16000, about 16500, about 17000, about 17500, about 18000, about 18500, about 19000, about 19500, about 20000, about 20500, about 21000, about 21500, about 22000, about 22500, about 23000, about 23500, about 24000, about 24500, about 25000, about 25500, about 26000, about 26500, about 27000, about 27500, about 28000, about 28500, about 29000, about 29500, about 30000, about 30500, about 31000, about 31500, about 32000, about 32500, about 33000, about 33500, about 34000, about 34500, about 35000, about 35500, about 36000, about 36500, about 37000, about 37500, about 38000, about 38500, about 39000, about 39500, about 40000, about 40500, about 41000, about 41500, about 42000, about 42500, about 43000, about 43500, about 44000, about 44500, about 45000, about 45500, about 46000, about 46500, about 47000, about 47500, about 48000, about 48500, about 49000, about 49500, about 50000, about 60000, about 70000, about 80000, about 90000, about 100000 or more nucleotides in length.
(98) To complete NGS adapter ligation, the universal adapter primer additionally comprises modified bases and/or linkages that can be destroyed enzymatically, chemically or physically. Modifications include but are not limited to dU-bases, deoxyinosine and RNA bases. Annealing of the single-stranded 5′ adapter to the 3′ overhang of the amplicons occurs as result of degradation of one strand of the universal adapter that corresponds to the incorporated universal primer with cleavable bases. In some embodiments, degradation is achieved enzymatically, more specifically, by using uracil-DNA glycosylase (UDG), or a combination of UDG and apurinic/apyrimidinic endonuclease if the oligonucleotide contains deoxyuracil bases, or by endonuclease V if the oligonucleotide contains deoxyinosine bases. Degradation can also be performed by incubation with RNase H1 or RNase H2 if the incorporated primer contains RNA bases. In some applications, degradation of the incorporated primer can be performed chemically or physically, for example, by light. Alternatively, the 3′ overhang of the amplicon can be produced by limited exonuclease digestion of the 5′ end of the amplicon. Such limited digestion can be achieved enzymatically, more specifically, by using T7 Gene 6 exonuclease or lambda 5′.fwdarw.3′exonuclease if the primer oligonucleotide contains nuclease-resistant base(s) at the 3′ end, specifically, a base(s) with phosphorothioate linkage. In this case, the exonuclease reaction stops at the modified base and produces a 3′ overhang.
Method—Steps
(99) The first three incubations of the method are pre-ligation steps, and include (i) dephosphorylation, (ii) polishing and (iii) optional adenylation. The remaining 2 incubations of the method include (1) 3′ adapter ligation, and (2) 5′ adapter ligation which comprises (a) 5′ adapter annealing (b) removal of the 5′ base from the substrate molecule and (c) 5′ adapter ligation (see
(100) Within the amplification reaction, the number of multiplexed cycles is limited to a minimum of 2 or can be performed as 3 cycles, 4 cycles, 5 cycles or more, up to N cycles prior to switching to the non-multiplexed universal adapter single primer amplification. The number of universal cycles can be varied from 1 cycle to 40 or more cycles, depending on the DNA input and desired library yield. Following multiplex PCR amplification, a purification step is performed, then the simultaneous adapter ligation step is performed.
Pre-Ligation Steps
(I) Dephosphorylation
(101) Prior to adapter ligation, the DNA ends are optionally processed to improve efficiency of the adapter ligation reaction. DNA end processing in existing methods typically uses two enzymatic reactions: (a) incubation with a proofreading DNA polymerase(s) to polish DNA ends by removing the 3′-overhangs and filling-in the recessed 3′ ends and (b) incubation with a polynucleotide kinase to add a phosphate group to the 5′ termini. When processing DNA ends some methods also adenylate blunt-ended DNA at the 3′ termini by incubation of polished DNA with a non-proofreading DNA polymerase. Adenylation helps to prevent DNA self-ligation and formation of chimeric products. It also minimizes formation of adapter-dimers due to the presence of dT at the 3′ end of corresponding adapters. The current invention addresses these issues in a completely different way. Rather than adding a phosphate group to the 5′ ends of the DNA fragments, the method of the invention implements an optional complete removal of the phosphate group from the 5′ ends of the DNA fragments. Dephosphorylation of DNA ends is achieved by incubation of DNA fragments with an enzyme capable of removing a phosphate from a DNA terminus. Examples of enzymes useful in the methods of the disclosure to remove a 5′ or a 3′ phosphate include, but are not limited to, any phosphatase enzyme, such as calf intestinal alkaline phosphatase, bacterial alkaline phosphatase, shrimp alkaline phosphatase, Antarctic phosphatase, and placental alkaline phosphatase, each used according to standard conditions.
(ii) Polishing
(102) After removal of the alkaline phosphatase or its inactivation by heat, DNA substrate molecules are optionally subjected to incubation with a proofreading DNA polymerase in the presence of dNTPs to create blunt ends. The reactions are performed according to standard conditions. Dephosphorylated and polished DNA fragments are good substrates for attachment of the 3′ adapter but they are poor substrates for DNA fragment concatamer ligation and chimera formation. They are also poor substrates for ligation of a conventional adapter.
(103) In some applications of the current invention, 5′ end dephosphorylation by a phosphatase enzyme can be omitted but the addition of an enzyme such as T4 polynucleotide kinase to the DNA polishing mix is preferable in this case to assure removal of the phosphate group from the 3′ termini prior to DNA polishing. Alternatively, the first two pre-ligation reactions described above, dephosphorylation and polishing, can be executed in any order and result in blunt-ended, double-stranded DNA lacking 5′ phosphate groups at their termini.
(iii) Adenylation
(104) The current invention also contemplates the use of adenylation of the 3′ terminus of the blunt-end DNA fragments using DNA polymerases with non-template polymerase activity including but not limited to (exo-) Klenow fragment of DNA polymerase I, and Taq DNA polymerase. Both alkaline phosphatase treatment and adenylation reduce the propensity of DNA fragment self-ligation and formation of chimeric library molecules. In the case of including an adenylation step, the 3′ adapter used in the subsequent step would require a single T overhang.
Ligation Steps
(1) 3′ Adapter Ligation, or, Generation of a Single-Stranded 3′ Overhang on DNA Substrates
(105) The options are depicted in
Option 1a: 3′ Blocked Oligonucleotide 2 as Part of a Double Stranded 3′ Adapter (FIG. 7a)
(106) Existing NGS library preparation protocols rely on ligation between the 3′OH group of the adapter and the 5′ phosphate group at the termini of the DNA fragments. For this reason, adapters used in conventional methods typically have one functional double-stranded end with a 3′ hydroxyl group and optional 5′ phosphate group (see
Option 1b: 3′ Hydroxyl Oligonucleotide 2 as Part of a Double Stranded 3′ Adapter (FIG. 7a)
(107) In an alternative method, a 3′-adapter that is lacking a blocked, unligatable base at the 3′ terminus of oligonucleotide 2 can be used. Ligation of a non-blocked oligonucleotide 2 to the substrate molecule will still be prevented by the lack of 5′ phosphate on the substrate molecule as a result of the dephosphorylation reaction. The advantage of using a non-blocked oligonucleotide 2 is that the 3′ end of oligonucleotide 2 can be extended by a single base using a dideoxy nucleotide mix and a DNA polymerase capable nick-translation DNA synthesis. This enables an alternate method to perform 5′ base excision from the substrate molecule, see subsequent steps described below. The disadvantage of using a non-blocked 3′-adapter is the creation of adapter-dimers during the ligation reaction which reduces adapter concentration and as a result, may decrease adapter ligation efficiency. Also for this option, oligonucleotide 2 additionally comprises modified bases and/or linkages that can be destroyed enzymatically, chemically or physically.
Option 2: Single Stranded 3′ Adapter (FIG. 7a)
(108) In the presence of a ligase (DNA or RNA) capable of covalently attaching a single stranded adapter to a double stranded (or single stranded) substrate molecule, oligonucleotide 2 can be omitted from the reaction.
Option 3: Homopolymer 3′ Adapter (FIG. 7a)
(109) In the presence of a template independent polymerase such as terminal deoxynucleotidyl transferase (TdT), poly(A) polymerase, poly(U) polymerase or DNA polymerases that lack 3′-exonuclease proofreading activity and comprising a nucleotide, a homopolymer or other tail can be incorporated on the 3′ termini of the substrate molecules that can serve as a 3′ adapter sequence.
Option 4: Controlled Tailing and Simultaneous 3′ Adapter Ligation (FIG. 7a)
(110) In the presence of a template independent polymerase such as TdT, nucleotides, and additionally comprising a ligase and an attenuator-adapter molecule, a synthetic tail and defined 3′ adapter sequence can be incorporated on the 3′ termini of the substrate molecules. See International patent application number PCT/US13/31104, filed Mar. 13, 2013, incorporated herein by reference in its entirety.
Option 5: Omit 3′ Adapter Ligation Step (FIG. 7b)
(111) In the case of substrate molecules that comprise a pre-existing 3′ overhang that is naturally occurring or resulting from a previous enzymatic or other treatment, either as a defined or random sequence, a separate 3′ adapter ligation step is not required and can be omitted, wherein the pre-existing 3′ overhang can serve as the 3′ adapter.
(112) In an alternative embodiment, a phosphatase enzyme with Zinc and other reaction components can be added to the 3′ adapter ligation reaction at its completion. Performing a phosphatase reaction following 3′ adapter ligation is a means of rendering any non-ligated 3′ adapter molecules incapable of subsequent ligation, which prevents adapter dimers from forming in subsequent steps when the 5′ adapter is present.
(2) 5′ Adapter Ligation, which is Comprised of Three Steps that Occur in a Single Incubation
(I) Annealing of the 5′ Adapter
(113) In the case of single stranded 3′ adapter ligation (option 2), homopolymer addition (option 3) or use of pre-existing 3′ overhang as 3′ adapter (option 5), annealing of the 5′ adapter can be performed directly without other consideration as there is no oligonucleotide 2 to degrade or displace.
(114) When ligation of a double-stranded 3′-adapter is used to create a single-stranded 3′ overhang at the ends of double-stranded DNA (options 1a, 1b and 4 above), the 5′-adapter can be annealed to the 3′-adapter using any of five different options, each of which is discussed below and depicted in
Option i
(115) Oligonucleotide 2 of the 3′ adapter additionally comprises modified bases and/or linkages that can be destroyed enzymatically, chemically or physically. Modifications include but are not limited to dU-bases, deoxyinosine and RNA bases. Annealing of the single-stranded 5′ adapter to the 5′ portion of oligonucleotide 1 of the 3′ adapter occurs as result of partial degradation of the 3′ adapter, specifically, of oligonucleotide 2. In some embodiments, degradation of oligonucleotide 2 is achieved enzymatically, more specifically, by using uracil-DNA glycosylase (UDG), or a combination of UDG and apurinic/apyrimidinic endonuclease if the second oligonucleotide contains deoxyuracil bases, or by endonuclease V if the second oligonucleotide contains deoxyinosine bases. Degradation of oligonucleotide 2 can also be performed by incubation with RNase H1 or RNase H2 if the second oligonucleotide contains RNA bases. In some applications, degradation of the second oligonucleotide can be done chemically or physically, for example, by light.
Option ii
(116) In some applications, annealing of the 5′ adapter to oligonucleotide 1 of the 3′adapter occurs without degradation of oligonucleotide 2. In this case, replacement of oligonucleotide 2 with the single-stranded 5′ adapter can be facilitated by higher affinity of the 5′ adapter over that of oligonucleotide 2 either due to increased complementarity between oligonucleotide 1 and the 5′ adapter sequence or due to base modifications within the 5′ adapter that increase its melting temperature (for example, LNA bases). Depending on the design of the 5′ adapter, annealing to oligonucleotide 1 of the 3′ adapter could either result in a nick or gap between the 3′ end of the 5′ adapter and the 5′ end of the DNA substrate molecule, or in overlap of the 3′ and 5′ bases of the 5′ adapter and DNA substrate molecule, correspondingly.
Option iii
(117) In this case, neither degradable modifications or competitive displacement of oligonucleotide 2 is used. Instead, the 5′ adapter replaces oligonucleotide 2 by annealing to the 3′ adapter further 3′ on oligonucleotide 1 relative to the annealing site of oligonucleotide 2, followed by limited nick-translation “chewing forward” which results in degradation or partial degradation of oligonucleotide 2.
Options iv and v
(118) In these cases, the 5′ adapter constitutes a part of the 3′ adapter and it is present during ligation of the 3′ adapter to the DNA substrate. In option iv, the 5′ adapter is pre-annealed to the 3′ adapter further 3′ on oligonucleotide 1 relative to the annealing site of oligonucleotide 2 (similar to option iii). In option v, the 5′ adapter has a blocking group at the 3′ end and it is pre-annealed the 3′ adapter instead of oligonucleotide 2. After ligation of the 3′ adapter, the blocking group at the 3′ end of the 5′ adapter is removed enzymatically to allow its extension by a DNA polymerase.
(II) 5′-Base Removal from the Substrate Molecule Resulting in Exposure of a 5′ Phosphate
(119) In this step, creation of a ligation-compatible 5′ terminal phosphate group on the substrate molecule is achieved by removal of the damaged 5′ terminal base of the DNA substrate molecules either by nick-translation of the 5′ adapter oligonucleotide using a DNA polymerase and nucleotides (option i), by a displacement-cleavage reaction using the 5′ adapter and a 5′-flap endonuclease in the absence of nucleotides (option ii), or by single dideoxy base extension from oligonucleotide 2 followed by displacement-cleavage using a 5′-flap endonuclease in the absence of nucleotides (option iii). For the third option, 5′ base excision of the substrate molecule occurs prior to 5′ adapter annealing, because it is alternately performed using the annealed oligonucleotide 2 instead of the 5′ adapter, but is included in this section to simplify description of the method (see
Option i
(120) Nick-translation DNA synthesis is initiated at the nick or gap between the 3′ end of the 5′ adapter oligonucleotide and the 5′ end of the DNA substrate molecules and stops when the ligation reaction seals the nick (see
(121) The reaction conditions contemplated for this step include those where (i) both a polymerase with endogenous 5′ exonuclease activity and a ligase are active; (ii) a strand displacement polymerase and flap endonuclease polymerase and ligase are active; (iii) a flap endonuclease and a ligase are active, (iv) simultaneous activity of both a thermostable enzyme and a thermolabile enzyme occur; or (v) where activity of only thermostable or only thermolabile enzymes can occur. In some embodiments, conditions (i) and (ii) are each performed with dNTPs for nick translation. In a specific embodiment, Taq polymerase and E. coli ligase are used at a reaction temperature of 40° C. In various embodiments, however, a range of reaction temperatures from 10° C. to 75° C. are contemplated.
(122) The nick-translation reaction results in removal of one, two or more bases from the 5′ end of the DNA substrate molecules prior to the ligation reaction which occurs between the 5′ adapter extension product and the DNA substrate molecule. Nick-translation synthesis can occur in the presence of all four nucleotides dGTP, dCTP, dTTP and dATP or their restricted combinations. Restricted combinations include but are not limited to three-nucleotide combinations such as dGTP, dCTP and dATP, or dGTP, dCTP and dTTP, or dGTP, dATP and dTTP, or dCTP, dATP and dTTP, two-nucleotide combination such as dGTP and dCTP, or dGTP and dATP, or dGTP and dTTP, or dCTP and dATP, or dCTP and dTTP, or dATP and dTTP or just one nucleotide such as dGTP, or dCTP, or dATP, or dTTP.
Option ii
(123) The displacement-cleavage reaction does not require dNTPs but requires that the 5′ adapter sequence comprises one, two or more random bases at the 3′ terminus to create an overlap with the substrate molecule, and which comprises a plurality of 5′ adapters in the reaction (see
Option iii
(124) An alternative embodiment to the 5′ adapter participating in the 5′ base excision of the substrate molecules is to instead, in a previous step, have oligonucleotide 2 of the 3′ adapter participate in the 5′ base excision of the substrate molecules (see
(125) In one approach (
(126) Alternatively, a 5′adapter oligonucleotide that lacks a random dN base at its 3′ terminus can be used (
(127) In another alternative (see
(III) Ligation of the 5′ Adapter
(128) Covalent attachment of the 5′ adapter to the substrate molecule involves ligation between the 5′ adapter or its extension product and the exposed 5′ phosphate of the substrate molecules. When excision of the 5′ base(s) of DNA substrate molecules is achieved by a nick-translation reaction, the ligation reaction seals the nick between the polymerase-extended 5′ adapter and the excised 5′ end of the DNA substrate molecule. When excision of the 5′ base of DNA substrate molecules is achieved through the displacement-cleavage reaction, the ligation occurs between the original 5′ adapter oligonucleotide and the excised 5′ end of the DNA substrate molecule. The standard conditions with respect to the ligation reaction in this step comprise, in various embodiments, use of any DNA ligase that is capable of sealing nicks or gaps in DNA. In one embodiment, the ligase is E. coli DNA ligase and the reaction occurs in the temperature interval between 10° C. and 50° C. In some embodiments, the ligase is a thermostable DNA ligase such as Taq DNA ligase, or AmplLigase, and the reaction occurs in the temperature interval between 30° C. and 75° C.
(129) In various aspects of the current invention, the three steps (I), (II) and (III) of the 5′ adapter ligation step are performed simultaneously in a single incubation by mixing and incubating the 3′-adapted substrate DNA with (i) an optional degradation endonuclease (e.g., UDG, endonuclease V, RNase H, or their combination); (ii) a nick-translation DNA polymerase or a 5′-flap endonuclease; and (iii) a DNA ligase (see
Construction of NGS Libraries
(130) Synthesis of an Illumina NGS library can be performed using the disclosed methods. As shown in
(131) The disclosed methods can be used to construct NGS libraries for a variety of sequencing platforms, and another example is presented in
(132) Alternatively in
Applications for Target Selected NGS Libraries
(133) The disclosed methods can be used to construct NGS libraries where specific targets can be selected and enriched, as a way to reduce complexity and sequencing requirements relative to whole genome sequencing. An example of such an application would be attachment of the 3′ adapter and 5′ adapter to randomly fragmented, denatured and primer-extended DNA substrates, where the primer or plurality of primers anneal to known targeted DNA regions. In this case, only the targeted loci would comprise a double stranded terminus, where non-selected loci would remain single stranded and adapter ligation would not occur on their termini.
(134) In other applications, the 5′ adapter of the current invention can be used to select and enrich a small fraction of DNA fragments with known terminal sequences. Pre-selected DNA sequences could contain one, two, three or more terminal DNA bases. To achieve such selection the 5′ adapter sequence should contain selected invasion bases or base combinations at the 3′ end. As a result, only DNA fragments with selected terminal sequences will be ligated to the 5′ adapter and amplified. As shown in
(135) Alternatively, target selection can be performed following library construction using the methods disclosed within (see
Alternative Adapter Designs and Applications
(136) Several alternative adapter designs and ligation methods using the disclosed methods are also presented. In
(137) In
(138) Sometimes it is useful to generate circular DNA libraries, such as an intermediate structure for the construction of mate-pair NGS libraries. As shown in
Enzymes
(139) Ligases that may be used according to standard reaction conditions to practice the methods of the disclosure include but are not limited to T4 DNA ligase, T4 RNA ligase, T3 DNA ligase or T7 DNA ligase, Taq DNA ligase, Ampligase, E. coli DNA ligase and E. coli RNA ligase. The disclosure contemplates, in various embodiments, reaction conditions appropriate for a blunt end or a cohesive (“sticky”) end ligation. The cohesive end, in some embodiments, comprises either a 5′ overhang or a 3′ overhang.
(140) Examples of enzymes useful in the methods of the disclosure to remove a 5′ or a 3′ phosphate include, but are not limited to, any phosphatase enzyme, such as calf intestinal alkaline phosphatase, bacterial alkaline phosphatase, shrimp alkaline phosphatase, Antarctic phosphatase, and placental alkaline phosphatase, each used according to standard conditions. Additionally, the phosphatase activity of T4 polynucleotide kinase can be used to remove 3′ phosphate groups.
(141) The polymerase enzymes useful in the practice of the invention include but are not limited to a DNA polymerase (which can include a thermostable DNA polymerase, e.g., a Taq DNA polymerase), RNA polymerase, DNA polymerase I and reverse transcriptase. Non-limiting examples of enzymes that may be used to practice the present invention include but are not limited to KAPA HiFi and KAPA HiFi Uracil+, VeraSeq Ultra DNA Polymerase, VeraSeq 2.0 High Fidelity DNA Polymerase, Takara PrimeSTAR DNA Polymerase, Agilent Pfu Turbo CX Polymerase, Phusion U DNA Polymerase, Deep VentR™ DNA Polymerase, LongAmp™ Taq DNA Polymerase, Phusion™ High-Fidelity DNA Polymerase, Phusion™ Hot Start High-Fidelity DNA Polymerase, Kapa High-Fidelity DNA Polymerase, Q5 High-Fidelity DNA Polymerase, Platinum Pfx High-Fidelity Polymerase, Pfu High-Fidelity DNA Polymerase, Pfu Ultra High-Fidelity DNA Polymerase, KOD High-Fidelity DNA Polymerase, iProof High-Fidelity Polymerase, High-Fidelity 2 DNA Polymerase, Velocity High-Fidelity DNA Polymerase, ProofStart High-Fidelity DNA Polymerase, Tigo High-Fidelity DNA Polymerase, Accuzyme High-Fidelity DNA Polymerase, VentR® DNA Polymerase, DyNAzyme™ II Hot Start DNA Polymerase, Phire™ Hot Start DNA Polymerase, Phusion™ Hot Start High-Fidelity DNA Polymerase, Crimson LongAmp™ Taq DNA Polymerase, DyNAzyme™ EXT DNA Polymerase, LongAmp™ Taq DNA Polymerase, Phusion™ High-Fidelity DNA Polymerase, Taq DNA Polymerase with Standard Taq (Mg-free) Buffer, Taq DNA Polymerase with Standard Taq Buffer, Taq DNA Polymerase with ThermoPol II (Mg-free) Buffer, Taq DNA Polymerase with ThermoPol Buffer, Crimson Taq™ DNA Polymerase, Crimson Taq™ DNA Polymerase with (Mg-free) Buffer, Phire™ Hot Start DNA Polymerase, VentR® (exo-) DNA Polymerase, Hemo KlenTaq™, Deep VentR™ (exo-) DNA Polymerase, Deep VentR™ DNA Polymerase, DyNAzyme™ EXT DNA Polymerase, Hemo KlenTaq™, LongAmp™ Taq DNA Polymerase, ProtoScript® AMV First Strand cDNA Synthesis Kit, ProtoScript® M-MuLV First Strand cDNA Synthesis Kit, Bst DNA Polymerase, Full Length, Bst DNA Polymerase, Large Fragment, 9° Nm DNA Polymerase, DyNAzyme™ II Hot Start DNA Polymerase, Hemo KlenTaq™, Sulfolobus DNA Polymerase IV, Therminator™ y DNA Polymerase, Therminator™ DNA Polymerase, Therminator™ II DNA Polymerase, Therminator™ III DNA Polymerase, Bsu DNA Polymerase, Large Fragment, DNA Polymerase I (E. coli), DNA Polymerase I, Large (Klenow) Fragment, Klenow Fragment (3′.fwdarw.5′ exo−), phi29 DNA Polymerase, T4 DNA Polymerase, T7 DNA Polymerase (unmodified), Terminal Transferase, Reverse Transcriptases and RNA Polymerases, E. coli Poly(A) Polymerase, AMV Reverse Transcriptase, M-MuLV Reverse Transcriptase, phi6 RNA Polymerase (RdRP), Poly(U) Polymerase, SP6 RNA Polymerase, and T7 RNA Polymerase.
(142) The enzymes possessing flap endonuclease activity that are useful in the disclosure include but are not limited to flap endonuclease 1 (FEN1), T5 exonuclease, Taq DNA polymerase, Bst polymerase, Tth polymerase, DNA polymerase I and their derivatives.
EXAMPLES
Example 1
Comparison of Conventional Adapter Ligation to 3′ Adapter Ligation with FAM-Labeled Oligonucleotides
(143) Rationale: Using a FAM-labeled oligonucleotide system, blunt ligation using fill-in adapters (
(144) Materials: Fill-in adapter contains oligonucleotides 12-900 and 13-426 (Table 1) 3′Adapter; 1.sup.st oligonucleotide 13-340 (Table 1) 3′Adapter; 2.sup.nd oligonucleotide option 1 (with a blocking 3′ deoxythymidine base at the 3′ terminus) 13-559 (Table 1) 3′Adapter; 2.sup.nd oligonucleotide option 2 (a phosphate group at the 3′ terminus) 13-558 (Table 1) FAM substrate A composed of oligonucleotides 13-562 and 13-563, where the FAM group labels ligation to the 5′ Phosphate of the substrate (Table 1) FAM substrate B composed of oligonucleotides 13-561 and 13-564, where the FAM group labels ligation to the 3′ OH of the substrate and where the corresponding 5′ terminus of the substrate has a phosphate (Table 1) FAM substrate C composed of oligonucleotides 13-560 and 13-564, where the FAM group labels ligation to the 3′ OH of the substrate and where the corresponding 5′ terminus of the substrate lacks a phosphate (Table 1) T4 DNA Ligase (Rapid) (Enzymatics, Cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, Cat #B6030)
Method
(145) Conventional adapter ligation reactions were assembled in a total volume of 10 μl, comprising 1×T4 DNA Ligase Buffer, 10 pmoles of FAM substrate A, 20 or 200 pmoles of Fill-in adapter, 600 units T4 DNA Ligase (Rapid) or no ligase.
(146) 3′ adapter ligation reactions were assembled in a total volume of 10 μl, containing 1×T4 DNA Ligase Buffer, 10 pmoles of FAM substrate B or 10 pmoles of FAM substrate C, 20 or 200 pmoles of 3′Adapter option 1 or 20 or 200 pmoles of 3′Adapter option 2 and 600 units T4 DNA Ligase (Rapid) or no T4 DNA ligase.
(147) All ligation reactions were performed at 25° C. for 30 minutes. The total ligation reaction volume (10 μl) was mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera. Subsequently the gel was stained SYBR® Gold nucleic acid gel stain (Invitrogen, Cat #S11494) (not shown).
(148) Results
(149) FAM substrate A was converted into ligation product in the presence of the fill-in adapter and T4 DNA ligase (
(150) Different scenarios of 3′ adapter ligation were tested in lanes 4 to 12 (
(151) Conclusion
(152) Conventional adapter ligation required a 5′-phosphate on the FAM substrate which led to the formation of chimeras if the fill-in adapters were not in excess. Ligation of the 3′ Adapter was more efficient and with fewer chimeras when the FAM substrate had a 5′hydroxy group and the 3′ Adapter had a blocking 3-deoxythymidine base (option 1) which prevented ligation between adapter molecules and favored the ligation between substrate and adapter. In both cases, the ratio of adapter:substate of 20:1 was favored for ligation product formation.
Example 2
Comparison of Conventional Adapter Ligation to 3′ Adapter Ligation with Sheared, Size-Selected Genomic DNA
(153) Rationale: This experiment was performed to test the effect of polishing of physically sheared genomic DNA on the efficiency of conventional or 3′ adapter ligation
(154) Materials: Fill-in adapter contains oligonucleotides 13-489 and 13-426 (Table 1) 3′Adapter; 1st oligonucleotide 13-340 (Table 1) and 2nd oligonucleotide option 1 (containing a blocking 3′ deoxythymidine base at the 3′ terminus) 13-559 (Table 1) NEBuffer 2 (New England Biolabs, cat #B7002S) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) Adenosine 5′-Triphosphate (ATP) (New England Biolabs, cat #P0756S) DNA Polymerase I, Large (Klenow) Fragment (New England Biolabs, cat #M0210S) T4 DNA polymerase (New England Biolabs, cat #M0203S) T4 Polynucleotide Kinase (New England Biolabs, cat #M0201S) Exonuclease III (E. coli) (New England Biolabs, cat #M0293S) Antarctic Phosphatase (New England Biolabs, cat #M0289S) Antarctic Phosphatase reaction buffer (New England Biolabs, cat #B0289S) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) E. coli genomic DNA ATCC 11303 strain (Affymetrix, cat #14380) M220 Focused-ultrasonicator, (Covaris, cat #PN 500295) Pippin Prep (Sage Science) CDF2010 2% agarose, dye free w/internal standards (Sage Science) DNA Clean & Concentrator-5 (Zymo research, cat #D4004) 25 bp ladder DNA size marker (Invitrogen (Life technologies), cat #10488-022)
Method
(155) E. coli genomic (gDNA) was resuspended in DNA suspension buffer (Teknova, cat #T0227) at a concentration of 100 ng/ul. The DNA was fragmented with the M220 Focused-ultrasonicator to 150 base pairs average size. A tight size distribution of fragmented DNA from approximately 150 bp to approximately 185 bp was subsequently isolated on a 2% agarose gel using Pippin Prep.
(156) 200 ng of the size-selected DNA was subjected to the activity of different enzymes. The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1×NEBuffer 2, 100 μM of each dNTP, 3 units T4 DNA polymerase or 5 units DNA Polymerase I, Large (Klenow) Fragment or 3 units T4 DNA polymerase and 5 units DNA Polymerase I, Large (Klenow) Fragment or 3 units T4 DNA polymerase and 5 units DNA Polymerase I, Large (Klenow) Fragment and 1 unit of Exonuclease III. Another reaction was assembled in a total volume of 30 μl comprising a final concentration 1×NEBuffer 2, 1 mM ATP, 10 units of T4 Polynucleotide Kinase. Another reaction was assembled in a total volume of 30 μl comprising a final concentration 1× Antarctic Phosphatase reaction buffer and 5 units of Antarctic phosphatase. A control reaction was assembled with 200 ng of the size-selected DNA with 1×NEBuffer 2. All reactions were incubated at 37° C. for 30 minutes and the DNA purified using the DNA Clean & Concentrator-5 columns. DNA was eluted in 30 μl of DNA suspension buffer and divided into 2 tubes of 15 μl for subsequent conventional adapter ligation or 3′ adapter ligation. The conventional adapter ligations were assembled in a total volume of 30 μl comprising 1×T4 DNA Ligase Buffer, Fill-in adapter containing oligonucleotides 13-489 (220 pmoles) and 13-426 (440 pmoles), and 1200 units of T4 DNA Ligase (Rapid). The 3′ adapter ligation reactions were assembled in a total volume of 30 μl, containing 1×T4 DNA Ligase Buffer, 220 pmoles of 3′ Adapter 1st oligonucleotide, 440 pmoles of 3′Adapter 2nd oligonucleotide and 1200 units T4 DNA Ligase (Rapid). All reactions were purified using DNA Clean & Concentrator-5-columns. The DNA was resuspended in 10 μl of DNA suspension buffer and was mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 6% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C. The gel was stained SYBR® Gold nucleic acid gel stain (Invitrogen, Cat #S11494) and visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(157) Results
(158) The conventional adapter ligation reactions (
(159) Conclusion
(160) Ligation of blunt adapters to sheared DNA highly depends on the polishing of this DNA. DNA polymerases like T4 DNA polymerase which present a strong 5′ to 3′ exonuclease activity and a 5′ to 3′ polymerase activity are well suited for this purpose. The conventional adapter ligation reaction depends on the presence of an intact 5′ phosphate on the substrate's blunt end. However, ligation of the 3′ adapter does not, since the ligation occurs at the 3′ hydroxyl terminus of the fragmented DNA. Since the 5′ termini of sheared DNA are not enzymatic substrates for T4 DNA polymerase, this explains why the 3′ adapter was more successfully ligated than the fill-in adapter (lane 3). The combination of T4 DNA Polymerase plus Klenow and Exonuclease III significantly enhanced the blunt ligation. Exonuclease III activity produced blunt ends required for ligation of blunt adapters by removing 3′ hydroxyl termini which could be damaged at the 3′ terminus of DNA. Exonuclease III also possesses a 3′ phosphatase activity, which makes the 3′ terminus accessible to DNA polymerase polishing activity.
Example 3
Temperature Optimization for 5′ Adapter Ligation Using a FAM-Labeled Oligonucleotide Substrate
(161) Rationale: This experiment assessed the temperature dependence and dNTP composition on nick translation mediated 5′ adapter ligation.
(162) Materials: 5′ adapter oligonucleotide for nick-translation (13-144) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) E. coli DNA ligase (New England BioLabs, cat #M0205S) 10× E. coli DNA Ligase Reaction Buffer (New England BioLabs) Taq DNA polymerase, concentrated 25 U/ul (Genscript, cat #E00012) 25 bp ladder DNA size marker (Invitrogen (Life technologies), cat #10488-022)
Method
(163) A first set of nick translation reactions was assembled in a total volume of 30 μl, comprising a final concentration of 1× E. coli DNA ligase Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for nick-translation and 45 pmoles of oligonucleotide template, 200 μM of dTTP or a mix of 200 uM of each dTTP/dGTP or 200 uM of each dATP/dTTP/dGTP and 2.5 units of Taq DNA polymerase or no Taq DNA polymerase. The reactions were incubated at 30° C., 40° C. or 50° C. for 30 minutes.
(164) A second set of nick translation reactions followed by ligations were assembled in 30 ul comprising a final concentration of 1× E. coli DNA ligase Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for nick-translation and 45 pmoles of oligonucleotide template, 200 uM of each dATP/dTTP/dGTP, and 2.5 units of Taq DNA polymerase. The reactions were incubated at 50° C., 53° C., 56° C. or 60° C. for 30 minutes. 10 μl of those reactions were taken for gel analysis. 10 units of E. coli ligase were added to the 20 μl left and incubated at 25° C. for 15 minutes. An additional control reaction was assembled in 30 ul comprising a final concentration of 1× E. coli DNA ligase Buffer, and 30 pmoles of FAM oligonucleotide substrate. 10 μl of those reactions were mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(165) Results
(166) As shown in
(167) The efficiency of nick translation and the amount of FAM oligonucleotide substrate cleaved was highly dependent on the temperature of the reaction. At 60° C., the FAM oligonucleotide substrate was almost entirely processed to smaller species (
(168) Conclusion
(169) During nick translation, the number of bases cleaved from the FAM oligonucleotide substrate depended on the complementary dNTPs introduced in the reaction and the temperature at which the reactions took place. During the nick translation reaction, Taq DNA polymerase cleaves the 5′ terminus of the FAM oligonucleotide substrate and generates a terminal 5′ phosphate that is essential for E. coli ligase to ligate two fragments. FAM oligonucleotide substrates cleaved by nick translation at higher temperatures were poor substrates for ligation by E. coli ligase because of a potential gap formed between the 3′ terminus of the 5′ adapter oligonucleotide and the 5′ terminus of the FAM oligonucleotide substrate.
Example 4
Analysis of dNTP Composition Effects on 5′ Adapter Ligation
(170) Rationale: This experiment was performed to assess the degree of nick-translation that occurs in the presence of varied dNTP composition and the effect on the coupled ligation reaction.
(171) Materials: 5′ adapter oligonucleotide for nick-translation (13-144) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) 25 bp ladder DNA size marker (Invitrogen (Life technologies), cat #10488-022) E. coli DNA ligase (Enzymatics, cat #L6090L) 10× E. coli DNA ligase Buffer (Enzymatics, cat #B6090) Taq-B DNA polymerase (Enzymatics, cat #P7250L)
Method
(172) The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1× E. coli DNA ligase Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for nick-translation and 45 pmoles of oligonucleotide template, 200 μM of each 4 dNTP or a mix of 200 μM of each: dCTP, dTTP, dGTP or dATP, dTTP, dGTP or dATP, dCTP, dGTP or dATP, dTTP, dCTP or no dNTP, 10 units of E. coli ligase and 10 units of Taq-B DNA polymerase. All reactions were incubated at 40° C. for 30 minutes. 10 μl of those reaction were mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera (lower panel). Subsequently the gel was stained SYBR® Gold nucleic acid gel stain (Invitrogen, Cat #S11494), visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera (upper panel).
(173) Results
(174) The first two lanes of
(175) Conclusion
(176) Phosphorylation of the 5′ terminus of the FAM oligonucleotide substrate is required for ligation. The polymerase activity of Taq DNA polymerase in the presence of dNTPs is required to perform the extension of the 5′ adapter, which creates a flap at the 5′ terminus of the FAM oligonucleotide substrate. This flap is a good substrate for the 5′ flap endonuclease activity of Taq DNA polymerase, generating a perfect 5′ phosphate substrate for ligation by E. coli ligase. The ligation occurs even if the flap is only formed by one base. The ligation also occurs when all four dNTPs are present which does not restrict the length of the flap or the extent of nick translation, suggesting that the ligation occurs immediately after a 5′phosphate is created at the 5′ terminus of the FAM oligonucleotide substrate.
Example 5
Coupled Nick Translation-Ligation Reaction with Thermo Stable Enzymes
(177) Rationale: This experiment was performed to assess the effect of reaction temperature and number of units of Taq DNA Polymerase enzyme in the coupled reaction.
(178) Materials: 5′ adapter oligonucleotide for nick-translation (13-144) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) Taq DNA ligase (New England BioLabs, cat #M0208S) 10×Taq DNA ligase Reaction Buffer (New England BioLabs) Taq DNA polymerase, concentrated 25 U/ul (Genscript, cat #E00012)
Method
(179) The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1× Taq DNA ligase reaction Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for nick-translation and 45 pmoles of oligonucleotide template, 200 μM of each: dATP, dTTP, dGTP or dTTP, 40 units of Taq DNA ligase, or 80 units Taq DNA ligase, or 120 units Taq DNA ligase and 10 units of Taq DNA polymerase. Reactions were incubated at 45° C., 50° C., 55° C., or 60° C., for 30 minutes. 10 μl of those reactions were mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(180) Results
(181) Taq DNA polymerase elongated the 3′ hydroxyl terminus of the 5′ adapter oligonucleotide, removing nucleotides on the FAM oligonucleotide substrate by its 5′ flap endonuclease activity. Adding dTTP/dGTP/dATP (
(182) Conclusion
(183) During the nick translation reaction, Taq DNA polymerase cleaves the 5′ terminus of the FAM oligonucleotide substrate and generates a 5′ phosphate terminus essential for Taq DNA ligase between 45° C. and 60° C. to perform ligation. The ligation was reduced at 60° C. The concentration of Taq DNA ligase in the reaction also affected the efficiency of the ligation, as more product was observed in the presence of 120 U enzyme compared to 80 U and 40 U.
Example 6
Coupled Displacement-Cleavage-Ligation Reaction
(184) Rationale: This experiment was performed to demonstrate that either thermostable Taq DNA ligase or thermolabile E. coli ligase can be combined with Taq DNA Polymerase in the coupled displacement-cleavage ligation reaction.
(185) Materials: 5′ adapter oligonucleotide for displacement-cleavage (13-156) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) Taq DNA ligase (New England BioLabs, cat #M0208S) 10×Taq DNA ligase Reaction Buffer (New England BioLabs) Taq DNA polymerase, concentrated 25 U/ul (Genscript, cat #E00012) E. coli DNA ligase (New England BioLabs, cat #M0205S) 10× E. coli DNA Ligase Reaction Buffer (New England BioLabs)
Method
(186) The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1× E. coli DNA ligase reaction Buffer or 1×Taq DNA ligase reaction Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for displacement-cleavage and 45 pmoles of oligonucleotide template, 10 units of E. coli DNA ligase or 40 units Taq DNA ligase, and 10 units of Taq DNA polymerase. Reactions were incubated at 40° C. or 45° C. for 30 minutes. 10 μl of those reactions were mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #511494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(187) Results
(188) The 5′ adapter oligonucleotide for displacement-cleavage has an extra matching base “T” at is 3′ terminus, which overlaps with the 5′ terminus of the FAM oligonucleotide substrate. When the 3′ terminus of the 5′ adapter oligonucleotide displaces the 5′ terminus of the FAM oligonucleotide substrate, the 5′ flap endonuclease activity of Taq DNA polymerase cleaves the 5′ terminus of the FAM oligonucleotide substrate to create a 5′ phosphate which is essential for the ligation with E. coli ligase (
(189) Conclusion
(190) In the absence of dNTPs, no extension of the 5′ adapter occurs. However, Taq DNA polymerase can cleave the 5′ terminus of the FAM oligonucleotide substrate and generates a terminal 5′ phosphate that is essential for E. coli DNA ligase or Taq DNA ligase to perform ligation.
Example 7
Coupled Displacement-Cleavage-Ligation Reaction with Either “N” Universal/Degenerate or “T” Substrate-Specific 5′ Adapter 3′ Overhang
(191) Rationale: This experiment demonstrates that 5′ adapter ligation using a flap endonuclease can be performed if either the 5′ adapter 3′ terminal overhang is a sequence-specific match or if it is composed of a degenerate non sequence-specific ‘N’.
(192) Materials: 5′ adapter oligonucleotide for displacement-cleavage “T” (13-607) (Table 1) 5′ adapter oligonucleotide for displacement-cleavage “N” (13-596) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) Taq DNA ligase (New England BioLabs, cat #M0208S) 10×Taq DNA ligase Reaction Buffer (New England BioLabs) Taq DNA polymerase, concentrated 25 U/ul (Genscript, cat #E00012) E. coli DNA ligase (New England BioLabs, cat #M0205S) 10× E. coli DNA Ligase Reaction Buffer (New England BioLabs)
Method
(193) The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1×Taq DNA ligase reaction buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide “T” or 45 pmoles of 5′ adapter oligonucleotide “N” 1 or 180 pmoles of 5′ adapter oligonucleotide “N” or 450 pmoles of 5′ adapter oligonucleotide “N” and 45 pmoles of oligonucleotide template, 40 units Taq DNA ligase, and 10 units of Taq DNA polymerase. Reactions were incubated at 45° C. or 50° C. or 55° C. for 30 minutes or cycling 8 times between 45° C. for 3 minutes, 65° C. for 15 seconds. 10 μl of those reactions were mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(194) Results
(195) When the 5′ adapter oligonucleotide for displacement-cleavage has a “T” at its 3′ terminus matching the oligonucleotide template (
(196) Conclusion
(197) To allow efficient 5′ adapter ligation coupled to displacement-cleavage using the 5′ adapter oligonucleotide “N”, cycling between a first temperature for Taq DNA ligase to operate and a second temperature where the duplex between the oligonucleotide template and the 5′ adapter oligonucleotide “N” could dissociate was critical. The cycling conditions permitted multiple associations between the 5′ adapter oligonucleotide “N” and the oligonucleotide template where the displacement-cleavage reaction occurred only if the 3′ terminal base of the 5′ adapter oligonucleotide is a perfect match to the template and can displace the 5′ terminus of the FAM oligonucleotide substrate.
Example 8
Coupled Nick-Translation-Ligation Reaction Using DNA Polymerase I
(198) Rationale: This experiment demonstrates that a DNA polymerase I, which possesses 5′-3′ exonuclease activity, can also participate in the nick translation coupled adapter ligation method.
(199) Materials: 5′ adapter oligonucleotide for nick-translation (13-144) (Table 1) FAM oligonucleotide substrate (13-581) (Table 1) Oligonucleotide template (13-582) (Table 1) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) 25 bp ladder DNA size marker (Invitrogen (Life technologies), cat #10488-022) E. coli DNA ligase (Enzymatics, cat #L6090L) 10× E. coli DNA ligase Buffer (Enzymatics, cat #B6090) Taq-B DNA polymerase (Enzymatics, cat #P7250L) DNA polymerase I (New England Biolabs, cat #M0209S)
Method
(200) The reactions were assembled in a total volume of 30 comprising a final concentration of 1× E. coli DNA ligase Buffer, 30 pmoles of FAM oligonucleotide substrate, 45 pmoles of 5′ adapter oligonucleotide for nick-translation and 45 pmoles of oligonucleotide template, 200 μM of each 4 dNTPs, 10 units of E. coli ligase and 10 units of Taq-B DNA polymerase or 5 units of DNA polymerase I or 1 unit of DNA polymerase I. Reactions were incubated at 40° C., 18° C., 16° C. or 14° C. for 30 minutes. 10 μl of each reaction was mixed with 10 μl of 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., visualized on a Dark reader light box (Clare Chemical Research) with an without SYBR gold (upper panel and lower panel, respectively), and photographed using a digital camera.
(201) Results
(202) The first lane of
(203) Conclusion
(204) Taq-B DNA polymerase (thermophilic polymerase) and DNA polymerase I (mesophilic polymerase) can both be used to perform the nick translation mediated ligation but they require different conditions to be fully active. They both generated a 69 base product which was the result of excision of the 5′ end followed by ligation but they use different mechanisms. While Taq-B created a flap that was cut to produce the required 5′ phosphorylated end for the ligation by E. coli ligase, DNA polymerase I removed nucleotides one by one in front of the growing strand and generated the 5′ phosphorylated nucleotide which was the perfect substrate for E. coli ligase to join the two fragments. DNA polymerase I can be used to perform 5′ adapter ligation mediated by nick translation.
Example 9
Polishing is Required for Blunt Ligation of Physically Sheared DNA and Dephosphorylation Prevents the Formation of Chimeric Ligation Products
(205) Rationale: This experiment demonstrates the importance of end polishing and dephosphorylation for blunt ligation of adapters to physically sheared DNA substrates.
(206) Materials: Blue Buffer (Enzymatics, cat #B0110) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) Adenosine 5′-Triphosphate (ATP) (New England Biolabs, cat #P0756S) DNA Polymerase I, Large (Klenow) Fragment (New England Biolabs, cat #M0210S) T4 DNA polymerase (New England Biolabs, cat #M0203S) T4 Polynucleotide Kinase (New England Biolabs, cat #M0201S) Shrimp alkaline phosphatase (Affymetrix, cat #78390) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) E. coli genomic DNA ATCC 11303 strain (Affymetrix, cat #14380) M220 Focused-ultrasonicator, (Covaris, cat #PN 500295) Pippin Prep (Sage Science) DNA Clean & Concentrator-5—(Zymo research, cat #D4004) CDF2010 2% agarose, dye free w/internal stds (Sage Science)
Method
(207) E. coli gDNA was resuspended in DNA suspension buffer (Teknova, cat #T0227) at a concentration of 100 ng/ul. The DNA was fragmented with the M220 Focused-ultrasonicator to 150 base pairs average size. A tight distribution of fragmented DNA from ˜150 bp to ˜185 bp was subsequently size-selected from a 2% agarose gel using pippin prep.
(208) In a set of reactions A, 100 ng or 500 ng of the size-selected DNA was subjected to the activity of polishing enzymes. The reactions were assembled in a total volume of 30 μl, comprising a final concentration of 1× Blue buffer, 100 μM of each dNTP, 3 units T4 DNA Polymerase, 5 units DNA Polymerase I, Large (Klenow) Fragment, 1 mM ATP, 10 units of T4 Polynucleotide Kinase. The reactions were incubated at 30° C., for 20 minutes. The DNA was purified using the DNA Clean & Concentrator-5 columns. The DNA was eluted in 15 μl of DNA suspension buffer and a subsequent dephosphorylation reactions B was followed by adapter ligation or were placed directly into the ligation reaction without dephosphorylation. The dephosphorylation reactions were assembled in a 30 μl final volume, including the processed DNA, 1× Blue buffer, and 1 unit of shrimp alkaline phosphatase. The reactions were incubated at 37° C., for 10 minutes. The DNA was purified using the DNA Clean & Concentrator-5 columns and eluted in 15 μl of DNA suspension buffer.
(209) In a set of reactions C, 100 ng of the size-selected DNA was subjected to dephosphorylation followed by polishing or directly to polishing in a set of reaction D. The dephosphorylation reactions were assembled in a 30 μl final volume, including the processed DNA, 1× Blue buffer, and 1 unit of shrimp alkaline phosphatase. The reactions were incubated at 37° C., for 10 minutes. The DNA was purified using the DNA Clean & Concentrator-5 columns and eluted in 15 μl of DNA suspension buffer. The polishing reactions D were assembled in a total volume of 30 μl, comprising a final concentration of 1× Blue buffer, 10004 of each dNTP, 3 units T4 DNA polymerase, 5 units DNA Polymerase I, Large (Klenow) Fragment, (lanes 6 to 7). The DNA was purified using the DNA Clean & Concentrator-5 columns and eluted in 15 μl of DNA suspension buffer.
(210) After purification, all the previous reactions were subject to ligation reactions. Reactions were assembled in a final volume of 30 μl, comprising the processed DNA, 1×T4 DNA ligase reaction buffer and 1200 units of T4 DNA ligase. The reactions were incubated at 25° C., for 15 minutes. 33 ng of DNA from each ligation was mixed with 2× formamide loading buffer (97% formamide, 10 mM EDTA, 0.01% bromophenol blue and 0.01% xylene cyanol), heated at 95° C. for 5 minutes and subsequently run on a pre-cast 15% polyacrylamide gel, TBE-Urea (Invitrogen, Cat #S11494) in an oven at 65° C., stained with SYBR Gold, visualized on a Dark reader light box (Clare Chemical Research) and photographed using a digital camera.
(211) Results
(212) Before polishing, physically sheared DNA was not a suitable substrate for ligation to blunt ended adapters by T4 DNA ligase (
(213) Conclusion
(214) Blunt ligation efficiency of physically sheared DNA depended on end polishing by DNA polymerases. The ligation was also improved by the addition of T4 Polynucleotide Kinase, which phosphorylated the 5′ terminus of the DNA fragments and dephosphorylated the 3′ terminus. The concentration of DNA also influenced the amount of ligation and the formation of chimeric products. At higher concentration, DNA is more likely to form chimeric products in the presences of T4 DNA ligase. Alkaline phosphatases remove 5′ phosphates (which are required for ligation) and prevent the formation of chimeric ligation products (concatamers).
Example 10
NGS Libraries have Increased Yield when Prepared Using 5′ Base Trimming Coupled to Adapter Ligation Reaction
(215) Rationale: This experiment demonstrates the utility of the reactions presented in their exemplary application to NGS library construction, particularly the increase in library yield that results from including 5′ base trimming coupled to 5′ adapter ligation. Libraries were constructed from size-selected sheared DNA so library products could be easily visualized by gel electrophoresis.
(216) Materials: Blue Buffer (Enzymatics, cat #B0110) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) Adenosine 5′-Triphosphate (ATP) (New England Biolabs, cat #P0756S) Klenow Fragment (Enzymatics, cat #P7060L) T4 DNA polymerase (Enzymatics, cat #P7080L) T4 Polynucleotide Kinase (Enzymatics, cat #Y904L) Shrimp alkaline phosphatase (Affymetrix, cat #78390) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) 3′Adapter; 1st oligonucleotide 13-501 (Table 1) 3′Adapter; 2nd oligonucleotide 13-712 (Table 1) E. coli genomic DNA ATCC 11303 strain (Affymetrix, cat #14380) M220 Focused-ultrasonicator, (Covaris, cat #PN 500295) E. coli DNA ligase (Enzymatics, cat #L6090L) E. coli DNA ligase buffer (Enzymatics, cat #B6090) Uracil-DNA glycosylase (Enzymatics, cat #G50100 Taq-B DNA polymerase (Enzymatics, cat #P7250L) 5′ adapter oligonucleotide for nick-translation (13-489) (Table 1) 5′ adapter oligonucleotide for displacement-cleavage (13-595) (Table 1) Taq DNA ligase (Enzymatics, cat #L6060L) SPRIselect (Beckman coulter, cat #B23419)
Methods
(217) E. coli genomic DNA was resuspended in DNA suspension buffer (Teknova, cat #T0227) at a concentration of 100 ng/μ1. The DNA was fragmented with the M220 Focused-ultrasonicator to 150 base pairs average size. A tight distribution of fragmented DNA from ˜150 bp to ˜185 bp was subsequently size-selected on a 2% agarose gel using pippin prep.
(218) 100 ng of the size-selected E. coli genomic DNA was used to prepare a library with the enhanced adapter ligation method. The polishing reaction was assembled in 30 μl, comprising a final concentration of 1× Blue buffer, 100 μM of each dNTP, 3 units T4 DNA polymerase, 5 units DNA Polymerase I, Large (Klenow) Fragment, 10 units of T4 Polynucleotide Kinase. The reaction was incubated at 37° C. for 20 minutes. The DNA was purified using the DNA Clean & Concentrator-5 and eluted in 15 μl with DNA suspension buffer. The 3′ Adapter ligation reaction was assembled in 30 μl including, 1×T4 DNA ligase buffer, 220 pmoles of the 3′ Adapter 1st oligonucleotide, 440 pmoles of the 3′ Adapter 2nd oligonucleotide, the 15 μl of DNA purified and 1200 units of T4 DNA ligase. The reaction was incubated at 25° C. for 15 minutes. The DNA was brought up to a 50 μl volume and purified and size selected using 70 μl SPRIselect beads (ratio 1.4×). DNA was eluted in 15 μl of DNA resuspension buffer. The partial degradation of the 3′ adapter, annealing of the 5′ adapter, 5′-end trimming and ligation of the 5′ adapter all took place in the next reaction which was assembled in a final volume of 30 μl containing 1× E. coli DNA ligase buffer or 1×Taq DNA ligase buffer, 200 μM of each dNTPs or 200 μM of each dATP, dTTP, dGTP or no dNTPs, 200 pmoles of 5′ adapter oligonucleotide for nick-translation or 5′ adapter oligonucleotide for displacement-cleavage, 10 units of E. coli ligase or 40 units of Taq DNA ligase, 2 units of uracil-DNA glycosylase, 10 units of Taq-B DNA polymerase and 15 μl of the DNA purified after the 3′ Adapter ligation reaction. The reaction was incubated at 40° C. or 45° C. for 10 minutes or with 30 cycles of (45° C. for 45 seconds-65° C. for 5 seconds)(library 5). The DNA was brought up to a 50 μl volume and purified and size selected using 40 μl of SPRIselect beads (ratio 0.8×). The DNA was eluted in 20 μl and quantified by qPCR using the Kapa Library Quantification Kit—Illumina/Universal (cat #KK4824).
(219) Results
(220) The library concentrations were reported on the plot (
(221) Conclusion
(222) Libraries were successfully made with the disclosed adapter ligation method. The 5′-end DNA trimming by Taq DNA polymerase allows a three-fold increase in the yield of 5′ adapter ligation product when compared to libraries that have no 5′ end processing step (libraries 1 vs 2). Both Taq DNA ligase (library 4) and E. coli ligase (library 3) efficiently ligated the 5′ adapter after the nick-translation. Taq DNA ligase also ligated the 5′ adapter after the displacement-cleavage (library 5). Using 4 dNTPs (library 2) instead of 3 (libraries 3 and 4) during the nick-translation may allow the ligation of more DNA substrate to the 5′ adapter.
Example 11
Sequence Analysis of NGS Libraries Prepared Using 5′ Base Trimming Coupled to Adapter Ligation
(223) Rationale: This experiment demonstrates the utility of the reactions presented in their exemplary application to NGS library construction. Libraries were constructed from sheared E. coli DNA and then sequenced in order to demonstrate the superior evenness of coverage obtained over a wide base composition of the genome.
(224) Materials: Blue Buffer (Enzymatics, cat #B0110) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) 100 mM 2′-deoxynucleoside 5′-triphosphate (dNTP) Set, PCR Grade (Invitrogen (Life technologies), cat #10297-018) Adenosine 5′-Triphosphate (ATP) (New England Biolabs, cat #P0756S) Klenow Fragment (Enzymatics, cat #P7060L) T4 DNA polymerase (Enzymatics, cat #P7080L) T4 Polynucleotide Kinase (Enzymatics, cat #Y904L) Shrimp alkaline phosphatase (Affymetrix, cat #78390) T4 DNA Ligase (Rapid) (Enzymatics, cat #L6030-HC-L) 10×T4 DNA Ligase Buffer (Enzymatics, cat #B6030) 3′Adapter; 1st oligonucleotide 13-510 (Table 1) 3′Adapter; 2nd oligonucleotide 13-712 (Table 1) E. coli genomic DNA ATCC 11303 strain (Affymetrix, cat #14380) M220 Focused-ultrasonicator, (Covaris, cat #PN 500295) E. coli DNA ligase (Enzymatics, cat #L6090L) E. coli DNA ligase buffer (Enzymatics, cat #B6090) Uracil-DNA glycosylase (Enzymatics, cat #G50100 Taq-B DNA polymerase (Enzymatics, cat #P7250L) 5′ adapter oligonucleotide for nick-translation (13-489) SPRIselect (Beckman coulter, cat #B23419)
Method
(225) E. coli genomic DNA was resuspended in DNA suspension buffer (Teknova, cat #T0227) at a concentration of 100 ng/μ1. The DNA was fragmented with the M220 Focused-ultrasonicator to 150 base pairs average size. 100 ng of E. coli covaris genomic DNA was used to prepare a library. A first reaction of dephosphorylation was assembled in a total volume of 15 μl, comprising a final concentration of 1× Blue buffer, 100 ng of fragmented E. coli genomic DNA and 1 unit of shrimp alkaline phosphatase. The reaction was incubated at 37° C. for 10 minutes. The shrimp alkaline phosphatase was inactivated 5 minutes at 65° C. The polishing reaction was assembled in 30 μl, comprising a final concentration of 1× Blue buffer, 100 μM of each dNTP, 3 units T4 DNA polymerase, 5 units DNA Polymerase I, Large (Klenow) Fragment and 15 μl of the dephosphorylation reaction. The reaction was incubated at 20° C. for 30 minutes. The DNA was purified using the DNA Clean & Concentrator-5. The DNA was eluted in 15 μl with DNA suspension buffer. The 3′ Adapter ligation reaction was assembled in 30 μl including, 1×T4 DNA ligase buffer, 220 pmoles of the 3′ Adapter 1st oligonucleotide, 440 pmoles of the 3′ Adapter 2nd oligonucleotide, the 15 μl of DNA purified after polishing and 1200 units of T4 DNA ligase. The reaction was incubated at 25° C. for 15 minutes. After adjusting volume to 50 μl, the DNA was purified and sized selected using 45 μl SPRIselect beads (ratio 0.9×). DNA was eluted in 15 μl of DNA resuspension buffer. The partial degradation of the 3′ adapter, annealing of the 5′ adapter, 5′-end DNA trimming and ligation of the 5′ adapter all took place in the next reaction which was assembled in a final volume of 30 μl containing 1× E. coli DNA ligase, 200 μM of each dNTPs, 200 pmoles of 5′ adapter oligonucleotide for nick-translation, 10 units of E. coli ligase, 2 units of uracil-DNA glycosylase, 10 units of Taq-B DNA polymerase and 15 μl of the DNA purified after the 3′ Adapter ligation reaction. The reaction was incubated at 40° C. for 10 minutes. After adjusting the volume to 50 μl, the DNA was purified using 70 μl of SPRIselect beads (ratio 1.4×). The DNA was eluted in 20 μl, and quantified by qPCR using the Kapa Library Quantification Kit—Illumina/Universal (cat #KK4824). DNA was denatured 5 minutes with a final concentration of 0.1 mM of sodium hydroxide and 600 μl of 10 pM library was loaded on a MiSeq (Illumina).
(226) Results
(227) The library concentration as quantified by qPCR was 2.8 nM. Pair end reads of 76 bases were generated by the v2 chemistry of the Illumina MiSeq. 928K/mm.sup.2 clusters were generated and the Q30 score were 97.8% and 96.9% for the first and second read, respectively. The sequence data quality was assessed using the FastQC report (Babraham Bioinformatics). A summary of the analysis showed 9 green check marks, 2 yellow exclamation points (warning), but no red X (failed) were observed (
(228) Conclusion
(229) A library was successfully made using fragmented E. coli genomic DNA. The sequencing demonstrated high quality data and no bias in the coverage throughout the range of GC content.
Example 12
Oncology Hotspot Panel Combined with Comprehensive Coverage of the TP53 Gene
(230) Rationale: A total of 51 amplicons were designed to cover the entire coding region of the TP53 gene as well as 30 hotspot loci representing clinically actionable mutations in oncology.
(231) Rationale: This amplicon panel provides proof of concept for the disclosed method, where the 51 amplicons have significant overlap to demonstrate the absence of the mini-amplicon dominating the reaction, as well as the evenness of coverage among amplicons that can be achieved using limited multiplex cycle number. In addition, the high percentage of on target reads demonstrates the specificity of priming because primer dimers and non-specific off target amplification products do not appear in the sequenced library.
(232) Materials: Human HapMap genomic DNA (Coriell Institute, NA12878) KAPA HiFi HotStart Uracil+ ReadyMix (KAPA Biosystems, cat #KK2802) 102 Target-specific primers (Table 2) Universal primer containing a 3′ adapter oligonucleotide truncated sequence and cleavable bases 14-882 (Table 2) E. coli DNA ligase buffer (Enzymatics, cat #B6090) 5′ adapter oligonucleotide for adapter ligation step (14-571) 5′ part of the 3′ adapter oligonucleotide for adapter ligation step (14-877) Linker oligonucleotide for adapter ligation step 14-382 (Table 2) E. coli DNA ligase (Enzymatics, cat #L6090L) Uracil-DNA glycosylase (Enzymatics, cat #G5010L) Endonuclease VIII (Enzymatics, cat #Y9080L) Taq-B DNA polymerase (Enzymatics, cat #P7250L) SPRIselect (Beckman coulter, cat #B23419) 20% PEG-8000/2.5M NaCl solution for purification steps
Method
(233) Human genomic DNA was diluted in DNA suspension buffer (Teknova, cat #T0227) at a concentration of 2 ng/μl. The DNA was slightly sheared by vortexing for 2 minutes. 10 ng of this sheared genomic DNA was used to prepare a library. A first reaction of amplification was assembled in a total volume of 30 μl, comprising a final concentration of 1×KAPA HiFi HotStart Uracil+ ReadyMix, 10 ng of sheared human genomic DNA, 300 pmol of the universal primer and a final concentration of 0.85 uM of a mix of the 102 target-specific primers present in different ratios. The following cycling program was run on this reaction: 3 minutes at 95° C. followed by 4 cycles of 20 seconds at 98° C., 5 minutes at 63° C. and 1 minute at 72° C. to generate target-specific amplicons and terminated by 23 cycles of 20 seconds at 98° C. and 1 minute at 64° C. to produce multiple copies of the target-specific amplicons. After adjusting the volume to 50 μl, the DNA product was purified using 60 μl of SPRIselect beads (ratio 1.2×). The beads were resuspended in 50 μl of a 1× reaction mix containing 1× E. coli ligase buffer, 100 pmol of the linker oligonucleotide, 10 units of E. coli ligase, 10 units of endonuclease VIII, 2 units of uracil-DNA glycosylase, 20 units of Taq-B DNA polymerase, 100 pmol of the 5′ adapter oligonucleotide and 100 pmol of the 5′ part of the 3′ adapter oligonucleotide. The reaction was incubated at 37° C. for 10 minutes and then purified by adding 42.5 μl of a 20% PEG-8000/2.5M NaCl solution (ratio 0.85×). The DNA was eluted in 20 μl, and quantified by qPCR using the Kapa Library Quantification Kit—Illumina/Universal (cat #KK4824). DNA was denatured 5 minutes with a final concentration of 0.1 mM of sodium hydroxide and 600 μl of 10 pM library was loaded on a MiSeq (Illumina).
(234) Results
(235) The library concentration as quantified by qPCR was 19.1 nM. Paired end reads of 101 bases were generated by the v2 chemistry of the Illumina MiSeq. Prior to data analysis, sequence-specific trimming from the 5′ end of both read 1 and read 2 is performed to remove synthetic primer sequences using the Cutadapt program. The alignment of the paired reads to the human genome and to the targeted regions using BWA-MEM tool showed exceptional quality data with 98% aligning to targeted regions. Coverage data were also obtained using BEDtools. The coverage uniformity was 100% meaning that each of the 51 amplicons was represented in the final library. The coverage of each individual base in each amplicon was also calculated and was higher than 20% of the mean per base coverage meaning that none of the 51 amplicons were underrepresented in the final product.
(236) Conclusion
(237) A targeted amplicon library was successfully made using human genomic DNA. The sequencing demonstrated high quality data.
(238) TABLE-US-00001 TABLE 1 Sequence SEQ ID name NO. Sequence 12-900 1 AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTAC ACGACGCTCTTCCGATCT 13-426 2 AGATCGGAAGAGCGTCGTGTAG/3SpC3/ 13-340 3 /5PHOS/AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA GATCTCGGTGGTCGCCGTATCATT/3SpC3/ 13-559 4 ACACGACGCTCTTCCGATCddT 13-558 5 ACACGACGCTCTTCCGATCT/3PHOS/ 13-562 6 /5PHOS/TGTACCTCACTTCTCATCACTGCT/3FAM/ 13-563 7 AGCAGTGATGAGAAGTGAGGTACA 13-561 8 /5PHOS/TGTACCTCACTTCTCATCACTGCT 13-564 9 /5FAM/AGCAGTGATGAGAAGTGAGGTACA 13-560 10 TGTACCTCACTTCTCATCACTGCT 13-144 11 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT 13-581 12 TGTACCTCACTTCTCATCACTGCTGTCATCCGAT/3FAM/ 13-582 13 AGCAGTGATGAGAAGTGAGGTACAAGATCGGAAGAGCGT CGTGTAG/3SpC3/ 13-156 14 GACTGGAGTTCAGACGTGTGCTCTTCCGATCTT 13-607 15 CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGA GTTCAGACGTGTGCTCTTCCGATCTT 13-596 16 /5SpC3/C*A*AGCAGAAGACGGCATACGAGATCGTGATGTG ACTGGAGTTCAGACGTGTGCTCTTCCGATCTN 13-501 17 /5PHOS/AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC ATCACGATCTCGTATGCCGTCTTCTGCT*T*G/3SpC3/ 13-712 18 AGACGUGUGCUCUTCCGATCddT 13-489 19 /5SpC3/A*A*TGATACGGCGACCACCGAGATCTACACTCTTT CCCTACACGACGCTCTTCCGATCT 13-595 20 /5SpC3/A*A*TGATACGGCGACCACCGAGATCTACACTCTTT CCCTACACGACGCTCTTCCGATCTN 13-510 21 /5PHOS/AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC GCCAATATCTCGTATGCCGTCTTCTGCT*T*G/3spC3/ *Phosphorothioated DNA bases /5SpC3/: 5′ C3 spacer (IDT) /3SpC3/: 3′ C3 spacer (IDT) /5PHOS/: 5′ Phosphorylation (IDT) /3PHOS/: 3′ Phosphorylation (IDT) /5FAM/: 5′ 6-carboxyfluorescein (IDT) /3FAM/: 3′ 6-carboxyfluorescein (IDT) ddT: 2′, 3′-Dideoxythymidine (TriLink)
(239) TABLE-US-00002 TABLE 2 Oligonucleotides used in Example 12. SEQ Final Sequence ID concentration name NO Sequence (5′-3′) in PCR (nM) 14-758 22 TCAGACGTGTGCTCTTCCGATCTTCTTGCAGCAG 10 nM CCAGA*C*T 14-759 23 TCAGACGTGTGCTCTTCCGATCTCCTGCCCTTCC 10 nM AATGGA*T*C 14-760 24 TCAGACGTGTGCTCTTCCGATCTCCCCTAGCAGA 5 nM GACCT*G*T 14-864 25 TCAGACGTGTGCTCTTCCGATCTGCCCAACCCTT 20 nM GTCCTT*A*C 14-762 26 TCAGACGTGTGCTCTTCCGATCTCTGACTGCTCT 5 nM TTTCACCC*A*T 14-763 27 TCAGACGTGTGCTCTTCCGATCTGAGCAGCCTCT 5 nM GGCATTC*T*G 14-764 28 TCAGACGTGTGCTCTTCCGATCTTGAAGACCCA 5 nM GGTCCAGAT*G*A 14-765 29 TCAGACGTGTGCTCTTCCGATCTGCTGCCCTGGT 5 nM AGGTTTTC*T*G 14-766 30 TCAGACGTGTGCTCTTCCGATCTCTGGCCCCTGT 15 nM CATCTTC*T*G 14-767 31 TCAGACGTGTGCTCTTCCGATCTCAGGCATTGA 15 nM AGTCTCATG*G*A 14-768 32 TCAGACGTGTGCTCTTCCGATCTTCCTCCCTGCT 10 nM TCTGTC*T*C 14-769 33 TCAGACGTGTGCTCTTCCGATCTCTGTCAGTGGG 10 nM GAACAAGA*A*G 14-885 34 TCAGACGTGTGCTCTTCCGATCTGTGCTGTGACT 10 nM GCTTGTA*G*A 14-886 35 TCAGACGTGTGCTCTTCCGATCTCTCTGTCTCCT 10 nM TCCTCTTCCT*A*C 14-869 36 TCAGACGTGTGCTCTTCCGATCTCTGTGCAGCTG 10 nM TGGGTT*G*A 14-773 37 TCAGACGTGTGCTCTTCCGATCTGCTCACCATCG 10 nM CTATCTG*A*G 14-865 38 TCAGACGTGTGCTCTTCCGATCTCATGACGGAG 5 nM GTTGTGA*G*G 14-775 39 TCAGACGTGTGCTCTTCCGATCTAGCAATCAGT 5 nM GAGGAATCAG*A*G 14-776 40 TCAGACGTGTGCTCTTCCGATCTAGCTGGGGCT 5 nM GGAGA*G*A 14-777 41 TCAGACGTGTGCTCTTCCGATCTGTCATCCAAAT 5 nM ACTCCACACG*C*A 14-778 42 TCAGACGTGTGCTCTTCCGATCTGCATCTTATCC 5 nM GAGTGGAA*G*G 14-779 43 TCAGACGTGTGCTCTTCCGATCTCACTGACAACC 5 nM ACCCTTAA*C*C 14-780 44 TCAGACGTGTGCTCTTCCGATCTCAGGTAGGAC 5 nM CTGATTTCCTT*A*C 14-781 45 TCAGACGTGTGCTCTTCCGATCTTTCTTGCGGAG 5 nM ATTCTCTT*C*C 14-782 46 TCAGACGTGTGCTCTTCCGATCTTGGGACGGAA 5 nM CAGCTTTG*A*G 14-783 47 TCAGACGTGTGCTCTTCCGATCTCCACCGCTTCT 5 nM TGTCC*T*G 14-784 48 TCAGACGTGTGCTCTTCCGATCTGGGTGCAGTTA 5 nM TGCCTC*A*G 14-785 49 TCAGACGTGTGCTCTTCCGATCTAGACTTAGTAC 5 nM CTGAAGGGT*G*A 14-786 50 TCAGACGTGTGCTCTTCCGATCTTAGCACTGCCC 5 nM AACAACA*C*C 14-787 51 TCAGACGTGTGCTCTTCCGATCTCGGCATTTTGA 5 nM GTGTTAGACT*G*G 14-788 52 TCAGACGTGTGCTCTTCCGATCTCCTGGTTGTAG 10 nM CTAACTAACT*T*C 14-789 53 TCAGACGTGTGCTCTTCCGATCTACCATCGTAAG 10 nM TCAAGTAGCA*T*C 14-790 54 TCAGACGTGTGCTCTTCCGATCTATGGTTCTATG 5 nM ACTTTGCCT*G*A 14-791 55 TCAGACGTGTGCTCTTCCGATCTAGCAGGCTAG 5 nM GCTAAGCTA*T*G 14-792 56 TCAGACGTGTGCTCTTCCGATCTCCTGCTGAAAA 10 nM TGACTGAATATAAACT*T*G 14-793 57 TCAGACGTGTGCTCTTCCGATCTGGTCCTGCACC 10 nM AGTAATAT*G*C 14-794 58 TCAGACGTGTGCTCTTCCGATCTTGCTTGCTCTG 10 nM ATAGGAAAATG*A*G 14-795 59 TCAGACGTGTGCTCTTCCGATCTGGATCCAGAC 10 nM AACTGTTCAAAC*T*G 14-796 60 TCAGACGTGTGCTCTTCCGATCTCCAGAAACTG 3.75 nM CCTCTTGA*C*C 14-797 61 TCAGACGTGTGCTCTTCCGATCTGATGTAAGGG 3.75 nM ACAAGCAG*C*C 14-798 62 TCAGACGTGTGCTCTTCCGATCTGAACCAATGG 5 nM ATCGATCTG*C*C 14-799 63 TCAGACGTGTGCTCTTCCGATCTGGGGAACTGA 5 nM TGTGACTTA*C*C 14-800 64 TCAGACGTGTGCTCTTCCGATCTCTGAGCAAGA 5 nM GGCTTTGG*A*G 14-801 65 TCAGACGTGTGCTCTTCCGATCTAACAGTGCAG 5 nM TGTGGAAT*C*C 14-802 66 TCAGACGTGTGCTCTTCCGATCTCCACAGAAAC 5 nM CCATGTATGAAG*T*A 14-803 67 TCAGACGTGTGCTCTTCCGATCTGTACCCAAAA 5 nM AGGTGACATG*G*A 14-804 68 TCAGACGTGTGCTCTTCCGATCTTTTCAGTGTTA 10 nM CTTACCTGTCTTG*T*C 14-805 69 TCAGACGTGTGCTCTTCCGATCTGGACTCTGAA 10 nM GATGTACCTATGG*T*C 14-806 70 TCAGACGTGTGCTCTTCCGATCTCTCACCATGTC 10 nM CTGACTG*T*G 14-807 71 TCAGACGTGTGCTCTTCCGATCTGTGGCACTCTG 10 nM GAAG*C*A 14-808 72 TCAGACGTGTGCTCTTCCGATCTGTTACTGAAAG 10 nM CTCAGGGAT*A*G 14-809 73 TCAGACGTGTGCTCTTCCGATCTCCACACTTACA 10 nM CATCACTTT*G*C 14-810 74 TCAGACGTGTGCTCTTCCGATCTTAGTCTTTCTT 10 nM TGAAGCAGCA*A*G 14-811 75 TCAGACGTGTGCTCTTCCGATCTCTAGCTGTGAT 10 nM CCTGAAACTG*A*A 14-812 76 TCAGACGTGTGCTCTTCCGATCTTCCTCCTGCAG 20 nM GATTCCT*A*C 14-813 77 TCAGACGTGTGCTCTTCCGATCTTGGTGGATGTC 20 nM CTCAAAAG*A*C 14-814 78 TCAGACGTGTGCTCTTCCGATCTCAGGATTCTTA 15 nM CAGAAAACAAGTG*G*T 14-815 79 TCAGACGTGTGCTCTTCCGATCTTGATGGCAAAT 15 nM ACACAGAGGA*A*G 14-816 80 TCAGACGTGTGCTCTTCCGATCTGACGGGTAGA 5 nM GTGTGCG*T*G 14-817 81 TCAGACGTGTGCTCTTCCGATCTCGCCACAGAG 5 nM AAGTTGTTG*A*G 14-818 82 TCAGACGTGTGCTCTTCCGATCTCGCACTGGCCT 10 nM CATCT*T*G 14-819 83 TCAGACGTGTGCTCTTCCGATCTCTTCCAGTGTG 10 nM ATGATGGTG*A*G 14-820 84 TCAGACGTGTGCTCTTCCGATCTCATGTGTAACA 5 nM GTTCCTGCA*T*G 14-821 85 TCAGACGTGTGCTCTTCCGATCTGGTCAGAGGC 5 nM AAGCAG*A*G 14-822 86 TCAGACGTGTGCTCTTCCGATCTTTACTTCTCCC 10 nM CCTCCTC*T*G 14-823 87 TCAGACGTGTGCTCTTCCGATCTCTTCCCAGCCT 10 nM GGGCA*T*C 14-824 88 TCAGACGTGTGCTCTTCCGATCTGCTGAATGAG 8 nM GCCTTGGA*A*C 14-825 89 TCAGACGTGTGCTCTTCCGATCTCTTTCCAACCT 8 nM AGGAAGGC*A*G 14-826 90 TCAGACGTGTGCTCTTCCGATCTGCACTGTAATA 5 nM ATCCAGACTGT*G*T 14-827 91 TCAGACGTGTGCTCTTCCGATCTCATGTACTGGT 5 nM CCCTCATT*G*C 14-828 92 TCAGACGTGTGCTCTTCCGATCTCCTTTCAGGAT 20 nM GGTGGATG*T*G 14-829 93 TCAGACGTGTGCTCTTCCGATCTCGACTCCACCA 20 nM GGACT*T*G 14-830 94 TCAGACGTGTGCTCTTCCGATCTGTTAACCTTGC 5 nM AGAATGGTCG*A*T 14-831 95 TCAGACGTGTGCTCTTCCGATCTCCACGAGAAC 5 nM TTGATCATATTC*A*C 14-832 96 TCAGACGTGTGCTCTTCCGATCTCAACAGGTTCT 5 nM TGCTGGTG*T*G 14-833 97 TCAGACGTGTGCTCTTCCGATCTATGGTGGGATC 5 nM ATATTCATCTA*C*A 14-836 98 TCAGACGTGTGCTCTTCCGATCTAGCTTGTGGAG 5 nM CCTCTTA*C*A 14-837 99 TCAGACGTGTGCTCTTCCGATCTGGGACCTTACC 5 nM TTATACACC*G*T 14-838 100 TCAGACGTGTGCTCTTCCGATCTCACCATCTCAC 5 nM AATTGCCA*G*T 14-839 101 TCAGACGTGTGCTCTTCCGATCTGCTTTCGGAGA 5 nM TGTTGCTTC*T*C 14-840 102 TCAGACGTGTGCTCTTCCGATCTGATCCCAGAA 5 nM GGTGAGAAAG*T*T 14-841 103 TCAGACGTGTGCTCTTCCGATCTTGAGGTTCAGA 5 nM GCCATG*G*A 14-842 104 TCAGACGTGTGCTCTTCCGATCTCTCCAGGAAG 10 nM CCTACGT*G*A 14-843 105 TCAGACGTGTGCTCTTCCGATCTGGACATAGTCC 10 nM AGGAGG*C*A 14-844 106 TCAGACGTGTGCTCTTCCGATCTCACCGCAGCAT 10 nM GTCAAGA*T*C 14-845 107 TCAGACGTGTGCTCTTCCGATCTGACCTAAAGC 10 nM CACCTCCTT*A*C 14-846 108 TCAGACGTGTGCTCTTCCGATCTTCCACTATACT 15 nM GACGTCTCCA*A*C 14-847 109 TCAGACGTGTGCTCTTCCGATCTACACACGCAA 15 nM AATACTCCTTC*A*G 14-850 110 TCAGACGTGTGCTCTTCCGATCTCTGTCCTCACA 5 nM GAGTTCAA*G*C 14-851 111 TCAGACGTGTGCTCTTCCGATCTGTTTTTGCAGA 5 nM TGATGGGCT*C*C 14-852 112 TCAGACGTGTGCTCTTCCGATCTCTGGACCAAG 5 nM CCCATC*A*C 14-853 113 TCAGACGTGTGCTCTTCCGATCTTGTGGCCTTGT 5 nM ACTGCA*G*A 14-854 114 TCAGACGTGTGCTCTTCCGATCTCAGTGTGTTCA 5 nM CAGAGACC*T*G 14-855 115 TCAGACGTGTGCTCTTCCGATCTGTAGGAAATA 5 nM GCAGCCTCAC*A*T 14-856 116 TCAGACGTGTGCTCTTCCGATCTTGTTCCTGATC 15 nM TCCTTAGACA*A*C 14-857 117 TCAGACGTGTGCTCTTCCGATCTCTTGCTGCACT 15 nM TCTCACA*C*C 14-858 118 TCAGACGTGTGCTCTTCCGATCTTGAAAATTCCA 7.5 nM GTGGCCAT*C*A 14-859 119 TCAGACGTGTGCTCTTCCGATCTCAATGAAGAG 7.5 nM AGACCAGA*G*C 14-860 120 TCAGACGTGTGCTCTTCCGATCTCCCATACCCTC 5 nM TCAGCGT*A*C 14-861 121 TCAGACGTGTGCTCTTCCGATCTGTGGATGTCAG 5 nM GCAGAT*G*C 14-862 122 TCAGACGTGTGCTCTTCCGATCTCCCTCCCAGAA 15 nM GGTCTAC*A*T 14-863 123 TCAGACGTGTGCTCTTCCGATCTTTTTGACATGG 15 nM TTGGGACTCT*T*G 14-882 124 TCAGACGUGUGCUCUUCCGAU*C*U 10 μM 14-382 125 GTGACTGGAGTTCAG - ACGTGT/3PHOS/ 14-877 126 AACTCCAGTCACTAATGCGCATCTCGTATGCCG - TCTTCTGCTTG/3PHOS/ 14-571 127 AATGATACGGCGACCACCGAGATCTACACAGGC - GAAGACACTCTTTCCCTACACGACGCTCTTCCG ATCT *Phosphorothioated DNA bases (IDT) /3PHOS/: 3′ Phosphorylation (IDT)
(240) The preceding disclosure is supplemented by the following description of various aspects and embodiments of the disclosure, as provided in the following enumerated paragraphs.
(241) A method of producing a processed substrate molecule, the method comprising: (i) ligating a first polynucleotide to a 3′ terminus of a substrate molecule that is at least partially double stranded; (ii) annealing a second polynucleotide to the first polynucleotide under conditions that promote the annealing; (iii) excising at least one nucleotide from the 5′ terminus of the substrate molecule; and then (iv) ligating the second polynucleotide to the 5′ terminus of the double stranded substrate molecule to produce the processed substrate molecule.
(242) In one embodiment, the method further comprises the step, prior to step (i), of contacting the substrate molecule with a phosphatase enzyme.
(243) In one embodiment, the method further comprises the step of making the substrate molecule blunt-ended by contacting the substrate molecule with a polymerase enzyme possessing 3′-5′ exonuclease activity.
(244) In one embodiment, the method further comprises the step of contacting the substrate molecule with a template-independent polymerase to adenylate the 3′ end of the substrate molecule.
(245) In one embodiment, the substrate molecule is naturally occurring or the substrate molecule is synthetic.
(246) In one embodiment, the substrate molecule is naturally occurring.
(247) In one embodiment, the substrate molecule is genomic DNA.
(248) In one embodiment, the genomic DNA is eukaryotic or prokaryotic.
(249) In one embodiment, wherein the genomic DNA is fragmented in vitro or in vivo.
(250) In one embodiment, the in vitro fragmenting is performed by a process selected from the group consisting of shearing, cleaving with an endonuclease, sonication, heating, irradiation using an alpha, beta, or gamma source, chemical cleavage in the presence of metal ions, radical cleavage, and a combination thereof.
(251) In one embodiment, the in vivo fragmenting occurs by a process selected from the group consisting of apoptosis, radiation, and exposure to asbestos.
(252) In one embodiment, the substrate molecule is synthetic and is selected from the group consisting of cDNA, DNA produced by whole genome amplification, primer extension products comprising at least one double-stranded terminus, and a PCR amplicon.
(253) The method of any of the preceding embodiments wherein the first polynucleotide is at least partially double stranded and comprises oligonucleotide 1 and oligonucleotide 2.
(254) In one embodiment, the second polynucleotide anneals to oligonucleotide 1.
(255) In one embodiment, the annealing results in a nick, a gap, or an overlapping base between the second polynucleotide and the substrate molecule.
(256) In one embodiment, the second polynucleotide is contacted with a polymerase, resulting in degradation of oligonucleotide 2.
(257) In one embodiment, oligonucleotide 2 comprises a base that is susceptible to degradation.
(258) In one embodiment, oligonucleotide 2 comprises a blocking group at its 3′ end that prevents ligation.
(259) The method of any of the preceding embodiments wherein the second polynucleotide comprises a modified base.
(260) In one embodiment, the annealing results in dehybridization of oligonucleotide 1 and oligonucleotide 2.
(261) The method of any of the preceding embodiments, further comprising: (i) ligating a third polynucleotide to a 3′ terminus of an additional substrate molecule that is at least partially double stranded; (ii) annealing a fourth polynucleotide to the third polynucleotide under conditions that promote the annealing; (iii) excising at least one nucleotide from the 5′ terminus of the additional substrate molecule; and then (iv) ligating the fourth polynucleotide to the 5′ terminus of the double stranded additional substrate molecule to produce a processed additional substrate molecule.
(262) In one embodiment, the first polynucleotide and the third polynucleotide are the same.
(263) In one embodiment, the second polynucleotide and the fourth polynucleotide are the same.