LINEAR DNA ASSEMBLY FOR NANOPORE SEQUENCING
20220411863 · 2022-12-29
Assignee
Inventors
- David Yu Zhang (Houston, TX)
- Deepak THIRUNAVUKARASU (Houston, TX, US)
- Yuxuan CHENG (Houston, TX, US)
- Ping SONG (Houston, TX, US)
Cpc classification
C12Q1/6806
CHEMISTRY; METALLURGY
International classification
Abstract
Provided herein are compositions and methods for assembling multiple DNA molecules into a linear concatemer, with applications to nanopore sequencing of DNA sequence variations.
Claims
1. An aqueous solution for DNA monomer assembly, the solution comprising: a plurality of double-stranded DNA monomer species, each monomer species comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Left sticky end DNA sequence (1), an insert sequence (A), a second designed Right sticky end DNA sequence (1*), and a type IIS restriction site in the (−) orientation (S1*), wherein at least two different DNA monomers comprise the same Left sticky end DNA sequence, wherein at least two different DNA monomers comprise the same Right sticky end DNA sequence, and wherein the Left sticky end DNA sequence and the Right sticky end DNA sequence are complementary to and can form Watson-Crick base pairs with each other; a type IIS DNA restriction enzyme; a DNA ligase enzyme; and a chemical buffer suitable for the enzymatic functions of the type IIS DNA restriction enzyme and the DNA ligase enzyme.
2. The solution of claim 1, further comprising a partially double-stranded DNA seed molecule, the seed molecule comprising, from 5′ to 3′: a single-stranded Left sticky end DNA sequence (1); and a double stranded DNA region devoid of a type IIS restriction site (C).
3. The solution of claim 1, further comprising a partially double-stranded DNA seed molecule, the seed molecules comprising, from 5′ to 3′: a Left sticky end DNA sequence (1); a double stranded DNA region devoid of a type IIS restriction site (C); and a Left sticky end DNA sequence (1).
4. The solution of any one of claims 1-3, wherein the chemical buffer comprises between 20 mM and 150 mM Tris-HCl, between 2 mM and 50 mM MgCl2, between 0 mM and 50 mM DTT, and between 0.1 mM and 10 mM ATP, wherein the buffer exhibits a pH between 5.5 and 9.5 at 25° C.
5. The solution of any one of claims 1-3, wherein the type IIS DNA restriction enzyme is selected from BsaI, BbsI, BsmBI, BtgZI, Esp3I, and SapI, wherein the S1 and S1* restriction sites correspond to the recognition site of the type IIS DNA restriction enzyme selected, and wherein the concentration of the type IIS DNA restriction enzyme is between 0.15 U/μL and 15 U/μL.
6. The solution of any one of claims 1-3, wherein the DNA ligase enzyme is selected from T4 DNA ligase, T7 DNA ligase, T3 DNA ligase, Taq DNA ligase, and E. coli DNA ligase, and wherein the concentration of the DNA ligase is between 5 U/μL and 500 U/μL.
7. The solution of any one of claims 1-3, wherein the Left sticky end DNA sequence and the Right sticky end DNA sequence each have a length of 2-6 nucleotides.
8. The solution of any one of claims 1-3, wherein the insert sequence of each monomer has a length between 40 nt and 2,000 nt.
9. The solution of any one of claims 1-3, wherein the total concentration of all DNA monomers is between 5 nM and 5 μM.
10. The solution of claim 2 or 3, wherein the total concentration of all DNA monomers is 1x to 1000x the concentration of partially double-stranded DNA seed molecules.
11. A method for linear assembly of DNA concatemers from a plurality of double-stranded DNA monomers, each monomer species comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Left sticky end DNA sequence (1), an insert sequence (A), a second designed Right sticky end DNA sequence (1*), and a type IIS restriction site in the (−) orientation (S1*); wherein at least two different DNA monomers comprise the same Left sticky end DNA sequence, wherein at least two different DNA monomers comprise the same Right sticky end DNA sequence, and wherein the Left sticky end DNA sequence and the Right sticky end DNA sequence are complementary to and can form Watson-Crick base pairs with each other; the method comprising: mixing the DNA monomers with a type IIS DNA restriction enzyme, a DNA ligase enzyme, and a chemical buffer suitable for the enzymatic functions of the type IIS DNA restriction enzyme and the DNA ligase enzyme; and thermal cycling the solution between 5 cycles and 100 cycles, with each cycle comprising between 5 seconds and 5 minutes at a temperature between 30° C. and 45° C., and between 30 seconds and 30 minutes at a temperature between 10° C. and 25° C.
12. The method of claim 11, wherein a partially double-stranded DNA seed molecule is mixed with the monomer molecules before thermal cycling, the seed molecule comprising, from 5′ to 3′: a single-stranded Left sticky end DNA sequence (1); and a double stranded DNA region devoid of a type IIS restriction site (C).
13. The method of claim 11, wherein a partially double-stranded DNA seed molecule is mixed with the monomer molecules before thermal cycling, the seed molecule comprising, from 5′ to 3′: a Left sticky end DNA sequence (1); a double stranded DNA region devoid of a type IIS restriction site (C); and a Left sticky end DNA sequence (1).
14. The method of claim 11, wherein a partially double-stranded DNA seed molecule is mixed with the monomer molecules before thermal cycling, the seed molecule comprising, from 5′ to 3′: a Left sticky end DNA sequence (1); a double stranded DNA region devoid of a type IIS restriction site (C) and a unique barcode; and a sticky end DNA sequence (2) for appending adapters for nanopore sequencing.
15. The method of claim 11, wherein the DNA monomers are generated by a method comprising: amplifying a DNA template by multiplex polymerase chain reaction (PCR) amplification, comprising: adding to a DNA template solution: a set of forward DNA primers comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Left sticky end DNA sequence (1), and a gene-specific sequence; a set of reverse DNA primers comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Right sticky end DNA sequence (1*), and a gene-specific sequence; a DNA polymerase; and a chemical buffer suitable for PCR amplification; thermal cycling the solution between 5 cycles and 60 cycles, with each cycle comprising between 5 seconds and 1 minute at a temperature between 90° C. and 100° C., and between 30 seconds and 2 minutes at a temperature between 55° C. and 72° C.
16. The method of claim 15, wherein a set of gene-specific DNA Blockers are additionally added to the DNA template solution, wherein the region of the DNA template that the Blockers bind overlaps with that of the forward DNA primers by between 4 and 15 nucleotides, and wherein the standard free energy of the forward primer displacing the Blocker at 60° C. in 5 mM Mg.sup.2+is between 0 kcal/mol and +5 kcal/mol.
17. A method of generating DNA monomers for linear assembly, the method comprising: obtaining a DNA sample solution that comprises a DNA template; amplifying the DNA template by multiplex polymerase chain reaction (PCR) amplification, comprising: adding to the DNA solution: a set of forward DNA primers comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Left sticky end DNA sequence (1), and a gene-specific sequence; a set of reverse DNA primers comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Right sticky end DNA sequence (1*), and a gene-specific sequence; a DNA polymerase; and a chemical buffer suitable for PCR amplification; thermal cycling the solution between 5 cycles and 60 cycles, with each cycle comprising between 5 seconds and 1 minute at a temperature between 90° C. and 100° C., and between 30 seconds and 2 minutes at a temperature between 55° C. and 72° C.
18. The method of claim 17, wherein the forward and/or reverse primers further comprise a UMI barcode.
19. The method of claim 17, wherein a set of gene-specific DNA Blockers are additionally added to the DNA template solution, wherein the region of the DNA template that the Blockers bind overlaps with that of the forward DNA primers by between 4 and 15 nucleotides, and wherein the standard free energy of the forward primer displacing the Blocker at 60° C. in 5 mM Mg.sup.2+is between 0 kcal/mol and +5 kcal/mol.
20. A method for preparing a solution of heterogeneous DNA concatemers, the method comprising: preparing a set of DNA monomers from a DNA template sample according to the method of any one of claims 17-19; purifying the monomers to remove unreacted primers and enzymes; and performing linear DNA assembly according to the method of claim 11 or 12.
21. The method of claim 20, wherein a set of gene-specific DNA Blockers are additionally added to the DNA template solution, wherein the region of the DNA template that the Blockers bind overlaps with that of the forward DNA primers by between 4 and 15 nucleotides, and wherein the standard free energy of the forward primer displacing the Blocker at 60° C. in 5 mM Mg.sup.2+is between 0 kcal/mol and +5 kcal/mol.
22. The method of claim 20, wherein purifying the monomers comprises using either an affinity column or magnetic beads.
23. A method for targeted nanopore sequencing of gene regions of interest, the method comprising: obtaining a DNA sample of interest comprising a DNA template; preparing a set of DNA monomers from the DNA template according to the method of any one of claims 17-19; purifying the monomers to remove unreacted primers and enzymes; performing linear DNA assembly according to the method of any one of claims 11-13; purifying the concatemers to remove unreacted monomers, Type IIS reaction side products, and enzymes; appending adapters for nanopore sequencing to the purified concatemers; purifying the adapter-appended concatemers to remove excess adapters and enzymes; and performing nanopore sequencing.
24. A method for constructing a monomer species comprising, from 5′ to 3′: a type IIS restriction site in the (+) orientation (S1), a designed Left sticky end DNA sequence (1), an insert sequence (A), a second designed Right sticky end DNA sequence (1*), and a type IIS restriction site in the (−) orientation (S1*); wherein at least two different DNA monomers comprise the same Left sticky end DNA sequence, wherein at least two different DNA monomers comprise the same Right sticky end DNA sequence, and wherein the Left sticky end DNA sequence and the Right sticky end DNA sequence are complementary to and can form Watson-Crick base pairs with each other; the method comprising: obtaining a solution of double-stranded DNA inserts of interest; performing a first ligation reaction on a first portion of the solution with a double stranded DNA adaptor comprising: a type IIS restriction site in the (+) orientation (S1), and a designed Left sticky end DNA sequence (1); performing a second reaction ligation reaction on a second portion of the solution with a double stranded DNA adaptor comprising: a type IIS restriction site in the (+) orientation (S1), and a designed Right sticky end DNA sequence (1*); and mixing the products of the first and second ligations reactions in a solution in a chemical buffer conducive to ligation.
25. The method of claim 24, wherein the double-stranded DNA inserts are dA-tailed prior to performing the ligation.
26. A method for targeted nanopore sequencing of gene regions of interest, the method comprising: obtaining a DNA sample of interest comprising a DNA template; preparing a set of DNA monomers from the DNA template according to the method of claim 24 or 25; purifying the monomers to remove unreacted primers and enzymes; performing linear DNA assembly according to the method of any one of claims 11-13; purifying the concatemers to remove unreacted monomers, Type IIS reaction side products, and enzymes; appending adapters for nanopore sequencing to the purified concatemers; purifying the adapter-appended concatemers to remove excess adapters and enzymes; and performing nanopore sequencing.
27. The methods of any one of claims 11-26, wherein the step of mixing the DNA monomers further comprises mixing with two single-stranded destructive probes, the first single-stranded destructive probe comprising, from 5′ to 3′, a type IIS recognition sequence (S1), and a Left sticky end DNA sequence (1); and the second single-stranded destructive probe comprising, from 5′ to 3′: a type IIS recognition sequence (S1), and the Right sticky end DNA sequence (1*).
28. The method of claim 27, wherein the concentration of the destructive probe is between 1x and 100x of the total concentration of the DNA monomers.
29. The methods of claim 27 or 28, wherein the destructive probes have chemical modifications that prevents restriction digestion.
30. The method of claim 29, wherein the modifications are selected from phosphorothioate-substituted backbone, sugar modified nucleotides (e.g., 2′Fluoro, 2′-OMe), inverted DNA nucleotides, methylated bases, DNA with carbon spacers, or DNA with polyethylene glycol (PEG) spacers.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
[0038]
[0039] Hetero-polymer assembly using LDA. Monomers of different family (A and B) have the same sticky ends 1 and 1*. Hybridization between family A and B monomers can occur during the ligation step leading to assembly of hetero-polymers containing monomers from different families. S1 is the type IIS restriction enzyme recognition site in the (+) orientation and S1* is the recognition site in the (−) orientation.
[0040]
[0041]
[0042] Preparation of DNA monomers by dA-tailing and ligation of LDA-adapters. LDA-adapterl and LDA-adpater2 are ligated to dA-tailed insert DNA in two separate ligation reactions. The ligated monomers from the two reactions are mixed and included in the LDA reaction. Monomers with LDA-adapter1 have sticky end 1, while monomers with LDA-adapter2 have sticky end 1* after the restriction step. These two monomer populations can ligate with each other during the ligation step to form linear assemblies.
[0043]
[0044]
[0045]
[0046] NS read throughput for DNA assembled by LDA (mean size of 1577 nt) is comparable to throughput of DNA monomers without assembly (mean size of 317 nt).
[0047]
[0048]
[0049]
[0050]
[0051] NA18537 and 0.1% variant sample is 0.1% human gDNA NA18562 in NA18537. gDNA sample NA18562 bears the two SNPs that were detected. BDA probes were designed for the SNP rs3789806(C>G), while the SNP rs9648696(T>C) occurs in cis. In the top panel, fraction of reads at each nucleotide position corresponding to the wildtype (NA18537 homozygous) allele is plotted. The two SNPs can be clearly detected in the 0.1% variant sample. The bottom panel shows the ΔVariant allele %, which is the fraction of reads mapped to the highest frequency variant allele in the 0.1% variant sample, minus the variant allele frequency in the matched normal 0% variant sample.
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
DETAILED DESCRIPTION
[0058] Provided herein are methods to assemble short DNA molecules (e.g., PCR amplicons) into long, linear concatemers using type IIS restriction enzyme digestion and ligation by DNA ligase. The provided methods and reagents improve assembly length. In nanopore sequencing, the number of DNA molecules that can be sequenced by a flow cell is similar regardless of the length of each DNA molecule, so the provided methods greatly improve the effective throughput of nanopore sequencing. The higher effective sequencing depth can also improve the limit of detection for mutations including single nucleotide variants and small insertions/deletions.
[0059] The provided Linear DNA Assembly (LDA) methods use Type IIS restriction enzyme digestion and DNA ligation to assemble many DNA monomers into a long linear concatemer. These methods discourage the formation of circular concatemer products (which cannot be sequenced by NS), (2) do not require assembly of different components in a pre-determined order and are suitable for a variety of NS panels with variable panel sizes, and (3) include several molecular innovations to increase the average length of the assembled concatemer.
[0060] I. Linear DNA Assembly
[0061] Linear DNA assembly (LDA) using type IIS restriction digestion and ligation requires each DNA monomer molecule to have one end with a type IIS restriction site in the plus (+) orientation (S1) followed by a designed base region (being 2-6 nucleotides in length, e.g., 4 bases) that serves as a sticky end sequence referred to as the Left sticky end sequence (1). The other end of the monomer has the type IIS restriction site in the minus (−) orientation (S1*) followed by a designed base region (being 2-6 nucleotides in length, e.g., 4 bases) that serves as a sticky end sequence referred to as the Right sticky end sequence (1*). The Left and Right sticky end sequences are designed to be complementary to each other.
[0062] A typical one pot assembly reaction contains 3 pmol to 7 pmol of DNA monomers, 30 U to 60 U of a type IIS restriction enzyme (e.g., BsaI), and 1000 U to 2000 U of DNA ligase (e.g., T4 DNA ligase) in buffer containing 50 mM Tris-HCl, 10 mM MgCl.sub.2, 10 mM DTT, and 1 mM ATP at pH 7.5. As shown in
[0063] In an exemplary homopolymer assembly reaction (
II. Preparation of DNA Monomers for Assembly
[0064] Double-stranded DNA inserts of any size can be assembled by linear DNA assembly, if the required end sequences as mentioned above are present. Adapters containing the end sequences can be added to any DNA insert by dA-tailing and ligation, as shown in
[0065] End sequences for assembly can be added to any DNA insert by PCR, as shown in
III. Directional Assembly
[0066] As mentioned above, self-hybridization of a molecule during assembly can lead to circularization (
[0067] IV. Blocking Side Product Assembly
[0068] During the restriction step of the assembly reaction, in addition to monomers with sticky ends (1 and 1*) two side products, SP1 and SP2 with Right sticky end sequence (1*) and Left sticky end sequence (1) respectively are formed (
[0069] Preliminary NS analysis of read length was performed for a 182 bp DNA assembled by LDA on a DNA seed by bi-directional assembly and using destructive probes (Table 1). The use of a DNA seed and destructive probes improved the length of the assembled DNA by around 56% compared to normal LDA.
[0070] Table 1. NS data showing increase in read length by LDA with seed and destructive probes
TABLE-US-00001 LDA with seed LDA with and destructive LDA seed probes No. of reads with seed — 551 420 No. of reads without 1969 1499 1425 seed Avg. length with seed — 1170 1238 Avg. length without 801 685 727 seed
V. Operation of LDA for Long-read Sequencing
[0071] Long-read sequencing platform based on nanopores like Oxford Nanopore Sequencing (NS) have several advantages over short-read sequencers (e.g., Illumina) such as being able to produce real-time data, rapid library prep, portability, and low capital cost. But NS suffers from a higher intrinsic error rate of roughly 10% compared to 0.2% for Illumina and also produces lower number of reads compared to Illumina. This prevents the use of NS for rare variant detection. Variant enrichment strategies that can enrich rare variants over the intrinsic error rate of NS can potentially enable use of NS for rare variant detection. Variant enrichments methods like Blocker Displacement Amplification (BDA), ICE COLD PCR, or PNA-blocker PCR produces short amplicons of 100 bp - 300 bp in length. PCR that produces short amplicons are routinely used in a number of diagnostic assays like cell-free DNA (cfDNA) analysis and in assays designed for short-read sequencing platforms. But sequencing short DNA (<300 bp) on NS produces reads of low quality and yield. Linear DNA assembly to assemble short DNA into long assemblies can enable NS to produce reads of higher quality and yield for short amplicon sequencing.
[0072] NS can sequence ultra-long reads up to several Mbs in size. Therefore, higher order assemblies of short amplicons are needed to utilize the full potential of NS. In molecular cloning, assembly by type IIS restriction and ligation (i.e., Golden Gate assembly) is one method for cloning up to 20 inserts into vectors. Gibson assembly, which is another method for cloning is used for cloning up to only 5 inserts, due to lower efficiency of assembly for higher number of inserts. Gibson assembly also requires longer sticky ends around 20 bases in length. This requires two separate PCR reactions to attach end sequences for assembly on to DNA inserts. Since, forward and reverse primers with 20-base complementarity will form primer dimers even at elevated temperatures of around 55° C.-72° C. that are typically used during the annealing and extension step of PCR and impair PCR amplification of the desired DNA insert. In the LDA method provided herein, sticky ends (1 and 1*) are only 4 bases long and hence will not form primer dimers during the annealing and extension steps of PCR. As such, LDA-adapter forward and reverse primers designed as depicted in
[0073] Circularization of short DNA and Rolling Circle Amplification (RCA) of the circular DNA can generate long single stranded DNA (ssDNA) composed of multiple copies (up to 50 copies) of the same DNA sequence. But NS cannot sequence ssDNA directly, since dsDNA sequencing adaptors containing bound motor proteins are ligated to ends of DNA to be sequenced. The motor proteins are needed for translocation of DNA through the nanopore for sequencing. Even if the ends of the DNA are made double stranded by hybridizing short oligos to the ends, the presence of significant structure in the ssDNA region of the RCA product interferes with NS. To generate dsDNA from RCA, random hexamers can be used during RCA. But this method generates highly branched DNA that needs to be debranched before NS. The presence of random hexamers also generates non-specific amplification products. Therefore, though RCA can generate long DNA fragments from short DNA, it is laborious involving multiple steps (circularization, exonuclease digestion to remove linear DNA, RCA and conversion to dsDNA/debranching) making it incompatible for rapid library preparation for NS. Therefore, the LDA method provided herein, which can rapidly assemble short amplicons into relatively long assemblies, is ideally suited for NS.
VI. Shortening NS Library Preparation Time by LDA with Barcode Adapter Seeds
[0074] Library preparation for NS involves ligating barcodes for sample identification followed by ligation of an NS adapter containing a motor protein. The motor protein on the NS adapter is necessary to regulate the speed of DNA translocation through the nanopore for proper interpretation of the DNA sequence. As depicted in
[0078] Here, the provided methods shorten NS library preparation time. The methods involve use of a barcode adapter seed that contains a single stranded Left sticky end sequence (1), a double stranded DNA region devoid of type IIS restriction site but containing a unique barcode sequence for sample identification (BC), and a Right sticky end sequence (2) for NS adapter ligation (
Table 2. Comparison of NS by normal library prep after LDA and shortened library prep after LDA with barcode adapter seed
TABLE-US-00002 LDA with Barcode LDA adapter seed No. of reads in 1 hour 258,000 307,290 Average Q-score 10.1 9.98 Average length 1135 1017 % Reads with 96% 82% Barcode
VII. Application Considerations for Low VAF Detection Using NS
[0079] Low VAF detection is essential for diagnostic applications in cancer.
[0080] Commercial tests based on the Illumina platform, such as FoundationOne and whole exome sequencing, for analysis of tumor mutation burden provide detailed information on potential pathogenic mutations for guiding therapy selection. However, short-read sequencers like Illumina are less suitable for the analysis of large deletions, fusions, and copy number variations. In addition, library preparation for Illumina sequencing typically takes 24 hours, with the sequencing run taking another 2 days and bioinformatic interpretation taking 1-2 days. Consequently, analysis of cancer samples can take a minimum of 4 days from sample to answer. Illumina instruments also require significant capital investment. As such, samples have to be sent to a centralized location for sequencing, which adds additional time for sample processing. These limitations can be overcome by enabling low VAF detection on the NS platform. NS is already well-suited for the analysis of DNA structural variants and copy number variants due to its long-read capability. Adding the capability of low VAF detection to NS will make it the preferred platform for rapid and comprehensive analysis of cancer genomics.
[0081] The variant enrichment method, BDA (Wu et al., 2017; US 2017/0067090; and WO 2019/164885, each of which is incorporated herein by reference in its entirety), was combined with LDA (
VIII. Application Considerations for Mutation Detection in Cancer
[0082] Low VAF detection on NS demonstrated above and in Example 3 makes NS suitable for mutation detection in cancer. Acute Myeloid Leukemia (AML) is a type of blood cancer in which the bone marrow produces abnormal red blood cells, platelets or myelobalsts. Previously, NS could detect only mutations with >20% VAF because of its high error rate. A 7-plex NS AML panel was designed for detecting mutations in 6 genes at 7 loci, which are involved in AML with a sensitivity of 1% VAF. Mutations in all 7 loci are detected in a single multiplex-reaction following the workflow in
[0083] Melanoma is a type of skin cancer in which pigment producing cells called melanocytes become mutated causing cancer. A 15-plex NS melanoma panel was designed for detecting mutations in 9 genes at 15 loci, which are involved in melanoma with sensitivity of 1% VAF in a single reaction. The panel can detect mutations in MAP2K1, MAP2K2, AKT1, AKT3, NRAS, KRAS, PIK3CA, and BRAF genes.
[0084] The panel was further tested on genomic DNA extracted from a fresh frozen melanoma clinical tissue sample (
[0085] Next, the melanoma panel was applied to 25 clinical melanoma tissue samples, including both fresh/frozen (FF) and FFPE tissue (
[0086] Importantly, many of the 153 discordant called variants based on a 20% VRF threshold could be real mutations missed by NGS. To confirm the discordant NS mutation calls, droplet digital PCR (ddPCR) was performed on 6 FFPE samples at 4 mutation loci (BRAF p. V600, KRAS p. G13, KRAS p. E62, and MAP2K1 p. P124). Of these 24, 11 mutations were called positive by NS, and 13 were called negative by NS. NS was concordant with ddPCR for 10 positive samples and 11 negative samples (
[0087] Next, the reproducibility and robustness of the NS panel was characterized on different types of nanopore sequencing instruments and flow cells. The Oxford Nanopore Flongle flow cell, in particular, is relatively inexpensive at $90, and can further reduce turnaround time relative to MinION by reducing the need for sample batching before sequencing. The NS panel was performed on all 25 melanoma samples on the Flongle. Highly quantitatively similar VRFs were observed as compared to the MinION (
[0088] These results show that BDA combined with LDA can enable NS to be used for cancer mutation profiling in the clinic.
IX. Interpretation of NS Data that Utilize BDA and LDA
[0089] An embodiment of an algorithm to analyze NS reads from FASTS files is described below. Similar algorithms from FASTS or FASTQ files can similarly be constructed by one of ordinary skill in the art of bioinformatic processing of sequencing data. [0090] 1. Remove reads with low quality score (e.g., Q score <7) [0091] 2. Convert FASTS to FASTQ file [0092] 3. For each read, look for the junction site of assembly and deconcatenate reads at the assembly site to generate individual sequences [0093] 4. Align individual sequences to reference amplicon sequence to generate a sam/bam file [0094] 5. Analyze the number of reads mapped to the wild type sequence at each position and calculate the variant percent at each position [0095] 6. Call variant at a position if the variant percentage at that position is above a certain threshold that depends on the error rate of NS for that amplicon sequence (e.g., >40%)
X. Examples
[0096] The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
Example 1— LDA Improves NS Read Throughput and Quality
[0097] The results presented in
Example 2— Variant Allele Detection by NS after LDA
[0098]
Example 3— Detection of 0.1% VAF by NS after LDA using BDA
[0099] Amplicons from a typical BDA reaction were used as the template for PCR with LDA-adapter primers as shown in
Example 4— AML 7-Plex Panel
[0100] ow VAF detection on NS demonstrated above and in Example 3 makes NS suitable for mutation detection in cancer. Acute Myeloid Leukemia (AML) is a type of blood cancer in which the bone marrow produces abnormal red blood cells, platelets or myelobalsts. Previously, NS could detect only mutations with >20% VAF because of its high error rate. A 7-plex NS AML panel was designed for detecting mutations in 6 genes at 7 loci, which are involved in AML with a sensitivity of 1% VAF. Mutations in all 7 loci are detected in a single multiplex-reaction following the workflow in
Example 5— Melanoma 15-Plex Panel
[0101] Melanoma is a type of skin cancer in which pigment producing cells called melanocytes become mutated causing cancer. A 15-plex NS melanoma panel (Table 3) was designed for detecting mutations in 9 genes at 15 loci, which are involved in melanoma with sensitivity of 1% VAF in a single reaction. The panel can detect mutations in MAP2K1, MAP2K2, AKT1, AKT3, NRAS, KRAS, PIK3CA, and BRAF genes.
[0102] Next, the melanoma panel was applied to 25 clinical melanoma tissue samples, including both fresh/frozen (FF) and FFPE tissue (
[0103] Importantly, many of the 153 discordant called variants based on a 20% VRF threshold could be real mutations missed by NGS. To confirm the discordant NS mutation calls, droplet digital PCR (ddPCR) was performed on 6 FFPE samples at 4 mutation loci (BRAF p. V600, KRAS p. G13, KRAS p. E62, and MAP2K1 p. P124). Of these 24, 11 mutations were called positive by NS, and 13 were called negative by NS. NS was concordant with ddPCR for 10 positive samples and 11 negative samples (
[0104] Next, the reproducibility and robustness of the NS panel was characterized on different types of nanopore sequencing instruments and flow cells. The NS panel was performed on all 25 melanoma samples on the Oxford Nanopore Flongle flow cell. Highly quantitatively similar VRFs were observed as compared to the MinION (
[0105] These results show that BDA combined with LDA can enable NS to be used for cancer mutation profiling in the clinic.
TABLE-US-00003 TABLE 3 Oligos used for NS using combined BDA and LDA SEQ ID Gene Design Sequence NO rs3789806 FP CTTGTATATAGACGGTAAAATAAACACCAA 1 GA RP AGGCACCAGAAGTCATCAGAATG 2 B TAAACACCAAGACGTGGTAAATATTTACCT 3 GGT/iSpC3//iSpC3/CG FLT3 FP CCCTGACAACATAGTTGGAATCA 4 RP ACTCCAGGATAATACACATCACAGT 5 B TGGAATCACTCATGATATCTCGAGCCA/SpC 6 3//iSpC3/CG DNMT3A FP AGCAGTCTCTGCCTCGC 7 RP AGAAGATTCGGCAGAACTAAGCA 8 B CCTCGCCAAGCGGCTCATGTT/iSpC3//iSpC3/ 9 AC IDH1 FP GCTTGTGAGTGGATGGGTAAAAC 10 RP TGTGTTGAGATGGACGCCTATT 11 B TGGGTAAAACCTATCATCATAGGTCGTCAT 12 G/iSpC3//iSpC3/TC IDH2_140 FP AGAAGATGTGGAAAAGTCCCAATG 13 RP GTGCCCAGGTCAGTGGAT 14 B TCCCAATGGAACTATCCGGAACATCC/iSpC3 15 //iSpC3/GC IDH2_172 FP CTGGTCGCCATGGGCGT 16 RP TGAAGAAGATGTGGAAAAGTCCCA 17 B GGCGTGCCTGCCAATGGTGA/iSpC3//iSpC3/G 18 A KIT FP TCCTTTAACCACATAATTAGAATCATTCTTG 19 A RP AGTTAGTTTTCACTCTTTACAAGTTAAAAT 20 GA B ATCATTCTTGATGTCTCTGGCTAGACCAAA/ 21 iSpC3//iSpC3/CT NPM1 FP GTTTAAACTATTTTCTTAAAGAGACTTCCTC 22 C RP TTAAAGTGTTTGGAATTAAATTACATCTGA 23 GT B ACTTCCTCCACTGCCAGAGATCTTGAA/SpC 24 3//iSpC3/GG MAP2K1_57 FP TTGAGGCCTTTCTTACCCA 25 RP GGCTTGTGGGAGACCTTGA 26 B TCTTACCCAGAAGCAGAAGGTGGGA/iSpC3// 27 iSpC3/AA MAP2K1_121 FP AGCTGCAGGTTCTGCAT 28 RP AGCCACCCAACTCTTAAGGC 29 B CTGCATGAGTGCAACTCTCCGTACA/iSpC3//i 30 SpC3/GC MAP2K1_203 FP CGCTGACCCCAAAGTCACA 31 RP AGTTCCCTCCTTTTCTATTTTCTCTTC 32 B AAAGTCACAGAGCTTGATCTCCCCAC/iSpC3 33 //iSpC3/GC MAP2K2_57 FP TTCGCCGACCTTGGCT 34 RP AGTCTCCCTAGGTAGCTAACCC 35 B TTGGCTTTCTGGGTGAGAAAGGCTT/iSpC3//i 36 SpC3/AC MAP2K2_125 FP CTGCAGGTCCTGCACGA 37 RP GGACGCACTCACCATGTGT 38 B CTGCACGAATGCAACTCGCCGTA/iSpC3//iSp 39 C3/TA MAP2K2_207 FP CATCCTCGTGAACTCTAGAGG 40 RP GGGACTCACAGCCATGTAGG 41 B TCTAGAGGGGAGATCAAGCTGTGTGA/iSpC 42 3//iSpC3/AA AKT1 FP AACACCTTCATCATCCGCT 43 RP CCATCCCCGTGTCCCTC 44 B CGCTGCCTGCAGTGGACCACT/iSpC3//iSpC3/ 45 CC AKT3 FP TCAAAAGGAAGTATCTTGGCCTCC 46 RP CCAGTGTTGTAGGACATATATTGTACC 47 B GCCTCCAGTTTTTTATATATTCTCCTACATG 48 AGG/iSpC3//iSpC3/AA NRAS_12 FP CAGTGCGCTTTTCCCAACA 49 RP GCTTTAAAGTACTGTAGATGTGGCTC 50 B CCCAACACCACCTGCTCCAACC/iSpC3//iSpC 51 3/CT NRAS_61 FP TTGTTGGACATACTGGATACAGC 52 RP GGTTAATATCCGCAAATGACTTGC 53 B GATACAGCTGGACAAGAAGAGTACAGTG/iS 54 pC3//iSpC3/AC KRAS_12 FP GTCAAGGCACTCTTGCCTAC 55 RP TGTATTAACCTTATGTGTGACATGTTCTAA 56 B TGCCTACGCCACCAGCTCCA/iSpC3//iSpC3/T 57 T KRAS_61 FP TCTTGGATATTCTCGACACAGCA 58 RP TTATATTCAATTTAAACCCACCTATAATGG 59 TG B CACAGCAGGTCAAGAGGAGTACAGTG/iSpC 60 3//iSpC3/AC PIK3CA_542 FP CAATTTCTACACGAGATCCTCTCT 61 RP GGTATGGTAAAAACATGCTGAGATCA 62 B ATCCTCTCTCTGAAATCACTGAGCAGG/iSpC 63 3//iSpC3/AC PIK3CA_1047 FP TTGGAGTATTTCATGAAACAAATGAATGAT 64 RP CAGTGCAGTGTGGAATCCAG 65 B CAAATGAATGATGCACATCATGGTGGC/iSp 66 C3//iSpC3/GT BRAF_600 FP GGACCCACTCCATCGAGAT 67 RP TTACTTACTACACCTCAGATATATTTCTTCA 68 TG B CCATCGAGATTTCACTGTAGCTAGACCAAA 69 A/iSpC3/iSpC3/AA LDA-adapter FP CAATTCGGTCTCCAGTG-Gene specific 70 sequence RP CAATTCGGTCTCCCACT-Gene specific 71 sequence
[0106] All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
REFERENCES
[0107] The following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference. [0108] US 2017/0067090 [0109] WO 2019/164885 [0110] Wu et al., (2017). Multiplexed enrichment of rare DNA variants via sequence-selective and temperature-robust amplification. Nature Biomedical Engineering, 1(9), 714-723.