METHOD FOR SELECTION OF CORRECT NUCLEIC ACIDS

20210017591 ยท 2021-01-21

    Inventors

    Cpc classification

    International classification

    Abstract

    Selective removal of erroneous nucleic acids or the selective retrieval of correct nucleic acids is enabled by controlled complementary strand synthesis using compositions of nucleotides at each cycle of the synthesis that facilitate the extension of correctly templated complementary strands and the termination of incorrectly templated complementary strands to the effect of allowing sufficient biochemical discrimination between correct and erroneous nucleic acids, for example, based on the completeness of the complementary strand synthesis.

    Claims

    1. A method of retrieving at least one nucleic acid from a plurality of template nucleic acids, comprising: a controlled cyclical synthesis of nucleic acid strands complementary to said template nucleic acids, wherein, at each cycle of said controlled cyclical synthesis, compositions of substrate nucleotides are provided in a way that corresponds to a desired nucleic acid sequence or desired nucleic acid sequence(s) to the effect of selectively extending complementary nucleic acid strands of template nucleic acids comprising said desired sequence(s); and a subsequent retrieval method comprising: the selective retrieval of template nucleic acids comprising said desired sequence(s); or the selective removal of template nucleic acids which do not comprise said desired sequence(s).

    2. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of natural nucleotide.

    3. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of reversibly terminated nucleotide.

    4. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of irreversibly terminated nucleotide.

    5. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of natural nucleotide and one type of reversibly terminated nucleotide.

    6. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of natural nucleotide and at least one type of irreversibly terminated nucleotide.

    7. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of reversibly terminated nucleotide and at least one type of irreversibly terminated nucleotide.

    8. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of natural nucleotide, one type of reversibly terminated nucleotide and one type of irreversibly terminated nucleotide.

    9. The method of claim 1, wherein said compositions of substrate nucleotides comprise at least one type of reversibly terminated nucleotide and said controlled cyclical synthesis of nucleic acid strands complementary to said template nucleic acids comprises a step to terminate complementary strand synthesis of template nucleic acids that failed to provide a template base complementary to the at least one type of reversibly terminated nucleotide provided in the previous extension step.

    10. The method of claim 1, wherein said template nucleic acids are initially single-stranded.

    11. The method of claim 1, wherein said template nucleic acids are initially double-stranded.

    12. The method of claim 1, wherein at least one strand of each said template nucleic acids is immobilised on a solid surface.

    13. The method of claim 1, wherein said subsequent selective retrieval method is enabled by the fully or partially single-stranded nature of template nucleic acids not comprising said desired sequence.

    14. The method of claim 1, wherein said subsequent selective retrieval method is enabled by the fully or partially single-stranded nature of template nucleic acids not comprising said desired sequence.

    15. The method of claim 1, wherein said subsequent selective retrieval method is enabled by the full or partial absence of a primer binding site in the synthesized complementary strands of template nucleic acids not comprising said desired sequence.

    16. The method of claim 1, wherein said controlled cyclical synthesis of nucleic acid strands complementary to said template nucleic acids and said selective retrieval method is performed repeatedly in a recurrent manner.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0014] Some embodiments of the present invention are illustrated as an example and are not limited by the figures of the accompanying drawings, in which like references may indicate similar elements and in which:

    [0015] FIG. 1 depicts an overview of one example of a workflow according to various embodiments of the invention.

    [0016] FIG. 2 illustrates one example cycle of controlled complementary strand synthesis according to various embodiments of the invention described herein, detailing the use of a mixture of reversibly terminated nucleotides and irreversibly terminated nucleotides.

    [0017] FIG. 3 illustrates one example cycle of controlled complementary strand synthesis according to various embodiments of the invention described herein, detailing the use of reversibly terminated nucleotides and capping of unextended 3OH groups.

    [0018] FIG. 4 illustrates example cycles of controlled complementary strand synthesis according to various embodiments of the invention described herein, detailing the use of natural nucleotides.

    [0019] FIG. 5 depicts an overview of one example of a workflow according to various embodiments of the invention, detailing immobilisation of nucleic acids at the 5 end.

    [0020] FIG. 6 depicts an overview of one example of a workflow according to various embodiments of the invention, detailing immobilisation of nucleic acids at the 5 end and the use of hairpin loops for self-priming of complementary strand synthesis.

    [0021] FIG. 7 depicts one example of a workflow to retrieve error-free nucleic acids after controlled complementary strand synthesis according to various embodiments of the invention, detailing the use of a double-strand specific restriction enzyme [SEQ ID NO:8] for the release from a solid support.

    [0022] FIG. 8 depicts one example of a workflow to retrieve error-free nucleic acids after controlled complementary strand synthesis according to various embodiments of the invention, detailing the use of denaturing conditions to release complementary strands from a solid support and targeted degradation of single-stranded nucleic acids originating from erroneous strands.

    [0023] FIG. 9 depicts one example of a workflow to retrieve error-free nucleic acids after controlled complementary strand synthesis according to various embodiments of the invention, detailing a cyclical process in which denaturing conditions are used to release complementary strands from a solid support followed by an optional amplification step, binding onto a solid support based on a primer binding site only present in previously completely synthesized strands and subsequent controlled complementary strand synthesis.

    [0024] FIG. 10 depicts one example of a workflow to retrieve error-free nucleic acids after controlled complementary strand synthesis according to various embodiments of the invention, detailing the use of denaturing conditions to release complementary strands from a solid support and polymerase chain reaction-based amplification of error-free nucleic acids.

    [0025] FIG. 11 depicts one example of a workflow to selectively remove erroneous nucleic acids from a solid support after controlled complementary strand synthesis according to various embodiments of the invention, detailing 5 end immobilised nucleic acids and the use of single-strand specific endonucleases [SEQ ID NO:9].

    [0026] FIG. 12 depicts one example of a workflow to selectively remove erroneous nucleic acids from a solid support after controlled complementary strand synthesis according to various embodiments of the invention, detailing 3 end immobilised nucleic acids, 5 protected ends and the use of single-strand specific endonucleases [SEQ ID NO:9] and 5 specific exonucleases [SEQ ID NO:1].

    [0027] FIG. 13 depicts one example of a workflow according to various embodiments of the invention, detailing the use of strand-displacing polymerases.

    DETAILED DESCRIPTION

    [0028] Nucleic acids of various sources may be subjected to the technique described herein and, despite using DNA as an example of a nucleic acid for the subsequent description, it will be appreciated that the technique could also be applied to other types of nucleic acid, such as XNA (xeno nucleic acid) or RNA.

    [0029] The technique described herein provides a method of retrieving or enriching for error-free DNA from a mixture of erroneous and error-free DNA, which may be a product of de novo nucleic acid synthesis or other processes and which may contain any type of errors such as deletions (where one or multiple nucleotides are missing), insertions (where one or multiple additional nucleotides are inserted), substitutions (where one or multiple nucleobases are exchanged for other nucleobases) or chemical alterations of the structure of the nucleic acid.

    [0030] The technique described herein addresses the demand for accurately synthesized DNA in various technological field, such as biotechnology, nanotechnology or data storage and avoids expensive and time-consuming techniques, such as molecular cloning, hybridisation-based error correction or barcode-based retrieval of sequence-confirmed DNA.

    [0031] In the present invention, retrieval of or enrichment for error-free DNA from a heterogeneous population of error-free and erroneous DNA is enabled by controlled complementary strand synthesis. Since specific compositions of nucleotides are provided at each cycle of the synthesis according to the expected sequence of the interrogated template strand, only templates that present the correct template base at each cycle (or in case of deletions or insertions in homopolymer regions the correct number of the same base in a row) will ultimately be able to follow this dictated synthesis while an erroneous template strand will fail the complete synthesis of its corresponding complementary strand. The resulting difference between error-free and erroneous templates in the success of complementary strand synthesis can then be leveraged for selective degradation or inactivation of erroneous strands or selective amplification or elution of error-free strands.

    [0032] Use of this method may be envisioned at multiple stages in the process of DNA synthesis, for example at the stage of short oligonucleotide synthesis or after the stage of fragment assembly from short oligonucleotides. The method is not limited to any specific synthesis method or source of nucleic acid, for example it may be applied in the currently most common nucleic acid synthesis technique, phosphoramidite-based oligonucleotide synthesis, or in enzymatic synthesis techniques based on terminal deoxynucleotidyl transferases [SEQ ID NO:10].

    [0033] FIG. 1 schematically illustrates an example of a workflow of the method described herein. Starting from various sources of nucleic acids (e.g. microarray-based oligonucleotide synthesis or longer assemblies of DNA), which may be immobilized on a solid support and which, for double-stranded DNA, may undergo treatment to remove one of the two strands. At the next step, one error-free and one erroneous strand are shown schematically, with each respective 3 end immobilized on a solid support and with a small circle indicating an error of any of the types mentioned above. It should be obvious that in reality many molecules will be present at once and, moreover, for example in parallelized synthesis approaches such as microarray-based oligonucleotide synthesis, different target oligonucleotides may be present on different spots of a chip with a certain spatial patterning and means to address each spot individually (e.g. through differential illumination, differential temperature regulation or differential control of acidity). The immobilized nucleic acids which will serve as synthesis templates are shown with an annealed primer at the next step of the flowchart, the binding site of which may be present on all immobilized template strands. Alternatively, a range of different primer sequences may be used, for example to allow for interrogation of nucleic acids on certain spots of a microarray. At the next step an exemplified output of the controlled complementary strand synthesis is shown, which, in the illustrated case, led to a completed complementary strand for the error-free template strand and to an incompletely synthesized complementary strand for the erroneous template due to termination at the site of the error, leaving the template strand partially single-stranded. FIG. 1 indicates that the output of said complementary strand synthesis may be further processed to selectively degrade erroneous and thus incompletely synthesized strands or to selectively amplify or release error-free and thus completely synthesized strands. The workflow in FIG. 1 may be controlled partially or completely with control devices, such as computer programs, fluid control system and apparatuses to address individual spots of immobilized nucleic acids as mentioned above.

    [0034] A detailed view of one cycle of said controlled complementary strand synthesis according to a preferred embodiment of the present invention is shown in FIG. 2. For clarity, a very short stretch of DNA sequence is shown [SEQ ID NO:14-18], although obviously the method is not limited to short DNA sequences. At the beginning of said cycle, all complementary strands shown (error-free and three erroneous strands each with a different type of error) have been extended by six nucleotides in previous cycles according to the 3-immobilized template strand and, at this point, further extension of all strands is reversibly blocked by protection moieties, for example by the presence of a modification at the terminal nucleotide or because of the use of single-turnover variants of DNA polymerases, which sterically or chemically block further extension at the terminal nucleotide. Continuation of complementary strand synthesis thus requires a deprotection step, which, in the example, leads to the generation of a free 3OH at the terminal nucleotides as an acceptor site for the next nucleotide. After deprotection and exposure of 3OH groups at the termini, a mixture of modified nucleotides is provided, for example through a fluidic control system. In the example, a G is expected at the interrogated site in the template strand and thus reversibly terminated dCTP (rt-dCTP) is provided together with a DNA polymerase, leading to extension of the correctly templated complementary strand by said nucleotide (rt-dCTP). In the example, simultaneous or sequential provision of irreversibly terminated (in this case dideoxy) nucleotides ddATP, ddGTP and ddTTP leads to the extension of incorrectly templated complementary strands by one of these irreversibly terminated nucleotides and thus to blockage of any further extension in subsequent cycles. After this extension step, the cycle is completed and a new cycle can be started by deprotection as described above. From the example, it is obvious that only the correct strand would undergo deprotection and further extension.

    [0035] FIG. 3 illustrates is analogous to the cycle described in FIG. 2 but represents another embodiment of the invention, where, instead of irreversible terminators, 3OH capping is used to terminate incorrectly templated complementary strands [SEQ ID NO:14-18]. In the example shown in FIG. 2, after deprotection of the 3-O-blocked termini of complementary strands, a 3-O-blocked rt-CTP and a DNA polymerase are provided as a template G is expected at the upcoming position. As a consequence, only the correctly templated strand undergoes extension and bears a reversibly terminated 3 end, while others expose a 3OH group. In the next step, a treatment is performed that irreversibly caps said 3OH groups so that no further extension can occur in subsequent cycles. For example, this may be achieved by means of chemical or enzymatic modification or by incubation with a DNA ligase [SEQ ID NO:11] and a pool of short random oligonucleotides (e.g. 4N) bearing a dideoxy-3-end to the effect of hybridising and ligating said oligonucleotides to the free 3-OH of incorrectly templated complementary strands. After capping is complete, the next cycle of the complementary strand synthesis can be performed, starting again with deprotection.

    [0036] Another embodiment of the present invention is illustrated in FIG. 4, where natural nucleotides (dNTPs) are used for controlled complementary strand synthesis and three cycles of the synthesis process are representatively shown [SEQ ID NO:14-18]. In the first cycle a G is expected as the next template base and thus a DNA polymerase is provided together with CTP. In the example, provision of CTP leads to extension of the correctly templated complementary strand by three Cs due to the presence of a homopolymer stretch of three Gs in the template. Likewise, the erroneous strands each get their complementary strands extended by, in this case, two Gs. From the example, as synthesis continues over the next two cycles, it is obvious that the complementary strands for the erroneous templates with the substitution or the insertion will fail to extend and will each lag behind by two bases after two further cycles due to the absence of a correct template base for the provided nucleotide. Moreover, it is obvious that this lag effect will become more pronounced over continued cycles, leading to incomplete complementary strand synthesis for substitution or insertion errors whereas deletion errors will not influence the synthesis completeness of the respective complementary strands over continued cycles.

    [0037] In different embodiments of the invention, the nucleic acids to be interrogated for errors may be immobilized at the 5- or the 3-end. For example, 3-end immobilization is common in phosphoramidite synthesis whereas for enzymatic de novo synthesis 5-end immobilization is commonly used (illustrated in FIG. 5). While FIGS. 1-4 illustrated different embodiments of the invention described herein based on immobilization at the 3-end, FIG. 5 shows that 5-end may as well be applied in analogous manner of previous descriptions herein, with the the starting point of the complementary strand synthesis moved from solid phase-proximal to solid phase-distal. In FIG. 6, an embodiment of the invention is shown, in which 5-end immobilization of the nucleic acids to be interrogated is used in combination with a terminal hairpin with priming activity for the controlled complementary strand synthesis.

    [0038] The different embodiments of the invention illustrated in the FIGS. 2, 3 and 4 enable the differentiation of correct nucleic acids based on the progress of complementary strand synthesis. An example of how said differentiation may be exploited is shown in FIG. 7 where a site-specific endonuclease (e.g. Type II or Type IIs restriction enzymes such as NotI [SEQ ID NO:12] or BsaI [SEQ ID NO:13], respectively) with exclusive activity on double stranded nucleic acids is used to release strands that have undergone complete complementary strand synthesis due to the absence of errors in the respective template strands. Released strands may be collected and further processed or used for the intended application. It is not necessary for the accuracy of the complementary strand synthesis and the selective release to be 100% perfect. An enrichment of correct nucleic acids over erroneous nuclei acids already yields significant cost and time advantages.

    [0039] FIG. 8 shows another embodiment the invention, where a release method employing denaturing of the double stranded nucleic acids after complementary strand synthesis is performed. Only non-immobilized strands (which correspond to the complementary strands) will be eluted and may subsequently be annealed to a primer, which can prime the uncontrolled polymerization (i.e. provision of all four natural dNTPs) of a new complementary strand corresponding to the template strand of the first controlled synthesis. Next, a treatment with single-strand specific exo- or endonuclease (e.g. Exonuclease I [SEQ ID NO:6] or Mung Bean Nuclease [SEQ ID NO:9], respectively) may be used to degrade single stranded nucleic acids, which may result from incompletely synthesized complementary strands from the first controlled synthesis and which were hence unable to bind to the primer prior to the uncontrolled polymerization in the previous step.

    [0040] FIG. 9 extends the concept shown in FIG. 8 to a cyclical process that may also be applied to other embodiments of the invention. In the example in FIG. 9, denaturing conditions are used to release complementary strands from a solid support, which may be followed by an amplification process to improve yields of this process. The amplification may for example be performed in solution, in solution in a small fluidic droplet or, in a droplet on a microchip to allow spatial separation from other nucleic acids being synthesized. The amplification may be carried out using polymerase chain reaction or an isothermal amplification process. The optional amplification step may then be followed by binding onto a solid support based on a primer binding site and subsequent controlled complementary strand synthesis. Instead of post-hoc immobilization, said amplification process may already be carried out with immobilized primers (e.g. Bridge PCR or isothermal template walking/WildFire). An example for an amplification-based enrichment for correct nucleic acids is shown in FIG. 10 where it is obvious that terminated strands resulting from the controlled complementary strand synthesis are unable to get amplified due to the lack of the corresponding primer binding site and, for example, the presence of a didexy-3OH end.

    [0041] In one embodiment of the invention, instead of selectively releasing correct nucleic acids after controlled complementary strand synthesis, selective detachment of erroneous strands may be performed. FIG. 11 shows how treatment with a single-strand specific endonuclease (e.g. Mung Bean Nuclease [SEQ ID NO:9]) followed by a wash may be used to remove partially single-stranded DNA stemming from incompletely synthesized complementary strands.

    [0042] Another method of selective detachment/degradation of erroneous strands is illustrated in FIG. 12 where controlled complementary strand synthesis was carried out on templates with 3 end immobilisation and where a combination of a single-strand specific endonuclease [SEQ ID NO:9] and a 5-exonuclease with activity on double-stranded nucleic acids is used for degradation of partially single-stranded DNA, the latter only acting on 5 ends generated by said single-strand specific endonuclease. For example, before said nuclease treatment all 5 ends may be non-phosphorylated whereas 5 ends generated by the single-strand specific endonuclease will be. A 5 phosphorylation-specific exonuclease [SEQ ID NO:1] may be used to also degrade double-stranded regions of the erroneous nucleic acids, which in turn expose new single-stranded regions, serving as a substrate for the single-strand specific endonuclease [SEQ ID NO:9].

    [0043] In one embodiment of the present invention, controlled complementary strand synthesis may be initiated on double-stranded nucleic acids. This may be advantageous, for example, if the template strands to be interrogated for errors are expected for form unwanted secondary structures. FIG. 13 shows an example of said process initiated on immobilized double-stranded nucleic acids where the base-by-base complementary strand synthesis is carried out with a polymerase having strand-displacing activity [SEQ ID NO:14] and which is primed through a specifically introduced nick in the non-immobilized strand. After said synthesis process, treatment with a single-strand specific endonuclease [SEQ ID NO:9] and a 5 exonuclease [SEQ ID NO:1] is carried out in a manner analogous to the example in FIG. 12.

    [0044] In one embodiment of the invention one or ambiguities may be tolerated or preferred in the synthesis outcome, for example for the purpose of mutagenesis experiments. Multiple versions of complementary strands may be synthesized by providing compositions of nucleotides during base-by-base complementary strand synthesis that allow incorporations of more than one type of bases at a certain position.

    REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY

    [0045] The official copy of the sequence listing is submitted electronically via EFS-Web as an ASCII-formatted sequence listing with a file named 16946376_SL.txt created on Sep. 29, 2020, and having a size of 52.7 kilobyte, and is filed concurrently with the specification. The sequence listing contained in this ASCII-formatted document is part of the specification and is herein incorporated by reference in its entirety.