MACROMOLECULES ENGINEERED FOR NANOELECTRONIC MEASUREMENT
20230070226 · 2023-03-09
Assignee
Inventors
- Sanjay B. Hari (Sharon, MA, US)
- Peiming Zhang (Gilbert, AZ)
- Barrett Duan (Reading, MA, US)
- Ming Lei (Sharon, MA, US)
Cpc classification
G01N33/48721
PHYSICS
C12Q2563/116
CHEMISTRY; METALLURGY
C12Q2563/116
CHEMISTRY; METALLURGY
C12Q1/6876
CHEMISTRY; METALLURGY
International classification
C12Q1/6876
CHEMISTRY; METALLURGY
G01N33/543
PHYSICS
Abstract
The present invention provides methods to engineer enzymes for their integration into a molecular nanowire as a fum-tional component for biopolymer sequencing/identification. The enzymes include but are not limited to DNA polymerase, RNA poly-merase, DNA helicase, DNA ligase, DNA exonuclease, reverse transcriptase, RNA primase, ribosome, sucrase, or lactase, which are either natural, mutated, or synthesized.
Claims
1. A system for identification, characterization, or sequencing of a biopolymer comprising, a. a nanogap formed by a first electrode and a second electrode placed next to each other; b. a nucleic acid molecular wire with a length comparable to the nanogap that bridges the nanogap by attaching one end of the molecular wire to the first electrode and another end of the molecular wire to the second electrode each through a chemical bond, wherein two internal nucleosides within the molecular wire at pre-defined positions are functionalized, allowing the attachment of a protein or a sensing molecule, and wherein the molecular wire has one or more attachment sites at each end; and c. a sensing probe with two attachment sites attached to the two corresponding functionalized sites on the molecular wire that can interact or perform a chemical or a biochemical reaction with the biopolymer, wherein the two attachment sites interact with the two functionalized sites on the molecular wire and control the orientation of the sensing probe.
2. The system of claim 1, further comprising, a. a bias voltage that is applied between the first electrode and the second electrode; b. a device that records a current fluctuation through the molecular wire caused by the interaction between the sensing probe and the biopolymer; and c. a software for data analysis that identifies or characterizes the biopolymer or a subunit of the biopolymer.
3. The system of claim 1, wherein the biopolymer is selected from the group consisting of a DNA, an RNA, a protein, a carbohydrate, a polypeptide, an oligonucleotide, a polysaccharide, and their analogues, either natural, synthesized, modified, and a combination thereof.
4. The system of claim 1, wherein the sensing probe is selected from the group consisting of a nucleic acid probe, an enzyme, a receptor, a ligand, an antigen and an antibody, either native, mutated, synthesized, and a combination thereof.
5. The system of claim 4, wherein the enzyme is selected from the group consisting of a DNA polymerase, an RNA polymerase, a DNA helicase, a DNA ligase, a DNA exonuclease, a reverse transcriptase, an RNA primase, a ribosome, a sucrase, lactase, either natural, mutated, synthesized, and a combination thereof.
6. The system of claim 4, wherein the enzyme is engineered to comprise an unnatural amino acid at a pre-defined site.
7. The system of claim 6, wherein the unnatural amino acid used for protein engineering is a selenocysteine or a phenylalanine or a lysine or a derivative thereof, either natural, synthesized, mutated, or a combination thereof
8. The system of claim 5, wherein the two engineered sites on the DNA or RNA polymerase are configured with one site in a finger domain, and the other site in either an exonuclease, or a palm, or a thumb, or a TPR1 or a DTPR2 domain.
9. The system of claim 5, wherein the DNA or RNA polymerase is engineered to comprise only one or two cysteine residues for attachment to the molecular wire.
10. The system of claim 5, wherein the DNA or RNA polymerase is engineered to comprise at least a selenocysteine or wherein at least one cysteine therein is replaced with a selenocysteine.
11. The system of claim 1, wherein the molecular wire is selected from the group consisting of a single nucleic acid duplex, a nucleic acid duplex duo, a nucleic acid triplex, a nucleic acid quadruplex, a nucleic acid origami structure, and a combination thereof wherein the nucleic acid strand is either in an A-form, a B-form or a Z-form and the nucleic acid bases are either natural or unnatural.
12. The system of claim 11, wherein the single nucleic acid duplex comprises a functionalized nucleic acid base at a pre-defined position on each strand and one attachment site at the end of each duplex or at the end of each strand; and the nucleic acid duplex duo has one functionalized nucleic base on each duplex and one attachment site at the end of each duplex.
13. The system of claim 11, wherein the sequence of a nucleic acid duplex is palindromic.
14. The system of claim 1, wherein the nucleic acid molecular wire comprises an amino function at one of its internal bases at a pre-defined position,
15. The system of claim 14, wherein the base with amino function is further functionalized with a moiety carrying an activated carboxylate, including but not limited to an azide, a maleimide, an exocyclic olefinic maleimide, a furan, a dibenzocyclooctane, a tetrazine, a triazine, an oxadiazole sulfone.
16. The system of claim 11, wherein the nucleic acid duplex duo comprises two double-stranded PNA, XNA or a hybrid of DNA/RNA, DNA/PNA, DNA/XNA, RNA/PNA, RNA/XNA or PNA/XNA, either natural, modified, synthesized, or a combination thereof, or is replaced by two double-stranded PNA, XNA or a hybrid of DNA/RNA, DNA/PNA, DNA/XNA, RNA/PNA, RNA/XNA or PNA/XNA, either natural, modified, synthesized, or a combination thereof.
17. The system of claim 1, wherein the nucleic acid molecular wire comprises at least 50% of GC base pairs.
18. The system of claim 1, wherein the nanogap size or the distance between the ends of the two electrodes is about 2 to 1000 nm, or about 5 to 100 nm, or about 5 to 30 nm.
19. The system of claim 1, wherein the nanogap comprises a plurality of nanogaps, each comprising a pair of electrodes, a molecular wire, a sensing probe, and any feature associated with a single nanogap.
20. A method for identification, characterization, or sequencing of a biopolymer comprising, a. forming a nanogap by placing a first electrode and a second electrode placed next to each other; b. providing a nucleic acid molecular wire with a length comparable to the nanogap, wherein two internal nucleosides of the molecular wire at pre-defined positions are functionalized, allowing the attachment of a protein or a sensing molecule, and wherein the molecular wire has one or more attachment sites at each end; c. providing a sensing probe that can interact or perform a chemical or a biochemical reaction with the biopolymer, wherein the sensing probe has two attachment sites that can interact with the two functionalized sites on the molecular wire; d. attaching one end of the molecular wire to the first electrode and another end of the molecular wire to the second electrode through attachment sites at the end of the molecular wire; and e. attaching the sensing probe to the molecular wire through the two attachment sites on the sensing probe and the two functionalized sites on the molecular wire. wherein step “e” could occur before step “d” or vice versa.
21-38. (canceled)
Description
BRIEF DESCRIPTION OF DRAWINGS
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
SUMMARY OF THE INVENTION
[0028] The present invention provides methods to engineer enzymes for their integration into a molecular nanowire as a functional component for biopolymer sequencing/identification. The said enzymes include but are not limited to DNA polymerase, RNA polymerase, DNA helicase, DNA ligase, DNA exonuclease, reverse transcriptase, RNA primase, ribosome, sucrase, or lactase, which are either natural, mutated, or synthesized.
[0029] The biopolymer includes but is not limited to DNA, RNA, oligonucleotides, protein, peptides, polysaccharides, etc., which are either natural or synthesized; and the molecular nanowire includes, but are not limited to a double-strand DNA (dsDNA or DNA duplex), a DNA duo (two dsDNA), a DNA nanostructure as disclosed in [24], or a combination thereof. The DNA duo is a simple DNA nanostructure and has an increased conductivity compared to a single DNA duplex. Below, we use the DNA duo and DNA polymerase to illustrate the method of engineering an enzyme. The same approach or principle applies to a single DNA duplex and a DNA nanostructure, sequencing and/or identifying different biopolymers using enzymes as sensors.
[0030]
DETAILED DESCRIPTION OF THE INVENTION
[0031] In one embodiment of the present invention, the said enzyme is an engineered DNA polymerase that carries unnatural amino acid residues containing an orthogonal functional group at two predefined positions (201,
[0032] In some embodiments of the present invention, the said enzyme is a wild-type DNA polymerase engineered with unnatural amino acids at the pre-select sites (702,
[0033] In some embodiments, the mutant DNA polymerase includes a fused, genetically-encoded protein conveying enhanced solubility and activity (701) (Sequence ID #1). The fused polymerase is engineered to contain only one or two cysteine residues (
[0034] In some embodiments, the fused polymerase is engineered by replacing some of its cysteines with selenocysteine (301,
[0035] In some embodiments, the unnatural amino acid used for protein engineering is a derivative of selenocysteine (shown in
[0036] In some embodiments, the said unnatural amino acid is a derivative of natural phenylalanine, which is incorporated into the said protein and mutants according to the cloning method stated in Methodology. Some of the phenylalanine derivatives are shown in
[0037] In some embodiments, the said unnatural amino acid is a derivative of natural lysine, which is incorporated into the said protein and mutants according to the cloning method stated in Methodology, Some of the lysine derivatives are shown in
[0038] In some embodiments, this invention provides a DNA duo to form a molecular junction as a medium for incorporating the said protein or a mutant and conveying the protein's movement to electrical signals. Each DNA duplex has one nucleoside functionalized (N.sup.m), able to react with one of the said unnatural amino acids in the engineered protein or polymerase in the case of DNA/RNA sequencing, and two functional groups (B.sup.m) at its two ends for attaching to the two electrodes at the nanogap respectively (
[0039] In some embodiments, the said DNA junction is a single DNA duplex (dsDNA), each strand of which has one nucleoside functionalized (N.sup.m), able to react with the said noncanonical and unnatural amino acids engineered into the said protein or polymerase in the case of DNA/RNA sequencing, and one or two functional groups (B.sup.m) at each end of the duplex for attaching to the two electrodes at the nanogap (
[0040] In some embodiments, the said DNA junction is a DNA nanostructure as disclosed in [24, 25] and two predefined locations in the nanostructure have nucleosides functionalized (N.sup.m), able to react with the said noncanonical and unnatural amino acids engineered into the said protein or polymerase in the case of DNA/RNA sequencing, and one or two functional groups (B.sup.m) at each end of the DNA nanostructure for attaching to the two electrodes at the nanogap (
[0041] In some embodiments, the double-stranded DNA has an amino function at one of its internal bases. For example, an amino group is situated at the 5-position of a pyrimidine base or the 7-position of a purine base. Some of these nucleosides are shown in
[0042] The aminated DNA is further functionalized with functional groups that can specifically react with the said unnatural amino acids engineered into the said protein or polymerase in the case of DNA/RNA sequencing. Some of which are shown in
[0043] In some embodiments of the present invention, the DNA duo generally comprises two double-stranded DNA with a length that can bridge two electrodes separated by a distance ranging from 3 to 50 nanometer. In some other embodiments, the DNA duo is replaced by two double-stranded RNA, PNA, XNA, or hybrids of DNA to RNA, DNA to PNA, DNA to XNA, RNA to PNA, RNA to XNA, or PNA to XNA.
[0044] In some embodiments, the sequence of a DNA duplex, either alone or being part of a DNA duo or a DNA nanostructure, contains at least 50% of GC base pairs with a length ranging from 10 to 150 base pairs. Besides the canonical bases, the DNA duplex also includes modified nucleobases and/or base analogs for improving its conductivity.
[0045] In some embodiments, the DNA duo comprises the palindromic double-stranded DNA that is formed spontaneously in solution from a single-stranded oligonucleotide with a self-complementary sequence. Both double-stranded DNA molecules in the DNA duo have the same symmetry without polarity along their helical axes. When the DNA duo is used as a molecular wire to bridge the nanogap, its two ends can be attached to either one of two electrodes, which would not cause electrical polarities.
Methodology
[0046] Cloning. A gene cassette harboring sequences encoding a fusion protein and wild-type DNA polymerase from phi29 (phi29pol) was inserted into a T7-based plasmid such as pET21a and expressed in E. coli. Point mutations were made using PCR with oligonucleotide primers containing desired mutations [23]. The recombinant protein was purified using Ni-NTA agarose. Typical yields are approximately 30 mg per liter of culture (
[0047] Activity assay. In a typical, non-limiting reaction, enzyme (100 ng) is incubated in a buffered solution containing plasmid DNA (20 ng), dNTPs, and single-stranded DNA primer at 30° C. Products are digested with EcoRI, separated by agarose gel electrophoresis, and visualized by fluorescence (
[0048] DNA-functionalization with DBCO. In a typical, non-limiting reaction, single-stranded DNA containing an amino function (50 μM) is incubated with DBCO-PEG5-TFP ester (2.5 mM) in sodium tetraborate buffer (pH 9) overnight at 25° C. Any unreacted linker is removed by ethanol precipitation.
[0049] Macromolecule-enzyme conjugation. In a typical, non-limiting reaction, enzyme (30 μM) containing a p-azidophenylalanine residue is incubated in a buffered solution containing DBCO-conjugated macromolecules (150 μM) molecule at 20° C. (
TABLE-US-00001 Sequence listing Sequence ID #1 Type: protein Organism: synthetic sequence Other information: DNA polymerase fusion protein MGHHHHHHHDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGK EMDSLRFLYDGIRIQADQTPEDLDMEDNDIIEAHREQIGGNGSKHMPRKMYSCDFETTTKVEDCR VWAYGYMNIEDHSEYKIGNSLDEFMAWVLKVQADLYFHNLKFDGAFIINWLERNGFKWSADGLPN TYNTIISRMGQWYMIDICLGYKGKRKIHTVIYDSLKKLPFPVKKIAKDFKLTVLKGDIDYHKERP VGYKITPEEYAYIKNDIQIIAEALLIQFKQGLDRMTAGSDSLKGFKDIITTKKFKKVFPTLSLGL DKEVRYAYRGGFTWLNDRFKEKEIGEGMVFDVNSLYPAQMYSRLLPYGEPIVFEGKYVWDEDYPL HIQHIRCEFELKEGYIPTIQIKRSRFYKGNEYLKSSGGEIADLWLSNVDLELMKEHYDLYNVEYI SGLKFKATTGLFKDFIDKWTYIKTTSEGAIKQLAKLMLNSLYGKFASNPDVTGKVPYLKENGALG FRLGEEETKDPVYTPMGVFITAWARYTTITAAQACYDRIIYCDTDSIHLTGTEIPDVIKDIVDPK KLGYWAHESTFKRAKYLRQKTYIQDIYMKEVDGKLVEGSPDDYTDIKFSVKCAGMTDKIKKEVTF ENFKVGFSRKMKPKPVQVPGGVVLVDDTFTIKSA Sequence ID #2 Type: protein Organism: synthetic sequence Other information: DNA polymerase fusion protein with a single cysteine MGHHHHHHHDSEVNQEAKPEVKPEVKPETHINLKVSDGSSEIFFKIKKTTPLRRLMEAFAKRQGK EMDSLRFLYDGIRIQADQTPEDLDMEDNDIIEAHREQIGGNGSKHMPRKMYSADFETTTKVEDAR VWAYGYMNIEDHSEYKIGNSLDEFMAWVLKVQADLYFHNLKFDGAFIINWLERNGFKWSADGLPN TYNTIISRMGQWYMIDIALGYKGKRKIHTVIYDSLKKLPFPVKKIAKDFKLTVLKGDIDYHKERP VGYKITPEEYAYIKNDIQIIAEALLIQFKQGLDRMTAGSDSLKGFKDIITTKKFKKVFPTLSLGL
[0050] Claimable items of this invention include, but not limited to, the following:
[0051] An embodiment is a DNA duplex or a DNA duo that bridges a nanogap between two electrodes. The said DNA duplex or DNA duo comprises: [0052] a. Double-stranded DNA molecules, either in A-form, or B-form or Z-form. [0053] b. Double-stranded nucleic acid helices including those natural and non-natural. [0054] c. Double-stranded molecules connected through a biomolecule. [0055] d. Double-stranded molecules containing linkers at their ends. [0056] e. Double-stranded molecules containing internal functional groups for attaching recognition molecules including those with a molecular weight ranging from 100 to 200,000 Da. [0057] f. Double-stranded DNA containing modified nucleotides that increase the conductivity of the double-stranded DNA, as disclosed in [25], such as a single nucleic acid duplex (double strands), a nucleic acid triplex, a nucleic acid quadruplex, a nucleic acid origami structure, and a combination thereof, wherein the nucleic acid bases are either natural, modified or synthesized or the combination thereof.
[0058] An embodiment is a functional protein engineered to at least contain one of the above said noncanonical amino acid residues at predefined positions. [0059] a. The said protein fused to another protein with enhanced solubility and stability. [0060] b. The said protein spontaneously and precisely forming covalent connections with an engineered molecular wire.
[0061] An embodiment is a functional protein engineered to contain two of the above said noncanonical amino acid residues at the predefined positions, and the said protein spontaneously and precisely forming covalent connections at two predefined positions on an engineered molecular wire.
[0062] An embodiment is a method to label enzymes with biomolecules and organic molecules.
[0063] An embodiment is the DNA duplex or DNA duo or DNA nanostructure internally carrying a nucleophile capable of reacting with the above said NHS, PFP, or TFP esters of functional molecules or other chemically active species. [0064] a. The said molecular wire has a length of ranging from 2 to 1000 nm, preferably 5 to 100 nm, most preferably 5 to 30 nm. [0065] b. The said molecular wires spontaneously and precisely forming covalent connections with engineered proteins.
[0066] An embodiment is a method to engineer DNA with different functional groups at predetermined locations.
General Remarks
[0067] All publications, patents, and other documents mentioned herein are incorporated by reference in their entirety.
[0068] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as those commonly understood by one of ordinary skill in the art to which this invention belongs. While the present invention has been illustrated by a description of various embodiments and while these embodiments have been described in considerable detail, it is not the intention of the applicants to restrict or in any way limit the scope of the applications. Additional advantages and modifications will readily appear to those skilled in the art. The invention in its broader aspects is therefore not limited to the specific details, representative device, apparatus and method, and illustrative example shown and described. Accordingly, departures may be made from such details without departing from the spirit of the applicant's general inventive concept.
References
[0069] 1. Smith L M, Sanders J Z, Kaiser R J, Hughes P, Dodd C, Connell C R, et al. Fluorescence detection in automated DNA sequence analysis. Nature. 1986; 321: 674-9.
[0070] 2. Lander E S, Linton L M, Birren B, Nusbaum C, Zody M C, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001; 409: 860-921.
[0071] 3. Venter J C, Adams M D, Myers E W, Li P W, Mural R J, Sutton G G, et al. The sequence of the human genome. Science. 2001; 291: 1304-51.
[0072] 4. Margulies M, Egholm M, Altman W E, Attiya S, Bader J S, Bemben L A, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005; 437: 376-80.
[0073] 5. Turcatti G, Romieu A, Fedurco M, Tairi A P. A new class of cleavable fluorescent nucleotides: synthesis and optimization as reversible terminators for DNA sequencing by synthesis. Nucleic Acids Res. 2008;36: e25.
[0074] 6. Previte M J, Zhou C, Kellinger M, Pantoja R, Chen C Y, Shi J, et al. DNA sequencing using polymerase substrate-binding kinetics. Nat Commun. 2015; 6: 5936.
[0075] 7. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009; 323: 133-8.
[0076] 8. Stoddart D, Heron A J, Mikhailova E, Maglia G, Bayley H. Single-nucleotide discrimination in immobilized DNA oligonucleotides with a biological nanopore. Proc Natl Acad Sci U S A. 2009; 106: 7702-7.
[0077] 9. Dekker C. Solid-state nanopores. Nat Nanotechnol. 2007; 2: 209-15.
[0078] 10. Mandell J G, Gunderson, Kevin L., Gundlach, Jens H. Compositions, systems, and methods for detecting events using tethers anchored to or adjacent to nanopores. The United States Patent Application No. 20190376135, 2019.
[0079] 11. Merriman BLSD, Mola, Paul W., Biomolecular sensors and methods. The United States Patent Application No. 20180340220, 2018.
[0080] 12. Merriman B L, Govindaraj V A, Mola P., Geiser T. ENZYMATIC CIRCUITS FOR MOLECULAR SENSORS. The United States Patent Application No. 20180305727, 2018.
[0081] 13. Merriman B L, Govindaraj V A, Mola P., Geiser T., Costa G. BINDING PROBE CIRCUITS FOR MOLECULAR SENSORS. The United States Patent Application No. 20190004003, 2019.
[0082] 14. Merriman B L S D, Mola P., Choi C. MOLECULAR SENSORS AND RELATED METHODS. The United States Patent Application No. 20190094175, 2019.
[0083] 15. Matthews B W. Studies on protein stability with T4 lysozyme. Adv Protein Chem. 1995; 46: 249-78.
[0084] 16. Yutani K, Ogasahara K, Tsujita T, Sugino Y. Dependence of conformational stability on hydrophobicity of the amino acid residue in a series of variant proteins substituted at a unique position of tryptophan synthase alpha subunit. Proc Natl Acad Sci USA. 1987; 84: 4441-4.
[0085] 17. Klein I B, Kirsch J F. The activation of papain and the inhibition of the active enzyme by carbonyl reagents. J. Biol. Chem. 1969; 244: 5928-35.
[0086] 18. Liu H, May K. Disulfide bond structures of IgG molecules: structural variations, chemical modifications and possible impacts to stability and biological function. MAbs. 2012; 4: 17-23.
[0087] 19. Costa S, Almeida A, Castro A, Domingues L. Fusion tags for protein solubility, purification and immunogenicity in Escherichia coli: the novel Fh8 system. Front Microbiol. 2014; 5: 63.
[0088] 20. Wang Y, Prosen D E, Mei L, Sullivan J C, Finney M, Vander Horn P B. A novel strategy to engineer DNA polymerases for enhanced processivity and improved performance in vitro. Nucleic Acids Res. 2004; 32: 1197-207.
[0089] 21. Takahashi H, Yamazaki H, Akanuma S, Kanahara H, Saito T, Chimuro T, et al. Preparation of Phi29 DNA polymerase free of amplifiable DNA using ethidium monoazide, an ultraviolet-free light-emitting diode lamp and trehalose. PLoS One. 2014; 9: e82624.
[0090] 22. Chen B, Long Q, Zhao Y, Wu Y, Ge S, Wang P, et al. Sulfone-Based Probes Unraveled Dihydrolipoamide S-Succinyltransferase as an Unprecedented Target in Phytopathogens. Journal of Agricultural and Food Chemistry. 2019; 67: 6962-9.
[0091] 23. Liu H, Naismith J H. An efficient one-step site-directed deletion, insertion, single and multiple-site plasmid mutagenesis protocol. BMC Biotechnol. 2008; 8: 91.
[0092] 24. Zhang P, Lei M, Devices, methods and chemical reagents for biopolymer sequencing. The U.S. Patent Application No. 62/794,096, 2019.
[0093] 25. Zhang P, Krstic P, Lei M, Engineered DNA for molecular electronics, U.S. Patent Application No. 62/938,084, 2019