NON-LTR-RETROELEMENT REVERSE TRANSCRIPTASE AND USES THEREOF
20220372453 · 2022-11-24
Inventors
- Jennifer L. STAMOS (Austin, TX, US)
- Alfred M. LENTZSCH (Austin, TX, US)
- Seung Kuk PARK (Austin, TX, US)
- Georg Mohr (Austin, TX)
- Alan M. Lambowitz (Austin, TX)
Cpc classification
G16B15/00
PHYSICS
C12Y207/07049
CHEMISTRY; METALLURGY
C30B7/00
CHEMISTRY; METALLURGY
C07K2299/00
CHEMISTRY; METALLURGY
International classification
C12N9/12
CHEMISTRY; METALLURGY
Abstract
A crystal structure of a Non-LTR-retroelement reverse transcriptase and methods of using the same to identify enzymes with improved activity are provided. Mutant reverse transcriptase enzymes and methods of using the same are also provided.
Claims
1-99. (canceled)
100. A non-LTR-retroelement reverse transcriptase comprising an amino acid substitution at an amino acid position corresponding to a position of SEQ ID NO: 1 that (a) contacts a template nucleic acid, a primer oligonucleotide, and/or an incoming dNTP; or (b) is on the surface of the non-LTR-retroelement reverse transcriptase.
101. The non-LTR-retroelement reverse transcriptase of claim 100, comprising an amino acid substitution at an amino acid position corresponding to a position of SEQ ID NO: 1 that (a) contacts a template nucleic acid, a primer oligonucleotide and/or an incoming dNTP.
102. The non-LTR-retroelement reverse transcriptase of claim 101, wherein the amino acid position that contacts a template nucleic acid, a primer oligonucleotide and/or an incoming dNTP is identified as such based on a crystal comprising a substantially pure non-LTR-retroelement reverse transcriptase comprised of at least a reverse transcriptase and a thumb domain in complex with template and primer oligonucleotide and incoming dNTP, wherein the non-LTR-retroelement reverse transcriptase has at least 95% sequence identity to SEQ ID NO: 1, and wherein the crystal has a space group of C 1 2 1.
103. The non-LTR-retroelement reverse transcriptase of claim 101, wherein the amino acid substitution is at a position or set of positions selected from the group consisting of: (i) N23; (ii) N23, Q24, G25, A26, P27, G28, I29, D30, and G31; (iii) 129; (iv) R63; (v) L77 and 179; (vi) R85; and (vii) F143.
104. The non-LTR-retroelement reverse transcriptase of claim 103, wherein the amino acid substitution is at a substitution or set of substitutions selected from the group consisting of: (i) N23A; (ii) replacement of the set of residues N23, Q24, G25, A26, P27, G28, 129, D30, and G31 with GGGG; (iii) I29R; (iv) R63A; (v) L77A and I79A; (vi) R85A; and (vii) F143A.
105. The non-LTR-retroelement reverse transcriptase of claim 100, comprising an amino acid substitution at an amino acid position corresponding to a position of SEQ ID NO: 1 that (b) is on the surface of the non-LTR-retroelement reverse transcriptase.
106. The non-LTR-retroelement reverse transcriptase of claim 105, wherein the amino acid position that is on the surface of the RT is identified as such based on a crystal comprising a substantially pure non-LTR-retroelement reverse transcriptase comprised of at least a reverse transcriptase and a thumb domain in complex with template and primer oligonucleotide and incoming dNTP, wherein the non-LTR-retroelement reverse transcriptase has at least 95% sequence identity to SEQ ID NO: 1, and wherein the crystal has a space group of C 1 2 1.
107. The non-LTR-retroelement reverse transcriptase of claim 105, wherein the amino acid substitutions is at a position that does not contact a template nucleic acid, a primer oligonucleotide, and/or an incoming dNTP.
108. The non-LTR-retroelement reverse transcriptase of claim 105, wherein the amino acid substitution is at a position or set of positions selected from the group consisting of: (i) R58 and K160; (ii) K213, R214, and K217; (iii) Q290, Q294, and Q298; (iv) K293 and R297; (v) R327; (vi) R343; (vii) R343 and K339; (viii) R343, R381, K382, R386, K389, and K399; (ix) R345; (x) R360; (xi) R381; (xii) R381, K382, R386, and K389; (xiii) K382; (xiv) R386; and (xv) R413.
109. The non-LTR-retroelement reverse transcriptase of claim 108, wherein the amino acid substitution is at a substitution or set of substitutions selected from the group consisting of: (i) R58A and K160A; (ii) K213A, R214E, and K217A; (iii) Q290A, Q294A, and Q298A; (iv) K293A and R297A; (v) R327L; (vi) R343L; (vii) R343L and K339D; (viii) R343L, R381D, K382D, R386D, K389D, and K399D; (ix) R345L; (x) R360A; (xi) R381A; (xii) R381A/D, K382A, R386A/D, and K389A; (xiii) K382A; (xiv) R386A; and (xv) R413A.
110. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase comprises: increased or decreased template switching activity; increased or decreased processivity; increased or decreased strand displacement activity; or increased or decreased fidelity.
111. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase comprises increased or decreased template switching activity; increased processivity; increased strand displacement activity; or increased fidelity.
112. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase comprises: improved stability, improved solubility, decreased non-specific nucleic acid binding or improved ability to be purified.
113. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase exhibits increased yield during recombinant production.
114. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase comprises a bacterial reverse transcriptase.
115. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase comprises a group II intron reverse transcriptase.
116. The non-LTR-retroelement reverse transcriptase of claim 100, wherein the non-LTR-retroelement reverse transcriptase further comprises a stability tag.
117. The non-LTR-retroelement reverse transcriptase of claim 116, wherein the stability tag comprises MalE.
118. A method for reverse transcribing a template comprising contacting the template with a non-LTR-retroelement reverse transcriptase (RT) in accordance with claim 100 under conditions permissible for reverse transcription.
119. A kit comprising a non-LTR-retroelement reverse transcriptase (RT) in accordance with claim 100.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein. The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings(s) will be provided by the Office upon request and payment of the necessary fee.
[0038]
[0039]
[0040]
[0041]
[0042] HIV-1 RT makes only 6 polar contacts and one peptide backbone H-bond. HCV RdRP preserves the general shape of RT2a insert region but contacts primarily n−1 to n+3, leaving the remaining RNA nucleotides free of polar interactions (the crystallized HCV RdRP construct lacks a large C-terminal segment, which may make additional contacts to the RNA template).
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
I. THE PRESENT EMBODIMENTS
[0056] Bacterial group II intron reverse transcriptases (RTs) function in both intron mobility and RNA splicing and are evolutionary predecessors of retrotransposon, telomerase, and retroviral RTs, as well as spliceosomal protein Prp8 in eukaryotes. The present studies determined a crystal structure of a full-length thermostable group II intron RT in complex with an RNA template/DNA primer and incoming dNTP at 3.0-Å resolution. It was found that the binding of template/primer and key aspects of the RT active site are surprisingly different from retroviral RTs, but remarkably similar to viral RNA-dependent RNA polymerases. The structure reveals a host of features not seen previously in RTs that may contribute to the distinctive biochemical properties of group II intron RTs, and it provides a prototype for many related bacterial and eukaryotic non-LTR-retroelement RTs. It also reveals how protein structural features used for reverse transcription evolved to promote the splicing of both group II and spliceosomal introns.
II. EXAMPLES
[0057] The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
Example 1
Thermostable Group II Intron Reverse Transcriptase Characterization
[0058] Comparison of Group II Intron RTs and Related Proteins:
[0059] Appended to their common conserved RT and thumb domains, RTs and RdRPs have various additional N- and/or C-terminal domains that specialize the proteins for different functions. Group II intron RTs lack an RNase H domain present in retroviral RTs and instead have a C-terminal DNA binding domain (D), which contributes to recognition of the DNA target site for intron integration during retrohoming (San Filippo and Lambowitz, 2002). Most subgroup IIA and IIB intron RTs, including the L1.LtrB RT, have an additional DNA endonuclease (En) domain, which nicks the DNA target site to generate a primer for reverse transcription of the intron RNA, whereas group IIC introns, such as the GsI-IIC RT, lack an En domain and instead use a nascent strand at a DNA replication fork to prime reverse transcription (reviewed in (Lambowitz and Belfort, 2015)). Prp8 and some non-LTR retrotransposon RTs have a different appended DNA endonuclease domain (REL), which has not been found associated with a group II intron RTs, indicating multiple independent acquisitions of an En domain by this RT family.
[0060] Structure Determination and Overview: The full-length GsI-IIC RT (420 amino acids) with a C-terminal 8x histidine tag (i.e., 8 consecutive histidine residues added after amino acid position 420 of SEQ ID NO:1) was co-crystallized with an 11-bp RNA/DNA heteroduplex with a three nucleotide 5′ overhang of the RNA template strand and a dideoxy nucleotide at the 3′ end of the DNA primer in the presence of dATP and Mg.sup.2+ to mimic an active polymerase conformation just prior to polymerization of an incoming dNTP (
[0061] The GsI-IIC RT structure follows the canonical right-handed domain organization present in all other RTs of known structure, with 20 α-helices and 9 β-strands (
[0062] Unlike HIV-1 RT, which functions as a heterodimer comprised of a catalytic p66 subunit and a structural p51 subunit, the group II intron RT binds the template/primer substrate as a monomer. The RT2a and RT3a inserts provide structural support for formation of the catalytic right-hand-like structure in the same manner as p51 in HIV-1 RT (
[0063] Surprisingly, the orientation of the fingers and thumb domain relative to the nucleic acid duplex differ markedly between GsI-IIC RT and HIV-1 RT (
[0064] RT Active Site: Overall, the structure of the GsI-IIC RT active site is similar to that of HIV-1 and other RTs, but with key differences that may enhance fidelity (
[0065] In the previous Roseburia intestinalis (R.i.) group II intron RT fragment structure, the RT3a insertion loop occupies the active site forcing the PQG motif near the beginning of RT4 into an inactive conformation (Zhao and Pyle, 2016). By contrast, in the GsI-IIC RT structure with bound template/primer, RT3a is flipped out of the active site, allowing the PQG loop to adopt an active conformation similar to that of HIV-1 RT (
[0066] As in all nucleic acid polymerases, the remainder of the dNTP-binding pocket in GsI-IIC RT is formed by the C-terminal end of a β strand (β5) and the following small helix (α5) of the palm (
[0067] Finally, GsI-IIC RT makes a series of novel contacts to the templating RNA nucleotide (n−1), which may deter misalignment of the template base. These include polar contacts to the phosphates on either side of n−1 from R85 of the fingers and the amide of A26 of RT0 and more extensive hydrophobic contact to the n−1 base by 179 (
[0068] Interactions of the GsI-IIC RT Fingers with the RNA Template Strand: In HIV-1 RT, contacts with the incoming RNA template strand are made by residues near aA, which is not present in GsI-IIC RT, and by the fingertips loop, which is present in GsI-IIC RT and interacts similarly with the bases of n−1 and n+1 through a series of hydrophobic residues (V65, 167, L77 in GsI-IIC RT or F61, I63, L74 in HIV-1 RT) (
[0069] Functions of the NTE/RT0 and RT2a Insertions: Strikingly, the GsI-IIC RT structure shows that the NTE/RT0 and RT2a insertions of group II intron RTs contribute extensive template/primer interactions around the RT active site, which are not present in retroviral RTs but comparable to those in viral RdRPs. For retroviral RTs, the major groove of the bound template/primer duplex is devoid of protein contacts and completely exposed to solvent (
[0070] The RT2a insertion of GsI-IIC RT consists of α3 and α4 joined by an extended loop (
[0071] Notably, the positioning of the first helix of the GsI-IIC RT NTE sterically precludes the RNA/DNA duplex in our structure from adopting the same conformation as RNA/DNA duplexes in HIV-1 and TERT (Das et al., 2014; Mitchell et al., 2010). Because of the location of the α1′helix of the NTE, the second turn of the duplex is displaced toward the thumb domain, resulting in a distortion of the duplex shape and a widening of the major groove. As a result, the distance between the n+2 and p+9 phosphates widens to 13.5 Å compared to 11.2 Å in the HIV-1 RT structure (PDB:4PQU,
[0072] Thumb Domain: The thumb domain of GsI-IIC RT adopts an elongated parallel three-helix structure similar to those seen in Prp8 structures and the first 3 helices of the thumb of RdRPs (Appleby et al., 2015; Galej et al., 2013). The second helix of the GsI-IIC RT thumb is homologous to the first thumb helix in retroviral RTs and similarly occupies the minor groove of the template/primer duplex. The prominent conserved motif of the thumb of group II intron RTs, G-x-x-x-Y/F-Y/F, serves the same function as the G-x-x-x-W motif in HIV-1 and other retroviral RTs, whereby the glycine allows close approach of the helix to the minor groove, and the conserved aromatic side chain at the fifth position of the motif (Y325 in the case of GsI-IIC RT) forms a pi-pi stacking interaction with the DNA primer and presumably disfavors polymerization of RNA bases by clashing with their 2′-OH group (
[0073] The conserved FLG loop in RT7 of group II intron RTs is located in the palm at the base of the thumb in the same general position as the WMG loop (“primer grip”) of HIV-1-RT and forms a similar pocket for proper positioning of the primer during catalysis (
[0074] The first and third helices of the thumb provide a structural scaffold for the second helix and harbor a number of positively charged residues, which form a strong basic patch on the side of the thumb opposite the polymerizing duplex (
[0075] DNA-Binding Domain: The remainder of the GsI-IIC RT protein is composed of the C-terminal DNA binding (D) domain. This domain forms a small globular structure, which caps the tip of the thumb in a series of a-helix hairpin structures, reminiscent of short a-helical hairpin domains often observed in canonical DNA-binding domains (Sawaya et al., 1997)(
[0076] Notably, the packing of the D domain along the outer surface of the thumb forms a highly positively charged cleft on the face of the thumb opposite that which binds the template/primer duplex (
[0077] RT0 Loop Mutations Affect Template-Switching Activity: The GsI-IIC RT structure will enable comprehensive structure-function analysis of group II intron RTs, as well as their engineering for biotechnological applications. It was noticed that the NTE/RT0 lid forms a large pocket that could contribute to the potent template-switching activity of group II intron RTs by capturing the 3′ end of an incoming template strand (
[0078] Strikingly, it was found that replacing the entire RT0 loop (positions 23-31) with four glycines (23-31/4G) strongly decreased template-switching activity, while leaving high primer extension activity. Further dissection of the loop showed that replacement of the first half with 6 glycines (23-28/6G) had only a minimal effect on template-switching activity, possibly reflecting that the RT0 lid binds the RNA template primarily via peptide backbone interactions.
[0079] By contrast, the mutations I29R, an anchoring residue at the end of the lid, and R85A, a residue that structurally stabilizes the conformation of the lid, strongly decreased template-switching activity. The I29R mutants retained high primer extension activity, while the R85A mutant had some decrease in primer extension activity, possibly reflecting that the hydrophobic side chain stem of R85 forms part of the active site cavity below the n−1 templating base. A D30A mutation in the second half of the loop had a relatively mild effect on template-switching activity. The severe loss of template-switching activity when mutating anchoring residues of the RT0 loop highlights the importance of the structural integrity of the lid for trapping incoming templates for polymerization.
[0080] Adaptation of Template/Primer-Binding Regions for RNA splicing: The crystal structure enabled modeling of the binding of the full-length GsI-IIC RT to a group IIC intron lariat RNA (Costa et al., 2016) by using positioning information from the cryo-EM structures of the L1.LtrB RT bound to a group IIA intron RNA and spliceosomal protein Prp8 bound to snRNAs. In order to mimic the configuration of the complex during reverse splicing of the intron lariat RNA into a DNA target site during retrohoming, a DNA strand containing the exon junction (EJ) and 5′-exon hairpin recognized by group IIC introns was added to the model based on the position of an RNA bound at the exon-binding site in a group IIC intron structure (PDB:3IGI) (
[0081] Comparison of the GsI-IIC RT model with the L1.LtrB and Prp8 spliceosome cryo-EM structures showed that key regions of the group II intron RT involved in binding the template/primer are dually functional and use non-overlapping regions to bind template/primer and RNA splicing substrates (
[0082] Remarkably, the GsI-IIC RT model indicates that after reverse splicing of the intron RNA into a DNA strand, the RT active-site cleft is precisely positioned to initiate reverse transcription by using an incoming nascent DNA strand as primer at or just downstream of the 3′ end of the intron sequence (
[0083] Thus, the present studies solved the crystal structure of the full-length thermostable group IIC intron RT (GsI-IIC RT) in complex with an RNA template/DNA primer duplex and dATP, the first non-LTR-retroelement RT for which such a structure has been determined. In addition to providing a prototype for a large group of related but structurally uncharacterized Non-LTR-retroelement RTs, the structure reveals remarkably close structural and mechanistic similarities between group II intron RTs and viral RdRPs, provides insight into the structural basis for the distinctive enzymatic properties of group II intron RTs, and suggests how RT structural features that initially evolved to promote reverse transcription have been adapted to bind group II intron and spliceosomal RNAs for RNA splicing.
Example 2
Methods and Materials
[0084] Constructs: Full length His-tag GsI-IIC RT used for crystallization contained the native N-terminus and was constructed by adding a non-cleavable 8× His tag directly to the C-terminus by PCR from a GsI-IIC RT pMal fusion vector described previously (Mohr et al., 2013). The PCR product was ligated into the pET14b expression vector (Millipore) using NcoI and PstI restriction sites. The vector was transformed into BL21-CodonPlus (DE3)-RIPL chemically competent cells (Agilent) and plated onto LB plates containing 100 μg/mL ampicillin and 25 μg/mL chloramphenicol.
[0085] Wild-type and mutant GsI-IIC RT proteins used in biochemical assays were expressed as maltose-binding protein rigid fusions from pMRF-GsI-IIC (Mohr et al., 2013). GsI-IIC RT mutants were constructed in pMRF-GsI-IIC by site-directed mutagenesis using a Q5 Site Directed Mutagenesis Kit (New England Biolabs) with primers listed in Table S2. Constructs were transformed into Rosetta 2 (DE3) (EMD Millipore) chemically competent cells and plated onto LB plates containing 100 μg/mL ampicillin and 25 μg/mL chloramphenicol. All constructs were verified by sequencing.
[0086] Protein Expression, Purification, and Crystallization: For His-tag GsI-IIC RT protein expression, starter cultures were prepared by inoculating 25 mL of LB containing 100 μg/mL ampicillin and 25m/mL chloramphenicol with a single colony and grown at 37° C. shaking overnight. 20 mL of starter culture was added to 1 L of LB shaking culture containing 100 μm/mL ampicillin and grown at 37° C. to an OD.sub.595 of 0.6-0.8. Cells were induced by the addition of 1 mM isopropyl β-D-thiogalactoside (IPTG) and incubated at 37° C. for a further 2.5 hr. Cell pellets were collected by centrifugation and stored at −80° C. prior to purification.
[0087] For seleno-methionine substituted His-tag GsI-IIC RT production, metabolic inhibition was used to incorporate seleno-methionine. After starter culture growth as described above, cells were collected by centrifugation and resuspended in 1 mL M9 minimal media and then inoculated into 1 L of M9 minimal media and grown to an OD.sub.595 of 0.3 at 37° C. shaking incubation. Solid amino acid supplements were added at 100 mg L-lysine, 100 mg L-phenylalanine, 100 mg L-threonine, 50 mg L-isoleucine, 50 mg L-leucine, 50 mg L-valine, and 50 mg L-seleno-methionine. Cells were incubated with shaking for an additional 20 min and then induced with 1 mM IPTG for 2 hr at 37° C. followed by 14 hr at 18° C. Cell pellets were collected by centrifugation and stored at -80° C. prior to purification.
[0088] For protein purification, cells from 1 L of culture of either native or seleno-methionine substituted His-tag GsI-IIC RT were resuspended in 40 mL of cold Lysis Buffer, containing 20 mM Tris-HCl pH 8.5, 100 mM NaCl, 10% glycerol, 5 mM imidazole, 0.1% β-mercapto-ethanol (β-ME), 0.2 mM phenylmethylsulfonyl flouride (PMSF), and one EDTA-free
[0089] Complete protease inhibitor cocktail tablet (Roche). Cells were lysed by sonication at 4° C. Lysate was clarified by centrifugation at 40,000×g for 1 hr at 4° C. Clarified lysate was combined with 10 mL bed volume of Ni-NTA Agarose beads (Invitrogen) and incubated by slow rotation at 4° C. for 2 hr. The beads were washed with 250 mL Wash Buffer, containing 20 mM Tris-HCl pH 8.5, 100 mM NaCl, 10% glycerol, 5 mM imidazole, and 0.1% β-ME, under gravity flow. The beads were further washed with 150 mL of Wash Buffer containing an additional 50 mM imidazole. GsI-IIC RT was eluted from the beads by adding 5×5 mL of Elution Buffer, containing 20 mM Tris-HCl pH 8.5, 100 mM NaCl, 10% glycerol, 0.1% β-ME, and 250 mM imidazole. Eluted fractions were analyzed by SDS-PAGE, and fractions containing GsI-IIC RT were pooled and applied to a 5 mL HiTrap Heparin HP column (GE Healthcare) pre-equilibrated in Heparin Buffer A, containing 20 mM Tris 8.5, 100 mM NaCl, 10% glycerol, and 0.1% β-ME, at a flow-rate of 1 mL/min. The column was washed with 5 column volumes (CVs) of Heparin Buffer A. A 10 CV gradient was applied from Heparin Buffer A to 50% of Heparin Buffer B, containing 20 mM Tris-HCl pH 8.5, 2 M NaCl, 10% glycerol, and 0.1% β-ME. A final 5 CVs of 100% Heparin Buffer B was applied. GsI-IIC RT typically eluted during the final step at a purity of >98% by SDS-PAGE. Fractions containing GsI-IIC RT were pooled and incubated in the presence of 65-70% saturating ammonium sulfate for 2-3 hr on ice, as the protein could not be successfully concentrated by other means. The precipitated protein was then pelleted by centrifugation at 40,000×g for 1 hr at 4° C., with utmost care taken during aspiration of the supernatant to remove as much ammonium sulfate solution as possible. The protein was then resuspended in Crystallization Buffer, consisting of 20 mM Tris-HCl pH 8.5, 500 mM NaCl, 10% glycerol, and 5 mM DTT, to a final concentration of 2-3 mg/mL.
[0090] Annealed RNA/DNA duplex for crystallization trials was produced by combining the single-stranded RNA and DNA oligonucleotides (Integrated DNA Technologies) at a 1:1 molar ratio, heating to 82° C. for 2 min, and then slowly cooling to room temperature. The annealed duplex was then combined with the concentrated GsI-IIC RT at a 1:1.2 protein:nucleic acid molar ratio in Crystallization Buffer also containing 2 mM MgCl.sub.2 and 1 mM dATP and incubated on ice for 30 min. Crystals were grown by the hanging drop vapor diffusion method, with the drop containing 0.5 μL of GsI-IIC RT/duplex combined with 0.5 μL of a well solution containing 0.1 M Tris-HCl pH 7.5-8.5 and 1.2-1.4 M sodium citrate tribasic dihydrate. Crystals grew as thin plates over the course of 1 to 2 weeks with dimensions of approximately 20 μm×50 μm×100 μm for seleno-methionine protein or 25 μm×100 μm×200 μm for native protein.
[0091] The RNA/DNA duplex-only crystals were obtained from hanging drop vapor diffusion experiments performed with a GsI-IIC RT/nucleic acid duplex complex prepared as described above, using a well solution containing 1.2 M sodium malonate pH 7.0 and 0.6 M ammonium citrate tribasic pH 7.0. Crystals grew to a 125 μm radius in 2-3 weeks.
[0092] Crystals were harvested with a cryoloop (Hampton Research), and immersed briefly in Al's oil (Hampton Research) for the GsI-IIC RT containing crystals, or paraffin oil for the duplex-only crystals, before flash freezing into liquid nitrogen.
[0093] Data Collection, Analysis, and Structure Determination: Diffraction data were collected at 100K at the Advanced Light Source (ALS) on beamline 5.0.3. Images were integrated using the XDS package (Kabsch, 2010) and scaled with Aimless (Evans and Murshudov, 2013). For the GsI-IIC RT/duplex complex, the initial molecular replacement model was obtained using the program EPMR (Kissinger et al., 1999). Initial refinement was carried out in the Phenix package (Adams et al., 2010), with subsequent refinement also incorporating Buster (Bricogne et al., 2016) and Refmac5 (Murshudov et al., 2011). Data collection and refinement parameters are reported in Table 1.
TABLE-US-00001 TABLE 1 Crystallographic Data Collection and Refinement Statistics, Related to Methods. RT/Duplex RT/Duplex Duplex (Nat) (Se-SAD) only Data Collection PDB ID 6AR1 6AR3 6AR5 Wavelength 0.9765 0.9765 0.9765 Resolution range 47.5-3.0 (3.1-3.0) 48.7-3.4 (3.5-3.4) 36.1-2.4 (2.5-2.4) Space group C 1 2 1 C 1 2 1 P 31 2 1 Unit cell: a, b, c (Å) 179.2, 95.1, 71.6 179.5, 109.0, 72.5 46.4, 46.4, 82.2 Unit cell: α, β, γ (°) 90, 113.5, 90 90, 113.8, 90 90, 90, 120 Total reflections 166553 (16758) 132407 (13028) 88600 (8691) Unique reflections 21940 (2211) 17492 (1724) 4250 (407) Multiplicity 7.6 (7.6) 7.6 (7.6) 20.8 (21.4) Completeness (%) 99.8 (98.9) 99.9 (99.7) 99.9 (100.0) Mean I/sigma(I) 16.9 (2.0) 9.1 (2.0) 49.8 (6.6) Wilson B-factor 78.8 84.9 46.3 R-merge 0.112 (0.958) 0.218 (1.01) 0.0531 (0.557) R-meas 0.120 (1.03) 0.234 (1.08) 0.0545 (0.570) R-pim 0.0434 (0.371) 0.0848 (0.393) 0.0119 (0.123) CC½ 0.999 (0.874) 0.996 (0.855) 1 (0.977) CC* 1 (0.966) 0.999 (0.96) 1 (0.994) Refinement Reflections used in refinement 21940 (2211) 17492 (1724) 4249 (407) Reflections used for R-free 1092 (105) 1736 (169) 212 (15) R-work 0.216 0.273 0.185 R-free 0.255 0.324 0.212 CC(work) 0.884 0.781 0.967 CC(free) 0.934 0.703 0.986 Number of non-hydrogen atoms 7873 7660 577 macromolecules 7801 7598 556 ligands 72 62 — solvent — — 21 Protein residues 832 837 0 RMS(bonds) 0.007 0.008 0.007 RMS(angles) 1.16 1.24 0.95 Ramachandran favored (%) 97.1 96.3 — Ramachandran allowed (%) 2.7 3.5 — Ramachandran outliers (%) 0.2 0.2 — Rotamer outliers (%) 1.1 0.9 — Clashscore 2.0 5.0 1.2 Average B-factor 87.3 99.5 39.8 macromolecules 87.3 99.6 39.9 ligands 81.7 87.2 — solvent — — 38.7 Number of TLS groups 10 — — Statistics for the highest-resolution shell are shown in parentheses.
[0094] The protein/nucleic acid complex crystallized in space group C2, and the structure was solved by a combination of multi-domain molecular replacement and seleno-methionine single-wavelength anomalous diffraction (SAD) phasing. The data display anisotropy, with CC.sub.1/2=0.5 to 2.6 Å in the h direction, but only to 3.4 Å in the k direction (with the 1 direction intermediate). Including higher resolution data did not enhance the electron density; therefore, 3.0 Å resolution was chosen according to the criterion of the mean I/sigma(I)=2.0 in the highest resolution shell. A combination of the molecular replacement solution obtained in EPMR using the R.i. fingers/palm domain (PDB:5IRF) and the TERT RNA/DNA hairpin (PDB: 3KYL) with
[0095] Se-Met SAD phases using the MR-SAD feature from the Phenix package in the GsI-IIC RT/Duplex (Se-SAD) dataset (a=179.5, b=109.0, c=72.5Å, β=113.8° ; PDB:6AR3) yielded initial density for the missing thumb and D domain, with the seleno-methionine locations matching with expected positioning based on homology to known RT structures. However, refinement stalled at high R-factors (˜40% R.sub.free), with many side chains absent and much of the thumb and D domain backbone trace unclear. After the crystals had remained in the drop for several months, C2 crystals with a slightly compacted unit cell were obtained (GsI-IIC RT/Duplex (Nat); PDB:6AR1)(a=179.2, b=95.1, c=71.6Å, β=113.8°; PDB:6AR3). Molecular replacement followed by rigid body refinement was carried out using the model from the GsI-IIC RT/duplex (Se-SAD) data set, with more difference density features apparent. Once most of the model had been built (bulk solvent parameters, NCS restraints, TLS, and individual temperature factors applied during refinement), rigid body twin refinement (twin law h+2*1, -k, -l, twin fraction=0.5) with the fingers, palm, thumb, and duplex domains as independent bodies (carried out in Refmac5) revealed the different domain conformations amongst the two monomers in the asymmetric unit and allowed completion of the refinement. Note that the pseudo-merohedral twin law combined with the pseudo-symmetry of the two molecules in the C2 asymmetric unit cause the data to assume an apparent I222/I2.sub.12.sub.12.sub.1 symmetry. The structure may also be partially solved and refined in the I222 space group, but with similar poor density and stalled R-factors as described above. The R.sub.free set was chosen to match both pseudo-symmetry related and twin related reflections. Model building was performed in Coot (Emsley et al., 2010). The monomer composed of chains A, B, and C were used throughout this work for depictions and structure analysis due to having more visible density for the loop region between helices α10 and α11 and the non-protein-bound end of the nucleic acid duplex.
[0096] The nucleic acid duplex-only crystallized in space group P3.sub.121 with unit cell constants as in Table 1 (PDB:6AR5). The structure was solved by molecular replacement in Phaser using the isolated RNA/DNA duplex chains from the native data set. Refinement was carried out in Phenix and model building in Coot, applying bulk solvent parameters and individual temperature factors.
[0097] Biochemical Methods: Wild-type and mutant GsI-IIC RT proteins were expressed from pMRF-GsI-IIC (see above) with maltose-binding protein fused to their N-termini via a non-cleavable rigid linker with minor modifications of the previously described procedure (Mohr et. al., 2013). Briefly, transformed single colonies of Rosetta 2 (DE3) cells (EMD Millipore) were inoculated into 100 mL LB+ampicillin/chloramphenicol media and grown overnight with shaking at 37° C. The starter culture was added to 1 L of LB+ampicillin media (ratio of 1:50) and grown at 37° C. to an OD.sub.600 of 0.6-0.7. Protein expression was induced by adding 1 mM IPTG and incubating at 37° C. for 2 hr. Cells were pelleted by centrifugation and stored in −80° C. overnight. Cells were thawed and then lysed by sonication in 20 mM Tris-HCl pH 7.5, 500 mM KCl, 20% glycerol, 1 mg/mL lysozyme, 0.2 μM. The lysate was clarified by centrifugation at 24000 ×g for 1 hr at 4° C. Polyethyleneirnine (PEI) was added slowly to the clarified lysate to a final concentration of 0.4% in order to precipitate nucleic acids and centrifuged at 24000×g for 25 min at 4° C. Nucleic-acid free GsI-HC RT was precipitated from the supernatant with 60% saturating ammonium sulfate, and resuspended in A1 buffer (25 mM Tris-HCl pH 7.5, 300 mM KCl, 10% glycerol). The protein was then loaded onto MBPTrap HP column (GE Healthcare), washed with 10 CVs of A1 buffer, 6 CVs of A2. buffer (25 mM Tris pH 7.5, 1.5 M KCl, 10% glycerol), and again with 6 CVs of A1 buffer. GsI-IIC RT was eluted with 10 CVs of 25 mM Tris-HCl pH 7.5. 500 mM KCl., 10% glycerol containing 10 mM maltose. Fractions containing protein were diluted to 100 mM KCl, loaded onto a HiTrap Heparin HP column (GE Healthcare), and eluted with a 12 CV gradient from buffer A1 to A2. Fractions containing GsI-IIC RT were identified by SDS-PAGE, pooled, and dialyzed into 20 mM Tris-HCl pH 7.5, 500 mM KCl, 50% glycerol. RT aliquots were flash frozen using :liquid nitrogen and stored in −80° C.
[0098] Template-switching and primer extension assays were carried out using GsI-IIC-MRF RT, as described (Mohr et al., 2013). The initial template-primer substrate used for template switching reactions was the same as that used for RNA-seq adapter addition in RNA-seq protocols (Nottingham et al., 2016). It consists of a 34-nt RNA oligonucleotide containing an Illumina R2 sequence (R2 RNA; Table 2) with a 3′-blocking group (3SpC3; Integrated DNA Technologies) annealed to a 35-nt 5′.sup.32P-labeled DNA primer ([γ-P.sup.32]-ATP, Perkin Elmer), which contains the reverse complement of the R2 sequence and leaves a single nucleotide 3′ G overhang (R2RG DNA; Table 2). The oligonucleotides were annealed at a ratio of 1:1.2 to a yield a final duplex concentration of 250 nM by heating to 82° C. for 2 min and then slowly cooling to room temperature. GsI-IIC RT (400 nM) was preincubated with the annealed R2/R2R-G heteroduplex (50 nM) and 50-nt acceptor template RNA (100 nM) in final 10 μl of reaction medium containing 200 mM NaCl, 5 mM MgCl.sub.2, 20 mM Tris-HCl pH 7.5, and 5 mM DTT for 30 min at room temperature, and the template-switching reverse transcription reactions were initiated by adding 0.4 μL of 25 mM dNTPs (an equimolar mix of 25 mM dATP, dCTP, dGTP, and dTTP, Promega). Reactions were incubated at 60° C. for 15 min and stopped by adding 5 μL of the reaction mixture to 15 μL of 0.25 M EDTA. The RNA templates were then degraded by adding 1 μL of 5 N NaOH and heating to 95° C. for 3 min followed by a cooling to room temperature and neutralization with 1 μL of 5 N HCl. 10 μL of formamide loading dye (95% formamide, 0.025% xylene cyanol, 0.025% bromophenol blue, 6.25 mM EDTA) was added and products were denatured by heating to 99° C. for 10 min and placed on ice prior to electrophoresis in a denaturing 8% TBE-Urea polyacrylamide gel. The gel was dried, exposed to a phosphor screen, and scanned using a Typhoon phosphorimager at a PMT of 1000.
TABLE-US-00002 TABLE 2 Biochemical Assay and Site-directed Mutagenesis Oligonucleotides, Related to Methods. Primers Sequence Biochemical Assays AML1 TCTTCGGGGCGAAAACTCTCAAGGATCTTACCG CTGTTGAGATCCAGTTC (SEQ ID NO: 2) R2RG GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC TG (SEQ ID NO: 3) R2 rArGrArUrCrGrGrArArGrArGrCrArCrAr CrGrUrCrUrGrArArCrUrCrCrArGrUrCrA rC/3SpC3/ (SEQ ID NO: 4) Acceptor rCrGrCrCrGrGrArCrCrGrUrGrCrArCrCr RNA oligo ArUrCrUrGrGrArGrUrUrArUrArGrArGrA rUrGrArGrUrCrUCrArCrArUrArGrArCrC (SEQ ID NO: 5) Site-directed Mutagenesis 23-28 F′ GGCGGCGGCATCGACGGAGTATCAACCG (SEQ ID NO: 6) 23-28 R′ GCCGCCGCCGGCTTCGACCCGTTTGAG (SEQ ID NO: 7) 23-31 F′ GGCGGCGTATCAACCGATCAACTCCG (SEQ ID NO: 8) 23-31 R′ GCCGCCGGCTTCGACCCGTTTGAG (SEQ ID NO: 9) I29R F′ AGCACCGGGACGAGACGGAGTATCAACC (SEQ ID NO: 10) I29R R′ CCTTGGTTGGCTTCGACC (SEQ ID NO: 11) D30A F′ ACCGGGAATCGCTGGAGTATCAACC (SEQ ID NO: 12) D30A R′ GCTCCTTGGTTGGCTTCG (SEQ ID NO: 13) R85A F′ CGTGGTGGACGCACTGATCCAACAAGC (SEQ ID NO: 14) R85A R′ GTGGGAATGCCTAGCTGC (SEQ ID NO: 15)
[0099] Primer extension reactions were carried out similarly using a 50-nt 5′.sup.32P end-labeled DNA primer (AML1; Table 2) annealed near the 3′ end of a 1.1 kb in vitro transcribed RNA. The transcript was generated by T3 runoff transcription (T3 MEGAscript kit, Thermo Fisher Scientific) of pBluescript KS (+) (Agilent) linearized using Xmnl (New England Biolabs) and cleaned up using a MEGAclear kit (Thermo Fisher Scientific). The labeled DNA primer was annealed to the RNA template at a ratio of 1:1.2 to a yield a final duplex concentration of 250 nM by heating to 82° C. for 2 min followed by slowly cooling to room temperature. GsI-IIC RT (400 nM) was preincubated with 50 nM of the annealed template/primer in final 10 μl of reaction medium containing 200 mM NaCl, 5 mM MgCl.sub.2, 20 mM Tris-HCl pH 7.5, 5 mM DTT for 30 min at room temperature, and reverse transcription was initiated by adding 0.4 μl of the 25 mM dNTP mix. After incubating at 60° C. for 15 min, the reaction was terminated, processed, and analyzed by electrophoresis in a denaturing 8% TBE-Urea polyacrylamide gel, as described above for template-switching reactions.
Example 3
Further Amino Acid Substitutions into the RT
[0100] Further studies were undertaken to generate and study additional amino acid substitutions into the surface of the non-LTR-retroelement reverse transcriptase. Details regarding the additional substitutions made are shown in the Table 3 below. Various of the additional substituted RT proteins were further tested to determine effects on protein yield during recombinant expression (
[0101] GsI-IIC RT mutants were expressed as recombinant proteins in E. coli, and those that could be expressed were purified and tested for RT activity in primer extension assays. All proteins contain an N-terminal solublity tag (MalE protein) fused to the GsI-IIC RT via a non-cleavable linker (Mohr et al., 2013). In general, all proteins expressed similarly to WT, the only exception was the R311L mutant (
[0102] In addition, we verified that the purified recombinant GsI-IIC RT mutant proteins had RT activity. Since the mutated regions were remote from the RT active site, we expected that thumb and D-domain mutants would not affect RT activity of GsI-IIC RT. To test for RT activity we used a primer extension assay with purified mutant protein and a 1.1-kb RNA template with an annealed DNA primer. Reverse transcription was initiated from a .sup.32P-labeled 50-nt DNA primer that annealed near the 3′ end of RNA. The reaction mixture was incubated at 60° C. in the presence of dNTPs for up to 60 minutes (
[0103] The GsI-IIC protein has patches of positively charged amino acid residues in the RT fingers and palm domains, such as R58, K160 and K213, R214, and K217. These residues can nonspecifically bind nucleic acids. To reduce this nonspecific binding and to reduce the positive charge, we made two mutants, one having the mutations R58A/K160A and the other having the mutations K213A/R214E/K217A. The R58A/K160A and K213A/R214E/K217A could be concentrated to very high concentrations (12.50 mg/mL and 3.5 mg/mL, respectively) by Amicon centrifugal filtration (
TABLE-US-00003 TABLE 3 Further substitutions in a reverse transcriptase of SEQ ID NO: 1. Number Mutants Plasmid Assay Effect 1 K18A pMal Primer extension, Template switching 2 K18E pMal Primer extension, Processivity Template switching, defect processivity 3 R19A pMal Primer extension, Processivity Template switching, defect processivity 4 R19E pMal Primer extension 5 R19A/R63A pMal Primer extension, Template Template switching switching defect 6 R19E/R63E pMal Primer extension 7 E21A pMal Primer extension, Processivity Template switching, defect processivity 8 N23A pMal Primer extension, Template switching 9 I29R pMal, pDonor Primer extension, template Template switching, switching Mobility defect 10 D30A pMal Primer extension, Template switching 11 23-28/6G pMal Primer extension, template Template switching switching defect 12 23-31/4G pMal, pDonor Primer extension, Template Template switching, switching and Mobility mobility defect 13 23-31/polyG pMal Primer extension, template Template switching switching defect 14 23-33/4G pMal Primer extension 15 N23A/L92A/P112G/R114A/P194G pMal Primer extension, Primer Template switching extension and template switching defect 16 R63A pMal, pDonor Primer extension, Template switching, Mobility 17 P68A pMal, pDonor Primer extension, Template switching, Mobility 18 R85A pMal, pDonor Primer extension, Template Template switching, switching Mobility defect 19 L92A pMal Primer extension 20 L92A/N23A pMal Primer extension 21 P112G pMal Primer extension 22 P112G/L92A/N23A pMal Primer extension, Template switching 23 R114A pMal Primer extension 24 F143A pMal, pDonor Primer extension, Template Template switching, switching, Mobility, nontemplated mobility, addition NTA defect 25 K141A pMal Primer extension 26 D144A pMal, pDonor Primer extension, Template switching, Mobility, nontemplated addition 27 175-184/polyG pMal Primer extension, Template switching 28 P194G pMal Primer extension 29 R291A pMal Primer extension 30 Q294A pMal Primer extension 31 R297A pMal Primer extension 32 Q298A pMal Primer extension 33 Y318A pMal Primer extension, Processivity assay 34 Y318A/W322A pMal Primer extension 35 W322A pMal Primer extension 36 Y325A pMal Primer extension 37 Y325F pMal Primer extension 38 F326A pMal Primer extension 39 F415A pMal Primer extension, Template switching 40 F415A/P68A pMal, pDonor Primer extension, Template switching, Mobility 41 YAAA pDonor Mobility Active site defect/control 42 R58A/K160A pMal Primer extension greatly increased protein yield 43 K213A/R214E/K217A pMal Primer extension increased protein yield Mutant Tested? (Y/N) NTP or dNTP? Effect 44 P68A/F415A Y — In mobility assay showed no effect when compared to WT. 45 F110A Y dNTP Slightly slower than WT 46 F142A N — — 47 F143A Y NTP Incorporated ~4 bases. Essentially no extension 48 F143V N — — 49 F143V/Y325V N — — 50 D144F Y dNTP Slightly slower than WT 51 Y325A N — — 52 Y325V N — — Mutation Mutant Effect/ position name protein yield 53 Q290A, Q294A, Q298A 3Q/A 54 K293A, R297A KR/2A 55 N301H 56 N303H 57 N301H, N303H NN/2H 58 N301H, N303H, S305H, S307H NNSS/4H 59 N301G, N303G, S305G, S307G NNSS/4G 60 N301H, N303H, S305H, S307H, NNSSQ/4H1R Q353R 61 N301H, N303H, S305H, S307H, NNSSQ/5H Q353H 62 W304A, I306A WI/A 63 S305H, S307H SS/2H 64 R311L 65 R327L increased protein yield 66 R343L 67 R343L, K399D RK/LD increased protein yield 68 R343L, R381D, K382D, R386D, 3RK/1L5D greatly K389D, K399D increased protein yield 69 R344L 70 R345L 71 R344L, R345L RR/L 72 R347L 73 Q353H 74 Q353R 75 K355D, R356L KR/DL increased protein yield 76 K355G, R356G, R358A, R360A, 3K5R/2G6A increased R381A, K382A, R386A, K389A protein yield 77 R356G 78 R360A 79 N379A 80 T380A 81 R381A 82 K382A increased protein yield 83 R386A increased protein yield 84 K389A 85 R381A, K382A RK/A-1 86 R386A, K389A RK/A-2 87 R381A, K382A, R386A, K389A RKRK/A increased protein yield 88 R381D, K382A, R386D, K389A RKRK/DADA increased protein yield 89 K399D increased protein yield 90 H394A increased protein yield 91 R413A
[0104] All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
REFERENCES
[0105] The following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference. [0106] U.S. Pat. No. 7,670,807 [0107] Adams, P. D., Afonine, P. V., Bunkoczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L. W., Kapral, G. J., Grosse-Kunstleve, R. W., et al. (2010). PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213-221. [0108] Aizawa, Y., Xiang, Q., Lambowitz, A.M., and Pyle, A.M. (2003). The pathway for DNA recognition and RNA integration by a group II intron retrotransposon. Mol. Cell 11, 795-805. Appleby, T. C., Perry, J. K., Murakami, E., Barauskas, O., Feng, J., Cho, A., Fox, D., 3rd, Wetmore, D. R., McGrath, M. E., Ray, A. S., et al. (2015). Viral replication. Structural basis for RNA replication by the hepatitis C virus polymerase. Science 347,771-775. Arnold, J. J., Vignuzzi, M., Stone, J. K., Andino, R., and Cameron, C. E. (2005). Remote site control of an active site fidelity checkpoint in a viral RNA-dependent RNA polymerase. J. Biol. Chem. 280, 25706-25716. [0109] Blocker, F. J. H., Mohr, G., Conlan, L. H., Qi, L., Belfort, M., and Lambowitz, A. M. (2005). Domain structure and three-dimensional model of a group II intron-encoded reverse transcriptase. RNA 11, 14-28. [0110] Bricogne G., Blanc E., Brandl M., Flensburg C., Keller P., Paciorek W., Roversi P, Sharff A., Smart O. S., Vonrhein C., Womack T. O. (2016). BUSTER version 2.10.2. Cambridge, United Kingdom: Global Phasing Ltd. [0111] Carignani, G., Groudinsky, O., Frezza, D., Schiavon, E., Bergantino, E., and Slonimski, P. P. (1983). An mRNA maturase is encoded by the first intron of the mitochondrial gene for the subunit I of cytochrome oxidase in S. cerevisiae. Cell 35, 733-742. [0112] Cavalier-Smith, T. (1991). Intron phylogeny: a new hypothesis. TIG 7, 145-148. Clark, W. C., Evans, M. E., Dominissini, D., Zheng, G., and Pan, T. (2016). tRNA base methylation identification and quantification via high-throughput sequencing. RNA 22, 1771-1784. [0113] Costa, M., Walbott, H., Monachello, D., Westhof, E., and Michel, F. (2016). Crystal structures of a group II intron lariat primed for reverse splicing. Science 354, aaf9258. [0114] Cousineau, B., Smith, D., Lawrence-Cavanagh, S., Mueller, J. E., Yang, J., Mills, D., Manias, D., Dunny, G., Lambowitz, A. M., and Belfort, M. (1998). Retrohoming of a bacterial group II intron: mobility via complete reverse splicing, independent of homologous DNA recombination. Cell 94, 451-462. [0115] Das, K., Martinez, S. E., Bandwar, R. P., and Arnold, E. (2014). Structures of HIV-1 RT-RNA/DNA ternary complexes with dATP and nevirapine reveal conformational flexibility of RNA/DNA: insights into requirements for RNase H cleavage. Nucleic Acids Res. 42, 8125-8137. [0116] Emsley, P., Lohkamp, B., Scott, W. G., and Cowtan, K. (2010). Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486-501. [0117] Evans, P. R., and Murshudov, G. N. (2013). How good are my data and what is the resolution? [0118] Acta Crystallogr. D Biol. Crystallogr. 69, 1204-1214. [0119] Fica, S. M., Tuttle, N., Novak, T., Li, N. S., Lu, J., Koodathingal, P., Dai, Q., Staley, J. P., and Piccirilli, J. A. (2013). RNA catalyses nuclear pre-mRNA splicing. Nature 503, 229-234. Fisher, T. S., Darden, T., and Prasad, V. R. (2003). Substitutions at Phe61 in the beta3-beta4 hairpin of HIV-1 reverse transcriptase reveal a role for the Fingers subdomain in strand displacement DNA synthesis. J. Mol. Biol. 325, 443-459. [0120] Galej, W. P., Nguyen, T. H., Newman, A. J., and Nagai, K. (2014). Structural studies of the spliceosome: zooming into the heart of the machine. Curr. Opin. Struct. Biol. 25, 57-66. [0121] Galej, W.P., Oubridge, C., Newman, A. J., and Nagai, K. (2013). Crystal structure of Prp8 reveals active site cavity of the spliceosome. Nature 493, 638-643. [0122] Gao, G., Orlova, M., Georgiadis, M. M., Hendrickson, W. A., and Goff, S. P. (1997). Conferring RNA polymerase activity to a DNA polymerase: a single residue in reverse transcriptase controls substrate selection. Proc. Natl. Acad. Sci. USA 94, 407-411. [0123] Gillis, A. J., Schuller, A. P., and Skordalakes, E. (2008). Structure of the Tribolium castaneum telomerase catalytic subunit TERT. Nature 455, 633-637. [0124] Kabsch, W. (2010). Xds. Acta Crystallogr. D Biol. Crystallogr. 66, 125-132. [0125] Kennell, J. C., Moran, J. V., Perlman, P. S., Butow, R. A., and Lambowitz, A. M. (1993). Reverse transcriptase activity associated with maturase-encoding group II introns in yeast mitochondria. Cell 73, 133-146. [0126] Kissinger, C. R., Gehlhaar, D. K., and Fogel, D. B. (1999). Rapid automated molecular replacement by evolutionary search. Acta Crystallogr. D Biol. Crystallogr. 55, 484-491. [0127] Koonin, E. V., Dolja, V. V., and Krupovic, M. (2015). Origins and evolution of viruses of eukaryotes: The ultimate modularity. Virology 479-480, 2-25. [0128] Lambowitz, A., and Belfort, M. (2015). Mobile bacterial group II introns at the crux of eukaryotic evolution. Microbiol. Spectrum 3. [0129] Lambowitz, A. M., and Zimmerly, S. (2011). Group II introns: mobile ribozymes that invade DNA. Cold Spring Harb. Perspect. Biol. 3, a003616. [0130] Malik, H. S., Burke, W. D., Eickbush, T. H. (1999). The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16, 793-805. [0131] Marcia, M., and Pyle, A. M. (2012). Visualizing group II intron catalysis through the stages of splicing. Cell 151, 497-507. [0132] Martin, W., and Koonin, E. V. (2006). Introns and the origin of nucleus-cytosol compartmentalization. Nature 440, 41-45. [0133] Michel, F., and Ferat, J. L. (1995). Structure and activities of group II introns. Annu. Rev. Biochem. 64, 435-461. [0134] Mitchell, M., Gillis, A., Futahashi, M., Fujiwara, H., and Skordalakes, E. (2010). Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nat. Struct. Mol. Biol. 17, 513-518. [0135] Mohr, S., Ghanem, E., Smith, W., Sheeter, D., Qin, Y., King, O., Polioudakis, D., Iyer, V. R., Hunicke-Smith, S., Swamy, S., et al. (2013). Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing. RNA 19, 958-970. [0136] Murshudov, G. N., Skubak, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F., and Vagin, A. A. (2011). REFMACS for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr. 67, 355-367. [0137] Nguyen, T. H., Galej, W. P., Fica, S. M., Lin, P. C., Newman, A. J., and Nagai, K. (2016). CryoEM structures of two spliceosomal complexes: starter and dessert at the spliceosome feast. Curr. Opin. Struc. Biol. 36, 48-57. [0138] Noah, J. W., Park, S., Whitt, J. T., Perutka, J., Frey, W., and Lambowitz, A. M. (2006). Atomic force microscopy reveals DNA bending during group II intron ribonucleoprotein particle integration into double-stranded DNA. Biochemistry 45, 12424-12435. [0139] Nottingham, R. M., Wu, D. C., Qin, Y., Yao, J., Hunicke-Smith, S., and Lambowitz, A.M. (2016). RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase. RNA 22, 597-613. [0140] Paukstelis, P. J., Chen, J. H., Chase, E., Lambowitz, A. M., and Golden, B. L. (2008). Structure of a tyrosyl-tRNA synthetase splicing factor bound to a group I intron RNA. Nature 451, 94-97. [0141] Peebles, C. L., Perlman, P. S., Mecklenburg, K. L., Petrillo, M. L., Tabor, J. H., Jarrell, K. A., and Cheng, H. L. (1986). A self-splicing RNA excises an intron lariat. Cell 44, 213-223. [0142] Qu, G., Kaushal, P. S., Wang, J., Shigematsu, H., Piazza, C. L., Agrawal, R. K., Belfort, M., and Wang, H. W. (2016). Structure of a group II intron in complex with its reverse transcriptase. Nat. Struct. Mol. Biol. 23, 549-557. [0143] Saldanha, R., Chen, B., Wank, H., Matsuura, M., Edwards, J., and Lambowitz, A. M. (1999). RNA and protein catalysis in group II intron splicing and mobility reactions using purified components. Biochemistry 38, 9069-9083. [0144] San Filippo, J., and Lambowitz, A. M. (2002). Characterization of the C-terminal DNA-binding/DNA endonuclease region of a group II intron-encoded protein. J. Mol. Biol. 324, 933-951. [0145] Sawaya, M. R., Prasad, R., Wilson, S. H., Kraut, J., and Pelletier, H. (1997). Crystal structures of human DNA polymerase beta complexed with gapped and nicked DNA: evidence for an induced fit mechanism. Biochemistry 36, 11205-11215. [0146] Sharp, P. A. (1985). On the origin of RNA splicing and introns. Cell 42, 397-400. [0147] Sontheimer, E. J., Gordon, P. M., and Piccirilli, J. A. (1999). Metal ion catalysis during group II intron self-splicing: parallels with the spliceosome. Genes Dev. 13, 1729-1741. [0148] Toro, N., Nisa-Martinez, R. (2014). Comprehensive phylogenetic analysis of bacterial reverse transcriptases. PLoS ONE 9, e114083. [0149] Wang, H., and Lambowitz, A. M. (1993). The Mauriceville plasmid reverse transcriptase can initiate cDNA synthesis de novo and may be related to reverse transcriptase and DNA polymerase progenitor. Cell 75, 1071-1081. [0150] Wu, X., and Bartel, D.P. (2017). Widespread influence of 3′-end structures on mammalian mRNA processing and stability. Cell 169, 905-917 e911. [0151] Yang, J., Zimmerly, S., Perlman, P. S., and Lambowitz, A. M. (1996). Efficient integration of an intron RNA into double-stranded DNA by reverse splicing. Nature 381, 332-335. [0152] Zhao, C., and Pyle, A. M. (2016). Crystal structures of a group II intron maturase reveal a missing link in spliceosome evolution. Nat. Struct. Mol. Biol. 23, 558-565. [0153] Zheng, G., Qin, Y., Clark, W. C., Dai, Q., Yi, C., He, C., Lambowitz, A. M., and Pan, T. (2015). Efficient and quantitative high-throughput tRNA sequencing. Nat. Methods 12, 835-837. [0154] Zimmerly, S., Guo, H., Eskes, R., Yang, J., Perlman, P. S., and Lambowitz, A. M. (1995a). A group II intron RNA is a catalytic component of a DNA endonuclease involved in intron mobility. Cell 83, 529-538. [0155] Zimmerly, S., Guo, H., Perlman, P. S., and Lambowitz, A. M. (1995b). Group II intron mobility occurs by target DNA-primed reverse transcription. Cell 82, 545-554. [0156] Zimmerly, S., Wu, L. (2015). An unexplored diversity of reverse transcriptases in bacteria. Microbiol. Spectr. 3, 1253-1269. [0157] Zubradt, M., Gupta, P., Persad, S., Lambowitz, A. M., Weissman, J. S., and Rouskin, S. (2016). DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods 14,75-82.