NUCLEIC ACID MODIFICATION WITH TOOLS FROM OXYTRICHA
20210163900 · 2021-06-03
Inventors
Cpc classification
C12Y201/01072
CHEMISTRY; METALLURGY
A61K45/06
HUMAN NECESSITIES
G01N33/6842
PHYSICS
C12Y302/02009
CHEMISTRY; METALLURGY
C12P19/34
CHEMISTRY; METALLURGY
International classification
A61K45/06
HUMAN NECESSITIES
C12P19/34
CHEMISTRY; METALLURGY
Abstract
The present disclosure provides, inter alia, methods for treating a disease characterized by an abnormal level of m6dA in a subject, such as cancer, methods of modifying a nucleic acid from a cell, methods for identifying protein binding sites on DNA, methods of mediating DNA N6-adenine methylation, methods of modulating nucleosome organization and/or transcription in a cell, using MTA1c or any components thereof. The present disclosure also provides methods of generating a synthetic chromosome and synthetic chromosomes made by such methods. Pharmaceutical compositions comprising MTA1c or any components thereof and kits containing such compositions or for carrying out such processes are further provided. Eukaryotic cells, vectors and transgenic organisms comprising MTA1c or any components thereof are also provided. Synthetic chromosomes and methods of making same are also provided.
Claims
1. A method of treating or ameliorating the effects of a disease characterized by an abnormal level of m6dA in a subject, comprising administering to the subject an amount of MTA1c or any components thereof effective to modulate m6dA levels in the subject.
2. The method according to claim 1, wherein the modulation comprises restoring m6dA levels to normal or near-normal ranges in the subject.
3. The method according to claim 1, wherein the disease is a cancer.
4. The method according to claim 3, wherein the cancer is gastric cancer or liver cancer.
5. The method according to claim 4, further comprising administering to the subject one or more of anti-gastric cancer and anti-liver cancer drugs.
6. The method according to claim 1, furthering comprising co-administering to the subject an epigenetic agent.
7. The method according to claim 6, wherein the epigenetic agent is selected from the group consisting of methylation inhibiting drugs, Bromodomain inhibitors, histone acetylase (HAT) inhibitors, protein methyltransferase inhibitors, histone methylation inhibitors, histone deacetlyase (HDAC) inhibitors, histone acetylases, histone deacetlyases, and combinations thereof.
8. A pharmaceutical composition comprising MTA1c or any components thereof that is effective to modulate m6dA levels in a subject in need thereof and a pharmaceutically acceptable carrier, diluent, adjuvant or vehicle.
9. A method of modifying a nucleic acid from a cell, the cell derived from a multicellular eukaryote, comprising the steps of: (a) obtaining the nucleic acid from the cell; and (b) contacting the nucleic acid with MTA1c or any components thereof under conditions effective to methylate the nucleic acid.
10. The method according to claim 9, wherein the methylated nucleic acid is effective to modulate nucleosome organization and transcription.
11. The method according to claim 9, wherein the modification is a DNA N6-adenine methylation.
12. The method according to claim 11, wherein the DNA N6-adenine methylation is one or more of dimethylated AT (5′-A*T-3′/3′-TA*-5′), dimethylated TA (5′-TA*-3′/3′-A*T-5′), dimethylated AA (5′-A*A*-3′/3′-TT-5′), methylated AT (5′-A*T-3′/3′-TA-5′), methylated AA (5′-A*A-3′/3′-TT-5′), methylated AC (5′-A*C-3′/3′-TG-5′), methylated AG (5′-A*G-3′/3′-TC-5′), methylated TA (5′-TA*-3′/3′-AT-5′), methylated AA (5′-AA*-3′/3′-TT-5′), methylated CA (5′-CA*-3′/3′-GT-5′), and methylated GA (5′-GA*-3′/3′-CT-5′).
13. The method according to claim 9, wherein the MTA1c or any components thereof comprises a mutation effective to abrogate dimethylation of the nucleic acid.
14. The method according to claim 13, wherein the mutation comprises loss of a C-terminal methyltransferase domain.
15. The method according to claim 9, wherein the MTA1c or any components thereof is obtained from ciliates, algae, or basal fungi.
16. The method according to claim 9, wherein the MTA1c or any components thereof is obtained from Oxytricha or Tetrahymena.
17. A cell line obtained from a multicellular eukaryote comprising a nucleic acid encoding MTA1c or any components thereof and/or an MTA1c protein complex or any components thereof.
18. The eukaryotic cell according to claim 17, wherein the nucleic acid encoding MTA1c or any components thereof is operably linked to a recombinant expression vector.
19. A method of identifying protein binding sites on DNA comprising the steps of: (a) providing DNA; (b) contacting the DNA with MTA1c or any components thereof under conditions effective to methylate the DNA; (c) contacting the DNA with one or more proteins; (d) contacting the DNA with an enzyme effective to hydrolyze the DNA in positions where no protein binding occurs; (e) removing the DNA bound protein; and (f) isolating and sequencing the DNA fragments.
20. The method according to claim 19, wherein the one or more proteins comprise histone octamers.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0023] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0024] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present disclosure. The disclosure may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097]
[0098]
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
[0106]
[0107]
[0108]
[0109]
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
DETAILED DESCRIPTION OF THE DISCLOSURE
[0130] DNA N6-adenine methylation (6 mA) has recently been described in diverse eukaryotes, spanning unicellular organisms to metazoa. In the present disclosure, it's reported a DNA 6 mA methyltransferase complex in ciliates, termed MTA1c. It consists of two MT-A70 proteins and two homeobox-like DNA-binding proteins and specifically methylates dsDNA. Disruption of the catalytic subunit, MTA1, in the ciliate Oxytricha leads to genome-wide loss of 6 mA and abolishment of the consensus ApT dimethylated motif. Mutants fail to complete the sexual cycle, which normally coincides with peak MTA1 expression. The present disclosure investigates the impact of 6 mA on nucleosome occupancy in vitro by reconstructing complete, full-length Oxytricha chromosomes harboring 6 mA in native or ectopic positions. It's shown that 6 mA directly disfavors nucleosomes in vitro in a local, quantitative manner, independent of DNA sequence. Furthermore, the chromatin remodeler ACF can overcome this effect. The present disclosure identifies a diverged DNA N6-adenine methyltransferase and defines the role of 6 mA in chromatin organization.
[0131] One embodiment of the present disclosure is a method of modifying a nucleic acid from a cell, the cell derived from a multicellular eukaryote. This method comprises the steps of: (a) obtaining the nucleic acid from the cell; and (b) contacting the nucleic acid with MTA1c or any components thereof under conditions effective to methylate the nucleic acid.
[0132] In some embodiments, the nucleic acid is RNA or DNA. In some embodiments, the eukaryotic cell is mammalian. In some embodiments, the multicellular eukaryote is a human. In some embodiments, the modification is a DNA N6-adenine methylation including one of more of the following motifs: dimethylated AT (5′-A*T-3′/3′-TA*-5′), dim ethylated TA (5′-TA*-3′/3′-A*T-5′), dim ethylated AA (5′-A*A*-3′/3′-TT-5′), methylated AT (5′-A*T-3′/3′-TA-5′), methylated AA (5′-A*A-3′/3′-TT-5′), methylated AC (5′-A*C-3′/3′-TG-5′), methylated AG (5′-A*G-3′/3′-TC-5′), methylated TA (5′-TA*-3′/3′-AT-5′), methylated AA (5′-AA*-3′/3′-TT-5′), methylated CA (5′-CA*-3′/3′-GT-5′), and methylated GA (5′-GA*-3′/3′-CT-5′). In certain embodiments, the MTA1 or an ortholog thereof comprises a mutation effective to abrogate dimethylation of the nucleic acid. Preferably, the mutation comprises loss of a C-terminal methyltransferase domain. In some embodiments, the MTA1c or any components thereof is obtained from ciliates, algae, or basal fungi. Preferably, the MTA1c or any components thereof is obtained from Oxytricha or Tetrahymena.
[0133] As used herein, an “ortholog,” or orthologous gene, is a gene with a sequence that has a portion with similarity to a portion of the sequence of a known gene, but found in a different species than the known gene. An ortholog and the known gene originated by vertical descent from a single gene of a common ancestor. As used herein an ortholog encodes a protein that has a portion of at least about 50%, such as at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80% or at least about 80% of the total length of the sequence of the encoded protein that is similar to a portion of a length of at least about 50%, such as at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80% or at least about 80% of a known protein. The respective portion of the ortholog and the respective portion of the known protein to which it is similar may be a continuous sequence or be fragmented a number, for example, into 1 to about 3, including 2, individual regions within the sequence of the respective protein. For example, the 1 to about 3 regions are arranged in the same order in the amino acid sequence of the ortholog and the amino acid sequence of the known protein. Such a portion of an ortholog has an amino acid sequence that has at least about 40%, at least about 45%, such as at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75% or at least about 80% sequence identity to the amino acid sequence of the known protein encoded by a MTA1 gene.
[0134] As used herein, an asterisk “*” indicates the presence of a methylated base. For example, “A*” represents a methylated adenine.
[0135] The modified base, m6dA, has been discovered in a wide range of eukaryotes, including humans. m6dA levels are significantly reduced in gastric and liver cancer tissues, and disruption of m6dA promotes tumor formation (Xiao et al. 2018). As disclosed herein, MTA1 is a novel m6dA “writer”, paving the way for cost-effective methods to understand mechanisms of m6dA function in biomedically relevant models.
[0136] Accordingly, another embodiment of the present disclosure is a method of treating or ameliorating the effects of a disease characterized by an abnormal level of m6dA in a subject. This method comprises administering to the subject an amount of MTA1c or any components thereof effective to modulate m6dA levels in the subject. In some embodiments, the modulation comprises restoring m6dA levels to normal or near-normal ranges in the subject.
[0137] In some embodiments, the subject is a mammal that can be selected from the group consisting of humans, veterinary animals, and agricultural animals. Preferably, the subject is a human.
[0138] In some embodiments, the disease is a cancer, e.g., gastric cancer or liver cancer. In certain embodiments, the method further comprises administering to the subject one or more of anti-gastric cancer and anti-liver cancer drugs. Non-limiting examples of anti-liver cancer drugs include Nexavar™ (Sorafenib Tosylate) and Stivarga™ (Regorafenib). Non-limiting examples of anti-gastric cancer drugs include Cyramza™ (Ramucirumab), Doxorubicin Hydrochloride, 5-FU (Fluorouracil Injection), Fluorouracil Injection, Herceptin™ (Trastuzumab), Mitomycin C, Taxotere™ (Docetaxel), Trastuzumab, Afinitor™ (Everolimus), Somatuline Depot™ (Lanreotide Acetate), FU-LV, TPF, and XELIRI.
[0139] In some embodiments, the method furthering comprises co-administering to the subject an epigenetic agent that is selected from the group consisting of methylation inhibiting drugs, Bromodomain inhibitors, histone acetylase (HAT) inhibitors, protein methyltransferase inhibitors, histone methylation inhibitors, histone deacetlyase (HDAC) inhibitors, histone acetylases, histone deacetlyases, and combinations thereof.
[0140] Another embodiment of the present disclosure is a pharmaceutical composition comprising MTA1c or any components thereof that is effective to modulate m6dA levels in a subject in need thereof and a pharmaceutically acceptable carrier, diluent, adjuvant or vehicle.
[0141] Yet another embodiment of the present disclosure is a kit for treating or ameliorating the effects of a disease characterized by an abnormal level of m6dA in a subject, such as, e.g., cancer, comprising an effective amount of MTA1c or any components thereof, packaged together with instructions for its use.
[0142] Another embodiment of the present disclosure is a cell line obtained from a multicellular eukaryote comprising a nucleic acid encoding MTA1c or any components thereof and/or an MTA1c protein complex or any components thereof. As used herein, a “cell line” refers to all types of cell lines such as, e.g., immortalized cell lines and primary cell lines. In certain embodiments, the nucleic acid encoding MTA1c or any components thereof is operably linked to a recombinant expression vector.
[0143] Another embodiment of the present disclosure is a recombinant expression vector comprising a polynucleotide encoding MTA1c or any components thereof.
[0144] Still another embodiment of the present disclosure is a transgenic organism whose genome comprises a transgene comprising a nucleotide sequence encoding MTA1c or any components thereof. Non-limiting examples of possible organism include an archaea, a bacterium, a eukaryotic single-cell organism, algae, a plant, an animal, an invertebrate, a fly, a worm, a cnidarian, a vertebrate, a fish, a frog, a bird, a mammal, an ungulate, a rodent, a rat, a mouse, and a non-human primate.
[0145] The present disclosure also provides a method of identifying protein binding sites on DNA. This method comprises the steps of: (a) providing DNA; (b) contacting the DNA with MTA1c or any components thereof under conditions effective to methylate the DNA; (c) contacting the DNA with one or more proteins; (d) contacting the DNA with an enzyme effective to hydrolize the DNA in positions where no protein binding occurs; (e) removing the DNA bound protein; and (f) isolating and sequencing the DNA fragments. In certain embodiments, the one or more proteins in step (c) comprise histone octamers.
[0146] Another embodiment of the present disclosure is a method of mediating DNA N6-adenine methylation. This method comprises the steps of: (a) providing DNA; and (b) contacting the DNA with MTA1c or any components thereof under conditions effective to methylate the DNA.
[0147] Another embodiment of the present disclosure is a method of modulating nucleosome organization and/or transcription in a cell, comprising providing to the cell an agent that is effective to modulate the expression of MTA1c or any components thereof.
[0148] The present disclosure also provides a method of generating a synthetic chromosome. This method comprises the steps of: (a) generating chromosome segments containing terminal restriction sites, wherein the chromosome segments comprise one or more m6dA bases; (b) digesting the chromosome segments with a restriction enzyme; and (c) purifying and ligating the digested chromosome segments to form a synthetic chromosome. In some embodiments, the method further comprises enriching the synthetic chromosome. A synthetic chromosome made by the method above is also provided.
[0149] The following examples are provided to further illustrate certain aspects of the present disclosure. These examples are illustrative only and are not intended to limit the scope of the disclosure in any way.
EXAMPLES
Example 1
Materials and Methods
[0150]
TABLE-US-00001 KEY RESOURCES TABLE REACIENT or RESOURCE SOURCE IDENTIFIER Antibodies Anti-H2A Active Motif Cat #: 39111 Anti-H2B Abcam Cat #: 1790 Anti-H3 Abcam Cat #: 1791 Anti-H4 Active Motif Cat #: 39269 Anti-N6-methyladenosine Cedarlane Cat #: 202003(SY) antibody Labs/Synaptic Systems Goat Anti-Rabbit IgG Bio-Rad 1706515 (H + L)-HRP Conjugate Bacterial and Virus Strains One Shot TOP10 chemically Thermo Fisher Cat #: C404006 competent E. coli BL21(DE3) pLysS Thermo Fisher Cat #: 70-236-4 SHuffle T7 Express NEB Cat #: C3029J Competent E. coli Lemo21 (DE3) Competent NEB Cat #: C2S28J E. coli Chemicals, Peptides, and Recombinant Proteins Micrococcal nuclease NEB Cat #: M0247S Q5 Site-Directed NEB Cat #: E0554S Mutagenesis Kit ProBlock Gold bacterial GoldBio Cat #: GB-330-5 protease inhibitor cocktail Proteinase K Roche Cat #: 3113879001 Phenol:Chloroform:IAA, Thermo Fisher Cat #: AM9732 25:24:1 TRIzol reagent Thermo Fisher Cat #: 15596026 DNA Polymerase I, Large NEB Cat #: M0210S (Klenow) Fragment Klenow Fragment NEB Cat #: M0212S (3′ .fwdarw. 5′ exo-) Bsal NEB Cat #: R3535S EcoGII NEB Cat #: M0603S T4 DNA ligase NEB Cat #: M0202M Phusion DNA polymerase NEB Cat #: M0530L S-adenosyl-L-methionine NEB Cat #: B9003S Mouse NAP1 This study N/A Drosophila ACF complex Active Motif Cat #: 31509 Xenopus histones This study N/A Polyvinyl alcohol Sigma Aldrich Cat #: P8136 Polyethylene glycol 8000 Sigma Aldrich Cat #: P2139 Adenosine Sigma Aldrich Cat #: A6559-25UMO 5′-triphosphate (ATP) Creatine phosphate Sigma Aldrich Cat #: 10621714001 Creatine kinase Sigma Aldrich Cat #: 10127566001 Power SYBR Green PCR master Thermo Fisher Cat #: 4367659 mix Gum Arabic Sigma Aldrich Cat #: G9752-1KG 3H-labeled PerkinElmer Cat #: NET155V250UC S-adenosyl-L-methionine ([3H]SAM) Ultima Gold PerkinElmer Cat #: 6013326 DNA degradase plus enzyme Zymo Research Cat #: E2020 .sup.15N.sub.5-dA nucleoside Cambridge Cat #: NLM-3895-25 Isotope Laboratories D.sub.3-6mA Synthesized N/A in this study Critical Commercial Assays QIAquick gel extraction kit QIAGEN Cat #: 28706 NEBNext Poly(A) mRNA NEB Cat #: E7490S Magnetic Isolation Module ScriptSeq v2 RNA-Seq Illumina Cat #: SSV21124 Library Prep Kit Nucleospin Tissue Kit Takara Bio Cat #: 740952.250 USA MinElute Reaction Cleanup QIAGEN Cat #: 28206 Kit NEBNext Ultra II DNA NEB Cat #: E7645S Library Prep Kit Hi-Scribe T7 High Yield NEB Cat #: E2040S RNA Synthesis Kit Dynabeads Protein A Thermo Fisher Cat #: 10001D TOPO TA cloning kit Thermo Fisher Cat #: K457501 Deposited Data Oxytricha trifallax This study SRA: SRX2335608 and SMRT-seq SRX2335607 Tetrahymena thermophila This study GEO: GSE94421 SMRT-seq Oxytricha trifallax, all This study GEO: GSE94421 Illumina data (RNA- seq, 6mA-IP-seq, MNase-seq, gDNA-seq) Experimental Models: Organisms/Strains Oxytricha trifallax cells, Lab collection N/A strain JRB310 Oxytricha trifallax cells, Lab collection N/A strain JRB510 Oxytricha trifallax cells, Lab collection N/A mtal mutant Tetrahymena thermophila Tetrahymena Cat #: SD00703 cells, strain SB210 stock center Oligonucleotides All are listed in Table S4 IDT N/A Recombinant DNA pET-His-NAP1 (expression This study N/A vector for recombinant NAP1) pET-XenH2A (expression This study N/A vector for recombinant Xenopus histone H2A) pET-XenH2B (expression This study N/A vector for recombinant Xenopus histone H2B) pET-XenH3 (expression This study N/A vector for recombinant Xenopus histone H3) pET-XenH4 (expression This study N/A vector for recombinant Xenopus histone H4) pET-HisSUMO-MTA1 This study N/A (expression vector for recombinant Tetrahymena MTA1) pET-HisSUMO-MTA7 This study N/A (expression vector for recombinant Tetrahymena MTA7) pET-HisSUMO-p1 This study N/A (expression vector for recombinant Tetrahymena p1) pET-HisSUMO-p2 This study N/A (expression vector for recombinant Tetrahymena p2) pCR-TOPO- This study N/A syntheticChromosome (cloned synthetic chromosomes to verify accuracy of ligation of component DNA building blocks) Software and Algorithms Galaxy Galaxy https://usegalaxy.org/ Community Hub Bowtie2 Langmead and http://bowtie-bio.sourceforge.net/bowtie2/index.shtml Salzberg, 2012 TopHat2 TopHat2 https://ccb.jhu.edu/software/tophat/index.shtml (Mortazavi et al., 2008) Python 2.7.10 Python Software https://www.python.org/download/releases/2.7/ Foundation CAGEr Haberle et https://bioconductor.org/packages/release/bioc/html/CAGEr.html al.. 2015 SMRT Analysis 2.3.0 Pacific https://www.pacb.com/documentation/smrt-analysis-software-installation-v2-3-0/ Biosciences PSI-BLAST NCBI/NIH https://blast.ncbi.nlm.nih.gov/ Blast.cgi?CMD=Web&PAGE-Proteins&PROGRAM-blastp&RUN_PSIBLAST=on CD-HIT Huang et al., http://weizhong-lab.ucsd.edu/cdhit-web-server/cgi-bin/index.cgi 2010 MAFFT Katoh et al., https://mafft.cbrc.jp/alignment/software/ 2017; Kuraku et al., 2013 MrBayes/CIPRES Science Miller et al., https://www.phylo.org/ Gateway 2010 R (v3.2.5) The R Foundation https://www.r-project.org/ hmmscan Finn et al., https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan 2015 Other Agencourt Ampure XP beads Beckman Coulter Cat #: A63880 Acid-extracted Oxytricha This study N/A histones Slide-A-Lyzer 3.5K MWCO Thermo Fisher Cat #: PI66110 cassette Amersham Hybond-XL membrane GE Healthcare Cat #: RPN303S Amersham Hybond-N+ GE Healthcare Cat #: RPN119B membrane Volvic water Amazon https://www.amazon.com/Volvic-500m1-6-Pack/dp/B013PCK8M4/ ref=sr_1_1_a_it?_ie=UTF8&qid=1538873999&sr=8- 1&keyword_s=volvic&dpID=418qEyu6yrUpreST=_SY300 QL70 &dpSrc=srch
Oxytricha trifallax
[0151] Vegetative Oxytricha trifallax strain J RB310 was cultured at a density of 1.5×10.sup.7 cells/L to 2.5×10.sup.7 cells/L in Pringsheim media (0.11 mM Na.sub.2HPO.sub.4, 0.08 mM MgSO.sub.4, 0.85 mM Ca(NO.sub.3).sub.2, 0.35 mM KCl, pH 7.0) and fed daily with Chlamydomonas reinhardtii. Cells were filtered through cheesecloth to remove debris and collected on a 10 pm Nitex mesh for subsequent experiments.
Tetrahymena thermophila
[0152] Stock cultures of vegetative Tetrahymena thermophila strain SB210 were maintained in Neff medium (0.25% w/v proteose peptone, 0.25% w/v yeast extract, 0.5% glucose, 33.3 pM FeCl.sub.3). These cultures were inoculated into SSP medium (2% w/v proteose peptone, 0.1% w/v yeast extract, 0.2% w/v glucose, 33 pM FeCl.sub.3) and grown to log-phase (˜3.5×10.sup.5 cells/mL) through constant shaking at 125 rpm/30° C.
In Vivo MNase-Seq
[0153] 3×10.sup.5 vegetative Oxytricha cells were fixed in 1% w/v formaldehyde for 10 min at room temperature with gentle shaking, and then quenched with 125 mM glycine. Cells were lysed by dounce homogenization in lysis buffer (20 mM Tris pH 6.8, 3% w/v sucrose, 0.2% v/v Triton X-100, 0.01% w/v spermidine trihydrochloride) and centrifuged in a 10%-40% discontinuous sucrose gradient (Lauth et al., 1976) to purify macronuclei. The resulting macronuclear preparation was pelleted by centrifugation at 4000×g, washed in 50 ml TMS buffer (10 mM Tris pH 7.5, 10 mM MgCl.sub.2, 3 mM CaCl.sub.2), 0.25M sucrose), resuspended in a final volume of 300 μL, and equilibriated at 37° C. for 5 min. Chromatin was then digested with MNase (New England Biolabs) at a final concentration of 15.7 Kunitz Units/μL at 37° C. for 1 min 15 s, 3 min, 5 min, 7 min 30 sec, 10 min 30 s, and 15 min respectively. Reactions were stopped by adding ½ volume of PK buffer (300 mM NaCl, 30 mM Tris pH 8, 75 mM EDTA pH 8, 1.5% w/v SDS, 0.5 mg/mL Proteinase K). Each sample was incubated at 65° C. overnight to reverse crosslinks and deproteinate samples. Subsequently, nucleosomal DNA was purified through phenol:chloroform:isoamyl alcohol extraction and ethanol precipitation. Each sample was loaded on a 2% agarose-TAE gel to check the extent of MNase digestion. The sample exhibiting −80% mononucleosomal species was selected for MNase-seq analysis, in accordance with previous guidelines (Zhang and Pugh, 2011). Mononucleosome-sized DNA was gel-purified using a QIAquick gel extraction kit (QIAGEN). Illumina libraries were prepared using an NEBNext Ultra II DNA Library Prep Kit (New England Biolabs) and subjected to paired-end sequencing on an Illumina HiSeq 2500 according to manufacturer's instructions. All vecietative Tetrahymena MNase-sea data were obtained from (Beh et al., 2015).
Poly(A).SUP.+ RNA-Seq and TSS Sequencing
[0154] Oxytricha cells were lysed in TRIzol reagent (Thermo Fisher Scientific) for total RNA isolation according to manufacturer's instructions. Poly(A).sup.+ RNA was then purified using the NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs). Oxytricha poly(A).sup.+ RNA was prepared for RNA-seq using the ScriptSeq v2 RNA-Seq Library Preparation Kit (Illumina). Tetrahymena poly(A).sup.+ RNA-seq data was obtained from (Xiong et al., 2012). The 5′ ends of capped RNAs were enriched from vegetative Oxytricha total RNA using the RAMPAGE protocol (Batut et al., 2013), and used for library preparation, Illumina sequencing and subsequent transcription start site determination (ie. “TSS-seq”). These data were used to plot the distribution of Oxytricha TSS positions in
Immunoprecipitation and Illumina Sequencing of Methylated DNA (6 mA IP-Seq)
[0155] Genomic DNA was isolated from vegetative Oxytricha cells using the Nucleospin Tissue Kit (Takara Bio USA, Inc.). DNA was sheared into 150 bp fragments using a Covaris LE220 ultra-sonicator (Covaris). Samples were gel-purified on a 2% agarose-TAE gel, blunted with DNA polymerase I (New England Biolabs), and purified using MinElute spin columns (QIAGEN). The fragmented DNA was dA-tailed using Klenow Fragment (3′->5′ exo-) (New England Biolabs) and ligated to Illumina adaptors following manufacturer's instructions. Subsequently, 2.2 μg of adaptor-ligated DNA containing 6 mA was immunoprecipitated using an anti-N6-methyladenosine antibody (Cedarlane Labs) conjugated to Dynabeads Protein A (Invitrogen). The anti-6 mA antibody is commonly used for RNA applications, but has also been demonstrated to recognize 6 mA in DNA (Fioravanti et al., 2013; Xiao and Moore, 2011). The immunoprecipitated and input libraries were treated with proteinase K, extracted with phenol:chloroform, and ethanol precipitated. Finally, they were PCR-amplified using Phusion Hot Start polymerase (New England Biolabs) and used for Illumina sequencing.
Sample Preparation for SMRT-Seq
[0156] Vegetative Oxytricha macronuclei were isolated as described in the subheading “in vivo MNase-seq” of this study. Vegetative Tetrahymena macronuclei were isolated by differential centrifugation (Beh et al., 2015). Oxytricha and Tetrahymena cells were not fixed prior to nuclear isolation. Genomic DNA was isolated from Oxytricha and Tetrahymena macronuclei using the Nucleospin Tissue Kit (Macherey-Nagel). Alternatively, whole Oxytricha cells instead of macronuclei were used. SMRT-seq according to manufacturer's instructions, using P5-C3 and P6-C4 chemistry, as in (Chen et al., 2014). Oxytricha and Tetrahymena macronuclear DNA were used for SMRT-seq in
Illumina Data Processing
[0157] Reads from all biological replicates were merged before downstream processing. All Illumina sequencing data were quality trimmed (minimum quality score=20) and length-filtered (minimum read length=40nt) using Galaxy (Blankenberg et al., 2010; Giardine et al., 2005; Goecks et al., 2010). MNase-seq and 6 mA IP-seq reads were mapped to complete chromosomes in the Oxytricha trifallax JRB310 (August 2013 build) or Tetrahymena thermophila SB210 macronuclear reference genomes (June 2014 build) using Bowtie2 (Langmead and Salzberg, 2012) with default settings, while poly(A). RNA-seq and TSS-seq reads were mapped using TopHat2 (Mortazavi et al., 2008) with August 2013 Oxytricha gene models or June 2014 Tetrahymena gene models, with default settings.
[0158] MNase-seq datasets were generated by paired-end sequencing. Within each MNase-seq dataset, the read pair length of highest frequency was identified. All read pairs with length±25 bp from this maximum were used for downstream analysis. On the other hand, 6 mA IP-seq datasets were generated by single-read sequencing. 6 mA IP-seq single-end reads were extended to the mean fragment size, computed using cross-correlation analysis (Kharchenko et al., 2008). The per-basepair coverage of Oxytricha MNase-seq read pair centers and extended 6 mA IP-seq reads were respectively computed across the genome. Subsequently, the per-basepair coverage values were normalized by the average coverage within each chromosome to account for differences in DNA copy number (and hence, read depth) between Oxytricha chromosomes (Swart et al., 2013). The per-basepair coverage values were then smoothed using a Gaussian filter of standard deviation=15. This smoothed data is denoted as “normalized coverage” or “nucleosome occupancy.” Tetrahymena MNase-seq data were processed similarly to Oxytricha, except that DNA copy number normalization was omitted as Tetrahymena chromosomes have uniform copy number (Eisen et al., 2006).
[0159] For the MNase-seq analysis in
[0160] Nucleosome positions were iteratively called as local maxima in normalized MNase-seq coverage, as previously described (Beh et al., 2015). “Consensus”+1, +2, +3 nucleosome positions downstream of the TSS were inferred from aggregate MNase-seq profiles across the genome (
[0161] RNA-seq and TSS-seq read coverage were calculated without normalization by DNA copy number since there is no correlation between Oxytricha DNA and transcript levels (Swart et al., 2013).
[0162] Oxytricha TSSs were called from TSS-seq data using CAGEr (Haberle et al., 2015); with clusterCTSS parameters (threshold=1.6, thresholdlsTpm=TRUE, nrPassThreshold=1, method=“paraclu,” removeSingletons=TRUE, keepSingletonsAbove=5). Only TSSs with tags per million counts>0.1 were used for downstream analysis. Tetrahymena TSSs were obtained from (Beh et al., 2015).
SMRT-Seq Data Processing
[0163] We processed SMRT-seq data with SMRTPipe v1.87.139483 in the SMRT Analysis 2.3.0 environment using, in order, the P Fetch, P Filter (with minLength=50, minSubreadLength=50, readScore=0.75, and artifact=−1000), P FilterReports, P Mapping (with gff2Bed=True, pulsemetrics=DeletionQV, IPD, InsertionQV, PulseWidth, QualityValue, MergeQV, SubstitutionQV, DeletionTag, and load PulseOpts=byread), P_MappingReports, P_GenomicConsensus (with algorithm=quiver, outputConsensus=True, and enableMapQVFilter=True), P_ConsensusReports, and P Mod ificationDetection (with identifyModifcations=True, enableMapQVFilter=False, and mapQvThreshold=10) modules. All other parameters were set to the default. The Oxytricha August 2013 reference genome build was used for mapping Oxytricha SMRT-seq reads, with Contig10040.0.1, Contig1527.0.1, Contig4330.0.1, and Contig54.0.1 removed, as they are perfect duplicates of other Contigs in the assembly. Tetrahymena SMRT-seq reads were mapped to the June 2014 reference genome build. Only chromosomes with high SMRT-seq coverage (>=80× for Oxytricha; >=100× for Tetrahymena) were used for all 6 mA-related analyses.
Chromosome Synthesis
[0164] Synthetic Contig1781.0 chromosomes were constructed from “building blocks” of native chromosome sequence (
Verification of Synthetic Chromosome Sequences
[0165] All chromosomes were dA-tailed using Klenow Fragment (3′->5′ exo-) (New England Biolabs), cloned using a TOPO TA cloning kit (Thermo Fisher) or StrataClone PCR Cloning Kit (Agilent Technologies), transformed into One Shot TOP10 chemically competent E. coli, and sequenced using flanking T7, T3, M13F, or M13R primers.
Preparation of Oxytricha Histones
[0166] Vegetative Oxytricha trifallax strain JRB310 was cultured as described in the subheading: “Experimental model and subject details” of this study. Cells were starved for 14 hr and subsequently harvested for macronuclear isolation as described in the subheading: “in vivo MNase-seq” of this study. However, formaldehyde fixation was omitted. Purified nuclei were pelleted by centrifugation at 4000×g, resuspended in 0.421 mL 0.4N H.sub.2SO.sub.4 per 10.sup.6 input cells, and nutated for 3 hr at 4° C. to extract histones. Subsequently, the acid-extracted mixture was centrifuged at 21,000× a for 15 min to remove debris. Proteins were precipitated from the cleared supernatant using trichloroacetic acid (TCA), washed with cold acetone, then dried and resuspended in 2.5% v/v acetic acid. Individual core histone fractions were purified from crude acid-extracts using semi-preparative RP-HPLC (Vydac C18, 12 micron, 10 mM×250 mm) with 40%-65% HPLC solvent B over 50 min (
Preparation of Recombinant Xenopus Histones
[0167] All RP-HPLC analyses were performed using 0.1% TFA in water (HPLC solvent A), and 90% acetonitrile, 0.1% TFA in water (HPLC solvent B) as the mobile phases. Wild-type Xenopus H4, H3 C110A, H2B and H2A proteins were expressed in BL21(DE3) pLysS E. coli and purified from inclusion bodies through ion exchange chromatography (Debelouchina et al., 2017). Purified histones were characterized by ESI-MS using a MicrOTOF-Q II ESI-Qq-TOF mass spectrometer (Bruker Daltonics). H4: calculated 11,236 Da, observed 11,236.1 Da; H3 C110A: calculated 15,239 Da, observed 15,238.7 Da; H2A: calculated 13,950 Da, observed 13,949.8 Da; H2B: calculated 13,817 Da, observed 13,816.8 Da.
Preparation of Histone Octamers
[0168] Oxytricha and Xenopus histone octamers were respectively refolded from core histones using established protocols (Beh et al., 2015; Debelouchina et al., 2017). Briefly, lyophilized histone proteins (Xenopus modified or wild-type; Oxytricha acid-extracted) were combined in equimolar amounts in 6 M guanidine hydrochloride, 20 mM Tris pH 7.5 and the final concentration was adjusted to 1 mg/mL. The solution was dialyzed against 2M NaCl, 10 mM Tris, 1 mM EDTA, and the octamers were purified from tetramer and dimer species using size-exclusion chromatography on a Superdex 200 10/300 column (GE Healthcare Life Sciences). The purity of each fraction was analyzed by SDS-PAGE. Pure fractions were combined, concentrated and stored in 50% v/v glycerol at −20° C.
Preparation of Mini-Genome DNA
[0169] 98 full-length chromosomes were individually amplified from Oxytricha trifallax strain JRB310 genomic DNA using Phusion DNA polymerase (New England Biolabs). Primer pairs are listed in Table 2. Amplified chromosomes were separately purified using a MinElute PCR purification kit (QIAGEN), and then mixed in equimolar ratios to obtain “mini-genome” DNA. The sample was concentrated by ethanol precipitation and adjusted to a final concentration of ˜1.6 mg/mL.
Preparation of Native Genomic DNA for Chromatin Assembly Starry
[0170] Macronuclei were isolated from vegetative Oxytricha trifallax strain JRB310 as described in the subheading “in vivo MNase-seq” of this study. However, cells were not fixed prior to nuclear isolation. Genomic DNA was purified using the Nucleospin Tissue kit (Macherey-Nagel). Approximately 200 μg of genomic DNA was loaded on a 15%-40% linear sucrose gradient and centrifuged in a SW 40 Ti rotor (Beckman Coulter) at 160,070×g for 22.5 hr at 20° C. Sucrose solutions were in 1M NaCl, 20 mM Tris pH 7.5, 5 mM EDTA. Individual fractions from the sucrose gradient were analyzed on 0.9% agarose-TAE gels. Fractions containing high molecular weight DNA that migrated at the mobility limit were discarded as such DNA species were found to interfere with downstream chromatin assembly. All other fractions were pooled, ethanol precipitated, and adjusted to 0.5 mg/mL DNA.
Chromatin Assembly and Preparation of Mononucleosomal DNA
[0171] Chromatin assemblies were prepared by salt gradient dialysis as previously described (Beh et al., 2015; Luger et al., 1999), or using mouse NAP1 histone chaperone and Drosophila ACF chromatin remodeler as previously described (An and Roeder, 2004; Fyodorov and Kadonaga, 2003). Details of each chromatin assembly procedure are listed below. To reduce sample requirements while maintaining adequate DNA concentrations for chromatin assembly, synthetic chromosomes were first mixed with a hundred-fold excess of “buffer” DNA (PCR-amplified Oxytricha Contig17535.0). We verified that nucleosome occupancy in the methylated region (qPCR primer pairs 6 and 7) of the synthetic chromosome is unaffected by the presence of buffer DNA (
[0172] For chromatin assembly through salt dialysis: histone octamers and (synthetic chromosome+buffer) DNA were mixed in a 0.8:1 mass ratio, while histone octamers and (native or mini-genome) DNA were mixed in a 1.3:1 mass ratio, each in a 50 μL total volume. Samples were first dialyzed into start buffer (10 mM Tris pH 7.5, 1.4M KCl, 0.1 mM EDTA pH 7.5, 1 mM DTT) for 1 hr at 4° C. Then, 350 mL end buffer (10 mM Tris pH 7.5, 10 mM KCl, 0.1 mM EDTA, 1 mM DTT) was added at a rate of 1mUmin with stirring. The assembled chromatin was dialyzed overnight at 4° C. into 200 mL end buffer, followed by a final round of dialysis in fresh 200 mL end buffer for 1 hr at 4° C. The assembled chromatin was then adjusted to 50 mM Tris pH 7.9, 5 mM CaCl.sub.2) and digested with MNase (New England Biolabs) to mainly mononucleosomal DNA as previously described (Beh et al., 2015).
[0173] For chromatin assembly using mouse NAP1 and Drosophila ACF: NAP1 was recombinantly expressed and purified as described in (An and Roeder, 2004). ACF was purchased from Active Motif. 0.49 μM NAP1 and 58 nM histone octamer were first mixed in a 302p1 reaction volume containing 62 mM KCl, 1.2% w/v polyvinyl alcohol (Sigma Aldrich), 1.2% w/v polyethylene glycol 8000 (Sigma Aldrich), 25 mM HEPES-KOH pH 7.5, 0.1 mM EDTA-KOH, 10% v/v glycerol, and 0.01% v/v NP-40. The NAP1-histone mix was incubated on ice for 30 min. Meanwhile, “AM” mix was prepared, consisting of 20 mM ATP (Sigma Aldrich), 200 mM creatine phosphate (Sigma Aldrich). 33.3 mM MgCl.sub.2, 33.3 μg/μl creatine kinase (Sigma Aldrich) in a 56u1 reaction volume. After the 30 min incubation. 5.29 μl of 1.7 μM ACF complex (Active Motif) and the “AM” mix were sequentially added to the NAP1-histone mix. Then, 10.63 μl of native or mini-genome DNA (2.66 μg) was added, resulting in a 374 μl reaction volume. The final mixture was incubated at 27° C. for 2.5 hr to allow for chromatin assembly. Subsequently, CaCl.sub.2 was added to a final concentration of 5 mM, and the chromatin was digested with MNase (New England Biolabs) to mainly mononucleosomal DNA as previously described (Beh et al., 2015).
[0174] Mononucleosome-sized DNA from MNase-digested chromatin was gel-purified and used for tiling qPCR on a Viia 7 Real-Time PCR System with Power SYBR Green PCR master mix (Thermo Fisher), or in vitro MNase-seq on an Illumina HiSeq 2500, according to the manufacturer's instructions. qPCR primer sequences are listed in Table 2.
Tiling qPCR Analysis of Nucleosome Occupancy
[0175] qPCR data were analyzed using the ΔΔCt method (Livak and Schmittgen, 2001). At each locus along the synthetic chromosome, ΔCt=(Ct at locus of interest)−(Ct at qPCR primer pair 22, far from the methylated region). See
ACF Spacing Assay
[0176] ATP-dependent nucleosome spacing was performed in accordance with a previous study (Lieleg et al., 2015). Chromatin was assembled by salt gradient dialysis as described above, and then adjusted to 20 mM HEPES-KOH pH 7.5, 80 mM KCl, 0.5 mM EGTA, 12% v/v glycerol, 10 mM (NH.sub.4).sub.2SO.sub.4, 2.5 mM DTT. Samples were then incubated for 2.5 hr at 27° C. with 3 mM ATP, 30 mM creatine phosphate, 4 mM MgCl.sub.2, 5 ng/0 creatine kinase, and 11 ng/μL ACF complex (Active Motif). Remodeled chromatin was then adjusted to 5 mM CaCl.sub.2) and subjected to MNase digestion, mononucleosomal DNA purification, and qPCR analysis as described above.
Phylogenetic Analysis
[0177] The MTA1 amino acid sequence (UniProt ID: J9IF92 9SPIT) was queried against the NCBI nr database using PSI-BLAST (Altschul et al., 1997; Schaffer et al., 2001) (maximum e-value=1e.sup.−4; enable short queries and filtering of low complexity regions). Retrieved hits were collapsed using CD-HIT (Huang et al., 2010) with minimum sequence identity=0.97 to remove redundant sequences. The resulting sequences were added to existing MT-A70 alignments from (Greer et al., 2015) using MAFFT (-add) (Katoh et al., 2017; Kuraku et al., 2013). Gaps and duplicate sequences were removed from the merged alignment. Only sequences corresponding to the taxa in
[0178] The above procedure was also used for constructing phylogenetic trees from p1 (UniProt ID: Q22VV9 TETTS) and p2 (UniProt ID: I7M8B9 TETTS). However, protein sequences were aligned using MAFFT without adding to an existing alignment.
Preparation of Nuclear Extracts with DNA Methyltransferase Activity
[0179] Vegetative Tetrahymena cells were grown in SSP medium to log-phase (˜3.5×10.sup.6 cells/mL) and collected by centrifugation at 2,300×g for 5 min in an SLA-3000 rotor. The supernatant was discarded, and cells were resuspended in medium B (10 mM Tris pH 6.75, 2 mM MgCl.sub.2, 0.1M sucrose, 0.05% w/v spermidine trihydrochloride, 4% w/v gum Arabic, 0.63% w/v 1-octanol, and 1 mM PMSF). Gum arabic (Sigma Aldrich) is prepared as a 20% w/v stock and centrifuged at 7,000×g for 30 min to remove undissolved clumps. For each volume of cell culture, one-third volume of medium B was added to the Tetrahymena cell pellet. Cells were resuspended and homogenized in a chilled Waring Blender (Waring PBB212) at high speed for 40 s. The resulting lysate was subsequently centrifuged at 2,750×g for 5 min in an SLA-3000 rotor to pellet macronuclei. The nuclear pellet was washed twice with medium B and then five times in MM medium (10 mM Tris-HCl pH 7.8, 0.25M sucrose, 15 mM MgCl.sub.2, 0.1% w/v spermidine trihydrochloride, 1 mM DTT, 1 mM PMSF). Macronuclei were pelleted between wash steps by centrifuging at 2,500×g for 5 min in an SLA-3000 rotor. Finally, the total number of washed macronuclei was counted with a hemocytometer using a Zeiss ID03 microscope. Nuclear proteins were extracted by vigorously resuspending the pellet in M M salt buffer (10 mM Tris-HCl pH 7.8, 0.25M sucrose, 15 mM MgCl2, 350 mM NaCl, 0.1% w/v spermidine trihydrochloride, 1 mM DTT, 1 mM PMSF). 1 mL M M salt buffer was added per 2.33×108 macronuclei. The viscous mixture was nutated for 45 min at 4° C., and then cleared at 175,000×g for 30 min at 4° C. in a SW 41 Ti rotor. Following this, the supernatant was dialyzed in a Slide-A-Lyzer 3.5K MWCO cassette (Thermo Fisher) overnight at 4° C. against two changes of MM minus medium (10 mM Tris-HCl pH 7.8, 15 mM MgCl.sub.2, 1 mM DTT, 0.5 mM PMSF). The dialysate was then centrifuged at 7,197×g for 1 hr at 4″C to remove precipitates, and dialyzed overnight in a Slide-A-Lyzer 3.5K MWCO cassette (Thermo Fisher) at 4° C. against two changes of MN3 buffer (30 mM Tris-HCl pH 7.8, 1 mM EDTA, 15 mM NaCl, 20% v/v glycerol, 1 mM DTT, 0.5 mM PMSF). The final dialysate was cleared by centrifugation at 7,197 g for 1.5 hr at 4° C., flash frozen, and stored at −80° C. This nuclear extract was used for all subsequent biochemical fractionation and 6 mA methylation assays.
Partial Purification of MTA1c from Nuclear Extracts
[0180] Tetrahymena nuclear extracts were passed through a HiTrap O HP column (GE Healthcare) and eluted using a linear aradient of 15 mM to 650 mM NaCl in 30 mM Tris-HCl pH 7.8, 1 mM EDTA, 20% v/v glycerol, 1 mM DTT, 0.5 mM PMSF, over 30 column volumes. Each fraction was assayed for DNA methyltransferase activity using radiolabeled SAM as described in the next section. The DNA methyltransferase activity eluted in two peaks, at ˜60 mM and ˜365 mM NaCl, termed the “low salt sample” and “high salt sample.” Fractions corresponding to each peak were pooled and passed through a HiTrap Heparin HP column (GE Healthcare). Bound proteins were eluted using a linear gradient of 60 mM to 1M NaCl (for the low salt sample) or 350 mM to 1M NaCl (for the high salt sample) over 30 column volumes. Fractions with DNA methyltransferase activity were respectively pooled and dialyzed into 10 mM sodium phosphate pH 6.8, 100 mM NaCl, 10% v/v glycerol, 0.3 mM CaCl.sub.2), 0.5 mM DTT (for the low salt sample); or 30 mM Tris-HCl pH 7.8, 1 mM EDTA, 200 mM NaCl, 10% v/v glycerol, 1 mM DTT, 0.2 mM PMSF (for the high salt sample). The dialyzed low salt sample was passed through a Nuvia cPrime column (Bio-Rad) and eluted using a linear gradient of 100 mM to 1M NaCl in 50 mM sodium phosphate pH 6.8, 10% v/v glycerol, 0.5 mM DTT. Separately, the dialyzed high salt sample was fractionated using a Superdex 200 10/300 GL column (GE Healthcare) in 30 mM Tris-HCl pH 7.8, 1 mM EDTA, 200 mM NaCl, 10% v/v glycerol, 1 mM DTT. Fractions from the Nuvia cPrime and Superdex 200 columns were dialyzed into 30 mM Tris-HCl pH 7.8, 1 mM EDTA, 15 mM NaCl, 20% v/v glycerol, 1 mM DTT, 0.5 mM PMSF and assayed for DNA methyltransferase activity. Those with qualitatively low, medium, and high activity were subjected to mass spectrometry to identify candidate methyltransferase proteins (
Recombinant Expression of MTA1, MTA9, p1, and p2 Proteins
[0181] Full length MTA1, MTA9, p1, and p2 open reading frames were codon-optimized for bacterial expression and cloned into a pET-His6-SUMO vector using ligation independent cloning. Protein sequences are listed in Table 3. The vector was a gift from Scott Gradia (Addgene plasmid #29659; http://addgene.org/29659; RRID: Addgene 29659). Mutations in the MTA1 open reading frame was introduced using the OS® Site-Directed Mutagenesis Kit (New England Biolabs). For recombinant expression, pET-His6-SUMO-MTA1 (wild-type and mutant) was transformed into SHuffle T7 competent E. co/i (New England Biolabs); pET-His6-SUMO-MTA9 was transformed into Lemo (DE3) competent E. coli (New England Biolabs); pET-His6-SUMO-p1 and pET-His6-SUMO-p2 were transformed into BL21(DE3) competent E. coli (New England Biolabs). IPTG induction was performed at 16′C overnight. Induced cells were resuspended in 25 ml of lysis buffer B (50 mM Tris pH 7.8, 300 mM NaCl, 5% v/v glycerol, 10 mM imidazole, 5 mM BME, 1 mM PMSF, 0.5× ProBlock Gold Bacterial protease inhibitor cocktail [GoldBio]). The cells were sonicated at 35% amplitude for a total of 4 minutes, with a 10 s off, 10 s cycle using a Model 505 Sonic Dismembrator (Fisherbrand). Lysates were cleared by centrifugation at 30,000 g for 30 min at 4° C., mixed with pre-washed Ni-NTA agarose (Invitrogen), and nutated for 45 min at 4° C. The resin was subsequently washed with lysis buffer and eluted in 50 mM Tris pH 7.8, 300 mM NaCl, 5% v/v glycerol, 400 mM glycerol, 5 mM BME, lx ProBlock Gold bacterial protease inhibitor cocktail [GoldBio]). Eluates were dialyzed into lysis buffer B and then digested with TEV protease (gift from S.H. Sternberg) at 4° C. overnight. The resulting mixture was passed through a fresh batch of Ni-NTA agarose (Invitrogen) to remove cleaved affinity tags. The flow-through containing each recombinant protein was flash frozen and used for all downstream methyltransferase assays.
Methyltransferase Assays
Generation of DNA and RNA Substrates
[0182] A 954 bp dsDNA PCR product was used in all assays involving Tetrahymena nuclear extract. This substrate was amplified by PCR from Tetrahymena thermophila strain SB210 macronuclear SB210 genomic DNA using PCR primers metGATC F2 and metGATC_R2 (Table 2). The resulting product was purified using Ampure XP beads (Beckman Coulter). This 954 bp region of the genome contains a high level of 6 mA in vivo. Thus, the underlying DNA sequence may be intrinsically amenable to methylation by Tetrahymena MTA1. Note that the amplified 954 bp product is devoid of DNA methylation as unmodified dNTPs were used for PCR. Separately, a 350 bp dsDNA PCR product was used in all assays involving recombinant MTA1, MTA9, p1 and p2. This sequence lacks 5′-NATC-3′ motifs, and was used to reduce background DNA methylation from contaminating Dam methyltransferase in recombinant protein preparations. The 350 bp dsDNA PCR product was amplified from Tetrahymena thermophila strain SB210 macronuclear SB210 genomic DNA using the PCR primers noGATC2 F and noGATC2_R (Table 2), and purified using Ampure XP beads (Beckman Coulter).
[0183] For short DNA substrates (<50 bp), oligonucleotides were purchased from Integrated DNA Technologies and either directly used as ssDNA, or annealed with its complementary sequence to obtain dsDNA. To prepare hemimethylated 27 bp dsDNA in
[0184] To generate ˜350nt ssRNA and −350 bp dsRNA, the aforementioned 350 bp dsDNA was first PCR-amplified using primers containing T7 overhangs (primer pairs T7noGATC2_F2/noGATC2_R and T7noGATC2_F2/T7noGATC2_R2 respectively; see Table 2 for primer sequences). Each PCR product was used as a template for in vitro transcription using the HiScribe T7 High Yield RNA Synthesis Kit (New England Biolabs). The synthesized RNA was rigorously treated with DNase (ThermoFisher) purified using acid phenol:chloroform extraction, followed by two rounds of chloroform extraction. Each sample was subsequently ethanol precipitated and resuspended in water for use in methyltransferase assays.
Radioactive Methyltransferase Assay
[0185] For experiments involving nuclear extract, 2.18 μg of 954 bp dsDNA substrate was mixed with 4-8 μl nuclear extract and 0.64 μM 3H-labeled S-adenosyl-L-methionine ([.sup.3H]SAM) in 33 mM Tris-HCl pH 7.5. 6 mM EDTA. 4.3 mM BME. in a 15p1 reaction volume. For experiments involving recombinant MTA1c protein components (ie. MTA1, MTA9, p1, and/or p2), ˜3 μM oligonucleotide ssDNA/annealed dsDNA is used. Alternatively, 1.3 μg of 350 bp dsDNA substrate (or an equimolar amount ˜350nt ssRNA, or ˜350 bp dsDNA) was used in place of DNA oligonucleotide substrates. ssRNA was heated at 90° C. for 2 min and snap cooled to minimize secondary structures before mixing with other components of the methyltransferase assay. All samples were incubated overnight at 37° C., and subsequently spotted onto 1 cm×1 cm squares of Hybond-XL membrane (GE Healthcare). Membranes were then washed thrice with 0.2M ammonium bicarbonate, once with distilled water, twice with 100% ethanol, and finally air-dried for 1 hr. Each membrane was immersed in 5 mL Ultima Gold (PerkinElmer) and used for scintillation counting on a TriCarb 2910 TR (Perkin Elmer).
Non-Radioactive Methyltransferase Assay
[0186] For assays involving nuclear extract: 5.5 pg of 954 bp DNA substrate was mixed with 20 nuclear extract and 0.2 mM S-adenosyl-L methionine (NEB) in 33 mM Tris-HCl pH 7.5, 6 mM EDTA, 4.3 mM BME in a 15p1 reaction volume. For assays involving recombinant MTA1c protein components (ie. MTA1, MTA9, p1, and/or p2), 2.6 μg of 350 bp DNA substrate was mixed with 540 nM MTA1, 90 nM MTA9, 1.5 μM p1, 1.0 μM p2 proteins. The band of expected size in each recombinant protein preparation was compared against a series of BSA standards to calculate protein concentration. All methylation reactions were incubated at 37° C. overnight, then purified using a MinElute purification kit (QIAGEN), denatured at 95° C. for 10 min, and snap cooled in an ice water bath. Samples were spotted on a Hybond N+ membrane (GE Healthcare), air-dried for 5 min and UV-cross-linked with 120,000 μJ/cm.sup.2 exposure using an Ultra-Lum UVC-515 Ultraviolet Multilinker. The cross-linked membrane was blocked in 5% milk in TBST (containing 0.1% v/v Tween) and incubated with 1:1,000 anti-N6-methyladenosine antibody (Synaptic Systems) at 4° C. overnight. The membrane was then washed three times with TBST, incubated with 1:3,000 Goat anti-rabbit HRP antibody (Bio-Rad) at room temperature for 1 hr, washed another three times with 1×TBST, and developed using Amersham ECL Western Blotting Detection Kit (GE Healthcare). This dot blot assay was used to measure 6 mA levels in
Quantitative Mass Spectrometry Analysis of dA and 6 mA
[0187] 10.5 μg Oxytricha or Tetrahymena macronuclear genomic DNA was first digested to nucleosides by mixing with 14p1 DNA degradase plus enzyme (Zymo Research) in a 262.5 μl reaction volume. Samples were incubated at 37° C. overnight, then 70° C. for 20 min to deactivate the enzyme.
[0188] The internal nucleoside standards .sup.15N.sub.5-dA and D.sub.3-6 mA were used to quantify endogenous dA and 6 mA levels in ciliate DNA. .sup.15N.sub.5-dA was purchased from Cambridge Isotope Laboratories, while D.sub.3-6 mA was synthesized as described in the following section. Nucleoside samples were spiked with 1 ng/μl .sup.15N.sub.5-dA and 200 pg/μl D.sub.3-6 mA in an autosampler vial. Samples were loaded onto a 1 mm×100 mm C18 column (Ace C18-AR, Mac-Mod) using a Shimadzu HPLC system and PAL auto-sampler (20 μl/injection) at a flow rate of 70 μl/min. The column was connected inline to an electrospray source couple to an LTQ-Orbitrap XL mass spectrometer (Thermo Fisher). Caffeine (2 pmol/μl in 50% Acetonitrile with 0.1% FA) was injected as a lock mass through a tee at the column outlet using a syringe pump at 0.5p1/min (Harvard PHD 2000). Chromatographic separation was achieved with a linear gradient from 10% to 99% B (A: 0.1% Formic Acid, B: 0.1% Formic Acid in Acetonitrile) in 5 min, followed by 5 min wash at 100% B and equilibration for 10 min with 1% B (total 20 min program). Electrospray ionization was achieved using a spray voltage of 4.50 kV aided by sheath gas (Nitrogen) flow rate of 18 (arbitrary units) and auxiliary gas (Nitrogen) flow rate of 2 (arbitrary units). Full scan MS data were acquired in the Orbitrap at a resolution of 60,000 in profile mode from the m/z range of 190-290. A parent mass list was utilized to acquire MS/MS spectra at a resolution of 7500 in the Orbitrap. LC-MS data were manually interpreted in Xcalibur's Qual browser (Thermo, Version 2.1) to visualize nucleoside mass spectra and to generate extracted ion chromatograms by using the theoretical [M+H] within a range of ±2 ppm. Peak areas were extracted in Skyline (Ver. 3.5.0.9319).
Synthesis of D.SUB.3.-6 mA Nucleoside
[0189] 2′-Deoxyadenosine and CD3I were purchased from Sigma Aldrich. Flash chromatography was performed on a Biotage Isolera using silica columns (Biotage SNAP Ultra, HP-Sphere 25 pm). Semi-preparative RP-HPLC was performed on a Hewlett-Packard 1200 series instrument equipped with a Waters XBridge BEH C18 column (5 μm, 10×250 mm) at a flow rate of 4 mL/min, eluting using A (0.1% formic acid in H.sub.2O) and B (0.1% formic acid in 9:1 MeCN/H.sub.2O). .sup.1H NMR spectra were recorded on a Bruker UltraShield Plus 500 MHz instrument. Data for .sup.1H NMR are reported as follows: chemical shift (8 ppm), multiplicity (s=singlet, br=broad signal, d=doublet, dd=doublet of doublets) and coupling constant (Hz) where possible. .sup.13C NMR spectra were recorded on a Bruker UltraShield Plus 500 MHz.
[0190] D.sub.3-6 mA (2′Deoxy-6-[D3]-methyladenosine) were synthesized and purified according to (Schiffers et al., 2017). After an initial purification by flash column chromatography, the methylated compounds were further purified by semipreparative RP-HPLC (linear gradient of 0% to 20% B over 30 min) affording the desired compounds in 14% and 10% yields respectively after lyophilization.
2Deoxy-6-[D3]methyladenosine
[0191] .sup.1H NMR (500 MHz, D.sub.2O) δ 7.98 (s, 1H), 7.77 (s, 1H), 6.17 (m, 1H), 4.54 (m, 1H), 4.10 (m, 1H), 3.79 (dd, J=12.7, 3.2 Hz, 1H), 3.71 (dd, J=12.7, 4.3 Hz, 1H), 2.60 (m, 1H), 2.44 (ddd, J=14.0, 6.3, 3.3 Hz, 1H).
[0192] .sup.13C NMR (126 MHz, D.sub.2O) δ 154.0, 151.5, 146.1, 138.9, 118.4, 87.3, 84.3, 71.1, 61.6, 39.2, 26.4 ppm. (Peak at 26.4 ppm appears as a broad signal. C-D coupling is not resolved).
[0193] HR-MS (ESI+): m/z calculated for [C.sub.11H.sub.13D.sub.3N.sub.5O.sub.3].sup.+ ([M+Hr): 269.1436. found 269.1421.
Mass Spectrometry Analysis of Proteins in Tetrahymena Nuclear Extracts
[0194] Samples where topped up to 200p1 with 50 mM ammonium bicarbonate pH 8. TCEP was added to 5 mM final concentration and left to incubate at 60° C. for 10 min. 15 mM chloroacetamide was then added and left to incubate in the dark at room temperature for 30 min. 1 μg of Trypsin Gold (Promega) was added to each sample and incubated end-over-end at 37° C. for 16 hr. An additional 0.25 μg of Trypsin Gold was added and incubated end-over-end at 37° C. for 3 hr. Samples were acidified by adding TFA to 0.2% final concentration, and desalted using SDB stage-tips (Rappsilber et al., 2007). Samples were dried completely in a speedvac and resuspended in 20p1 of 0.1% formic acid pH 3.5 μl was injected per run using an Easy-nLC 1200 UPLC system. Samples were loaded directly onto a 45 cm long 75 pm inner diameter nano capillary column packed with 1.9 μm C18-AQ (Dr. Maisch, Germany) mated to metal emitter in-line with an Orbitrap Fusion Lumos (Thermo Scientific, USA). The mass spectrometer was operated in data dependent mode with the 120,000 resolution MS1 scan (AGC 4e5, Max IT 50 ms, 400-1500 m/z) in the Orbitrap followed by up to 20 MS/MS scans with CID fragmentation in the ion trap. Dynamic exclusion list was invoked to exclude previously sequenced peptides for 60 s if sequenced within the last 30 s, and maximum cycle time of 3 s was used. Peptides were isolated for fragmentation using the quadrupole (1.6 Da window). Ns was utilized. Ion-trap was operated in Rapid mode with AGC 2e3, maximum IT of 300 msec and minimum of 5000 ions.
[0195] Raw files were searched using Byonic (Bern et al., 2012) and Sequest HT algorithms (Eng et al., 1994) within the Proteome Discoverer 2.1 suite (Thermo Scientific, USA). 1 Oppm MS1 and 0.4 Da MS2 mass tolerances were specified. Caramidomethylation of cysteine was used as fixed modification, while oxidation of methionine, pyro-Glu from Gln and deamidation of asparagine were specified as dynamic modifications. Trypsin digestion with maximum of 2 missed cleavages were allowed. Files were searched against the Tetrahymena themophila macronuclear reference proteome (June 2014 build), supplemented with common contaminants (27,099 total entries).
[0196] Scaffold (version Scaffold 4.8.7, Proteome Software Inc., Portland, Oreg.) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 93.0% probability. Peptide Probabilities from Sequest and Byonic were assigned by the Scaffold Local FDR algorithm. Protein identifications were accepted if they could be established at greater than 99.0% probability to achieve an FDR less than 1.0% and contained at least 3 identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm (Nesvizhskii et al., 2003). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.
Generation of Mta1 Mutant Lines
[0197] A frameshift mutation in the MTA1 gene was created by inserting a small non-coding DNA segment immediately downstream of the MTA1 start codon (
[0198] ssRNA was generated by in vitro transcription using a Hi-Scribe T7 High Yield RNA Synthesis Kit (New England Biolabs). The DNA template for in vitro transcription consists of the ectopic DNA segment flanked by 100-200 bp cognate MTA1 sequence. Following DNase treatment, ssRNA was acid-phenol:chloroform extracted and ethanol precipitated. After precipitation, ssRNA was resuspended in nuclease-free water (Ambion) to a final concentration of 1 to 3 mg/mL for injection.
ssRNA Microinjections
[0199] Oxytricha cells were mated by mixing 3 mL of each mating type, JRB310 and JRB510, along with 6 mL of fresh Pringsheim media. At 10 to 12 hr post mixing, pairs were isolated and placed in Volvic water with 0.2% bovine serum albumin (Jackson ImmunoResearch Laboratories) (Fang et al., 2012). ssRNA constructs were injected into the macronuclei of paired cells under a light microscope as previously described with DNA constructs (Nowacki et al., 2008). After injection, cells were pooled in Volvic water. At 60 to 72 hr post mixing, the pooled cells were singled out to grow clonal injected cell lines. As clonal population size grew, lines were transferred to 10 cm Petri dishes and grown in Pringsheim media. Only water from the “Volvic” brand has been empirically tested in our laboratory to support Oxytricha growth. Similar products from other vendors have not been tested.
Survival Analysis of Oxytricha Mta1 Mutants
[0200] This experiment was performed in
Quantification and Statistical Analysis
[0201] All statistical tests were performed in Python (v2.7.10) or R (v3.2.5), and described in the respective Figure and Table legends.
Data and Software Availability
[0202] Oxytricha SMRT-seq data are deposited in SRA under the accession numbers SRA: SRX2335608 and SRX2335607, and GEO: GSE94421. Tetrahymena SMRT-seq and all Oxytricha Illumina data are deposited in NCBI GEO under accession number GEO: GSE94421.
TABLE-US-00002 TABLE 1 Protein sequences for phylogenetic tree construction. Protein sequences for phylogenetic analysis of MT-A70 proteins (including MTA1 and MTA9) >NP_495127.1 DNA N6-methyl methyltransferase [Caenorhabditis elegans] (SEQ ID No: 1) MDTEFAILDEEKYYDSVFKELNLKTRSELYEISSKFMPDSQFEAIKRRGISNRKRKIKETSENSNRMEQMALKIKNVG TELKIFKKKSILDNNLKSRKAAETALNVSIPSASASSEQIIEFQKSESLSNLMSNGMINNWVRCSGDKPGIIENSDGTK FYIPPKSTFHVGDVKDIEQYSRAHDLLFDLIIADPPWFSKSVKRKRTYQMDEEVLDCLDIPVILTHDALIAFWITNRIGI EEEMIERFDKWGMEVVATWKLLKITTQGDPVYDFDNQKHKVPFESLMLAKKKDSMRKFELPENFVFASVPMSVHS HKPPLLDLLRHFGIEFTEPLELFARSLLPSTHSVGYEPFLLQSEHVFTRNISL >NP_564080.1 Methyltransferase MT-A70 family protein [Arabidopsis thaliana] (SEQ ID No: 2) MAKTDKLAQFLDSGIYESDEFNWFFLDTVRITNRSYTRFKVSPSAYYSRFFNSKQLNQHSSESNPKKRKRKQKNSS FHLPSVGEQASNLRHQEARLFLSKAHESFLKEIELLSLTKGLSDDNDDDDSSLLNKCCDDEVSFIELGGVWQAPFYE ITLSFNLHCDNEGESCNEQRVFQVFNNLVVNEIGEEVEAEFSNRRYIMPRNSCFYMSDLHHIRNLVPAKSEEGYNLI VIDPPWENASAHQKSKYPTLPNQYFLSLPIKQLAHAEGALVALWVTNREKLLSFVEKELFPAWGIKYVATMYWLKV KPDGTLICDLDLVHHKPYEYLLLGYHFTELAGSEKRSDFKLLDKNQIIMSIPGDFSRKPPIGDILLKHTPGSQPARCLE >ORY94237.1 MT-A70-domain-containing protein [Syncephalastrum racemosum] (SEQ ID No: 3) MIVASSDTCDIVDCEAAFGIDGTVRLRPGDFSLGTPYFTSRLGQKRPRPDDDTLDNTPSDTIHAIVQQLPVMAPDY WHDRPMEAVVMNAHVHFPSLVSLAEASLRFDPDNDEDEDNRQILRPDMALESLQVFYRHFEHPKDSPILIRVQDAY YWIPPRTAFMMGSLENIHLPTLGKFDCIVMDPPWPNKSVRRSAHYETQEDIYDLFAIPLPQLAQPNCLVAVWVTNK PKFIRFVQKLFAAWDVEPLTTWYWLKVTTHGEPVCPIDSPHRKPYEHLILGRKRPVKININDPPALPRVLVSVPSKH HSRKPPLNDILMRYLPSDARRLELFARCLTPGWTSWGNECLKFQHVDYFYDTNEAMEEGKQK >ORX58127.1 MT-A70-domain-containing protein [Hesseltinella vesiculosa] (SEQ ID No: 4) MANAARRFAQQDELPLDVSQDLQDLPLLDLFNRKVINDSDQCSSLHVASFGQYLVPRHTKFVMSDLDNIDLLRSEN DVFDLIVMDPPWPNKSVHRSTDYETQDIYDLFHLPIKSLIKNQGLVAVWVTNKPKYRRFILDKLFKAWQMTCVGEW LWLKVTSSGEPVFPLDSPHRKPYEQLILGRYQPDDTSPTLPNPPQQHVLISVPSIRHSRKPPLGEVLADFLPKQPAC LELFARCLTPGWTSWGNECLKFQHESYFISNDTPHSPSAS >ORZ15132.1 MT-A70-like protein, partial [Absidia repens] (SEQ ID No: 5) YDLVVMDPPWPNKSVHRSSHYETQDIYDLYQIPLTSLVHKNSLVAVWITNKPKYRRFVMDKLFKSWHVDCVAEWT WLKVTNDGEPVFPLNSTHRKPYEQLIIGRYNGGSGGGNDNNDSIQEESEVKPIPYQHSIVSVPSKRHSRKPPLQDL LQPYLPAKPRCLELFARCLTPGWSSWGNECLKFQNEYYYTRIENPLHIDRSDV >XP_021679935.1 MT-A70-domain-containing protein [Lobosporangium transversale] (SEQ ID No: 6) MLHESTVSVLDRLILISHISLQTYLLAKDREGFDIIVMDPPWQNASVDRMSHYRTMDLYELFKIPIPDLLKANGSNVG GIVAVWITNKAKVKRVVVEKLFPAWGLDLVAHWFWLKVTTKGEPVLSLSNSHRRAYEGVLIGRQRQGSKLSNKTM HETSASNPVNRLLVSIPAQHSRKPSLNALIEEEFFTSKLESRADRDRNAYVDSEALVKKPLYRLELFARNLEEGVLS WGNEPLRYQYCGRGASNSQVVQDGYLIPCPIQSELVSQ >XP_689178.3 methyltransferase-like protein 4 isoform X1 [Danio rerio] (SEQ ID No: 7) MSVVCCNSWGWLLDSSSHIDKDFQRCVCYNEANGLEENTHFTCCFKRQYFNILMPHMQQSTAMSGFPLDSGKH DSAEHEKIELQTRKKRKRKHHDLNTGEIEANIYHDKVRSVVLEGSRALLEAGRQCGYFTEALTESQTISTPSESTSA HECQLAAFCDLAKQLPLSEESPVHTLSRDGQNPALDLFSSITENPFDCACEITFMRERYLLPPRCRFLLSDVTRMDP LVNSGDKFDLIVLDPPWENKSVKRSNRYSSLPSSQLKKLPVPALAAPGGLVVTWVTNRAKHRRFVREELYPHWAV EVLAEWLWVKVTRSGEFVFPLDSQHKKPYEVLVLGRCRSTSDHTDRCSAVNELPDQRLLVSVPSTLHSHKPSLAA VLKPYIRREPRCLELFARSLQSDWSCWGNEVLKFQHCSYFSRHTDQEPTSDTLQRTHSHLQSTGLLETPETAR >NP_073751.3 methyltransferase-like protein 4 isoform 1 [Homo sapiens] (SEQ ID No: 8) MSVVHQLSAGWLLDHLSFINKINYQLHQHHEPCCRKKEFTTSVHFESLQMDSVSSSGVCAAFIASDSSTKPENDDG GNYEMFTRKFVFRPELFDVTKPYITPAVHKECQQSNEKEDLMNGVKKEISISIIGKKRKRCVVFNQGELDAMEYHTKI RELILDGSLQLIQEGLKSGFLYPLFEKQDKGSKPITLPLDACSLSELCEMAKHLPSLNEMEHQTLQLVEEDTSVTEQD LFLRVVENNSSFTKVITLMGQKYLLPPKSSFLLSDISCMQPLLNYRKTFDVIVIDPPWQNKSVKRSNRYSYLSPLQIQ QIPIPKLAAPNCLLVTWVTNRQKHLRFIKEELYPSWSVEVVAEWHWVKITNSGEFVFPLDSPHKKPYEGLILGRVQE KTALPLRNADVNVLPIPDHKLIVSVPCTLHSHKPPLAEVLKDYIKPDGEYLELFARNLQPGWTSWGNEVLKFQHVDY FIAVESGS >XP_020951799.1 methyltransferase-like protein 4 isoform X1 [Sus scrofa] (SEQ ID No: 9) MSVVHQLSSGWLLDHLSFINKISYELHQHHEPCCSKNEPTSVHLDSLHKDSVFSFGASPAFIASSSKPENDDGGNR EMSMQKYVFRSELFDVTKPYITSAIHKECQQSNEKEDLANDVKKEASISIKRKKRKRCVVFNQGELDAMEYHTKIRG LILDGSSQLIQEGLKSGFLHPLSEKCDKCSKPVTLPLDTCSLSELCEMAKHVPSLNEMELQTLQLMEDDISVTEQDLF SRIVENNSSFTKMITLMGQKYLLPPKSSFLLSDISCIYPLLNCRKTYDVIVIDPPWQNKSVKRSNRYSYLSPLQIKQIPI PKLAAPNCLVVTWVTNRQKHLRFVKEELYPSWSVEIVAEWHWVKITNSGEFVFPIDSPHKKPYEVLVLGRVRERAA LLLSRNAEVKELSIPDRKLIVSVPCILHSHKPPLAEVLKDYIKPEGEYLELFARNLQPGWTSWGNEVLKFQHMDYFVA LESRS >XP_011245012.1 PREDICTED: methyltransferase-like protein 4 isoform X2 [Mus musculus] (SEQ ID No: 10) MSVVHHLPPGWLLDHLSFINKVNYQLCQHQESFCSKNNPTSSVYMDSLQLDPGSPFGAPAMCFAPDFTTVSGND DEGSCEVITEKYVFRSELFNVTKPYIVPAVHKERQQSNKNENLVTDYKQEVSVSVGKKRKRCIAFNQGELDAMEYH TKIRELILDGSSKLIQEGLRSGFLYPLVEKQDGSSGCITLPLDACNLSELCEMAKHLPSLNEMELQTLQLMGDDVSVI ELDLSSQIIENNSSFSKMITLMGQKYLLPPQSSFLLSDISCMQPLLNCGKTFDAIVIDPPWENKSVKRSNRYSSLSPQ QIKRMPIPKLAAADCLIVTWVTNRQKHLCFVKEELYPSWSVEVVAEWYWVKITNSGEFVFPLDSPHKKPYECLVLG RVKEKTPLALRNPDVRIPPVPDQKLIVSVPCVLHSHKPPLTGYLNSSFATLIPRVSNNMEYCRVVRTAFIA >XP_018079135.1 PREDICTED: methyltransferase-like protein 4 [Xenopus laevis] (SEQ ID No: 11) MSVVCETSAGWLVDELSLLRKWYQHSTSCQDAAHKKQLYDIKEDLFLILRPHIPVQSTPAPLPILCPETNPGTINQR KKRKRSCAFNQGELDAMEYHKKIIDFIMEGTQPLLQEGFKRLFLRPVLVNDDDHSQTEPRLCNNPCQLAELCNMAK CMPLLNPGEHAVQVLERGIYLPQETNVLSCITENKSECPEVIQFMGEKYIIPPKSTFLMSDVSCMEPLLHYKRYNIIVM DPPWENKSVKRSKRYSSLSPNEIQQLPVPVLAAPDCLVITWVTNKQKHLRFVKEDLYPHWSVKTLGEWHWVKITR SGEFVFPLDSTHKKPYEVLIIGRFKGAGNSTARKSEICLPPIPERKLIVSVPCKLHSHKPPLSEILKEYVKPDLECLELF ARNLQPGWTSWGNEVLKFQHIDYFTPVDVED >NP_650573.1 uncharacterized protein Dmel_CG14906 [Drosophila melanogaster] (SEQ ID No: 12) MLKLQKKTEDSKFAVFLDHKTLINEAYDEFKLKSELFQFHAKKTDKGIEEDKTRKRKRKAGVEDASSLEDLHLVNEY LELLSKPVEPEDSSPMKRHWEDGYNVPQLHGANESGRMQRFLRVDGSRGVYLIPNQSRFFNHNVDNLPALLHQLL PAYDLIVLDPPWRNKYIRRLKRAKPELGYSMLSNEQLSHIPLSKLTHPRSLVAIWCTNSTLHQLALEQQLLPSWNLR LLHKLRWYKLSTDHELIAPPQSDLTQKQPYEMLYVACRSDASENYGKDIQQTELIFSVPSIVHSHKPPLLSWLREHLL LDKDQLEPNCLELFARYLHPHFTSIGLEVLKLMDERLYEVRKVEHCNQEEVN >tr|A8J2E1|A8J2E1_CHLRE Predicted protein OS = Chlamydomonas reinhardtii OX = 3055 GN = CHLREDRAFT_174824 PE = 3 SV = 1 (SEQ ID No: 13) MATLPGAAAAAPGANAEVGVPEPSLEPQDALQQRIALAEGLLALNEADAMQAWQQLPREALLEQVAKYRGAVRD MASALRSSTLPGGVPPHCVPIHANVTTFDWPSLYSHAQFDVIMMDPPWQLATANPTRGVALGYSQLNDDHISRLP VPQLQRQGGYLFVWVINAKYKWTLDLFDRWGYRLVDEVVWVKMTVNRRLAKSHGYYLQHAKEVCLVAKRGNPP VPPGCEGGVGSDIIFSERRGQSQKPEEIYHLIEQLVPNGRYLEIFARKNNLRNYWVSIGNEVTGTGLPDEDMQALRD LHHIPGAVYGKNAPHLVSKLFLYAPNSSREEG >XP_021880122.1 MT-A70-domain-containing protein [Lobosporangium transversale] (SEQ ID No: 14) MLDQINIDIEQLEASLDIDEGKAHSNNASGTGCLIGTGTSSGNASNGAGVADEDLEEEVDDLEEFEAPEWCVPIKAN VMTYDWDSLAAECQFDVILMDPPWQLATHAPTRGVAIAYQQLPDICIEELPVPKLSSNGFIFIWVINNKYAKAFDLM RRWGYSYVDDITWVKQTVNRRMAKGHGYYLQHAKETCLVGKKGEDPPGCRHSIGSDVIFSERRGQSQKPEELYE LIEELVPNGRYLEIFGRKNNLRDYWVTVGNEL >ORX69627.1 MT-A70-domain-containing protein [Linderina pennispora] (SEQ ID No: 15) MDVDSSSPAVVLQALRQREQKIRSRILVLEQEISDLEKRCGVEGSGDAANKVTEADLEEFKAPEWSVPIRANVMNF DWEKLAQACQFDVILMDPPWQLASQAPTRGVAIAYQQLPDVCIESLPIDLLQTSGFIFIWVINNKYTKAFQLMKQWG YKYVDDIAWVKQTVNRRMAKGHGYYLQHAKETCLVGKKGPDPPNLRRSVASDVIFSERRGQSQKPEELYEIIEQLV PGGRYLEIFGRKNNLRDYWVTVGNEL >ORX98979.1 allantoinase [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 16) MSAIIFTGNRVLFDSTSKVEPATIHVDPWTGRIVKITNKRSTKADFPGIEDKDFVDAGDDLIMPGVIDAHVHLNEPGR TDWEGFDTATRAAAAGGLTTVIDMPLNSIPPTTTLENLNTKKEAAKPQAWVDVGFYGGVIPGNADQLRPMIAAGVC GFKCFLIESGVDEFPCVNEEEVRKAFAEFDGTDNVFMFHAEMECDDHSHETAAPQSTDPSAYQTFLQSRPHALEV KAIEMIIRVCKDFPNVRAHIVHLSSAEALPMIRKAKAEGVKLTVETCYHYLTLNAEDIINGATHFKCCPPIREGSNRELL WEALLDGTIDYVVSDHSPCTPELKRFDSGDFTAAWGGISSLQFGLSLLWTEAKRRGCTLQDLTRWLSQNTARHAG ILNRKGRLQIGSDADIVIWSPEETFVVDKKMIHFKNKVTPYENMTLHGAVKKTFVRGRNVYDKSTAQLFSAKPLGNL LARFQVYSNPITAMPSYAQPPSSDNGDFEEESEDYIESDEVDEDLRELLAKETSLRLRIDSLKEEILKLEREQRGETD GSKNEGEGGEEEIDLEEFEAPEWCVPIKANVMTFEWKRLAEAAQFDVILMDPPWQLATHAPTRGVAIGYQQLPDV CIEELPIPLLQKNGFIFIWVINNKYVKAFELMAKWGYRYVDDITWVKQTVNRRMAKGHGYYLQHAKETCLIGKKGED PPNCRHSVCSDVIFSERRGQSQKPEELYEMIEQLVPNGKYLEIFGRKNNLRDYWVTIGNEL >ORZ00623.1 MT-A70-domain-containing protein [Syncephalastrum racemosum] (SEQ ID No: 17) MSSREESPSSVSGFDLDTIDESTVTDTTLKNLLRREIELQLQIDALQTEILQIEESTAAGKNNKNDEELDPQDLEEFEA PEWCVPIKANVMTFDWEALASEVQFDVIVADPPWQLATHAPTRGVAIGYQQLPDVCIEEIPIQKLQKNGFIFIWVINN KYAKAFELMERWGYHYVDDITWVKQTVNRRMAKGHGYYLQHAKETCLVGKKGEDPPNCRHSVGSDVIFSERRG QSQKPEELYELIEELVPNGKYLEIFGRKNNLRDYWVTVGNEL >ORZ06213.1 MT-A70-domain-containing protein [Absidia repens] (SEQ ID No: 18) MTSDTSAMTADVLNRKRKRSPAMNGDDLSNNSDEADNNTTTGTTTSVDSNENDYQEQDREPILRLPRLNDAKLLE EVVDDVDYEDQPERYDFDFKKLWLQERGLMERIDGLLKDIARLTDFKGHYRDMVIPSDDEDDLDDEDSKAQYDAP EWCVPIKANVMTFDWESLGKEVQFDVIMADPPWQLATHAPTRGVAISYQQLPDVCIEDLPLEKLQTNGFLFIWVIN NKYAKAFEMMEKWGYKYVDDITWVKQTVNRRMAKGHGYYLQHAKETCLVGVKGTLPPYCRRSVGSDVIYSERRG QSQKPEQIYELIEEMMPGGKYLEIFGRKNNLRDYWITVGNEL >ORX43344.1 MT-A70-domain-containing protein [Hesseltinella vesiculosa] (SEQ ID No: 19) MASESNISRESSPASISSTNSESGIENVQSLTDEDLKQLILKEMNLKEHIEQLQRKISKLTANDLSTNQDSSDADDDLL NGDETMDDDSSSGSDSEVSGNEDIASVKSSPHAADKSESESESESDEGSSEDGNDEEDEFEAPKWCVPIKANVM TFDWEKLASETQFDVIVADPPWQLATHAPTRGVAIAYQQLPDVCIEDLPIEKLQTNGFIFIWVINNKYAKAFELMEKM GYTYVDDITWVKQTVNRRMAKGHGYYLQHAKETCLVGKKGVDPPSCRHSVGSDIIFSERRGQSQKPEELYELIEEL VPNGKYLEIFGRKNNLRDYWVTVGNEL >ORX52920.1 MT-A70 protein [Piromyces finnis] (SEQ ID No: 20) MMIVANEIDYEEFTAPEWCIPIKANVIDFEWDKLASECQFDAILMDPPWQLATHAPTRGVAIAYQQLPDQFIEELPIE KLQKNGFIFIWVINNKYVKAFELMKKWGYTFVDDITWVKQTVNRRMAKGHGYYLQHAKETCLVGKKGEDPVGCKH SISSDVIYSVRRGQSQKPEELYEMIEELIPNGKYLEIFGRKNNLRDYWVTIGNEL >ORX86973.1 MT-A70-domain-containing protein [Anaeromyces robustus] (SEQ ID No: 21) MDEKEVENSVLDSSNIEKSNATTSNMDVDETSNNETSTAIIKSEDGANSYDDFLKLDFTPEEEKDEVLKKLIERETEL KLKIEKEIEGIKNLELKGFSALTQKDEDVQDIDYEEFTAPEWCIPIKANVIDFEWDKLASECQFDAILMDPPWQLATHA PTRGVAIAYQQLPDQFIEELPIEKLQKNGFIFIWVINNKYVKAFELMKKWGYTFVDDITWVKQTVNRRMAKGHGYYL QHAKETCLVGKKGDDPVGCRHKISSDVIYSVRRGQSQKPEELYEMIEELIPNGKYLEIFGRKNNLRDYWVTIGNEL >XP_001032074.3 MT-a70 family protein [Tetrahymena thermophila SB210] (SEQ ID No: 22) MKKEQQFLIFKKSLIIAQKRKEINIKQLKQQFKNFLFVQIFSIIKLKLQDIIIKFKMSKAVNKKGLRPRKSDSILDHIKNKLD QEFLEDNENGEQSDEDYDQKSLNKAKKPYKKRQTQNGSELVISQQKTKAKASANNKKSAKNSQKLDEEEKIVEEE DLSPQKNGAVSEDDQQQEASTQEDDYLDRLPKSKKGLQGLLQDIEKRILHYKQLFFKEQNEIANGKRSMVPDNSIPI CSDVTKLNFQALIDAQMRHAGKMFDVIMMDPPWQLSSSQPSRGVAIAYDSLSDEKIQNMPIQSLQQDGFIFVWAIN AKYRVTIKMIENWGYKLVDEITWVKKTVNGKIAKGHGFYLQHAKESCLIGVKGDVDNGRFKKNIASDVIFSERRGQS QKPEEIYQYINQLCPNGNYLEIFARRNNLHDNWVSIGNEL >EJY88228.1 MT-A70 family protein [Oxytricha trifallax] (SEQ ID No: 23) MNQSSQDITTQKSSNGFNPQTQPETLIQVIRKESTFIFKYRKNPYYVPPPISSQTSPNLEVETSNDLNQMSDYEGQI PNNYEINRNSTQFTNNDDQSDNDFYDNNSITTMQIDTSTAKILNNGPLEYNPDLPNKEQKLKDSQVMQNQPPTATS TNSQQRTLQELINIMPSIEDISQQCKQQQQLKIQAKANSTQSASTANAANGGKGRKRGRTVRFDQPLLGKVRQRN GDASDDEEPDEIEMLIRRLHTDILNDARNDPVEQAKKIRQARESQSDQTNSTTQLSVYERMILGSASQQSTDHQPG EFSNMFRTLEDEQIEINQNFLFDEYDSEDDSIADDKVEIASDDEQMLLQEHKKRGKKYLQDEIVKEEDFDEDDDSDE DIHMDDLENESLSFDRNNRKSHKPVCKRTREENILDADLGDEKDDEDTIFIDNLPSDEFSIRRQLQDVKSYIKQFEML FFEEEDSDKEEQLKQITNVQKHEEALQNFKDRSHLKNFWCIPLSSDVREIDWDVLIARQQEHTNGQLFDVITCDPP WQLSSANPTRGVAIAYETLNDGEILKIPWGRLQKDGFLFIWVINAKYRFALDMMGAHGYRVVDEIQWVKQTCNGKI AKGHGYYLQHAKEVCLVGCKGDPAILAKKCRSNIESDVIFSERRGQSQKPEEIYELVEALVPNGYYMEIFGRRNNLH NGWVTVGNEL >EJY79437.1 MT-A70 family protein [Oxytricha trifallax] (SEQ ID No: 24) MHLPMQIITQNMFRQGNQHSCLNRTEILRTPRLTRSTKTELQEQTHFSKLPRRNYLKLQIDMREIQSLVDKKVKESA AAQQQLSQSGIEDSAIKRSLRPRKVENYKNMLEGDEITLKTIQDEQIEVKRKKREASSQNRLEDEDEDEDMLEVGQ QIERASDDEDDDDFPISTRRSARKRTRRQDVDEDEEAIEVNQVESSDAEVEIPANDIDTESYTEGTNKRKQKLKAKK QVLDKKKNKTEGDIDKEDAVEEEETVFIDNLPNDEFEIRRMLKEVKKHIKSLEKQFFEEEDSEKEEELKQINNNSKHE EALQAFKETSHLKQFWCIPLSVNVTTLDFDLLAKSQMKQGGRLFDVITIDPPWQLSSANPTRGVAIAYDTLNDKEILN MPFEKVQTDGFLFIWVINAKYRFALEMMEKFGYKLVDEIAWVKQTVNGKIAKGHGYYLQHAKETCLVGVKGNVKGK ARYNIESDVIFSQRRGQSQKPEEIYEIAEALVPNGYYLEIFGRRNNLHNGWVTIGNEL >NP_066012.1 N6-adenosine-methyltransferase non-catalytic subunit [Homo sapiens] (SEQ ID No: 25) MDSRLQEIRERQKLRRQLLAQQLGAESADSIGAVLNSKDEQREIAETRETCRASYDTSAPNAKRKYLDEGETDEDK MEEYKDELEMQQDEENLPYEEEIYKDSSTFLKGTQSLNPHNDYCQHFVDTGHRPQNFIRDVGLADRFEEYPKLREL IRLKDELIAKSNTPPMYLQADIEAFDIRELTPKFDVILLEPPLEEYYRETGITANEKCWTWDDIMKLEIDEIAAPRSFIFL WCGSGEGLDLGRVCLRKWGYRRCEDICWIKTNKNNPGKTKTLDPKAVFQRTKEHCLMGIKGTVKRSTDGDFIHA NVDIDLIITEEPEIGNIEKPVEIFHIIEHFCLGRRRLHLFGRDSTIRPGWLTVGPTLTNSNYNAETYASYFSAPNSYLTG CTEEIERLRPKSPPPKSKSDRGGGAPRGGGRGGTSAGRGRERNRSNFRGERGGFRGGRGGAHRGGFPPR >NP_964000.2 N6-adenosine-methyltransferase non-catalytic [Mus musculus] (SEQ ID No: 26) MDSRLQEIRERQKLRRQLLAQQLGAESADSIGAVLNSKDEQREIAETRETCRASYDTSAPNSKRKCLDEGETDEDK VEEYKDELEMQQEEENLPYEEEIYKDSSTFLKGTQSLNPHNDYCQHFVDTGHRPQNFIRDVGLADRFEEYPKLRELI RLKDELIAKSNTPPMYLQADIEAFDIRELTPKFDVILLEPPLEEYYRETGITANEKCWTWDDIMKLEIDEIAAPRSFIFL WCGSGEGLDLGRVCLRKWGYRRCEDICWIKTNKNNPGKTKTLDPKAVFQRTKEHCLMGIKGTVKRSTDGDFIHA NVDIDLIITEEPEIGNIEKPVEIFHIIEHFCLGRRRLHLFGRDSTIRPGWLTVGPTLTNSNYNAETYASYFSAPNSYLTG CTEEIERLRPKSPPPKSKSDRGGGAPRGGGRGGTSAGRGRERNRSNFRGERGGFRGGRGGTHRGGFTPR >XP_003129279.3 N6-adenosine-methyltransferase subunit METTL14 [Sus scrofa] (SEQ ID No: 27) MDSRLQEIRERQKLRRQLLAQQLGAESADSIGAVLNSKDEQREIAETRETCRASYDTSTPNAKRKYQDEGETDEDK IEEYKDELEMQQEEENLPYEEEIYKDSSTFLKGTQSLNPHNDYCQHFVDTGHRPQNFIRDVGLADRFEEYPKLRELI RLKDELIAKSNTPPMYLQADIEAFDIRELTPKFDVILLEPPLEEYYRETGITANEKCWTWDDIMKLEIDEIAAPRSFIFL WCGSGEGLDLGRVCLRKWGYRRCEDICWIKTNKNNPGKTKTLDPKAVFQRTKEHCLMGIKGTVKRSTDGDFIHA NVDIDLIITEEPEIGNIEKPVEIFHIIEHFCLGRRRLHLFGRDSTIRPGWLTVGPTLTNSNYNAETYASYFSAPNSYLTG CTEEIERLRPKSPPPKSKSDRGGGAPRGGGRGGTSAGRGRERNRSNFRGERGGFRGGRGGAHRGGFPPR >XP_018099063.1 PREDICTED: N6-adenosine-methyltransferase subunit METTL14 isoform X2 [Xenopus laevis] (SEQ ID No: 28) MNSRLQEIRARQTLRRKLLAQQLGAESADSIGAVLNSKDEQREIAETRETSRASYDTSAAVSKRKLPEEGKADEEV VQECKDSVEPQKEEENLPYREEIYKDSSTFLKGTQSLNPHNDYCQHFVDTGHRPQNFIRDVGLADRFEEYPKLREL IRLKDELIAKSNTPPMYLQADLENFDLRELKSEFDVILLEPPLEEYFRETGIAANEKWWTWEDIMKLDIEGIAGSRAFV FLWCGSGEGLDFGRMCLRKWGFRRSEDICWIKTNKDNPGKTKTLDPKAIFQRTKEHCLMGIKGTVHRSTDGDFIH ANVDIDLIITEEPEIGNIEKPVEIFHIIEHFCLGRRRLHLFGRDSTIRPDQSWEERLANSGGLREKEFLVGLLLGLLLPTA TLIQRLMLLTLTLQIHLLLDAQRRSKDSVPKLHLLSQIVALGHREEEDEVEHLQVAERGAGKGTEAVLGETEGISEDV EDHIGVSLLPVDFKCF >NP_996954.1 N6-adenosine-methyltransferase non-catalytic subunit [Danio rerio] (SEQ ID No: 29) MNSRLQEIRERQKLRRQLLAQQLGAESPDSIGAVLNSKDEQKEIEETRETCRASFDISVPGAKRKCLNEGEDPEED VEEQKEDVEPQHQEESGPYEEVYKDSSTFLKGTQSLNPHNDYCQHFVDTGHRPQNFIRDGGLADRFEEYPKQRE LIRLKDELISATNTPPMYLQADPDTFDLRELKCKFDVILIEPPLEEYYRESGIIANERFWNWDDIMKLNIEEISSIRSFVF LWCGSGEGLDLGRMCLRKWGFRRCEDICWIKTNKNNPGKTKTLDPKAVFQRTKEHCLMGIKGTVRRSTDGDFIH ANVDIDLIITEEPEMGNIEKPVEIFHIIEHFCLGRRRLHLFGRDSTIRPGWLTVGPTLTNSNFNIEVYSTHFSEPNSYLS GCTEEIERLRPKSPPPKSMAERGGGAPRGGRGGPAAGRGDRGRERNRPNFRGDRGGFRGRGGPHRGFPPR >NP_609205.1 methyltransferase like 14 [Drosophila melanogaster] (SEQ ID No: 30) MSDVLKSSQERSRKRRLLLAQTLGLSSVDDLKKALGNAEDINSSRQLNSGGQREEEDGGASSSKKTPNEIIYRDSS TFLKGTQSSNPHNDYCQHFVDTGQRPQNFIRDVGLADRFEEYPKLRELIKLKDKLIQDTASAPMYLKADLKSLDVKT LGAKFDVILIEPPLEEYARAAPSVATVGGAPRVFWNWDDILNLDVGEIAAHRSFVFLWCGSSEGLDMGRNCLKKW GFRRCEDICWIRTNINKPGHSKQLEPKAVFQRTKEHCLMGIKGTVRRSTDGDFIHANVDIDLIISEEEEFGSFEKPIEI FHIIEHFCLGRRRLHLFGRDSSIRPGWLTVGPELTNSNFNSELYQTYFAEAPATGCTSRIELLRPKSPPPNSKVLRG RGRGFPRGRGRPR >NP_567348.2 Methyltransferase MT-A70 family protein [Arabidopsis thaliana] (SEQ ID No: 31) MKKKQEESSLEKLSTWYQDGEQDGGDRSEKRRMSLKASDFESSSRSGGSKSKEDNKSVVDVEHQDRDSKRERD GRERTHGSSSDSSKRKRWDEAGGLVNDGDHKSSKLSDSRHDSGGERVSVSNEHGESRRDLKSDRSLKTSSRDE KSKSRGVKDDDRGSPLKKTSGKDGSEVVREVGRSNRSKTPDADYEKEKYSRKDERSRGRDDGWSDRDRDQEGL KDNWKRRHSSSGDKDQKDGDLLYDRGREREFPRQGRERSEGERSHGRLGGRKDGNRGEAVKALSSGGVSNEN YDVIEIQTKPHDYVRGESGPNFARMTESGQQPPKKPSNNEEEWAHNQEGRQRSETFGFGSYGEDSRDEAGEASS DYSGAKARNQRGSTPGRTNFVQTPNRGYQTPQGTRGNRPLRGGKGRPAGGRENQQGAIPMPIMGSPFANLGMP PPSPIHSLTPGMSPIPGTSVTPVFMPPFAPTLIWPGARGVDGNMLPVPPVLSPLPPGPSGPRFPSIGTPPNPNMFFT PPGSDRGGPPNFPGSNISGQMGRGMPSDKTSGGWVPPRGGGPPGKAPSRGEQNDYSQNFVDTGMRPQNFIRE LELTNVEDYPKLRELIQKKDEIVSNSASAPMYLKGDLHEVELSPELFGTKFDVILVDPPWEEYVHRAPGVSDSMEYW TFEDIINLKIEAIADTPSFLFLWVGDGVGLEQGRQCLKKWGFRRCEDICWVKTNKSNAAPTLRHDSRTVFQRSKEH CLMGIKGTVRRSTDGHIIHANIDTDVIIAEEPPYGSTQKPEDMYRIIEHFALGRRRLELFGEDHNIRAGWLTVGKGLSS SNFEPQAYVRNFADKEGKVWLGGGGRNPPPDAPHLVVTTPDIESLRPKSPMKNQQQQSYPSSLASANSSNRRTT GNSPQANPNVVVLHQEASGSNFSVPTTPHWVPPTAPAAAGPPPMDSFRVPEGGNNTRPPDDKSFDMYGFN >PNW88915.1 hypothetical protein CHLRE_01g050600v5 [Chlamydomonas reinhardtii] (SEQ ID No: 32) MQDGQGPPGDGRGRGRGRSRGGRIMFAREGGRGPRPMHSDMGPPPPPMGMFPHDPSAMMGGPMPGMPPM DFTPEMLLTMMGAGLGGPMGLAGPMGMMMPDFGAAAAGAPGGMMVPPGAMMPPPPQPPSGGPGGMGGGGM GGMGGMMGHQQGMGGAGGPMGLPGGGMGMGMGGGGGGGGGGGYGGRGGHGEAGGGGGGGGRAGGAG GGGGAGGAAEHLSNDYSQNFVDTGLRPQNFLRDTHLTDRYEEYPKLKELIVRKDRQVSAHATPPLFLRTDLRSTRL SPELFGTKFDVILVDPPWEEYVRRAPGMVADPEVWSWQDIQALDIEAVADNPCFLFLWCGAEEGLEAGRVCMQK WGFRRVEDICWIKTNKEGGKGPGGGRRPYLTAANQHPESMLVHTKEHCLMGIKGSVRRATDGHIIHTNVDTDVIV SEEPELGSTRKPEEMYHIIERFCNGRRRLELFGEDHNIRNGWVTVGRSLTSSNFSAKAYADHFRNRDGSVWVQNT YGPKPPPGSVILVPTTDEIEDLRPKSPTGPHGGSSFHHSR >XP_001022374.1 MT-a70 family protein [Tetrahymena thermophila SB210] (SEQ ID No: 33) MQPQQNQNQQQQQQQQSQQQQQQNQQLPQLQQSMSSQQQQNQQQEKQIIIKRGTTSKRNDYCQNFVNTHER PQNFIMNIRPEERFIEYPKLQDLIKFKDDLIKKRNHPPVYLKADLKYYDLSKLGKFDVIMMDPPWKEYEERVQGLPIYS QYPEKFNSWDLNEIAALPIDEISDKPSFLFLWVGSDHLDQGRELFRKWGYKRCEDIVWVKTNKDKTKEYIELPHSNL LVRVKEHCLVGLRGDVKRASDSHFIHANIDTDVIVAEEPPLGSTQKPAEIYDIIERFCLGRKRLELFGEVHNVRQGWL TIGKLLDESNFNQDEYNSWFDGDKTYPQIQTYRGGRYVGTTPDIEQLRPKSPTKNNQMNSNQNMSGSQVSEFDL GIQQKQQKLNQQF >NP_009876.1 Kar4p [Saccharomyces cerevisiae S288C] (SEQ ID No: 34) MAFQDPTYDQNKSRHINNSHLQGPNQETIEMKSKHVSFKPSRDFHTNDYSNNYIHGKSLPQQHVTNIENRVDGYP KLQKLFQAKAKQINQFATTPFGCKIGIDSIVPTLNHWIQNENLTFDVVMIGCLTENQFIYPILTQLPLDRLISKPGFLFI WANSQKINELTKLLNNEIWAKKFRRSEELVFVPIDKKSPFYPGLDQDDETLMEKMQWHCWMCITGTVRRSTDGHLI HCNVDTDLSIETKDTTNGAVPSHLYRIAENFSTATRRLHIIPARTGYETPVKVRPGWVIVSPDVMLDNFSPKRYKEEI ANLGSNIPLKNEIELLRPRSPVQKAQ >XP_001691478.1 predicted protein [Chlamydomonas reinhardtii] (SEQ ID No: 35) MRLGGGPGGSELDDLLGKRSVKEKVKVEKGSELLDILSKPTARESARVEQFRTAGGSAIREHCPHLTKDECRRVN GVPLACHRLHFLRVVQPHTDVALGNCSYLDTCRNMRTCKYVHYRPDPEPDVPGMGSEMARLRASVPKKPVGDG QTSRGALDPQWINCDVRSFDMTVLGKFGVIMADPPWEIHQDLPYGTMKDDEMVNLNVGCLQDNGVLFLWVTGRA MELARECMAKWGYKRVDELIWVKTNQLQRLIRTGRTGHWLNHSKEHCLVGIKGSPQLNRYVDTDVVVAEVRETS RKPDEMYSLLERLSPGTRKLEIFARVHNCKPGWVGLGNQLKNVNLIEPEVRQRFAARYGFEPDASKDCFVN >NP_192814.1 mRNAadenosine methylase [Arabidopsis thaliana] (SEQ ID No: 36) METESDDATITVVKDMRVRLENRIRTQHDAHLDLLSSLQSIVPDIVPSLDLSLKLISSFTNRPFVATPPLPEPKVEKKH HPIVKLGTQLQQLHGHDSKSMLVDSNQRDAEADGSSGSPMALVRAMVAECLLQRVPFSPTDSSTVLRKLENDQNA RPAEKAALRDLGGECGPILAVETALKSMAEENGSVELEEFEVSGKPRIMVLAIDRTRLLKELPESFQGNNESNRVVE TPNSIENATVSGGGFGVSGSGNFPRPEMWGGDPNMGFRPMMNAPRGMQMMGMHHPMGIMGRPPPFPLPLPLP VPSNQKLRSEEEDLKDVEALLSKKSFKEKQQSRTGEELLDLIHRPTAKEAATAAKFKSKGGSQVKYYCRYLTKEDC RLQSGSHIACNKRHFRRLIASHTDVSLGDCSFLDTCRHMKTCKYVHYELDMADAMMAGPDKALKPLRADYCSEAE LGEAQWINCDIRSFRMDILGTFGVVMADPPWDIHMELPYGTMADDEMRTLNVPSLQTDGLIFLWVTGRAMELGRE CLELWGYKRVEEIIWVKTNQLQRIIRTGRTGHWLNHSKEHCLVGIKGNPEVNRNIDTDVIVAEVRETSRKPDEMYA MLERIMPRARKLELFARMHNAHAGWLSLGNQLNGVRLINEGLRARFKASYPEIDVQPPSPPRASAMETDNEPMAID SITA >EAS00013.2 N6-adenosine-methyltransferase 70 kDa subunit [Tetrahymena thermophila SB210] (SEQ ID No: 37) MGSSVKDQEISNKKHKARNSSSGANNNSNSSNYQSSKRDIHQDRSYSKDDSQSRQYNSNNGGGGSSSKNSNRN SSQQGYNQNSSSNQGQNSEYGGSGSGKNSQANSQRNSSQQGLQQLNQQQQSQQQQQQMLQNQMNSMGMM NQFQNSFGLMGMQPSQPLQLLNPSMIIPSGKKQKYDFLEFPPSSQHEFRAILLDYFLSDLFDYPMHSAELFENFIEA FSDIKDSSSFIKKLELIPLLQELNDKKAIKLETCAVGTKLFDFIVDINKDKIKQLSREFSKDRPKFMPILDKKPQPSSSKT NSSSTTAPPKQAISKREIEDLLKKETGLQKEVITQSKEKSNLLNKISAAEESALAIFRKQGSRRIDYCDCGTRDKCIQIR NSTVPCNKAHFRKIIRPHTDENLGNCSYLDTCRHMDYCKFVHYELDVDINNMNNDNLLLDGIEKKLNPQWINCDLR QIDFNILGKFNCIMADPPWDIHMTLPYGTLKDREMKAMRVDLLQEEGVIFLWVTGRAMELGRECLTNWGYRRVEEI IWVKTNQLQRIIRTGRTGHWLNHSKEHCLVGIKGNPKINRKIDCDVIVSEVRETSRKPDEIYNLIERMCPGGKKIELFG RPHNTMPGWLTLGNQLPGIYLEDEEIIERYMDAYPDQDISRETMERNRIRMKNENDIDHIYNSHIQNIPPFKTKQLTK DLQLQQQSSSMQTTQQQSSSQMMPQMQQQQSSQSINSNTDLQMHGNGLYEQE >ORX92345.1 MT-A70-domain-containing protein [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 38) MKLERALFKMADMWGYNTIGIKREYDNDKSAISVIYFDPRNLRNVQHIEKTLEDICDVDSIDPDIFLDKTTSAQVPSTY IPNEEARFSEDAEIEKLLSKPSFLEMEAFSSLIGVTELIERKTFREQEAEEMFKAQGNGGFREFCEYLIKEDCKKMNT SGQPCAMTASILLTNMKLHFRRIMRPQTDLELGDCSYLNTCHRMDTCKYVHYELDDFEHPSSANITKTTIPTSLIFRP PKKVLPAQWINCDVRKFDFSILGKFSVIMADPPWDIHMTLPYGTMTDDEMKAMAIHKLQDEGLIFLWVTARAMELG RECLATWGYDRVDEVVWIKTNQLQRLIRTGRTGHWLNHSKEHCLVGIKGDPSRFNIGLACDVLVAEVRETSRKPD QIYGMIDRLSPGTRKIEIFGRQHNTRPGWFTLGNQLKDVRIVEPEVLEAYNQRYPECPAQLSAIPES >AJR96662.1 Ime4p [Saccharomyces cerevisiae YJM1248] (SEQ ID No: 39) MINDKLVHFLIQNYDDILRAPLSGQLKDVYSLYISGGYDDEMQKLRNDKDEVLQFEQFWNDLQDIIFATPQSIQFDQN LLVADRPEKIVYLDVFSLKILYNKFHAFYYTLKSSSSSCEEKVSSLTTKPEADSEKDQLLGRLLGVLNWDVNVSNQGL PREQLSNRLQNLLREKPSSFQLAKERAKYTTEVIEYIPICSDYSHASLLSTAVYIVNNKIVSLQWSKISACQENHPGLI ECIQSKIHFIPNIKPQTDISLGDCSYLDTCHKLNMCRYIHYLQYIPSCLQERADRETAIENKRIRSNVSIPFYTLGNCSA HCIKKALPAQWIRCDVRKFDFRVLGKFSVVIADPAWNIHMNLPYGTCNDIELLGLPLHELQDEGIIFLWVTGRAIELG KESLNNWGYNVINEVSWIKTNQLGRTIVTGRTGHWLNHSKEHLLVGLKGNPKWINKHIDVDLIVSMTRETSRKPDE LYGIAERLAGTHARKLEIFGRDHNTRPGWFTIGNQLTGNCIYEMDVERKYQEFMKSKTGTSHTGTKKIDKKQPSKL QQQHQQQYWNNMDMGSGKYYAEAKQNPMNQKHTPFESKQQQKQQFQTLNNLYFAQ >NP_651204.1 methyltransferase like 3 [Drosophila melanogaster] (SEQ ID No: 40) MADAWDIKSLKTKRNTLREKLEKRKKERIEILSDIQEDLTNPKKELVEADLEVQKEVLQALSSCSLALPIVSTQVVEKI AGSSLEMVNFILGKLANQGAIVIRNVTIGTEAGCEIISVQPKELKEILEDTNDTCQQKEEEAKRKLEVDDVDQPQEKTI KLESTVARKESTSLDAPDDIMMLLSMPSTREKQSKQVGEEILELLTKPTAKERSVAEKFKSHGGAQVMEFCSHGTK VECLKAQQATAEMAAKKKQERRDEKELRPDVDAGENVTGKVPKTESAAEDGEIIAEVINNCEAESQESTDGSDTCS SETTDKCTKLHFKKIIQAHTDESLGDCSFLNTCFHMATCKYVHYEVDTLPHINTNKPTDVKTKLSLKRSVDSSCTLYP PQWIQCDLRFLDMTVLGKFAVVMADPPWDIHMELPYGTMSDDEMRALGVPALQDDGLIFLWVTGRAMELGRDCL KLWGYERVDELIWVKTNQLQRIIRTGRTGHWLNHGKEHCLVGMKGNPTNLNRGLDCDVIVAEVRATSHKPDEIYGI IERLSPGTRKIELFGRPHNIQPNWITLGNQLDGIRLVDPELITQFQKRYPDGNCMSPASANAASINGIQK >NP_001084701.1 methyltransferase like 3 L homeolog [Xenopus laevis] (SEQ ID No: 41) MSDTWSSIQAHKKQLDNLRERLQRRRKDATSQLALDLQSSEGGIAPTFRSDSPVPSASSQPLKGPSGSAEVTPDP ELEKKLLHHLSDLSLVLPADSVSIQLAITTPDFPVTRQGVESLLQKFAAQELIEVKGWGQEDDDRPTVVTFADYSKLS AMMGAVAERKGTTIPTGAKKRRLQEADPSASSLSSSLSASASREKKTSEPQKKARKHASHLDLEIESLLSQQSTKE QQSKKVSQEILELLSTSTAKEQSIVEKFRSRGRAQVQEFCDFGTKEECMKAAGADTPCRKLHFRRIINMHTDESLG DCSFLNTCFHMDTCKYVHYEIDAWVEPGGTAMGTEAIASLDTPLAKAVGDSSVGRLFPAQWIRCDIRYLDVSILGKF SVVMADPPWDIHMELPYGTLTDDEMRKLQIPVLQDDGFLFLWVTGRAMELGRECLKLWGYERVDEIIWVKTNQLQ RIIRTGRTGHWLNHGKEHCLVGVKGSPQGFNRGLDCDVIVAEVRSTSHKPDEIYGMIERLSPGTRKIELFGRPHNIQ PNWITLGNQLDGIHLLDPDVVAQFKQKYPDGVIGMPKNM sp|F1R777.1|MTA70_DANRE RecName: Full = N6-adenosine-methyltransferase subunit METTL3: AltName: Full = N6-adenosine-methyltransferase 70 kDa subunit; Short = MT-A70 (SEQ ID No: 416) MSDTWSHIQAHKKQLDSLRERLQRRRKDPTQLGTEVGSVESGSARSDSPGPAIQSPPQVEVEHPPDPELEKRLLG YLSELSLSLPTDSLTITNQLNTSESPVSHSCIQSLLLKFSAQELIEVRQPSITSSSSSTLVTSVDHTKLWAMIGSAGQS QRTAVKRKADDITHQKRALGSSPSIQAPPSPPRKSSVSLATASISQLTASSGGGGGGADKKGRSNKVQASHLDMEI ESLLSQQSTKEQQSKKVSQEILELLNTSSAKEQSIVEKFRSRGRAQVQEFCDYGTKEECVQSGDTPQPCTKLHFRR IINKHTDESLGDCSFLNTCFHMDTCKYVHYEIDSPPEAEGDALGPQAGAAELGLHSTVGDSNVGKLFPSQWICCDIR YLDVSILGKFAVVMADPPWDIHMELPYGTLTDDEMRKLNIPILQDDGFLFLWVTGRAMELGRECLSLWGYDRVDEII WVKTNQLQRIIRTGRTGHWLNHGKEHCLVGVKGNPQGFNRGLDCDVIVAEVRSTSHKPDEIYGMIERLSPGTRKIE LFGRPHNVQPNWITLGNQLDGIHLLDPEVVARFKKRYPDGVISKPKNM >NP_062826.2 N6-adenosine-methyltransferase catalytic subunit [Homo sapiens] (SEQ ID No: 42) MSDTWSSIQAHKKQLDSLRERLQRRRKQDSGHLDLRNPEAALSPTFRSDSPVPTAPTSGGPKPSTASAVPELATD PELEKKLLHHLSDLALTLPTDAVSICLAISTPDAPATQDGVESLLQKFAAQELIEVKRGLLQDDAHPTLVTYADHSKLS AMMGAVAEKKGPGEVAGTVTGQKRRAEQDSTTVAAFASSLVSGLNSSASEPAKEPAKKSRKHAASDVDLEIESLL NQQSTKEQQSKKVSQEILELLNTTTAKEQSIVEKFRSRGRAQVQEFCDYGTKEECMKASDADRPCRKLHFRRIINK HTDESLGDCSFLNTCFHMDTCKYVHYEIDACMDSEAPGSKDHTPSQELALTQSVGGDSSADRLFPPQWICCDRY LDVSILGKFAVVMADPPWDIHMELPYGTLTDDEMRRLNIPVLQDDGFLFLWVTGRAMELGRECLNLWGYERVDEII WVKTNQLQRIIRTGRTGHWLNHGKEHCLVGVKGNPQGFNQGLDCDVIVAEVRSTSHKPDEIYGMIERLSPGTRKIE LFGRPHNVQPNWITLGNQLDGIHLLDPDVVARFKQRYPDGIISKPKNL >sp|Q8C3P7.2|MTA70_MOUSE RecName: Full = N6-adenosine-methyltransferase subunit METTL3; AltName: Full = Methyltransferase-like protein 3; AltName: Full = N6-adenosine- methyltransferase 70 kDa subunit; Short = MT-A70 (SEQ ID No: 43) MSDTWSSIQAHKKQLDSLRERLQRRRKQDSGHLDLRNPEAALSPTFRSDSPVPTAPTSSGPKPSTTSVAPELATD PELEKKLLHHLSDLALTLPTDAVSIRLAISTPDAPATQDGVESLLQKFAAQELIEVKRGLLQDDAHPTLVTYADHSKLS AMMGAVADKKGLGEVAGTIAGQKRRAEQDLTTVTTFASSLASGLASSASEPAKEPAKKSRKHAASDVDLEIESLLN QQSTKEQQSKKVSQEILELLNTTTAKEQSIVEKFRSRGRAQVQEFCDYGTKEECMKASDADRPCRKLHFRRIINKH TDESLGDCSFLNTCFHMDTCKYVHYEIDACVDSESPGSKEHMPSQELALTQSVGGDSSADRLFPPQWICCDIRYL DVSILGKFAVVMADPPWDIHMELPYGTLTDDEMRRLNIPVLQDDGFLFLWVTGRAMELGRECLNLWGYERVDEIIW VKTNQLQRIIRTGRTGHWLNHGKEHCLVGVKGNPQGFNQGLDCDVIVAEVRSTSHKPDEIYGMIERLSPGTRKIEL FGRPHNVQPNWITLGNQLDGIHLLDPDVVARFKQRYPDGIISKPKNL >XP_003128628.1 N6-adenosine-methyltransferase 70 kDa subunit [Sus scrofa] (SEQ ID No: 44) MSDTTWSSIQAHKKQLDSLRERLRRRRKQDSGHLDLRNPEAALSPTFRSDSPVPTVPTSGGPKPSTASAVPELATD PELEKKLLHHLSDLALTLPTDAVSIRLAISTPDAPATQDGVESLLQKFAAQELIEVKRSLLQDDAHPTLVTYADHSKLS AMMGAVAEKKGPGEVAGTITGQKRRAEQDSTTVAAFASSLTSSLASSASEVAKEPTKKSRKHAASDVDLEIESLLN QQSTKEQQSKKVSQEILELLNTTTAKEQSIVEKFRSRGRAQVQEFCDYGTKEECMKASDADRPCRKLHFRRIINKH TDESLGDCSFLNTCFHMDTCKYVHYEIDACMDSEAPGSKDHTPSQELALTQSVGGDSNADRLFPPQWICCDIRYL DVSILGKFAVVMADPPWDIHMELPYGTLTDDEMRRLNIPVLQDDGFLFLWVTGRAMELGRECLNLWGYERVDEIIW VKTNQLQRIIRTGRTGHWLNHGKEHCLVGVKGNPQGNQGLDCDVIVAEVRSTSHKPDEIYGMIERLSPGTRKIEL FGRPHNVQPNWITLGNQLDGIHLLDPDVVARFKQRYPDGIISKPKNL >WP_009339935.1 MULTISPECIES: S-adenosylmethionine-binding protein [Afipia] (SEQ ID No: 45) MTLPAKDLLSFAGQRRFSTILADPPWQFTNKTGKVAPEHKRLSRYGTMKLDEIMMLPVADIAAPTSHLYLWCPNAL LPEGLAVMKAWGFNYKSNIVWHKVRKDGGSDGRGVGFYFRNVTEVILFGVRGKNARTLAPGRRQVNLLATRKRE HSRKPDEQYEIIESCSPGPFLELFARGTRKNWATWGNQADDDYKPTWKTYAHHSRAGLVAAE >WP_013485562.1 S-adenosylmethionine-binding protein [Ethanoligenens harbinense] (SEQ ID No: 46) MSTAKETANNLLQFCGEKKYATVYADPPWRFQNRTGKVAPENKKLNRYPTMDLEDIKALPVGKIAAEKSHLYLWVP NALLPDGLEVMKAWGFEYKGNIIWEKVRKDGEPDGRGVGFYFRNVTEILLFGIRGGNNRTLAPARSQVNLIRTQKR EHSRKPDEIITIIESCSPGPYLELFARGDRENWDMWGNQATAEYEPTWNTYKNHTTKETTSGVSGSQSET >WP_016343787.1 adenine-specific DNA methyltransferase [Mycobacteroides abscessus] (SEQ ID No: 47) MAAPLREVNEPPPLPVTDGGFSTILADPPWRFTNRTGKVAPEHRRLDRYSTLSLDEICALGVSDVTADNAHLYLWV PNALLPDGLRVMEEWGFRYVSNIVWSKVRRDGLPDGRGVGFYFRNTTELLLFGVRGSMRTLQPARSQVNQIVTR KREHSRKPDEQYELIEACSPGPYLEMFGRYRRPNWAVWGDEANEDVEPRGQTHKGYGGGEITRLPALEPHSRIP QWLAKPIAAAIKSAYDDGMSIDAIAAETGYSISRVRHLLDQAGAKKRGRGRPAKA >WP_023133224.1 MULTISPECIES: MT-A70 protein [Rothia] (SEQ ID No: 48) MLDPMNTNEEFAPLPTVEGGFQTVLADPPWRFTNRTGKVAPEHHRLGRYGTMSLDEIKALRVGDVTADNAHLYL WVPNALLPEGLEVMQAWGFRYVSNIIWAKRRKDGGPDGRGVGFYFRNVTEPILFGVKGSMRTLAPGRSTVNMIET RKREHSRKPDEQYDLIEACSPGPYLELFARYARPGWSVWGNEASNEIEPRGKAQKGYGGGEIDRLPILEPNERMS EWLSGRVGELLAEEYTKGASVQELANQSGYSIARVRTLLTHSGVPLRGRGRPKKGQVAS >ETW92643.1 S-adenosylmethionine-binding protein [Candidatus Entotheonella factor] (SEQ ID No: 49) MSNSPHSAADDLLACGFPPHSFSTVLADPPWRFTNRTGKMAPEHRRLSRYPTLTLEEIADLPLAQLVQPDSHLYLW VPNALLAEGLDVMRRWGFTYKTNLVWYKIRRDGGPDRRGVGFYFRNVTELVLFGVRGRMRTLAPGRRQENLLAS QKQEHSRKPDTFYDLIERCSPGPYLELFARHPRPGWHQFGNEPLVSSS >AHJ63281.1 Adenine-specific methyltransferase [Granulibacter bethesdensis] (SEQ ID No: 50) MTKQPDPIAEFRNQLNGGNFATVLADPPWRFQNRTGKMAPEHRRLSRYGTMELPEIMALPVSEVTAKTAHLYLWV PNALLPEGLAVMQAWGFNYKSNLVWHKIRKDGGSDGRGVGFYFRNVTELVLFGVKGKNARTEAPGRRQVNLLAT QKREHSRKPDEFYDIVEACSPGPYLELFARGTRPGWCAWGNQAEEYDITWDTYSHHSQRQSLWVAE >WP_017364718.1 S-adenosylmethionine-binding protein [Methylococcus capsulatus] (SEQ ID No: 51) MTENTLDPAADLLERLGDKRFRTILADPPWQFQNRTGKMAPEHKRLNRYGTMSLEAIAGLPVERLTADTAHLYLWV PNALLLEGLKVMEAWGFTYKTNLVWHKIRKDGGPDGRGVGFYFRNVTELVLFGVRGKNARTLAAGRRQVNFLAT RKREHSRKPDEMYGIIEACSPGPYLELFARGARDRWSVWGNEADENYYPRWNTYANHSQAEICPFE >WP_027700599.1 S-adenosylmethionine-binding protein [Xylella fastidiosa] (SEQ ID No: 52) MTKHKANTASDVGRDLLARHGGQRFHTILADPPWQFQNRTGKMAPEHKRLSRYGTMTLDDIMMLPVEQLVTDTA HLYLWVPNALLPEGIKVLEAWGFSYKSNIVWHKVRKDGGPDGRGVGFYFRNVTELVLFGVRGKNARTLAPGRRQ VNFLATQKREHSRKPDEFYDIVESCSPGPFLELFARGPRDGWKVWGNQADKYYPTWPTYSNHSQAECELGRVE MIAQRLLSV >WP_027488351.1 S-adenosylmethionine-binding protein [Rhizobium undicola] (SEQ ID No: 53) MLNRNTDAPSPSDDFTNFISGRKFATIMADPPWQFMNRTGKVAPEHKRLNRYGTMELDAIKALPVATACAPTAHLY LWVPNALLPEGLEVMKAWGFNYKANIVWHKLRKDGGSDGRGVGFYFRNVTELILFGTRGKNARTLPPGRSQVNYI GTRKREHSRKPDEQYPLIESCSPGPYLEMFGRGLRKGWTTWGNQADETYEPTWKTYGHNSSTDRLEAAE >ESK34829.1 hypothetical protein G966_02949 [Escherichia coli UMEA 3323-1] (SEQ ID No: 54) MGWFMTKKYTLIYADPPWVYRDKAADGNRGAGFKYPVMSVLDICRLPVWDLADENCLLAMWWVPTQPLEALKVV EAWGFRLMTMKGFTWIKCGSRQPDKLVMGMGHMTRANSEDCLFAVKGKLPTRINAGIVQSFTAPRLEHSRKPDIV REKLVQLLGDVSRIELFARQTSHGFDVWGNQCEDPAVQLHPGYALDIGGLTNAFSNAPLSPTDIQGRERAA >AIF94871.1 Adenine DNA methyltransferase, phage-associated [Escherichia coli O157:H7 str. SS17] (SEQ ID No: 55) MTKKYTLIYADPPWTFRDKATDGQRGASFKYPVMSLLDICRLPVWELAADNCLLAMWWVPTQPLEALKVVEAWG FRLVTMKGLTWNKCGKRQTDKLVMGMGSTTRANSEDCLFAVKGNLPERINAGIIQSFTAPRLDHSRKPDMAREKL VQLLGDVPRIELFARHTSHGFDVWGNQCGTPSIEMVPGIVKFLEKTNERKNDVDKGITS >WP_032715146.1 adenine methylase [Klebsiella aerogenes] (SEQ ID No: 56) MTGKYTLIYADPPWSYRDKAADGDRGAGFKYPVMNVMDICRLPVWELSADDCLLAMWWVPTQPVEALKVVEAW GFRLMTMKGFTWHKINKHKGNSAIGMGHMTRANSEDCLFAVRGKLPERMDASICQHVTAPRLENSRKPDVIREKL VQLLGDVPRIELFARQSSHGFDVWGNQCIAPAVELLPGCAVPVVKTEAA >AIA43360.1 DNA methyltransferase [Klebsiella pneumoniae subsp. pneumoniae KPNIH27] (SEQ ID No: 57) MNYDLIYCDPPWEYGNRISNGAACNHYSTMSIDDLKFLPVRKLAADNAVLAMWYTGTHNREAVELAESWGFRVRT MKGFTWVKLNQNAADRFNKALSTGELVDFNDLLEMLDRETRMNGGNHTRSNTEDVLIATRGTGLPRASASVKQV VHTCLGEHSAKPWEVRNRLEQLYGDVKRIELFAREEWKGWDRWGNQCNNSIEIITGLIKEVNHAA >WP_009320301.1 DNA methyltransferase [Clostridioides difficile] (SEQ ID No: 58) MPAVLFLLELHRRRKGGYKIENNQKYNIIYADPPWRYQQKRLSGAAEHHYPTMSVKDICGLKVEEIAAKDCVLFLWA TFPQLPEALRVIKAWGFQYKTVAFVWLKQNKSGKGWFFGLGFWTRGNAEICLLAIKGKPHRNSNRVHQFLISPIRG HSQKPEEAREKIVELMGDLPRVELFAREKTEGWDAWGNEVESDIEISSDTEKEWR >WP_012115592.1 MT-A70 family protein [Xanthobacter autotrophicus] (SEQ ID No: 59) MNGLWQFGDLKMFGYDLIVADPPWDFELYSEAGEGKSAKAHYGTMKLDEIAALRVGDLARGDCLLLLWCCEWMP PAARQRVLDAWGFTYKTTIIWRKVTRAGKVRMGPGYRARTMHEPVIVATVGNPKHTPFSSVFDGVAREHSRKPEA FYRMVEAAAPKAARADLFSRQRRDGWDAFGNEVEKFDQPPAEAAE >KFL31466.1 DNA methyltransferase [Devosia riboflavina] (SEQ ID No: 60) MTAWPFGAMPMFSFDVVMADPPWSFDNWSEGGNAKNAKAQYDCMPTPDIKRLPVGHLAAGDCWLWLWATYP MLPDAIEVMDAWGFRYVTAGPWVKRGTSGKLAMGTGYVLRSCSEIFLIGKNGEPKTHARDVRNVLEAPRREHSRK PDEAYAMAEKLFGPGRRADLFSRETRPGWTSWGNESTKFDEVAA >WP_016734162.1 DNA methyltransferase [Rhizobium phaseoli] (SEQ ID No: 61) MRLFPDLWPFGDLQPHSFDFIMADPPWKMQEWSDNGDKSKSTQSKYRLMPLDEIKAMPVLDLAAPNCLLWLWAT NPMLPQALDVLHAWGFTFATAGSWMKTTRNGKQAFGTGYIFRTSNEPILIGKRGEPKTTRSVRSSFPGLAREHSR KPEEGYREAERLMPRARRLELFSRTNRVGWTTWGDEVGKFGDVA >KFB10357.1 Adenine-specific methyltransferase [Nitratireductor basaltis] (SEQ ID No: 62) MHLFDWPFGDLNPHSFDLIMADPPWAFELRSDKGEGKSAQSHYKCQTLDEIKALPVLDLAAPDCLLWLWATNPML PQAFEVMAAWGFTFKTAGAWGKTTVNGKLAFGTGYIFRSAHEPILIGTRGEPRTTKSVRSLIMGQVREHSRKPEEA YAAAEKLIPNARRLELFSRTDRAGWEVWGDEAGKFGEAA Protein sequences for phylogenetic analysis of p1 proteins >XP_001009903.1 [Tetrahymena thermophila SB210] (SEQ ID No: 63) MSLKKGKFQHNQSKSLWNYTLSPGWREEEVKILKSALQLFGIGKWKKIMESGCLPGKSIGQIY MQTQRLLGQQSLGDFMGLQIDLEAVFNQNMKKQDVLRKNNCIINTGDNPTKEERKRRIEQNR KIYGLSAKQIAEIKLPKVKKHAPQYMTLEDIENEKFTNLEILTHLYNLKAEIVRRLAEQGETIAQPS IIKSLNNLNHNLEQNQNSNSSTETKVTLEQSGKKKYKVLAIEETELQNGPIATNSQKKSINGKRK NNRKINSDSEGNEEDISLEDIDSQESEINSEEIVEDDEEDEQIEEPSKIKKRKKNPEQESEEDDI EEDQEEDELVVNEEEIFEDDDDDEDNQDSSEDDDDDED >EJY79729.1 [Oxytricha trifallax] (SEQ ID No: 64) MSSSISAAIIAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALMKFGVGRWKTIEQSQCLPT KTMSQMYLQTQRLVGQQSLAEFMGLHLDLEQIFIKNAERQGAGVFRKNGCIINTGDNMTKVQI AKLRKKNSKIFGLTQPFVQSLHLPKAKVKEWLKVLTLDQILSAKSNFSTAEKIHYLKILENALER KLKKILRLQELVSIYRPCNIGIVVQKRLGSSIGDEYFEYVDCVKIEEKSVGNLDFALPNRNTDSTS LNEDFSFLDSTQKPQKLKAGSGRENKRKKMRDGLKDERAQRQSLMEALDEQEFDETKFQDS >EJY78001.1 [Oxytricha trifallax] (SEQ ID No: 65) MSVHHKMADSKSLHNYTLSPGWTREEVDILKIALMKFGIGKWKKIQKSGCLPSKTISQMNLQT QRLLGQQSLAEFMGLHVYLDRVFRDNSLKTGPEIQRKNNFIINTGNNLTQPEKEKRLRLNKQK YGLDLAFIKTLRLPKPESATGGKREAILSMDQIFAQKSHFTVVEKLKHLEALKNALCSKLGKIER RRRNKELSKIYRPLGQLIVVQKNADDQYEFVDIIDENE >ORX69504.1 [Linderina pennispora] (SEQ ID No: 66) MSSATPYAPRSMPTGQRNVVRSNDSASLWNCTLSPGWTQEEVQVLRKALMKFGVGNWMKII ESECLPGKTIAQMNLQTQRMLGQQSTAEFNGLHLDAFVIGELNSKKQGPGIKRKNNCIVNTGG KLTRDEVVKRQQKHREQYEVKAEVWRAIVLPKPDNPLILLEKKREELKKVRLELEEIMKQIEET >ORX78557.1 [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 67) MTDVYKPRSMPVGARNVLRSNDSASLWNCTLSPGWTEPEVHILRKAVMKFGIGNWAKIIESQ CLFGKTIAQMNLQLQRMLGQQSTAEFAGLHLDPFVIGEINSKKQGPGIKRKNNCIVNTGGKLTR EEIKRRLLEHKRTYEISEEEWRSIELPKPEDPGAVLIAKKDELKMLEDELLRVVQKIQKAREERR SKSVDSSSVDGSVDDEARETKRRRK >EJY73777.1 [Oxytricha trifallax] (SEQ ID No: 68) MSHATSHGNSTEKDKKNSGNMVAESKSLWNYALSPQWTPQEVDVLKIALMKFGIGKWTIIDK SGILPTKTIQQCYLQTQRILGQQSLAEFMGLHVDIDKIALDNRRKNGIRKMGFLVNQGGKLTPE EKAHYQEINRQKYGLSPEEVETIKLPPPCSVEIYDINKIINPKSKLTTIEKINHCIKLQDALLEKLEN IKNKKIPTGAGFSSSRVYENMRGYDPQLLLNSHVTGQLDHSMQDLTIDERYSDLDEEEDPLAM ASIIDSQATPQPQKIKSSVPNKASTTPSAKEMNQIKDIIDSVIAENSAQQSKNLAQEKPKLKFSLV KATESNLLQSAAQNSDDVVMEEDSKLQHIETFSTVTQTATDQSNSQSKSQNNIASDSLKDSLE QNDLSKSLTDSLEMQQYSAEKKLNQAPMSKNSDKPKKKRLNKRKLPSDDEFETL >XP_021883515.1 [Lobosporangium transversale] (SEQ ID No: 69) MSSGSTPRSMTAGARNILRSNDSASLWNYTVAPGWSMKEAEILRKALMKFGIGNWSKIIESN CLVGKTNAQMNLQTQRMLGQQSTAEFAGLHIDPRVIGQKNSLIQGDHIRRKNGCIVNTGAKLS REEIRRRVAENKEQYELPEEEWSSIELPLPDDPHLLLEAKKSEKVRLELELKNVQRQIAMLRKV GRKFETGSESPKTELDDDERDEFIEDQPLGKRARIEA >EJY81929.1 [Oxytricha trifallax] (SEQ ID No: 70) MSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALMKFGVGRWSAINKSGVLP TKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQGFLVNQGCKLTPEEKDEL RKINQEKYGLSAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDKIKHLIKLEDALKSKLEMIRE GKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSSHQNRNSDLDEENDESVMIIDE SQQENLTPKGKAQAMLTHQKYNEVTQTMIKQGDDSRQQQHLPLDSTSASVSNPSSTSKSST MKSNSMKQSETAIASMKPSSIGKKTKVDSSFVTKQSNQQSTAPIQKQAHQQNLDRNRSELGS TFAQQASVDTQNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVEDSE >EJY76686.1 [Oxytricha trifallax] (SEQ ID No: 71) MRVYLKFCNRKQIHYTHTMSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALM KFGVGRWSAINKSGVLPTKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQ GFLVNQGCKLTPEEKDELRKINQEKYGLTAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDK IKHLIKLEDALKSKLEMIREGKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSAHQN RNSDLDEENDQSVMIIDESQQQNLTPKGKAQTMLTNQTQTMKKQADDSRDEQHLPLISTSAS VSNPSSTSKSSALKLNSMKQSDTAIASMKPSSSGKKTKVDSSFVSKQSNQQSTSYSETNVDT QNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVEDSE >EJY80746.1 [Oxytricha trifallax] (SEQ ID No: 72) MRVYLKFCNRKQIHYTHTMSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALM KFGVGRWSAINKSGVLPTKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQ GFLVNQGCKLTPEEKDELRKINQEKYGLTAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDK IKHLIKLEDALKSKLEMIREGKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSAHQN RNSDLDEENDQSVMIIDESQQQNLTPKGKAQTMLTNQTQTMKKQADDSREEQHLPLNSTSAS VSNPSSTSKSSALKLNSMKQSDTAIASMKPSSSGKKTKVDSSFVSKQSNQQSTGPIQKQAHQ QNLDRNRSELGSTFAQQTNVDTQNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVKDSE >ORX56566.1 [Piromyces finnis] (SEQ ID No: 73) MSIPKPRSMPVGFRNILRPNDSTSLWNCTLSPGWTQEESDILRDALIFYGIGNWKDIIEHGCLP DKTNAQMNLQLQRMLGQQSTAEFQNLHIDPYEIGKINSQKQGPNIRRKNGFIINTGGKLSREDI KRKIQENKENYELPEEVWSKIVLPNREVVTINEKRQKLNKLEEELDSVLKQIVNRRRELRGMTP LKETEMKSIVNRSNQNDTKTEEKEIKEEESTTVNEEKIENTETSSISIISTNENEQSENISSSSPIV KSEQKKKRVVSRRKNKRRVNSDDEDFLPPGKSRSKRTRRTPKKSSN >ORX79686.1 [Anaeromyces robustus] (SEQ ID No: 74) MSIPKPRSMPTGFRNILRPNDSTSLWNCTLSPGWTQEESDILRDALIYYGIGNWKDIIEHGCLP DKTNAQMNLQLQRMLGQQSTAEFQNLHIDPYVIGKINSQKQGPNIRRKNGFIINTGGKLSREDI RRKIQENKENYELPKEEWSKIVLPNREVVIKNKVQEAINEKREKLNKLEDELDSVLKAIVNRRR ELRGMIPLKDSEMKSLVNRSAKNEGENKTETTNNEESNNTNNSDDIKDENNETSTSSHIFTNN DNELSENNSSSSSSNSISNKKKRFLRREVRRGKRRYNYDDDDFMPSGNRSRKSRKI >ORZ01404.1 [Syncephalastrum racemosum] (SEQ ID No: 75) MSNNKENNVNKPRSMTAGARNVLRSNDSTSLWNCTLSPGWTQDESEVLRKALMKFGVGNW AKIIESGCLPGKTNAQMNLQLQRLLGQQSTAEFAGLHIDPKVIGEKNSKIQGPHIKRKNNCIVNT GDKLSRDKLRARVMSNKEEYELPEEVWKNIELPKVKDPLMLLEGKKEEMRKLKTELEKVQAKI QQLRQAQPARVQELQSQIEVARSPSPSAPDSPALSV >XP_001698763.1 [Chlamydomonas reinhardtii] (SEQ ID No: 76) MAFAAALAEKRGPRVGDAASLWNFTPAPGWSREEVQILRLCLMKHGVGQWMQILSTGLLPG KLIQQLNGQTQRLLGQQSLAAYTGLKVDVDRIRVDNETRTDATRKAGLIINDGPNLTKEMKEK MRQDAVAKYGLTPEQVAEVDEQLAEIAAAFNPASTSAAAGAGSGAAAAGQAAAAGSGAGGS GNLMAQPTEQLSAEQLGQLLLRLRNRLACLVDRARGRAGLPPRTAPRWATEAAAAACLAAM AAAEASAPQAPAAAAGGQEGAAGPVMVSVPFSREVLAEATACRVRSGTAAGARGNAPGAQ GGVRKRTSKGGKAKGGDREWSPEGEENTAPQPRGGGKRKSGAVAGGEEADGVASGRAKR ASRPKRGSSKHDPYVDDNDYGDEGIDPFDVGDDLDDMNPHGRYGNGGGRRADPSEAISALT AMGFTQSKARGALRECNFNVELAVEWLFANCL >PNW76495.1 [Chlamydomonas reinhardtii] (SEQ ID No: 77) MAFAAALAEKRGPRVGDAASLWNFTPAPGWSREEVQILRLCLMKHGVGQWMQILSTGLLPG KLIQQLNGQTQRLLGQQSLAAYTGLKVDVDRIRVDNETRTDATRKAGLIINDGPNLTKEMKEK MRQDAVAKYGLTPEQVAEVDEQLAEIAAAFNPASTSAAAGAGSGAAAAGQAAAAGSGAGGS GQAATAADAGGAAGRGTGSAGGAAAAAPPRNALAISTGVLAATLLDASLGNLMAQPTEQLSA EQLGQLLLRLRNRLACLVDRARGRAGLPPRTAPRWATEAAAAACLAAMAAAEASAPQAPAAA AGGQEGAAGPVMVSVPFSREVLAEATACRVRSGTAAGARGNAPGAQGGVRKRTSKGGKAK GGDREWSPEGEENTAPQPRGGGKRKSGAVAGGEEADGVASGRAKRASRPKRGSSKHDPY VDDNDYGDEGIDPFDVGDDLDDMNPHGRYGNGGGRRADPSEAISALTAMGFTQSKARGALR ECNFNVELAVEWLFANCL >ORZ17038.1 [Absidia repens] (SEQ ID No: 78) MSSPSSPSPIKPRSMLTGSRNVVRSNDSASLWNCTLSPGWNEEQSETLRHAVMKYGIGNWA KIIDSGYLPGKTNAQMNLQLQRLLGQQSTAEFAGLHIDPKVIGEQNSRIQGPEIRRKNNTIVNTG DKLSREALRERILRNKEKYELPESVWQAIELEHVTDEDALLEEKKKTLREMKSQLKVVQRQIKN LEFMHPLHAAKLKFELEKLAPSSSTSSSSSSPSPSSSSSPSSSSSKPSVSGTEEEMREAVDEE RGSDEEIDELVEETDEEETSVSPKVGTRTKKVRTN >ORX56339.1 [Hesseltinella vesiculosa] (SEQ ID No: 79) MIANSTATPKPRSMKAGARNVLRSNDSASLWNCTLSPGWTEQESEILRQLAIKFGIGNWAKIIE SDCLPGKTNAQMNLQLQRLLGQQSTAEFAGLHIDPKVIGEKNSKIQGPHIKRKNTTIVNTGGKL SREELRERQAKNKEMYEMPKSAWDSIDLDELRDMNSLKLKKKEDKDALKKQKLTQLKTKLTK SQNNLKKVQAELKQIAMVDPERVAELKKELSRASSPLSNEVSVIEESPAKKQRTS >ORX54764.1 [Piromyces finnis] (SEQ ID No: 80) MVVEKDLAQENKIKEELNKKHEWVKEMRKKFCVRKEFENTKNLILEDGTLNQEYFRLSKGTVL KTNEVRKWTSIERNLLIKGIEKYGIGHFREISESLLPKWSGNDLRIKTIHLIGRQNLKLYKDWKG GEEDIKREYNRNKEIGLKCNAWKNNCLIDDGNGKVKEMIEATEPKH >ORX84766.1 [Anaeromyces robustus] (SEQ ID No: 81) MVVEKETNKENIKNIKEELDKKHAWVKEMRKKFCVRKEFENTKILILEDGTLNQDYFRLSKGTV LKTNEVRKWTSIERGLLIKGIEKYGIGHFREISENLLPKWSGNDLRIKTIHLIGRQNLKLYKDWK GNEEDIKREYNRNKEIGLKCNAWKNNCLVDDGHGKVKAMIEATENN >ORY98423.1 [Syncephalastrum racemosum] (SEQ ID No: 82) MMTATDEDVDMKDVDIKLESNQETEQKILTPEEQKEKEKQDWIRQLRLKFCIRPEYEITKNMIF PDGTLNQDYFRPPKGAKVEEARKWTEVEKELLIQGIEKYGIGNFGEVSKALLPAWSTNDLRIK CIRLIGRQNLQLYRGWKGNADDIAREYNRNKELGLKYGTWKQGVLVYDDDGLVEKEILAQDA AAKGEDVDMN >XP_021886199.1 [Lobosporangium transversale] (SEQ ID No: 83) MEINQEQLPSSSSILHPTSTSSSSSPSPSPSPASPKPERVFDARQRRINEIRLKFCIRDEFPITK NMIHPDGTLNQDYFRPPRGSKPVEVARKWTDKERELLIKGIEKYGIGHFREISEEFLPLWSGN DLRIKTMRLVGRQNLQLYKDWKGNEQDLAREFELNKAIGLKYGAWKAGTLVADDDGLVAKAI EEQWPGSNSGTGKTTAVIGISSEENSEVSTPLNDEDVDME >ORY01319.1 [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 84) MEVDQNDSSVAKETAEQPETPEISKELLERQEWIKNMRLQFCVRPEFEVTKNIIHEDGMLNQE YFLPPKGAKLEAEPERKWTETERNLLIQGIQQYGIGHFREISEALLPQWSGNDLRVKSMRLMG RQNLQLYKDWKGSIEDIEREYERNKAIGLKYNTWKNSTLVYDDAGLVLKAIEASEPKP >ORZ26026.1 [Absidia repens] (SEQ ID No: 85) MAIDSLQDTEDDRTNDQNDESRESSPTPLSPEEQAQKERHEDWINQIRLKFCIRPEFEVTKNIIH PDGRLNQEYFHPPKGYKPEDARKWTETEKQLLIKGIEEHGIGNFGLISKESLPKWSTNDLRVK CIRLIGRQNLQLYRGWKGNADDITREYERNKEIGLKYGTWKQGVLVYDDDGMVEKELLATAAT PADSMSMEEDEDMATD >ORX67568.1 [Linderina pennispora] (SEQ ID No: 86) MDTASPDDGAIAQPMLGVEDADFWRQKQEWVKQMRLQFSRRPEFPETHNMIDDEGMLNQE YFQPPKDAVAPKERKWGDDEKRRLLEGIEKHGIGHFREISEESLPEWSGNDLRMKAIRLMGR QNLQLYKGWKGDAAAIGLKHGTWKGGALVYDDDGVVLKAIQESNRANPP >XP_001699352.1 [Chlamydomonas reinhardtii] (SEQ ID No: 87) MAACSAACDSHVVPQPSPGSWGMPEDRDNYIVQMRRRYSPAGMLNADGSINQDFFKPRRV VLVADRAKWGDAEREGLYKGLEVHGVGKWREINRDYLKGQWDDQQVRIRAARLLGSQSLVR YMGWKGSKAKVDAEYAKNKAIGEATGCWKAGQLVEDDHGSVRKYFEAQQAGGEQ Protein sequences for phylogenetic analysis of p2 proteins >XP_001017830.3 [Tetrahymena thermophila SB210] (SEQ ID No: 88) MNQMGVIAIKRKQSYQLNVKINYINTAHQIKKPCQYIQKCILFRLLYKFCKQLIPLNFNLFLIFYFY HLLFHLIFNYLLKFAKKINKLIRNQRKNREKKEAFKHKKIQININHYNYLKQNIQQVGIIFQNKKSK LTLKLVQKKSLSEYYRKIKMKKNGKSQNQPLDFTQYAKNMRKDLSNQDICLEDGALNHSYFLT KKGQYWTPLNQKALQRGIELFGVGNWKEINYDEFSGKANIVELELRTCMILGINDITEYYGKKIS EEEQEEIKKSNIAKGKKENKLKDNIYQKLQQMQ >XP_001699352.1 [Chlamydomonas reinhardtii] (SEQ ID No: 89) MAACSAACDSHVVPQPSPGSWGMPEDRDNYIVQMRRRYSPAGMLNADGSINQDFFKPRRV VLVADRAKWGDAEREGLYKGLEVHGVGKWREINRDYLKGQWDDQQVRIRAARLLGSQSLVR YMGWKGSKAKVDAEYAKNKAIGEATGCWKAGQLVEDDHGSVRKYFEAQQAGGEQ >EJY77156.1 [Oxytricha trifallax] (SEQ ID No: 90) MSTAKQQQAQQHLLPKHSNMRVGSVSNELDYAKRNYIIKMRQSFIEVNKNIYFEDGSLNFKYF NVKKGHYWSKEINEELIKGVIKYGATNYKDIKNKMEIFKKEWSETEIRLRICRLLKCYNLKVYEG HKFNSREEILEQATLNKEEAIKQKKICGGILYNPPHEQDDGIMSSYFNLKNKNNTPVKASAQ >ORZ26026.1 [Absidia repens] (SEQ ID No: 91) MAIDSLQDTEDDRTNDQNDESRESSPTPLSPEEQAQKERHDWINQIRLKFCIRPEFEVTKNIIH PDGRLNQEYFHPPKGYKPEDARKWTETEKQLLIKGIEEHGIGNFGLISKESLPKWSTNDLRVK CIRLIGRQNLQLYRGWKGNADDITREYERNKEIGLKYGTWKQGVLVYDDDGMVEKELLATAAT PADSMSMEEDEDMATD >ORY96423.1 [Syncephalastrum racemosum] (SEQ ID No: 92) MMTATDEDVDMKDVDIKLESNQETEQKILTPEEQKEKEKQDWIRQLRLKFCIRPEYEITKNMIF PDGTLNQDYFRPPKGAKVEEARKWTEVEKELLIQGIEKYGIGNFGEVSKALLPAWSTNDLRIK CIRLIGRQNLQLYRGWKGNADDIAREYNRNKELGLKYGTWKQGVLVYDDDGLVEKEILAQDA AAKGEDVDMN >XP_021886199.1 [Lobosporangium transversale] (SEQ ID No: 93) MEINQEQLPSSSSILHPTSTSSSSSPSPSPSPASPKPERVFDARQRRINEIRLKFCIRDEFPITK NMIHPDGTLNQDYFRPPRGSKPVEVARKWTDKERELLIKGIEKYGIGHFREISEEFLPLWSGN DLRIKTMRLVGRQNLQLYKDWKGNEQDLAREFELNKAIGLKYGAWKAGTLVADDDGLVAKAI EEQWPGSNSGTGKTTAVIGISSEENSEVSTPLNDEDVDME >ORY01319.1 [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 94) MEVDQNDSSVAKETAEQPETPEISKELLERQEWIKNMRLQFCVRPEFEVTKNIIHEDGMLNQE YFLPPKGAKLEAEPERKWTETERNLLIQGIQQYGIGHFREISEALLPQWSGNDLRVKSMRLMG RQNLQLYKDWKGSIEDIEREYERNKAIGLKYNTWKNSTLVYDDAQLVLKAIEASEPKP >ORX67568.1 [Linderina pennispora] (SEQ ID No: 95) MDTASPDDGAIAQPMLGVEDADFWRQKQEWVKQMRLQFSRRPEFPETHNMIDDEGMLNQE YFQPPKDAVAPKERKWGDDEKRRLLEGIEKHGIGHFREISEESLPEWSGNDLRMKAIRLMGR QNLQLYKGWKGDAAAIGLKHGTWKGGALVYDDDGVVLKAIQESNRANPP >ORX84766.1 [Anaeromyces robustus] (SEQ ID No: 96) MVVEKETNKENIKNIKEELDKKHAWVKEMRKKFCVRKEFENTKILILEDGTLNQDYFRLSKGTV LKTNEVRKWTSIERGLLIKGIEKYGIGHFREISENLLPKWSGNDLRIKTIHLIGRQNLKLYKDWK GNEEDIKREYNRNKEIGLKCNAWKNNCLVDDGHGKVKAMIEATENN >ORX54764.1 [Piromyces finnis] (SEQ ID No: 97) MVVEKDLAQENKIKEELNKKHEWVKEMRKKFCVRKEFENTKNLILEDGTLNQEYFRLSKGTVL KTNEVRKWTSIERNLLIKGIEKYGIGHFREISESLLPKWSGNDLRIKTIHLIGRQNLKLYKDWKG GEEDIKREYNRNKEIGLKCNAWKNNCLIDDGNGKVKEMIEATEPKH >ORX56334.1 [Hesseltinella vesiculosa] (SEQ ID No: 98) MLAGDAELVEKPHNALNAEDTEMEDVDHSSHPDTTVDLSPEQLRLQEKQAWINQMRLKFCV REEFEITKNMIHPDGILNQDYFKPPKKSKKKKSKSKSKGTDETKDDTEAKGEDNKEDEDME >PNW76495.1 [Chlamydomonas reinhardtii] (SEQ ID No: 99) MAFAAALAEKRGPRVGDAASLWNFTPAPGWSREEVQILRLCLMKHGVGQWMQILSTGLLPG KLIQQLNGQTQRLLGQQSLAAYTGLKVDVDRIRVDNETRTDATRKAGLIINDGPNLTKEMKEK MRQDAVAKYGLTPEQVAEVDEQLAEIAAAFNPASTSAAAGAGSGAAAAGQAAAAGSGAGGS GQAATAADAGGAAGRGTGSAGGAAAAAPPRNALAISTGVLAATLLDASLGNLMAQPTEQLSA EQLGQLLLRLRNRLACLVDRARGRAGLPPRTAPRWATEAAAAACLAAMAAAEASAPQAPAAA AGGQEGAAGPVMVSVPFSREVLAEATACRVRSGTAAGARGNAPGAQGGVRKRTSKGGKAK GGDREWSPEGEENTAPQPRGGGKRKSGAVAGGEEADGVASGRAKRASRPKRGSSKHDPY VDDNDYGDEGIDPFDVGDDLDDMNPHGRYGNGGGRRADPSEAISALTAMGFTQSKARGALR ECNFNVELAVEWLFANCL >XP_001698763.1 [Chlamydomonas reinhardtii] (SEQ ID No: 100) MAFAAALAEKRGPRVGDAASLWNFTPAPGWSREEVQILRLCLMKHGVGQWMQILSTGLLPG KLIQQLNGQTQRLLGQQSLAAYTGLKVDVDRIRVDNETRTDATRKAGLIINDGPNLTKEMKEK MRQDAVAKYGLTPEQVAEVDEQLAEIAAAFNPASTSAAAGAGSGAAAAGQAAAAGSGAGGS GNLMAQPTEQLSAEQLGQLLLRLRNRLACLVDRARGRAGLPPRTAPRWATEAAAAACLAAM AAAEASAPQAPAAAAGGQEGAAGPVMVSVPFSREVLAEATACRVRSGTAAGARGNAPGAQ GGVRKRTSKGGKAKGGDREWSPEGEENTAPQPRGGGKRKSGAVAGGEEADGVASGRAKR ASRPKRGSSKHDPYVDDNDYGDEGIDPFDVGDDLDDMNPHGRYGNGGGRRADPSEAISALT AMGFTQSKARGALRECNFNVELAVEWLFANCL >XP_011237366.1 [Mus musculus] (SEQ ID No: 101) MPRRQAEAMDIDAEREKITQEIQELERILYPGSTSVHFEVSESSLSSDSEADSLPDEDLETAGA PILEEEGSSESSNDEEDPKDKALPEDPETCLQLNMVYQEVIREKLAEVSQLLAQNQEQQEEILF DLSGTKCPKVKDGRSLPSYMYIGHFLKPYFKDKVTGVGPPANEETREKATQGIKAFEQLLVTK WKHWEKALLRKSVVSDRLQRLLQPKLLKLEYLHEKQSRVSSELERQALEKQIKEAEKEIQDIN QLPEEALLGNRLDSHDWEKISNINFEGARSAEEIRKFWQSSEHPSISKQEWSTEEVERLKAIA ATHGHLEWHLVAEELGTSRSAFQCLQKFQQYNKTLKRKEWTEEEDHMLTQLVQEMRVGNHI PYRKIVYFMEGRDSMQLIYRWTKSLDPSLKRGFWAPEEDAKLLQAVAKYGAQDWFKIREEVP GRSDAQCRDRYIRRLHFSLKKGRWNAKEEQQLIQLIEKYGVGHWARIASELPHRSGSQCLSK WKILARKKQHLQRKRGQRPRHSSQWSSSGSSSSSSEDYGSSSGSDGSSGSENSDVELEAS LEKSRALTPQQYRVPDIDLWVPTRLITSQSQREGTGCYPQHPAVSCCTQDASQNHHKEGSTT VSAAEKNQLQVPYETHSTVPRGDRFLHFSDTHSASLKDPACKPVLKVPLEKMPKLIRTRPPTQ SHTLMKERPKQPLLPSSRSGSDPGNNTAGPHLRQLWHGTYQNKQRRKRQALHRRLLKHRLL LAVIPWVGDINLACTQAPRRPATVQTKADSIRMQLECARLASTPVFTLLIQLLQIDTAGMEVV RERKSQPPALLQPGTRNTQPHLLQASSNAKNNTGCLPSMTGEQTAKRASHKGRPRLGSCRT EATPFQVPVAAPRGLRPKPKTVSELLREKRLRESHAKKATQALGLNSQLLVSSPVILQPPLLPV PHGSPVVGPATSSVELSVPVAPVMVSSSPSGSWPVGGISATDKQPPNLQTISLNPPHKGTQV AAPAAFRSLALAPGQVPTGGHLSTLGQTSTTSQKQSLPKVLPILRAAPSLTQLSVQPPVSGQP LATKSSLPVNWVLTTQKLLSVQVPAVVGLPQSVMTPETIGLQAKQLPSPAKTPAFLEQPPAST DTEPKGPQGQEIPPTPGPEKAALDLSLLSQESEAAIVTWLKGCQGAFVPPLGSRMPYHPPSL CSLRALSSLLLQKQDLEQKASSLAASQAAGAQPDPKAGALQASLELVQRQFRDNPAYLLLKTR FLAIFSLPAFLATLPPNSIPTTLSPDVAVVSESDSEDLGDLELKDRARQLDCMACRVQASPAAP DPVQSHLVSPGQRAPSPGEVSAPSPLDASDGLDDLNVLRTRRARHSRR >XP_006497966.1 [Mus musculus] (SEQ ID No: 102) MPRRQAEAMDIDAEREKITQEIQELERILYPGSTSVHFEVSESSLSSDSEADSLPDEDLETAGA PILEEEGSSESSNDEEDPKDKALPEDPETCLQLNMVYQEVIREKLAEVSQLLAQNQEQQEEILF DLSGTKCPKVKDGRSLPSYMYIGHFLKPYFKDKVTGVGPPANEETREKATQGIKAFEQLLVTK WKHWEKALLRKSVVSDRLQRLLQPKLLKLEYLHEKQSRVSSELERQALEKQIKEAEKEIQDIN QLPEEALLGNRLDSHDWEKISNINFEGARSAEEIRKFWQSSEHPSISKQEWSTEEVERLKAIA ATHGHLEWHLVAEELGTSRSAFQCLQKFQQYNKTLKRKEWTEEEDHMLTQLVQEMRVGNHI PYRKIVYFMEGRDSMQLIYRWTKSLDPSLKRGFWAPEEDAKLLQAVAKYGAQDWFKIREEVP GRSDAQCRDRYIRRLHFSLKKGRWNAKEEQQLIQLIEKYGVGHWARIASELPHRSGSQCLSK WKILARKKQHLQRKRGQRPRHSSQWSSSGSSSSSSEDYGSSSGSDGSSGSENSDVELEAS LEKSRALTPQQYRVPDIDLWVPTRLITSQSQREGTGCYPQHPAVSCCTQDASQNHHKEGSTT VSAAEKNQLQVPYETHSTVPRGDRFLHFSDTHSASLKDPACKSHTLMKERPKQPLLPSSRSG SDPGNNTAGPHLRQLWHGTYQNKQRRKRQALHRRLLKHRLLLAVIPWVGDINLACTQAPRRP ATVQTKADSIRMQLECARLASTPVFTLLIQLLQIDTAGCMEVVRERKSQPPALLQPGTRNTQP HLLQASSNAKNNTGCLPSMTGEQTAKRASHKGRPRLGSCRTEATPFQVPVAAPRGLRPKPK TVSELLREKRLRESHAKKATQALGLNSQLLVSSPVILQPPLLPVPHGSPVVGPATSSVELSVPV APVMVSSSPSGSWPVGGISATDKQPPNLQTISLNPPHKGTQVAAPAAFRSLALAPGQVPTGG HLSTLGQTSTTSQKQSLPKVLPILRAAPSLTQLSVQPPVSGQPLATKSSLPVNWVLTTQKLLSV QVPAVVGLPQSVMTPETIGLQAKQLPSPAKTPAFLEQPPASTDTEPKGPQGQEIPPTPGPEKA ALDLSLLSQESEAAIVTWLKGCQGAFVPPLGSRMPYHPPSLCSLRALSSLLLQKQDLEQKASS LAASQAAGAQPDPKAGALQASLELVQRQFRDNPAYLLLKTRFLAIFSLPAFLATLPPNSIPTTLS PDVAVVSESDSEDLGDLELKDRARQLDCMACRVQASPAAPDPVQSHLVSPGQRAPSPGEVS APSPLDASDGLDDLNVLRTRRARHSRR >EJY86254.1 [Oxytricha trifallax] (SEQ ID No: 103) MSVHHKMADSKSLHNYTLSPGWTREEVDILKIALMKFGIGKWKKIQKSGCLPSKTISQMNLQT QRLLGQQSLAEFMGLHVYLDRVFRDNSLKTGPEIQRKNNFIINTGNNLTQPEKEKRLRLNKQK YGLDLAFIKTLRLPKPESATGGKREAILSMDQIFAQKSHFTVVEKLKHLEALKNALCSKLGKIER RRRNKELSKIYRPLCQLIVVQKNADDQYEFVDIIDENE >ORX69504.1 [Linderina pennispora] (SEQ ID No: 104) MSSATPYAPRSMPTGQRNVVRSNDSASLWNCTLSPGWTQEEVQVLRKALMKFGVGNWMKII ESECLPGKTIAQMNLQTQRMLGQQSTAEFNGLHLDAFVIGELNSKKQGPGIKRKNNCIVNTGG KLTRDEVVKRQQKHREQYEVKAEVWRAIVLPKPDNPLILLEKKREELKKVRLELEEIMKQIEET EKLVDVPEHAPGTKRARE >NP_003077.2 [Homo sapiens] (SEQ ID No: 105) MDVDAEREKITQEIKELERILDPGSSGSHVEISESSLESDSEADSLPSEDLDPADPPISEEERW GEASNDEDDPKDKTLPEDPETCLQLNMVYQEVIQEKLAEANLLLAQNREQQEELMRDLAGSK GTKVKDGKSLPPSTYMGHFMKPYFKDKVTGVGPPANEDTREKAAQGIKAFEELLVTKWKNW EKALLRKSVVSDRLQRLLQPKLLKLEYLHQKQSKVSSELERQALEKQGREAEKEIQDINQLPEE ALLGNRLDSHDWEKISNINFEGSRSAEEIRKFWQNSEHPSINKQEWSREEEERLQAIAAAHGH LEWQKIAEELGTSRSAFQCLQKFQQHNKALKRKEWTEEEDRMLTQLVQEMRVGSHIPYRRIV YYMEGRDSMQLIYRWTKSLDPGLKKGYWAPEEDAKLLQAVAKYGEQDWFKIREEVPGRSDA QCRDRYLRRLHFSLKKGRWNLKEEEQLIELIEKYGVGHWAKIASELPHRSGSQCLSKWKIMM GKKQGLRRRRRRARHSVRWSSTSSSGSSSGSSGGSSSSSSSSSEEDEPEQAQAGEGDRAL LSPQYMVPDMDLWVPARQSTSQPWRGGAGAWLGGPAASLSPPKGSSASQGGSKEASTTA AAPGEETSPVQVPARAHGPVPRSAQASHSADTRPAGAEKQALEGGRRLLTVPVETVLRVLRA NTAARSCTQKEQLRQPPLPTSSPGVSSGDSVARSHVQWLRHRATQSGQRRWRHALHRRLL NRRLLLAVTPWVGDVVVPCTQASQRPAVVQTQADGLREQLQQARLASTPVFTLFTQLFHIDT AGCLEVVRERKALPPRLPQAGARDPPVHLLQASSSAQSTPGHLFPNVPAQEASKSASHKGSR RLASSRVERTLPQASLLASTGPRPKPKTVSELLQEKRLQEARAREATRGPVVLPSQLLVSSSVI LQPPLPHTPHGRPAPGPTVLNVPLSGPGAPAAAKPGTSGSWQEAGTSAKDKRLSTMQALPL APVFSEAEGTAPAASQAPALGPGQISVSCPESGLGQSQAPAASRKQGLPEAPPFLPAAPSPT PLPVQPLSLTHIGGPHVATSVPLPVTWVLTAQGLLPVPVPAVVSLPRPAGTPGPAGLLATLLPP LTETRAAQGPRAPALSSSWQPPANMNREPEPSCRTDTPAPPTHALSQSPAEADGSVAFVPG EAQVAREIPEPRTSSHADPPEAEPPWSGRLPAFGGVIPATEPRGTPGSPSGTQEPRGPLGLE KLPLRQPGPEKGALDLEKPPLPQPGPEKGALDLGLLSQEGEAATQQWLGGQRGVRVPLLGS RLPYQPPALCSLRALSGLLLHKKALEHKATSLVVGGEAERPAGALQASLGLVRGQLQDNPAYL LLRARFLAAFTLPALLATLAPQGVRTTLSVPSRVGSESEDEDLLSELELADRDGQPGCTTATC PIQGAPDSGKCSASSCLDTSNDPDDLDVLRTRHARHTRKRRRLV >XP_016870547.1 [Homo sapiens] (SEQ ID No: 106) MDVDAEREKITQEIKELERILDPGSSGSHVEISESSLESDSEADSLPSEDLDPADPPISEEERW GEASNDEDDPKDKTLPEDPETCLQLNMVYQEVIQEKLAEANLLLAQNREQQEELMRDLAGSK GTKVKDGKSLPPSTYMGHFMKPYFKDKVTGVGPPANEDTREKAAQGIKAFEELLVTKWKNW EKALLRKSVVSDRLQRLLQPKLLKLEYLHQKQSKVSSELERQALEKQGREAEKEIQDINQLPEE ALLGNRLDSHDWEKISNINFEGSRSAEEIRKFWQNSEHPSINKQEWSREEEERLQAIAAAHGH LEWQKIAEELGTSRSAFQCLQKFQQHNKALKRKEWTEEEDRMLTQLVQEMRVGSHIPYRRIV YYMEGRDSMQLIYRWTKSLDPGLKKGYWAPEEDAKLLQAVAKYGEQDWFKIREEVPGRSDA QCRDRYLRRLHFSLKKGRWNLKEEEQLIELIEKYGVGHWAKIASELPHRSGSQCLSKWKIMM GKKQGLRRRRRRARHSVRWSSTSSSGSSSGSSGGSSSSSSSSSEEDEPEQAQAGEGDRAL LSPQYMVPDMDLWVPARQSTSQPWRGGAGAWLGGPAASLSPPKGSSASQGGSKEASTTA AAPGEETSPVQVPARAHGPVPRSAQASHSADTRPAGAEKQALEGGRRLLTVPVETVLRVLRA NTAARSCTQWLRHRATQSGQRRWRHALHRRLLNRRLLLAVTPWVGDVVVPCTQASQRPAV VQTQADGLREQLQQARLASTPVFTLFTQLFHIDTAGCLEVVRERKALPPRLPQAGARDPPVHL LQASSSAQSTPGHLFPNVPAQEASKSASHKGSRRLASSRVERTLPQASLLASTGPRPKPKTV SELLQEKRLQEARAREATRGPVVLPSQLLVSSSVILQPPLPHTPHGRPAPGPTVLNVPLSGPG APAAAKPGTSGSWQEAGTSAKDKRLSTMQALPLAPVFSEAEGTAPAASQAPALGPGQISVSC PESGLGQSQAPAASRKQGLPEAPPFLPAAPSPTPLPVQPLSLTHIGGPHVATSVPLPVTWVLT AQGLLPVPVPAVVSLPRPAGTPGPAGLLATLLPPLTETRAAQGPRAPALSSSWQPPANMNRE PEPSCRTDTPAPPTHALSQSPAEADGSVAFVPGEAQVAREIPEPRTSSHADPPEAEPPWSGR LPAFGGVIPATEPRGTPGSPSGTQEPRGPLGLEKLPLRQPGPEKGALDLEKPPLPQPGPEKG ALDLGLLSQEGEAATQQWLGGQRGVRVPLLGSRLPYQPPALCSLRALSGLLLHKKALEHKAT SLVVGGEAERPAGALQASLGLVRGQLQDNPAYLLLRARFLAAFTLPALLATLAPQGVRTTLSV PSRVGSESEDEDLLSELELADRDGQPGCTTATCPIQGAPDSGKCSASSCLDTSNDPDDLDVL RTRHARHTRKRRRLV >XP_020936800.1 [Sus scrofa] (SEQ ID No: 107) MDVDAEREKISKEIKELERILDPGSSGINDDVSESSLDSDSEAESLPDDDADATGPLLSEDERW GDASNDEDDAKERALPEDPETCLQLNMVYQEVVREKLAEVSLLLAQNREQQEEVSWALAGS GGRRVKDGRSPPARLYVGHFMKPYFKDKVTGAGPPANEDTREKAAQGVKAFEELLVTKWKS WEKALLRKAVVSDRLQRLLQPKLLKLEYLQQKQSRATSDAERQALEKQVREAEKEVQDISQL PEEALLGHRLDSHDWEKIANVNFEGGRSAEETRKFWQNHEHPSINKQEWSAQEVDRLKAIAA KHGHLRWQEIAEELGTRRSAFQCLQKYQQHNAALKRREWTQEEDRMLTQLVQAMGVGSHIP YRRIAYYMEGRDSTQLIYRWTKSLDPALKKGLWAPEEDAKLLQAVAKYGEQDWFKIREEVPG RSDAQCRDRYLRRLRLSLKKGRWSAQEEERLLELIGKHGVGHWAKIASELPHRTDSQCLSK WKIMARKQQSRGRRRRRPLRRVCWSSSSEDSEDSGDSGGSSSSSSSSEDVEPEGAPEARA DGPAPPSAQHPVPDMDLWVPTRQSARVPWGVGPGAWPGHRSASPRPPEGSDVAPGEEAG RAQAPSETPSASLRGGGCPRSADARPSGSEGLADEGPRRPLTVPLETVLRVLRTNTAALCRA LKEKLRRPRLLGSPLGPSPSDGSVARPRVQPRWRRRHALQRRLLERQLLMAVSPWVGDVTL PCAPWRPAVLHRRADGIGKQLQGARLASTPVFTLLIQLFRIDTAGCMEVVRERRAQPPALPSG GRVPSSARNSPGHLFQNGSARGAAKKSASHSGGGGPQSAPAPSGPRPKPKTVSELLREKRL REARARKAAQGPAVLPPQGLLSSPAILQPLPPQQLPVSGAVLSGPGGPAVASPGAPGPWAS AKEGPPSLHALALAPASMAAGVTPAAPRAPALGPSQVPASCHLSSLGQSQAPATSRKQGLPE APPFLPAAPSPIQLPVQPRSLTPALAAHTGASHVVASTPLPVTWVLTAQGLLPVPAVVGLPRP AGPPDPEGLSGTPPPSLTETRAGRGPKQPPAHVSVGPDPPAKTPPTAQSPAEGDGDVAHGP GGPSCPGEAQVAGEASVPRTLSPAKPLADHPEAEPCGSSQLPLPGGLSPGGAPTRHQGLER PPPPWPGPEKGAPDLRLLSQESEAAVRGWLTGQRGVCVPPLASRLPYQPPTLCSLRALSGLL LHKKALEHRAASLVPSGAAGAQQAPLGQVRERLQSSPAYLLLKARFLAAFALPALLATLPPHG VPTTLSAAAGVDSESDDDSLDELELADNGGPLGGWPSGRQAGPAAPTPTQGAPGEGSAAP GLDSDDLDILRTRHAWHARKRRRLV >XP_021883515.1 [Lobosporangium transversale] (SEQ ID No: 108) MSSGSTPRSMTAGARNILRSNDSASLWNYTVAPGWSMKEAEILRKALMKFGIGNWSKIIESN CLVGKTNAQMNLQTQRMLGQQSTAEFAGLHIDPRVIGQKNSLIQGDHIRRKNGCIVNTGAKLS REEIRRRVAENKEQYELPEEEWSSIELPLPDDPHLLLEAKKSEKVRLELELKNVQRQIAMLRKV GRKFETGSESPKTELDDDERDEFIEDQPLGKRARIEA >EJY76686.1 [Oxytricha trifallax] (SEQ ID No: 109) MRVYLKFCNRKQIHYTHTMSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALM KFGVGRWSAINKSGVLPTKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQ GFLVNQGCKLTPEEKDELRKINQEKYGLTAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDK IKHLIKLEDALKSKLEMIREGKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSAHQN RNSDLDEENDQSVMIIDESQQQNLTPKGKAQTMLTNQTQTMKKQADDSRDEQHLPLISTSAS VSNPSSTSKSSALKLNSMKQSDTAIASMKPSSSGKKTKVDSSFVSKQSNQQSTSYSETNVDT QNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVEDSE >EJY73777.1 [Oxytricha trifallax] (SEQ ID No: 110) MSHATSHGNSTEKDKKNSGNMVAESKSLWNYALSPGWTPQEVDVLKIALMKFGIGKWTIIDK SGILPTKTIQQCYLQTQRILGQQSLAEFMGLHVDIDKIALDNRRKNGIRKMGFLVNQGGKLTPE EKAHYQEINRQKYGLSPEEVETIKLPPPCSVEIYDINKIINPKSKLTTIEKINHCIKLQDALLEKLEN IKNKKIPTGAGFSSSRVYENMRGYDPQLLLNSHVTGQLDHSMQDLTIDERYSDLDEEEDPLAM ASIIDSQATPQPQKIKSSVPNKASTTPSAKEMNQIKDIIDSVIAENSAQQSKNLAQEKPKLKFSLV KATESNLLQSAAQNSDDVVMEEDSKLQHIETFSTVTQTATDQSNSQSKSQNNIASDSLKDSLE QNDLSKSLTDSLEMQQYSAEKKLNQAPMSKNSDKPKKKRLNKRKLPSDDEFETL >EJY79729.1 [Oxytricha trifallax] (SEQ ID No: 111) MSSSISAAIIAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALMKFGVGRWKTIEQSQCLPT KTMSQMYLQTQRLVGQQSLAEFMGLHLDLEQIFIKNAERQGAGVFRKNGCIINTGDNMTKVQI AKLRKKNSKIFGLTQPFVQSLHLPKAKVKEWLKVLTLDQILSAKSNFSTAEKIHYLKILENALER KLKKILRLQELVSIYRPCNIGIVVQKRLGSSIGDEYFEYVDCVKIEEKSVGNLDFALPNRNTDSTS LNEDFSFLDSTQKPQKLKAGSGRENKRKKMRDGLKDERAQRQSLMEALDEQEFDETKFQDS DGEMPDLNM >EJY81929.1 [Oxytricha trifallax] (SEQ ID No: 112) MSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALMKFGVGRWSAINKSGVLP TKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQGFLVNQGCKLTPEEKDEL RKINQEKYGLSAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDKIKHLIKLEDALKSKLEMIRE GKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSSHQNRNSDLDEENDESVMIIDE SQQENLTPKGKAQAMLTHQKYNEVTQTMIKQGDDSRQQQHLPLDSTSASVSNPSSTSKSST MKSNSMKQSETAIASMKPSSIGKKTKVDSSFVTKQSNQQSTAPIQKQAHQQNLDRNRSELGS TFAQQASVDTQNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVEDSE >EJY80746.1 [Oxytricha trifallax] (SEQ ID No: 113) MRVYLKFCNRKQIHYTHTMSSSISAAIMAGNQNKKIAESKSLWNYALSPGWTQQEVEILKIALM KFGVGRWSAINKSGVLPTKQIQQCYLQTQRLIGQQSLAEFMGLHLDIDRIAADNKQKRGIRKQ GFLVNQGCKLTPEEKDELRKINQEKYGLTAEHVEAIKLPAPCHLVEIFQIDKIMHPRSTLSTMDK IKHLIKLEDALKSKLEMIREGKRQQKFEQLQQKLKTTEASGRGSVTRVQRQMSDLHLGSAHQN RNSDLDEENDQSVMIIDESQQQNLTPKGKAQTMLTNQTQTMKKQADDSREEQHLPLNSTSAS VSNPSSTSKSSALKLNSMKQSDTAIASMKPSSSGKKTKVDSSFVSKQSNQQSTGPIQKQAHQ QNLDRNRSELGSTFAQQTNVDTQNSNNQGTSTASGNFISQSDDEEALMPKLKRRRVKDSE >ORX78557.1 [Basidiobolus meristosporus CBS 931.73] (SEQ ID No: 114) MTDVYKPRSMPVGARNVLRSNDSASLWNCTLSPGWTEPEVHILRKAVMKFGIGNWAKIIESQ CLFGKTIAQMNLQLQRMLGQQSTAEFAGLHLDPFVIGEINSKKQGPGIKRKNNCIVNTGGKLTR EEIKRRLLEHKRTYEISEEEWRSIELPKPEDPGAVLIAKKDELKMLEDELLRVVQKIQKAREERR SKSVDSSSVDGSVDDEARETKRRRK >ORX79686.1 [Anaeromyces robustus] (SEQ ID No: 115) MSIPKPRSMPTGFRNILRPNDSTSLWNCTLSPGWTQEESDILRDALIYYGIGNWKDIIEHGCLP DKTNAQMNLQLQRMLGQQSTAEFQNLHIDPYVIGKINSQKQGPNIRRKNGFIINTGGKLSREDI RRKIQENKENYELPKEEWSKIVLPNREVVIKNKVQEAINEKREKLNKLEDELDSVLKAIVNRRR ELRGMIPLKDSEMKSLVNRSAKNEGENKTETTNNEESNNTNNSDDIKDENNETSTSSHIFTNN DNELSENNSSSSSSNSISNKKKRFLRREVRRGKRRYNYDDDDFMPSGNRSRKSRKI >ORX56566.1 [Piromyces finnis] (SEQ ID No: 116) MSIPKPRSMPVGFRNILRPNDSTSLWNCTLSPGWTQEESDILRDALIFYGIGNWKDIIEHGCLP DKTNAQMNLQLQRMLGQQSTAEFQNLHIDPYEIGKINSQKQGPNIRRKNGFIINTGGKLSREDI KRKIQENKENYELPEEVWSKIVLPNREVVTINEKRQKLNKLEEELDSVLKQIVNRRRELRGMTP LKETEMKSIVNRSNQNDTKTEEKEIKEEESTTVNEEKIENTETSSISIISTNENEQSENISSSSPIV KSEQKKKRVVSRRKNKRRVNSDDEDFLPPGKSRSKRTRRTPKKSSN >XP_001009903.1 [Tetrahymena thermophila SB210] (SEQ ID No: 117) MSLKKGKFQHNQSKSLWNYTLSPGWREEEVKILKSALQLFGIGKWKKIMESGCLPGKSIGQIY MQTQRLLGQQSLGDFMGLQIDLEAVFNQNMKKQDVLRKNNCIINTGDNPTKEERKRRIEQNR KIYGLSAKQIAEIKLPKVKKHAPQYMTLEDIENEKFTNLEILTHLYNLKAEIVRRLAEQGETIAQPS IIKSLNNLNHNLEQNQNSNSSTETKVTLEQSGKKKYKVLAIEETELQNGPIATNSQKKSINGKRK NNRKINSDSEGNEEDISLEDIDSQESEINSEEIVEDDEEDEQIEEPSKIKKRKKNPEQESEEDDI EEDQEEDELVVNEEEIFEDDDDDEDNQDSSEDDDDDED >XP_020936799.1 [Sus scrofa] (SEQ ID No: 118) MDVDAEREKISKEIKELERILDPGSSGINDDVSESSLDSDSEAESLPDDDADATGPLLSEDERW GDASNDEDDAKERALPEDPETCLQLNMVYQEVVREKLAEVSLLLAQNREQQEEVSWALAGS GGRRVKDGRSPPARLYVGHFMKPYFKDKVTGAGPPANEDTREKAAQGVKAFEELLVTKWKS WEKALLRKAVVSDRLQRLLQPKLLKLEYLQQKQSRATSDAERQALEKQVREAEKEVQDISQL PEEALLGHRLDSHDWEKIANVNFEGGRSAEETRKFWQNHEHPSINKQEWSAQEVDRLKAIAA KHGHLRWQEIAEELGTRRSAFQCLQKYQQHNAALKRREWTQEEDRMLTQLVQAMGVGSHIP YRRIAYYMEGRDSTQLIYRWTKSLDPALKKGLWAPEEDAKLLQAVAKYGEQDWFKIREEVPG VTFEARAFPASRQRTSLPCAPLWPPALWVSRLGNRRGGRQPRGFSRTPRSVCRRYLRRLRL SLKKGRWSAQEEERLLELIGKHGVGHWAKIASELPHRTDSQCLSKWKIMARKQQSRGRRRR RPLRRVCWSSSSEDSEDSGDSGGSSSSSSSSEDVEPEGAPEARADGPAPPSAQHPVPDMD LWVPTRQSARVPWGVGPGAWPGHRSASPRPPEGSDVAPGEEAGRAQAPSETPSASLRGG GCPRSADARPSGSEGLADEGPRRPLTVPLETVLRVLRTNTAALCRALKEKLRRPRLLGSPLGP SPSDGSVARPRVQPRWRRRHALQRRLLERQLLMAVSPWVGDVTLPCAPWRPAVLHRRADG IGKQLQGARLASTPVFTLLIQLFRIDTAGCMEVVRERRAQPPALPSGGRVPSSARNSPGHLFQ NGSARGAAKKSASHSGGGGPQSAPAPSGPRPKPKTVSELLREKRLREARARKAAQGPAVLP PQGLLSSPAILQPLPPQQLPVSGAVLSGPGGPAVASPGAPGPWASAKEGPPSLHALALAPAS MAAGVTPAAPRAPALGPSQVPASCHLSSLGQSQAPATSRKQGLPEAPPFLPAAPSPIQLPVQ PRSLTPALAAHTGASHVVASTPLPVTWVLTAQGLLPVPAVVGLPRPAGPPDPEGLSGTPPPSL TETRAGRGPKQPPAHVSVGPDPPAKTPPTAQSPAEGDGDVAHGPGGPSCPGEAQVAGEAS VPRTLSPAKPLADHPEAEPCGSSQLPLPGGLSPGGAPTRHQGLERPPPPWPGPEKGAPDLR LLSQESEAAVRGWLTGQRGVCVPPLASRLPYQPPTLCSLRALSGLLLHKKALEHRAASLVPS GAAGAQQAPLGQVRERLQSSPAYLLLKARFLAAFALPALLATLPPHGVPTTLSAAAGVDSESD DDSLDELELADNGGPLGGWPSGRQAGPAAPTPTQGAPGEGSAAPGLDSDDLDILRTRHAWH ARKRRRLV >XP_009300052.1 [Danio rerio] (SEQ ID No: 119) MKCLSVNMTHLSRDSWLYTHDVQVTYNSFIKVSPCPKMASDDLRAQRDKIQREILALESTLGA DSSIADQLSSDNSSDYESDDSGPTVKRVERDDLETERLRIQREIEELENALGADAALENVLQD SDHDTDSSEDSADDLELPQNVETCLQMNLVYQEVLKEKLAELEQLLIENQQQQKEIEVQLSGP GNSIFSVPGVPPQKQFLGYFLKPYFKDKLTGLGPPANEETKERMKHGSIPVDNLKIKRWEGW QKTLLTNAVARDTMKRMLQPKLSKMEYLSNKLCRAEGEEKEQLKAQIELIEKQIAEIRTLKDDQ LLGDLQDDHDWDKISNIDFEGLRQADDLKRFWQNFLHPSINKSVWKQDEIYKLQAVAEEFKM CHWDKIAEALGTNRTAFMCFQTYQRYISKTFRRTHWTEEEDDLLRELVEKMRIGNFIPYIQMS HFMVGRDGSQLAYRWTSVLDPSLKKGPWSKEEDQLLRNAVAKYGTREWGRIRTEVPGRTD SACRDRYLDCLRETVKKGTWSYAEMELLKEKVAKYGVGKWAKIASEIPNRVDAQCLHKWKL MTRSKKPLKRPLSSITTSYPRNKRQKLLKTVKEEMFFNSSSDDESQINYMNSDESDDLAEDEN LEIPQKEYVQTEMKEWIPRNAMVWTITPGSFRTLWVRLPTNEEELRESTKESGLGSDSSENS ACPNDEPIMERNTILDRFGDVERTYVGMNTVVLHRRTDDEKAMFKVCMSDVKQFIQMKATEF AVKKKKKIKNKKRTLRDVFSLNTDLQKAVIPWIGNVIISTPANEAIFCEGDIVGIKAASIRLQKTSV FTFFIKAFHVDVNGCRTVIEIHKKLDIKMPLAINGNPKPTPISTSPKTVAVLLQQSKAASEHKKPA EPSQQPSLPPSQKPSLPPAQQPTQPPSLPPSVPPSQQPTLPPPSQPSQPPPQPPSLPPSQPP AQQPPQQPSLPPPQPPSLPPPQPPSLPTSQQQSLPPSQQHSLPPFQNPSLPPSQQPSLPPS KQPPQPLPVRQITTPTLIYPNNLVITNPNMEGEVQHLVFKGLLLPQQPSKAVSHIPLPVMQPKT PAQPIVVSKSPSVQDSNSVKSSKRICKPTKKAQALMEQSKVKSRKKEPQKQNQGNKNVVFPT VTLQTSPVIKILSPARLVQVTGLSPNFSSNQTINMPDKSLTIKSPQPCSSGNLHQSAPVVVHSS TNPTFVHSSVSNVSRDNLNVSSTINISPRVSRDALNPTSFLNSTTFPLPQNLSVQQSVQIVPQIP INVVHKATCTKAAKTSSDSSSDESVVKQHQLSPSTGRSIPPAVFNIQPNPSTPPTLSSGPVIFN PNNKVVAPKLCGLNVSSSQLPTVSTQKTKYRPIRPLGPLPVVAPPSRKVTSMSRIRAQSEGEP LISLRDLPAAGVNFDSHLIFPEKSSEVDDWMDGKGGIPLPHLDTSLPYLPPSAATIKTMTDLLRA KQPLLLAAKKVLPAQYQDECNEEVEVEAIRKVVAERFASNPAYLLCKARFLSCFTLPALLATINP CEERQLLSEDDEEDDHLATINPSEEHQSSTEDDEEDLQTNERSQPPTARTELNMNENEASAK QFSGIGPKRQRNQRIKRLIK
TABLE-US-00003 TABLE 2 Primer sequences. Name Sequence (5′ to 3′) Description 1781.0_qF24 ACTAGTCTTAAATATGAGAAAGATGATTTGAA Contig1781.0 tiling qPCR TAAGAT (SEQ ID No: 120) primers 1781.0_qR24 ATCCTAGCAATATTATCTACTTATAATTCTATT Contig1781.0 tiling qPCR GACTATTAG (SEQ ID No: 121) primers 1781.0_qF23 CTAATTAACTAATAGTCAATAGAATTATAAGT Contig1781.0 tiling qPCR AGATAATATTGCT (SEQ ID No: 122) primers 1781.0_qR23 CATTAAATCATTAACAGAGTAATGTCGTCATA Contig1781.0 tiling qPCR TATTTGTC (SEQ ID No: 123) primers 1781.0_qF22 TTTAGTGAGCATAGACAAATATATGACGACAT Contig1781.0 tiling qPCR TACTC (SEQ ID No: 124) primers 1781.0_qR22 GCGGAGATGTCTTTTTGACCTTTTGATAG Contig1781.0 tiling qPCR (SEQ ID No: 125) primers 1781.0_qF21 ATGTTAACATGCTTATTATTACTATCAAAAGG Contig1781.0 tiling qPCR TCAA (SEQ ID No: 126) primers 1781.0_qR21 GGCTGCTACTGATATTTATGTTCTTTATGTTT Contig1781.0 tiling qPCR A (SEQ ID No: 127) primers 1781.0_qF20 CAAAGAACACGAAGCTCATAAACATAAAGAA Contig1781.0 tiling qPCR CAT (SEQ ID No: 128) primers 1781.0_qR20 TGGAGCAAATGCTGCTAATAACGAG (SEQ ID Contig1781.0 tiling qPCR No: 129) primers 1781.0_qF19 ACCTCCAGCAGCTCCGTTTCTATTATTTG Contig1781.0 tiling qPCR (SEQ ID No: 130) primers 1781.0_qR19 GGCCTGGGTATTTTCCCTGCTTTA (SEQ ID Contig1781.0 tiling qPCR No: 131) primers 1781.0_qF18 CTTCCCAGGTAAAATTTAAGGTAAATAAAGCA Contig1781.0 tiling qPCR GG (SEQ ID No: 132) primers 1781.0_qR18 TCAAGGTGGAGGACTCTTCGGTAAC (SEQ ID Contig1781.0 tiling qPCR No: 133) primers 1781.0_qF17 ATTACGAACCCACTACCTGAATTATTGTTACC Contig1781.0 tiling qPCR G (SEQ ID No: 134) primers 1781.0_qR17 AAACGTCCTGCAGGACAACGC (SEQ ID No: Contig1781.0 tiling qPCR 135) primers 1781.0_qF16 TTGATTGAAGTTTTAATTTGGTACTGGGC Contig1781.0 tiling qPCR (SEQ ID No: 136) primers 1781.0_qR16 TTGGATGCTGATCTGTTTTGTTTAGAAAG Contig1781.0 tiling qPCR (SEQ ID No: 137) primers 1781.0_qF15 TTGGGATTTCTTAACTGGATTTCTTTCTAAAC Contig1781.0 tiling qPCR (SEQ ID No: 138) primers 1781.0_qR15 CTGCTTAAATTAAGTACTTCTATGTTTGAAAT Contig1781.0 tiling qPCR TAATGTTC (SEQ ID No: 139) primers 1781.0_qF14 CAATTAAAACACGTTGAACATTAATTTCAAAC Contig1781.0 tiling qPCR ATAG (SEQ ID No: 140) primers 1781.0_qR14 TGAGGATCCAAGGTAAATTTCATACAATC Contig1781.0 tiling qPCR (SEQ ID No: 141) primers 1781.0_qF13 GACTGCATGTATATGCTAATGATTGTATGAAA Contig1781.0 tiling qPCR TTTAC (SEQ ID No: 142) primers 1781.0_qR13 AGTGGCATTTCCAAGGAAACATTAATAC Contig1781.0 tiling qPCR (SEQ ID No: 143) primers 1781.0_qF12 CAGTGTTTCCCTTTGTGTAAATGGG (SEQ ID Contig1781.0 tiling qPCR No: 144) primers 1781.0_qR12 TCAGTGGATAAACTAGCCTAAGGAAACAC Contig1781.0 tiling qPCR (SEQ ID No: 145) primers 1781.0_qF11 TTTTACAGACTGGACACAGTAGTGTTTCC Contig1781.0 tiling qPCR (SEQ ID No: 146) primers 1781.0_qR11 CCAGTGGTATCAACATGCGGTCATC (SEQ ID Contig1781.0 tiling qPCR No: 147) primers 1781.0_qF10 GATATATACACTCCCAGCAGTAAAGATGACC Contig1781.0 tiling qPCR (SEQ ID No: 148) primers 1781.0_qR10 GAATAGGCTCACTCTAAATTCGAGTGC (SEQ Contig1781.0 tiling qPCR ID No: 149) primers 1781.0_qF9 ATTCGCTAGGTCTAAGCAAATATTGCAC Contig1781.0 tiling qPCR (SEQ ID No: 150) primers 1781.0_qR9 TAAATAGCCAAAACAACCAATAAAATTAACAA Contig1781.0 tiling qPCR TAACCTC (SEQ ID No: 151) primers 1781.0_qF8 CTTTTTGAGGGCGAGGTTATTGTTAATTTTAT Contig1781.0 tiling qPCR TG (SEQ ID No: 152) primers 1781.0_qR8 GATCCATTAATTACAGAAATAAATAATAGGCA Contig1781.0 tiling qPCR GCATA (SEQ ID No: 153) primers 1781.0_qF7 ATATTGCCTGAATTATTATGCTGCCTATTATT Contig1781.0 tiling qPCR TATT (SEQ ID No: 154) primers 1781.0_qR7 AAATGTGCACCGTCATCAAATACC (SEQ ID Contig1781.0 tiling qPCR No: 155) primers 1781.0_qF6 GGATCACTATAATCATCTGGATGACTATTGG Contig1781.0 tiling qPCR (SEQ ID No: 156) primers 1781.0_qR6 AAGTGTAATGTAGTTTCAATGGTAGTGATGT Contig1781.0 tiling qPCR G (SEQ ID No: 157) primers 1781.0_qF5 TGACTTCTTCCAGTGGATTCACATC (SEQ ID Contig1781.0 tiling qPCR No: 158) primers 1781.0_qR5 GCCAATTAATTCATTTGTTCGTAGAGATATGT Contig1781.0 tiling qPCR AA (SEQ ID No: 159) primers 1781.0_qF4 CACTTTATAATAAATAAGAATTATTACATATCT Contig1781.0 tiling qPCR CTACGAACAA (SEQ ID No: 160) primers 1781.0_qR4 CTCACCAGTAATTTGCAGACACC (SEQ ID Contig1781.0 tiling qPCR No: 161) primers 1781.0_qF3 GGCTGACTGGGGTTGAGTTAATC (SEQ ID Contig1781.0 tiling qPCR No: 162) primers 1781.0_qR3 AATATAAACAAAATGGAATATACAAAACTTGA Contig1781.0 tiling qPCR ATAAGAAATAG (SEQ ID No: 163) primers 1781.0_qF2 GAGACTGAGGATCTATTTCTTATTCAAGTTTT Contig1781.0 tiling qPCR G (SEQ ID No: 164) primers 1781.0_qR2 ATTAATACATTATTAACTTAAATATAAATATTT Contig1781.0 tiling qPCR AAAGAATTATGAACAATAAT (SEQ ID primers No: 165) 1781.0_qF1 CATTTTGTTTATATTATTGTTCATAATTCTTTA Contig1781.0 tiling qPCR AATATTTATATTTAAGTTAAT(SEQ ID primers No: 166) 1781.0_qR1 ACAAGATAACATTGCTAATTTTCAATAAATTA Contig1781.0 tiling qPCR AATTAATACATT (SEQ ID No: 167) primers 1781.0_F CCCCAAAACCCCAAAACCCCACTAGTCTTAA Primer pair for amplifying ATATGAGAAAGATGATTTGAATAAG (SEQ ID chromosome, to be added to No: 168) mini-genome 1781.0_R CCCCAAAACCCCAAAACCCCACAAGATAACA TTGCTAATTTTCAATAAATTAAAT (SEQ ID No: 169) 15118.0_F CCCCAAAACCCCAAAACCCCGATTTATGAAA Primer pair for amplifying GTGCTGTATTATTAAGGAATG (SEQ ID No: chromosome, to be added to 170) mini-genome 15118.0_R CCCCAAAACCCCAAAACCCCATTATTCCTAC TTTTAGCTATATTAGAAATTCG (SEQ ID No: 171) 1339.1_F CCCCAAAACCCCAAAACCCCATGATGATACA Primer pair for amplifying TAGATTCATTAAAATAAAAAAAAG (SEQ ID chromosome, to be added to No: 172) mini-genome 1339.1_R CCCCAAAACCCCAAAACCCCTTAGATGAATT AAATAAAGAATTCAAATAAATAC (SEQ ID No: 173) 20718.0_F CCCCAAAACCCCAAAACCCCATGAATCTGAA Primer pair for amplifying ATCGGGCAGTTGAATACG (SEQ ID chromosome, to be added to No: 174) mini-genome 20718.0_R CCCCAAAACCCCAAAACCCCATTTATCATAAT TATAGAGAAGATAGTGATGC (SEQ ID No: 175) 20822.0_F CCCCAAAACCCCAAAACCCCATGAGAGTTTG Primer pair for amplifying TGAAAAATTAAGTTTG (SEQ ID No: chromosome, to be added to 176) mini-genome 20822.0_R CCCCAAAACCCCAAAACCCCTATATTAAATAT CAAGAAAAAGTAAAAAGACAG (SEQ ID No: 177) 21162.0_F CCCCAAAACCCCAAAACCCCAAGTCTCATTT Primer pair for amplifying TGGTTAGTGATGTTTGGATTG (SEQ ID No: chromosome, to be added to 178) mini-genome 21162.0_R CCCCAAAACCCCAAAACCCCGTATGATCGAT GAATACAAAATCAAGTTGGAAG (SEQ ID No: 179) 11991.0_F CCCCAAAACCCCAAAACCCCACTTAAAAGGA Primer pair for amplifying TTGCATGATTGTAAGGGAAATGTG (SEQ ID chromosome, to be added to No: 180) mini-genome 11991.0_R CCCCAAAACCCCAAAACCCCAATAATCGCAC TTACATTATATCTGGAGAAATG (SEQ ID No: 181) 5079.0_F CCCCAAAACCCCAAAACCCCTTCTACTAAATT Primer pair for amplifying TCATTGATTTTTTTCAATTTC (SEQ ID chromosome, to be added to No: 182) mini-genome 5079.0_R CCCCAAAACCCCAAAACCCCATTTGATAGAA TAGAAGAGAAATTATGGAATG (SEQ ID No: 183) 13665.0_F CCCCAAAACCCCAAAACCCCAAGTATAAATA Primer pair for amplifying AGGGAGTTGATATATAATATACTT (SEQ ID chromosome, to be added to No: 184) mini-genome 13665.0_R CCCCAAAACCCCAAAACCCCATGAGAATTCC TATTCAAAAATGAAAAAGTAGATTG (SEQ ID No: 185) 22365.0_F CCCCAAAACCCCAAAACCCCATAAGGTAGTA Primer pair for amplifying TATTTTTATTAAGGATTGGAAATTA (SEQ ID chromosome, to be added to No: 186) mini-genome 22365.0_R CCCCAAAACCCCAAAACCCCATAAGACTAAA TTTATTGAAATTATCTTGTTAATAG (SEQ ID No: 187) 21620.0_F CCCCAAAACCCCAAAACCCCTTGAGCCAATA Primer pair for amplifying CTGAAAAGGATGATAGTGAATAGTG (SEQ ID chromosome, to be added to No: 188) mini-genome 21620.0_R CCCCAAAACCCCAAAACCCCTCATTTTTTAAA TTGGATAGTAAGAAAAATTATAATAAAG (SEQ ID No: 189) 15049.0_F CCCCAAAACCCCAAAACCCCAAGGAATAAAA Primer pair for amplifying TTCAATTCCAAAATGTAAGGTGAG (SEQ ID chromosome, to be added to No: 190) mini-genome 15049.0_R CCCCAAAACCCCAAAACCCCGTTAAAAGAAC CAAGTGATATATTATAAGCCA (SEQ ID No: 191) 16562.0_F CCCCAAAACCCCAAAACCCCTTTATCAATTAT Primer pair for amplifying AAATAAAAAGTTTTAAGTCTATTTTTAA (SEQ chromosome, to be added to ID No: 192) mini-genome 16562.0_R CCCCAAAACCCCAAAACCCCATAAGACAAAT GCAACTTTATAAAGTAAATAAATTATC (SEQ ID No: 193) 22360.0_F CCCCAAAACCCCAAAACCCCAATGCAACATT Primer pair for amplifying TACTTTTAACATTAGAGATTATC (SEQ ID chromosome, to be added to No: 194) mini-genome 22360.0_R CCCCAAAACCCCAAAACCCCATAAGAGCAAA AGTTAATATAAAAATTCAAGGTG (SEQ ID No: 195) 15836.0_F CCCCAAAACCCCAAAACCCCGATTTGCACAG Primer pair for amplifying TTAATTTGAATTTGGTATTTG (SEQ ID No: chromosome, to be added to 196) mini-genome 15836.0_R CCCCAAAACCCCAAAACCCCTCATTTTTAGTA TTTTAAATATCATTTAGTTTTAAGTAA (SEQ ID No: 197) 2324.0_F CCCCAAAACCCCAAAACCCCTTGATTGATTC Primer pair for amplifying CTGAATACAAATGAAATAATATAAAG (SEQ ID chromosome, to be added to No: 198) mini-genome 2324.0_R CCCCAAAACCCCAAAACCCCAAGACCAAAAT AAAGAGGAATAATGAGAAGTAC (SEQ ID No: 199) 22404.0_F CCCCAAAACCCCAAAACCCCATGTAGAATTA Primer pair for amplifying ATATGAGAACATCATTTTTTAAGC (SEQ ID chromosome, to be added to No: 200) mini-genome 22404.0_R CCCCAAAACCCCAAAACCCCATAATGTAAGA AATCTGATACAATAGAGAGATAAAC (SEQ ID No: 201) 15403.0_F CCCCAAAACCCCAAAACCCCGAATGGAAAAT Primer pair for amplifying TTGTATGAAGTTCAGAGAGAAAG (SEQ ID chromosome, to be added to No: 202) mini-genome 15403.0_R CCCCAAAACCCCAAAACCCCATAAGATTATC AGTTATAAAAATTGATAGGGGATG (SEQ ID No: 203) 17795.0_F CCCCAAAACCCCAAAACCCCATCATACGATA Primer pair for amplifying TCTTAAGTGTTGATCTGAATTAAAT (SEQ ID chromosome, to be added to No: 204) mini-genome 17795.0_R CCCCAAAACCCCAAAACCCCGTTAGGTTTAA GAGTAGAAATAAAAGGAGATAAG (SEQ ID No: 205) 11141.0_F CCCCAAAACCCCAAAACCCCTCTCACTATCT Primer pair for amplifying TTTGTAAAAAGTTGGTAGAT (SEQ ID chromosome, to be added to No: 206) mini-genome 11141.0_R CCCCAAAACCCCAAAACCCCGTTGGTTTAGA ATAAAGAATTGTATTAACCAAATTTAT (SEQ ID No: 207) 22342.0_F CCCCAAAACCCCAAAACCCCGTGAATTAAAA Primer pair for amplifying TATAAACGAATAAGATATAAAGATTG (SEQ ID chromosome, to be added to No: 208) mini-genome 22342.0_R CCCCAAAACCCCAAAACCCCTTAATTACTGA ATTGTTTATTATAAGATTATAAG (SEQ ID No: 209) 2240.0_F CCCCAAAACCCCAAAACCCCGTAATGAATAA Primer pair for amplifying ATTGTAAAGGTAAATTGCAA (SEQ ID chromosome, to be added to No: 210) mini-genome 2240.0_R CCCCAAAACCCCAAAACCCCAATGGCAAACA TTTAAAATAAATATTAATATAAATTAC (SEQ ID No: 211) 3531.0_F CCCCAAAACCCCAAAACCCCTAAAAGGAAAA Primer pair for amplifying CAAATAGAAGAAACTGAA (SEQ ID No: chromosome, to be added to 212) mini-genome 3531.0_R CCCCAAAACCCCAAAACCCCATTTGGATATT ATGATTAGCAGTTTAGTG (SEQ ID No: 213) 4701.0_F CCCCAAAACCCCAAAACCCCTTTAAATAAAAA Primer pair for amplifying TCGCATGAATTAAATGCAAG (SEQ ID chromosome, to be added to No: 214) mini-genome 4701.0_R CCCCAAAACCCCAAAACCCCTAGGTAAATGC AAATTGGAGAATTTCCAATAG (SEQ ID No: 215) 20883.0_F CCCCAAAACCCCAAAACCCCATATTAAGAAT Primer pair for amplifying TGTGTAATTTTTGAGTAAATTG (SEQ ID No: chromosome, to be added to 216) mini-genome 20883.0_R CCCCAAAACCCCAAAACCCCATTTAGTAGAA TCTTCAATAAATAAGCGTTATTG (SEQ ID No: 217) 15191.0_F CCCCAAAACCCCAAAACCCCTAGCATTAAAT Primer pair for amplifying TTGTAAAAAGAATGAAATTTAATAT (SEQ ID chromosome, to be added to No: 218) mini-genome 15191.0_R CCCCAAAACCCCAAAACCCCAATATACATGA TTTTAGATAAACAACAAATAAT (SEQ ID No: 219) 19342.0_F CCCCAAAACCCCAAAACCCCATCAAGAATGG Primer pair for amplifying ATTAGAATTTTTAATGCTTTGC (SEQ ID No: chromosome, to be added to 220) mini-genome 19342.0_R CCCCAAAACCCCAAAACCCCGAGGAACTAG GGATTACTCATTTTACTTCAG (SEQ ID No: 221) 15245.0_F CCCCAAAACCCCAAAACCCCATGCATGTAAT Primer pair for amplifying TTTCTGTCAAAATTGAGTAAATAG (SEQ ID chromosome, to be added to No: 222) mini-genome 15245.0_R CCCCAAAACCCCAAAACCCCGTAAGCTAAAT AAGTAGACTAAATAGGTAG (SEQ ID No: 223) 6109.0_F CCCCAAAACCCCAAAACCCCAACCGCAAATA Primer pair for amplifying GAATATATAAAGGATAATTTA (SEQ ID No: chromosome, to be added to 224) mini-genome 6109.0_R CCCCAAAACCCCAAAACCCCGAAGTACTAAA AATAAAAAGTAAAGTATTAAAATAAAATC (SEQ ID No: 225) 22610.0_F CCCCAAAACCCCAAAACCCCGTAGACAGATT Primer pair for amplifying TTCCAGTTTATAGCTGTGTTTG (SEQ ID No: chromosome, to be added to 226) mini-genome 22610.0_R CCCCAAAACCCCAAAACCCCTTTATGAATTTT CTTAAATCTGTAAATAAATAAAATAAT (SEQ ID No: 227) 11875.0_F CCCCAAAACCCCAAAACCCCGTATGTTAATT Primer pair for amplifying TTATGCTTTAAATGATAGTTTA (SEQ ID No: chromosome, to be added to 228) mini-genome 11875.0_R CCCCAAAACCCCAAAACCCCTGGATTCCATT TTGAAGAATAATTTATTAAC (SEQ ID No: 229) 15329.0_F CCCCAAAACCCCAAAACCCCTTGTTTCGATT Primer pair for amplifying ATATTCAAAATAGGAAATTTAG (SEQ ID No: chromosome, to be added to 230) mini-genome 15329.0_R CCCCAAAACCCCAAAACCCCATGAATTTCAA TAACTTTTTATGAAAATGAATTTA (SEQ ID No: 231) 20179.0_F CCCCAAAACCCCAAAACCCCTAGGAAGAAAA Primer pair for amplifying TCTTGTGTGCAATTTGAGATTAAC (SEQ ID chromosome, to be added to No: 232) mini-genome 20179.0_R CCCCAAAACCCCAAAACCCCTTGATAAAAAC ATAGATTAAATACTAGTGTATAAA (SEQ ID No: 233) 9936.0_F CCCCAAAACCCCAAAACCCCATATGGAATAT Primer pair for amplifying TTAATTTGATTTAAATGAAACGAAATA (SEQ chromosome, to be added to ID No: 234) mini-genome 9936.0_R CCCCAAAACCCCAAAACCCCTTGTAACAGTA AATAGAATATTTTAATTACCAAAAC (SEQ ID No: 235) 16267.0_F CCCCAAAACCCCAAAACCCCTCATTTTAGAA Primer pair for amplifying TTATCTGTACTTAATTATTTTG (SEQ ID No: chromosome, to be added to 236) mini-genome 16267.0_R CCCCAAAACCCCAAAACCCCATGAGCATGTT ATTTTACTTCATTAGTCAATTTG (SEQ ID No: 237) 4488.0_F CCCCAAAACCCCAAAACCCCATGAAATGAAT Primer pair for amplifying TCTAAGATTGAATTGCATG (SEQ ID chromosome, to be added to No: 238) mini-genome 4488.0_R CCCCAAAACCCCAAAACCCCAGAAGAGATCA ATAAATTGAGAAGGAATTG (SEQ ID No: 239) 8551.0_F CCCCAAAACCCCAAAACCCCGTGTTACAATT Primer pair for amplifying TGCGTTTGAAATAGTTGGTTGATA (SEQ ID chromosome, to be added to No: 240) mini-genome 8551.0_R CCCCAAAACCCCAAAACCCCATATGGTAAAA ATTGAAGAAAGAAATTCAAGAGAA (SEQ ID No: 241) 11746.0_F CCCCAAAACCCCAAAACCCCGTATTGATGAT Primer pair for amplifying AAAATTGTATACAAGTTGATAG (SEQ ID No: chromosome, to be added to 242) mini-genome 11746.0_R CCCCAAAACCCCAAAACCCCTAGATGCTTAA TTATTAAGAAGATTCTGGAATG (SEQ ID No: 243) 22291.0_F CCCCAAAACCCCAAAACCCCATAAACCAATG Primer pair for amplifying TAATTAATTTATTGGGTGTGTTG (SEQ ID chromosome, to be added to No: 244) mini-genome 22291.0_R CCCCAAAACCCCAAAACCCCTTAGATTAAATT TAGAGAGTTATAGAAATGTAGTAAAT (SEQ ID No: 245) 17535.0_F CCCCAAAACCCCAAAACCCCATCTCAATTTAT Primer pair for amplifying AAAATCAGAATAAGAGATTGTC (SEQ ID No: chromosome, to be added to 246) mini-genome 17535.0_R CCCCAAAACCCCAAAACCCCAGAATAAAACA ACTGAAGTAAATATGAGTTAC (SEQ ID No: 247) 15372.0_F CCCCAAAACCCCAAAACCCCTTTCAAATATAA Primer pair for amplifying AATAAACAGAAGAATGGCAAACG (SEQ ID chromosome, to be added to No: 248) mini-genome 15372.0_R CCCCAAAACCCCAAAACCCCAAATTCAATATT AAATGAAATAATTTTCAAAAGTG (SEQ ID No: 249) 13537.0_F CCCCAAAACCCCAAAACCCCATGAGATCAAA Primer pair for amplifying TTTTTTTATTAAAATTCTTC (SEQ ID chromosome, to be added to No: 250) mini-genome 13537.0_R CCCCAAAACCCCAAAACCCCTTGGATTCATA TTTTTGTTTAAGGCTTAGATA (SEQ ID No: 251) 22613.0_F CCCCAAAACCCCAAAACCCCATTAGAAAAGA Primer pair for amplifying GGATTTCAATAAAAGCAAATAT (SEQ ID No: chromosome, to be added to 252) mini-genome 22613.0_R CCCCAAAACCCCAAAACCCCATCGATTTATT ATTGTTGAATTTAAAAGTATTGAA (SEQ ID No: 253) 12585.0_F CCCCAAAACCCCAAAACCCCGAGAGGTTTGA Primer pair for amplifying TAAGTAGAATTAGTAAAATCTATAAAG (SEQ chromosome, to be added to ID No: 254) mini-genome 12585.0_R CCCCAAAACCCCAAAACCCCATTAGTACTAT TTTCATAGATCTATGTATAAATTGAA (SEQ ID No: 255) 5317.0_F CCCCAAAACCCCAAAACCCCAATGGAAAGAT Primer pair for amplifying AAACAGATTTTAATTTGGAAATAAAAT (SEQ chromosome, to be added to ID No: 256) mini-genome 5317.0_R CCCCAAAACCCCAAAACCCCTTTAAGCAGTA TTTCTAAAATGTTGATGAAATAAAAAT (SEQ ID No: 257) 17894.0_F CCCCAAAACCCCAAAACCCCATAAGATAAAA Primer pair for amplifying TTTAACGAAAAAAAGTTAAGTC (SEQ ID No: chromosome, to be added to 258) mini-genome 17894.0_R CCCCAAAACCCCAAAACCCCATAAGATGAAA TATAGAGATAATTGAGCCTA (SEQ ID No: 259) 3513.0_F CCCCAAAACCCCAAAACCCCAATTACATATTA Primer pair for amplifying ATGTACTTATGATAGAATG (SEQ ID chromosome, to be added to No: 260) mini-genome 3513.0_R CCCCAAAACCCCAAAACCCCTAATGATCAAA TAACCTGAGTTAAAGAAG (SEQ ID No: 261) 16420.0_F CCCCAAAACCCCAAAACCCCAAATTATGAAA Primer pair for amplifying ATAGACACTAATTGGATGTTC (SEQ ID No: chromosome, to be added to 262) mini-genome 16420.0_R CCCCAAAACCCCAAAACCCCTGATTCGTCAT ATGAAATTGAAAAGGAGTAAAT (SEQ ID No: 263) 1084.1_F CCCCAAAACCCCAAAACCCCAGCGCCATGAA Primer pair for amplifying TCTGATGCATTTATTTTAAG (SEQ ID chromosome, to be added to No: 264) mini-genome 1084.1_R CCCCAAAACCCCAAAACCCCGTAGATCATTT ATGTAAAAGATTTTGAGAGATG (SEQ ID No: 265) 22651.0_F CCCCAAAACCCCAAAACCCCATACAATTATTA Primer pair for amplifying TAAATGAAAAAGCGCACTAATC (SEQ ID No: chromosome, to be added to 266) mini-genome 22651.0_R CCCCAAAACCCCAAAACCCCATAGTTACTAT GAAAGGACTGGTACATAGAAATAATAG (SEQ ID No: 267) 8670.0_F CCCCAAAACCCCAAAACCCCTTAAGTCAATA Primer pair for amplifying TCTAAATCAAATATTAGTAGTATAAT (SEQ ID chromosome, to be added to No: 268) mini-genome 8670.0_R CCCCAAAACCCCAAAACCCCGTCATATGGTT TTATAAAATAAAATTGAGATTTTTTTG (SEQ ID No: 269) 19107.0_F CCCCAAAACCCCAAAACCCCATAAGGATAAA Primer pair for amplifying TTCTATCATATAAGTGGAAGTGC (SEQ ID chromosome, to be added to No: 270) mini-genome 19107.0_R CCCCAAAACCCCAAAACCCCATTCTTGAATA TTGATTATGCATATTGTGTAAAATAG (SEQ ID No: 271) 21021.0_F CCCCAAAACCCCAAAACCCCAAGCGTTGAAT Primer pair for amplifying TTTTTATAATATATGATAAAC (SEQ ID chromosome, to be added to No: 272) mini-genome 21021.0_R CCCCAAAACCCCAAAACCCCTTAATGCCAAT AAACAGATGAAAGTAGAGTTATAG (SEQ ID No: 273) 15004.0_F CCCCAAAACCCCAAAACCCCATAGAGAGTGT Primer pair for amplifying TTTATTGAAGGACAGAGAATATTG (SEQ ID chromosome, to be added to No: 274) mini-genome 15004.0_R CCCCAAAACCCCAAAACCCCGAGCGTAAGAA ATATTCTTAGATAAATGGAAACTG (SEQ ID No: 275) 18789.0_F CCCCAAAACCCCAAAACCCCATGGCAATATC Primer pair for amplifying TTTGCGTGTTTCTGGC (SEQ ID chromosome, to be added to No: 276) mini-genome 18789.0_R CCCCAAAACCCCAAAACCCCATAAGAATAAA TTAAAGAAGATTTGAGAAAGATATGC (SEQ ID No: 277) 1335.1_F CCCCAAAACCCCAAAACCCCAAATGCTAAAA Primer pair for amplifying ATAATGAAAAATCTGAGGG (SEQ ID chromosome, to be added to No: 278) mini-genome 1335.1_R CCCCAAAACCCCAAAACCCCTAATGACAGGT TTAGTAATAATTTAGCTG (SEQ ID No: 279) 17286.0_F CCCCAAAACCCCAAAACCCCACGACTTAACA Primer pair for amplifying TTGCTGTTAAATATTCAGAAAT (SEQ ID No: chromosome, to be added to 280) mini-genome 17286.0_R CCCCAAAACCCCAAAACCCCTAAAATTGGAA AGGGGCAAATTTGCTTATGA (SEQ ID No: 281) 7278.0_F CCCCAAAACCCCAAAACCCCATGAGTAATAT Primer pair for amplifying ATACAAATTTTAAATGTATTTTGATTTA (SEQ chromosome, to be added to ID No: 282) mini-genome 7278.0_R CCCCAAAACCCCAAAACCCCATTGAGTGAGT ATTTTTATATTTATTGCGAGTTA (SEQ ID No: 283) 7752.0_F CCCCAAAACCCCAAAACCCCACAATAGGCAT Primer pair for amplifying ATTTAATAATTAATTGTTAAAG (SEQ ID No: chromosome, to be added to 284) mini-genome 7752.0_R CCCCAAAACCCCAAAACCCCACTCATTATAT AAGGCTGAAAAAATCAGAGG (SEQ ID No: 285) 244.1_F CCCCAAAACCCCAAAACCCCTAAATGTAAGA Primer pair for amplifying GTAAACTATCATATGAAAG (SEQ ID chromosome, to be added to No: 286) mini-genome 244.1_R CCCCAAAACCCCAAAACCCCATAATGCGAAA TATTCATCAGAGTAAATAATG (SEQ ID No: 287) 20383.0_F CCCCAAAACCCCAAAACCCCATACGTCATGA Primer pair for amplifying TTATAAGATTATTATAGAATGCTTAC (SEQ ID chromosome, to be added to No: 288) mini-genome 20383.0_R CCCCAAAACCCCAAAACCCCTCTTGTAAAAT AATAAGTTTAAGAAATTGAATTTAG (SEQ ID No: 289) 331.1_F CCCCAAAACCCCAAAACCCCATAATATCAAA Primer pair for amplifying TTAATGAATATTTATCAATTTTATTAAT (SEQ chromosome, to be added to ID No: 290) mini-genome 331.1_R CCCCAAAACCCCAAAACCCCCCCTAATGTCC ATAATTTATGTATCAAATAAGG (SEQ ID No: 291) 22208.0_F CCCCAAAACCCCAAAACCCCATGATGGTGGA Primer pair for amplifying GGAGTGAAGATAAATTAGAATG (SEQ ID No: chromosome, to be added to 292) mini-genome 22208.0_R CCCCAAAACCCCAAAACCCCAAAGTGCAATA AAAAGAGTGAAAATAAATTTTTG (SEQ ID No: 293) 21398.0_F CCCCAAAACCCCAAAACCCCATATACCAATG Primer pair for amplifying TTAAAAATGAATATTGATATAGAATAG (SEQ chromosome, to be added to ID No: 294) mini-genome 21398.0_R CCCCAAAACCCCAAAACCCCATAATACAAAG TAAAATTGTTTTTTATAGTTCATAA (SEQ ID No: 295) 11890.0_F CCCCAAAACCCCAAAACCCCACATAGTGAAT Primer pair for amplifying GAATTAATGAATAAGTTTGAG (SEQ ID No: chromosome, to be added to 296) mini-genome 11890.0_R CCCCAAAACCCCAAAACCCCGTGATAATAAA TTCCTGAGTATATAGTTTAAGAAG (SEQ ID No: 297) 13521.0_F CCCCAAAACCCCAAAACCCCGTGATTGCATT Primer pair for amplifying TTTTTGCGAAATATTTGC (SEQ ID No: chromosome, to be added to 298) mini-genome 13521.0_R CCCCAAAACCCCAAAACCCCTGAGTTCTCAT GTAATAAAAGAATCCATG (SEQ ID No: 299) 3511.0_F CCCCAAAACCCCAAAACCCCATGATGCTACA Primer pair for amplifying AAAACGCTATATAATCTATAAC (SEQ ID No: chromosome, to be added to 300) mini-genome 3511.0_R CCCCAAAACCCCAAAACCCCTTGAACTTTCA ATAGATGTTTGATTAAATTC (SEQ ID No: 301) 22209.0_F CCCCAAAACCCCAAAACCCCAAAGATATGTG Primer pair for amplifying GCTGGATTTTAAAATATGGTTG (SEQ ID No: chromosome, to be added to 302) mini-genome 22209.0_R CCCCAAAACCCCAAAACCCCAAGACTAATGA ATTTGAGAATTATAAAATAATGAATC (SEQ ID No: 303) 18924.0_F CCCCAAAACCCCAAAACCCCATCAACTTTAA Primer pair for amplifying TTCATTGTAGGAATTAAAGATGTAATAC (SEQ chromosome, to be added to ID No: 304) mini-genome 18924.0_R CCCCAAAACCCCAAAACCCCGTGAGAACAAA TAATAATAAAAATAAAGGAATTAA (SEQ ID No: 305) 14977.0_F CCCCAAAACCCCAAAACCCCAATTCTTTATCT Primer pair for amplifying GAATTAGATAAGAATTCATAAGC (SEQ ID chromosome, to be added to No: 306) mini-genome 14977.0_R CCCCAAAACCCCAAAACCCCGTGAGTATGCA ATAGATTGTTAATTAAATTTG (SEQ ID No: 307) 18694.0_F CCCCAAAACCCCAAAACCCCAAGTTGCTAAA Primer pair for amplifying AATAGTTGATAGCAACAAGTTAT (SEQ ID chromosome, to be added to No: 308) mini-genome 18694.0_R CCCCAAAACCCCAAAACCCCTGGATGTGTTT TTTTCCAAATTAATGAACAAAAATTAAA (SEQ ID No: 309) 13237.0_F CCCCAAAACCCCAAAACCCCAACATTCTAAA Primer pair for amplifying TTTCTTCTTTATAAGATTATTG (SEQ ID No: chromosome, to be added to 310) mini-genome 13237.0_R CCCCAAAACCCCAAAACCCCATCTAAACTAA TCTGAAACCAAAGATAGTATG (SEQ ID No: 311) 21338.0_F CCCCAAAACCCCAAAACCCCGTTATCCATAT Primer pair for amplifying ATACGTAAGCATTTTGCGATTG (SEQ ID No: chromosome, to be added to 312) mini-genome 21338.0_R CCCCAAAACCCCAAAACCCCGAAACCTATGC ATTATTTTTAAAGAAATATTAAATTAA (SEQ ID No: 313) 215.1_F CCCCAAAACCCCAAAACCCCTCGTACATTAA Primer pair for amplifying TAGTTGAAATTGCTTTTATTAAATTG (SEQ ID chromosome, to be added to No: 314) mini-genome 215.1_R CCCCAAAACCCCAAAACCCCGTAGTCTAAAA TAAATTTTATTTTGGGTTTTAA (SEQ ID No: 315) 13236.0_F CCCCAAAACCCCAAAACCCCGTTAAATGATA Primer pair for amplifying ATCATAGCAAAATTGCGGTAT (SEQ ID No: chromosome, to be added to 316) mini-genome 13236.0_R CCCCAAAACCCCAAAACCCCAAGGATAAATA TTGAAAGTAAATGTTCTAATTAATTTGC (SEQ ID No: 317) 16827.0_F CCCCAAAACCCCAAAACCCCAGAAATGAAAA Primer pair for amplifying GAATGATTTTTGAGGGGATTC (SEQ ID No: chromosome, to be added to 318) mini-genome 16827.0_R CCCCAAAACCCCAAAACCCCTAAAGGCAAAA GTCGATTTAAATGCTCAGTTTC (SEQ ID No: 319) 15136.0_F CCCCAAAACCCCAAAACCCCTTAAGGCTAAA Primer pair for amplifying ATACTTGTTTTACTAGAGAAC (SEQ ID No: chromosome, to be added to 320) mini-genome 15136.0_R CCCCAAAACCCCAAAACCCCATAAATCAAAT TAAATTGCATAACATGAAC (SEQ ID No: 321) 115.1_F CCCCAAAACCCCAAAACCCCAGAGGATGTAA Primer pair for amplifying ATTACAATAAATCGTAAAAAC (SEQ ID No: chromosome, to be added to 322) mini-genome 115.1_R CCCCAAAACCCCAAAACCCCTTCTAAAAAAT ATAAAGATAAATTGACGTC (SEQ ID No: 323) 21295.0_F CCCCAAAACCCCAAAACCCCATCCAGTTGAA Primer pair for amplifying ATCTAAAACAATTTTGTATATTTAAAG (SEQ chromosome, to be added to ID No: 324) mini-genome 21295.0_R CCCCAAAACCCCAAAACCCCTTAAGAGATTG CATTATAAATAAGATAGGATTC (SEQ ID No: 325) 16269.0_F CCCCAAAACCCCAAAACCCCATTGATTGATA Primer pair for amplifying AACTTGGAAGTTAAGAAAGATTTG (SEQ ID chromosome, to be added to No: 326) mini-genome 16269.0_R CCCCAAAACCCCAAAACCCCATGAATAACAG ATGGAATGCTTCAAGATATG (SEQ ID No: 327) 644.1_F CCCCAAAACCCCAAAACCCCAAATGTTAGTA Primer pair for amplifying TTTGAATTAAAGAGAGGTAAAAC (SEQ ID chromosome, to be added to No: 328) mini-genome 644.1_R CCCCAAAACCCCAAAACCCCTTATGAAAATG AAATGGTTTTGATTGGCTAATAA (SEQ ID No: 329) 5586.0_F CCCCAAAACCCCAAAACCCCATGAGTAAAAT Primer pair for amplifying TTAGCTTAAGTAATGTAAGAATC (SEQ ID chromosome, to be added to No: 330) mini-genome 5586.0_R CCCCAAAACCCCAAAACCCCATATATCAAAA TATCAACATTTTTTTGTGTGATTGTTAC (SEQ ID No: 331) 13085.0_F CCCCAAAACCCCAAAACCCCTTGATGAAATT Primer pair for amplifying TGAAAATGAATAGAGAGTAC (SEQ ID No: chromosome, to be added to 332) mini-genome 13085.0_R CCCCAAAACCCCAAAACCCCGTAATGCTACA TTTGCAAAAAAGTACAAACAG (SEQ ID No: 333) 13838.0_F CCCCAAAACCCCAAAACCCCGTAAGGCCAGA Primer pair for amplifying ATCAATGAATAAAAAGGTC (SEQ ID chromosome, to be added to No: 334) mini-genome 13838.0_R CCCCAAAACCCCAAAACCCCGAAAAGGGAG ATTTACAAAAATTTGTAGATGTTATATTG (SEQ ID No: 335) 1415.1_F CCCCAAAACCCCAAAACCCCATTGATCATTA Primer pair for amplifying ATAAAGAAGAATTGCTAATAT (SEQ ID No: chromosome, to be added to 336) mini-genome 1415.1_R CCCCAAAACCCCAAAACCCCAATGCGATGAA ATGTTTTTTATTATGAAAAG (SEQ ID No: 337) 19468.0_F CCCCAAAACCCCAAAACCCCAAGGAAGTTCA Primer pair for amplifying ATGCTATTTAGCAAATTAGG (SEQ ID chromosome, to be added to No: 338) mini-genome 19468.0_R CCCCAAAACCCCAAAACCCCTTGATTCAAAA TATGCACAAGATTAAAAATTCAC (SEQ ID No: 339) 20407.0_F CCCCAAAACCCCAAAACCCCATAAGAAAGAT Primer pair for amplifying AAGTTGCAATTAAATAATAAGG (SEQ ID No: chromosome, to be added to 340) mini-genome 20407.0_R CCCCAAAACCCCAAAACCCCATGAAGACAAG TCTGATGAAAATAGAATGG (SEQ ID No: 341) 19922.0_F CCCCAAAACCCCAAAACCCCATAGTCTTAAA Primer pair for amplifying ATTTTATACTATCATGAAATAATATTAAG (SEQ chromosome, to be added to ID No: 342) mini-genome 19922.0_R CCCCAAAACCCCAAAACCCCGTAAGTCTAAA GTTTAACAGTTTTTAGTAAATATC (SEQ ID No: 343) 20459.0_F CCCCAAAACCCCAAAACCCCTTATGCTAGTT Primer pair for amplifying GAGTGATTGAAAATATATTTGTGC (SEQ ID chromosome, to be added to No: 344) mini-genome 20459.0_R CCCCAAAACCCCAAAACCCCTTGACGTAGAA TAATGGGCTTATAGAAG (SEQ ID No: 345) 20493.0_F CCCCAAAACCCCAAAACCCCTTAATCAACTC Primer pair for amplifying ACTTTACCCACTAATCAAACAC (SEQ ID No: chromosome, to be added to 346) mini-genome 20493.0_R CCCCAAAACCCCAAAACCCCATATTTAAGAT ATACAGAAATATAGAGAATACAAC (SEQ ID No: 347) 9925.0_F CCCCAAAACCCCAAAACCCCATTGGATCAAT Primer pair for amplifying TTTGAAGAGAATTCATGGAAAAT (SEQ ID chromosome, to be added to No: 348) mini-genome 9925.0_R CCCCAAAACCCCAAAACCCCATCAGAAAAAA TATTTGAAAATTCGATAAAGC (SEQ ID No: 349) 22456.0_F CCCCAAAACCCCAAAACCCCATTTCACTTTAT Primer pair for amplifying TTATATATAGATTTGAAATTAAAGTT (SEQ ID chromosome, to be added to No: 350) mini-genome 22456.0_R CCCCAAAACCCCAAAACCCCAGTTGACATGT TATTTCCAAATTTTCATGGATA (SEQ ID No: 351) 17712.0_F CCCCAAAACCCCAAAACCCCATGATAACAGG Primer pair for amplifying AATATTTTATAAAATAGTTAAG (SEQ ID No: chromosome, to be added to 352) mini-genome 17712.0_R CCCCAAAACCCCAAAACCCCTCACTCTATGC AATAAATTTGTTGATATATT (SEQ ID No: 353) 11116.0_F CCCCAAAACCCCAAAACCCCTTAAAAAAAGA Primer pair for amplifying ATAGTTGGAATAAAAATGAATTT (SEQ ID chromosome, to be added to No: 354) mini-genome 11116.0_R CCCCAAAACCCCAAAACCCCAATAGATAAAG ATGCCTTTTTTAATAAGTATTTAAC (SEQ ID No: 355) 19275.0_F CCCCAAAACCCCAAAACCCCGAGAGGATAAA Primer pair for amplifying TTTATATGAAAATAAAAATAAAGC (SEQ ID chromosome, to be added to No: 356) mini-genome 19275.0_R CCCCAAAACCCCAAAACCCCATAAATAAGAA ATTTTAAGAATAACGGGCAAATTAG (SEQ ID No: 357) 21217.0_F CCCCAAAACCCCAAAACCCCTTGAATTTTAAA Primer pair for amplifying TAAACTTCTTTGTATGATTTAAATG (SEQ ID chromosome, to be added to No: 358) mini-genome 21217.0_R CCCCAAAACCCCAAAACCCCATAGATTACTT TTCAAAGAATTTCTTGACATTC (SEQ ID No: 359) 10537.0_F CCCCAAAACCCCAAAACCCCAAAGCAAAGAA Primer pair for amplifying ATCTGATGTTTTATTAGAAAAAGTG (SEQ ID chromosome, to be added to No: 360) mini-genome 10537.0_R CCCCAAAACCCCAAAACCCCATGAGATGATA ATATTGCCTTTTTGCATATAAT (SEQ ID No: 361) 22670.0_F CCCCAAAACCCCAAAACCCCATCCTTATACA Primer pair for amplifying AATTCAGAAAACTTAGCAAAT (SEQ ID No: chromosome, to be added to 362) mini-genome 22670.0_R CCCCAAAACCCCAAAACCCCGTGGAGAATTT TCTAAAGAATTTTCGGAAATTTG (SEQ ID No: 363) 1781.0_F CCCCAAAACCCCAAAACCCCACTAGTCTTAA PCR primers for amplifying ATATGAGAAAGATGATTTGAATAAG (SEQ ID synthetic chromosome 1 and 6 No: 364) in FIG. 5B 1781.0_R CCCCAAAACCCCAAAACCCCACAAGATAACA TTGCTAATTTTCAATAAATTAAAT (SEQ ID No: 365) 1781.0_Purple_F GTCAGTGGTCTCAGTATGAAATTTACCTTGG PCR primers for amplifying ATCCTCAGTGTTTCCCTTTGTG (SEQ ID No: purple DNA building block in 366) synthetic chromosomes 2-4 in 1781.0_Purple_R AACGCTCGGTCTCGCAGAAATAAATAATAGG FIG. 5B CAGCATAATAATTCAGG (SEQ ID No: 367) 1781.0_red_F GTCAGTGGTCTCTCCAGTGGATTCACATCAC PCR primers for amplifying TACCATTG (SEQ ID No: 368) red DNA building block in 1781.0_red_R CCCCAAAACCCCAAAACCCCACAAGATAACA synthetic chromosomes 2-4 TTGCTAATTTTCAATAAATTAAAT (SEQ ID in FIG. 5B No: 369) 1781.0_turquoise_F CCCCAAAACCCCAAAACCCCACTAGTCTTAA PCR primers for amplifying ATATGAGAAAGATGATTTGAATAAG (SEQ ID turquoise DNA building block No: 370) in synthetic chromosomes 2-4 1781.0_turquoise_R ACGCTCGGTCTCGATACAATCATTAGCATAT in FIG. 5B ACATGCAGTCTGCTTAAATTAAG (SEQ ID No: 371) DarkBlue_6mA_top TCTGTAATTAATGGATCACTATAATCATCTGG Oligos for annealing to make ATGACTATTGGTATTTGATGACGGTGCACAT blue DNA building block in TTGACTTCTT (SEQ ID No: 372) synthetic chromosomes 2-4 in DarkBlue_6mA_bottom ATTAATTACCTAGTGATATTAGTAGACCTACT FIG. 5B. Bold red nucleotides GATAACCATAAACTACTGCCACGTGTAAACT represent 6mA. GAAGAAGGTC (SEQ ID No: 373) 1781.0_red2_F AGCCTAGGTCTCGTTCTTTTTGAGGGCGAGG PCR primers for amplifying TTATTGTTAAT (SEQ ID No: 374) red DNA building block in 1781.0_red2_R CCCCAAAACCCCAAAACCCCACAAGATAACA synthetic chromosome 5 in T (SEQ ID No: 375) FIG. 5B 1781.0_orange_F TAGTCAGGTCTCTAGAATAGGCTCACTCTAA PCR primers for amplifying ATTCGAGTGCAAT (SEQ ID No: 376) orange DNA building block in 1781.0_orange_R TCTACTGGTCTCAGTATGAAATTTACCTTGGA synthetic chromosome 5 in FIG. TCCTCAGTGT (SEQ ID No: 377) 5B 1781.0_emerald_F ATCGTAGGTCTCAATACAATCATTAGCATATA PCR primers for amplifying CATGCAGT (SEQ ID No: 378) emerald DNA building block in 1781.0_emerald_R CCCCAAAACCCCAAAACCCCACTAGTCTTAA synthetic chromosome 5 in FIG. AT (SEQ ID No: 379) 5B 17535.0_F CCCCAAAACCCCAAAACCCCATCTCAATTTAT PCR primers for amplifying AAAATCAGAATAAGAGATTGTC (SEQ ID No: “buffer” chromosome 380) (Contig17535.0) for use in 17535.0_R CCCCAAAACCCCAAAACCCCAGAATAAAACA chromatin assembly ACTGAAGTAAATATGAGTTAC (SEQ ID No: 381) 12701assay_F AAGAAGAACTAGCCAGCTCTCACTCAGTTC PCR primers for assaying the (SEQ ID No: 382) presence of ectopic DNA 12701assay_R TGTCTATCTCATCAGGCTCATCAGCATAGG insertion in mta1 mutants (SEQ ID No: 383) 12701_firstround_T7_F AAGAAGAACTAGCCAGCTCTCACTCAGTTC PCR primers to generate DNA (SEQ ID No: 384) template for ssRNA in vitro 12701_firstround_T7_R CCTCTCTGCCCACTAAATTATTCTGACAGC transcription. This ssRNA is (SEQ ID No: 385) injected into Oxytricha cells to induce ectopic DNA retention in MTA1 gene. PCR product is amplified from Oxytricha gDNA of cell strain JRB310. The resulting PCR product is subjected to a second round of PCR amplification using primers “12701_secondround_T7_F” and “12701_secondround_T7_R”. The final, second round PCR product is then used for ssRNA in vitro transcription. 12701_secondround_T7_F CTACTTGATATAATACGACTCACTATAGGGAA PCR primers for second round TTCCTAAGGGGAGTGAAGCCAACAACAG amplification of DNA template, (SEQ ID No: 386) to be used for ssRNA in vitro 12701_secondround_T7_R TGTCTATCTCATCAGGCTCATCAGCATAGG transcription. Forward primer (SEQ ID No: 387) contains T7 promoter sequence, which is required for subsequent in vitro transcription. metGATC_F2 GTGCTATGCATTTTAAATTTATTCGCATTGAA PCR primers for amplification GA (SEQ ID No: 388) of DNA substrate for use in metGATC_R2 ATTCAGAATTTTAGTGTGTGGAGTATGATAGT 6mA methyltransferase assay A (SEQ ID No: 389) involving Tetrahymena nuclear noGATC2_F GGTCTATATTATTTTAGTATTCTTTCTATAAAT PCR primers for amplifying G (SEQ ID No: 390) 350 bp dsDNA substrate for noGATC2_R GTTACAAGAATATAAGAAAAGAAAGGGTGAA methyltransferase assays TAGG (SEQ ID No: 391) involving recombinant proteins (in FIGS. 2E, 2F, and 10H) T7noGATC2_F2 TAATACGACTCACTATAGGG PCR primers for amplifying GGTCTATATTATTTTAGTATTCTTTC (SEQ ID DNA ~350 bp dsDNA template No: 392) with T7 overhangs at one end, noGATC2_R GTTACAAGAATATAAGAAAAGAAAGGGTGAA for subsequent ssRNA TAGG (SEQ ID No: 393) production by in vitro transcription T7noGATC2_F2 TAATACGACTCACTATAGGG PCR primers for amplifying GGTCTATATTATTTTAGTATTCTTTC (SEQ ID DNA ~350 bp dsDNA template No: 394) with T7 overhangs at the 5′ T7noGATC2_R2 TAATACGACTCACTATAGGG and 3′ ends, for subsequent GTTACAAGAATATAAGAAAAG (SEQ ID No: dsRNA production by in vitro 395) transcription noGATC_F3 AACTTCTGTCATTACATTAAGCTTTAA (SEQ DNA oligonucleotides for use ID No: 396) in DNA methyltransferase noGATC_R3 TTAAAGCTTAATGTAATGACAGAAGTT (SEQ assays in FIGS. 2G, 10I, ID No: 397) 10J, 10K, 10L noGATC_F12 AACTTCTGTCATTAACTTAAGCTTTAA (SEQ ID No: 398) noGATC_R12 TTAAAGCTTAAGTTAATGACAGAAGTT (SEQ ID No: 399) noGATC_F13 AACTTCTGTACTTACATTAAGCTTTAA (SEQ ID No: 400) noGATC_R13 TTAAAGCTTAATGTAAGTACAGAAGTT (SEQ ID No: 401) noGATC_F14 AACTTCTGTACTTAACTTAAGCTTTAA (SEQ ID No: 402) noGATC_R14 TTAAAGCTTAAGTTAAGTACAGAAGTT (SEQ ID No: 403) noGATC_F1 AACTTCTGTCATTACATTAAGCTTTAAAAAAT TCAATTCCTTTTATT (SEQ ID No: 404) noGATC_R1 AATAAAAGGAATTGAATTTTTTAAAGCTTAAT GTAATGACAGAAGTT (SEQ ID No: 405) noGATC_F2 TGTCATTACATTAAGCTTTAAAAAATTCAATT CCT (SEQ ID No: 406) noGATC_R2 AGGAATTGAATTTTTTAAAGCTTAATGTAATG ACA (SEQ ID No: 407) noGATC_F3 AACTTCTGTCATTACATTAAGCTTTAA (SEQ ID No: 408) noGATC_R3 TTAAAGCTTAATGTAATGACAGAAGTT (SEQ ID No: 409) noGATC_F8 TATTAGAATTATGTTCTTCATGAAATT (SEQ ID No: 410) noGATC_R8 AATTTCATGAAGAACATAATTCTAATA (SEQ ID No: 411)
TABLE-US-00004 TABLE 3 Recombinant protein sequences. >MTA1 (manually curated from Tetrahymena DB gene ID: TTHERM_00704040) (SEQ ID No: 412) MSKAVNKKGLRPRKSDSILDHIKNKLDQEFLEDNENGEQSDEDYDQKS LNKAKKPYKKRQTQNGSELVISQQKTKAKASANNKKSAKNSQKLDEEE KIVEEEDLSPQKNGAVSEDDQQQEASTQEDDYLDRLPKSKKGLQGLLQ DIEKRILHYKQLFFKEQNEIANGKRSMVPDNSIPICSDVTKLNFQALI DAQMRHAGKMFDVIMMDPPWQLSSSQPSRGVAIAYDSLSDEKIQNMPI QSLQQDGFIFVWAINAKYRVTIKMIENWGYKLVDEITWVKKTVNGKIA KGHGFYLQHAKESCLIGVKGDVDNGRFKKNIASDVIFSERRGQSQKPE EIYQYINQLCPNGNYLEIFARRNNLHDNWVSIGNEL >MTA9 (manually curated from Tetrahymena DB gene ID: TTHERM_00301770) (SEQ ID No: 413) MAPKKQEQEPIRLSTRTASKKVDYLQLSNGKLEDFFDDLEEDNKPARN RSRSKKRGRKPLKKADSRSKTPSRVSNARGRSKSLGPRKTYPRKKNLS PDNQLSLLLKWRNDKIPLKSASETDNKCKVVNVKNIFKSDLSKYGANL QALFINALWKVKSRKEKEGLNINDLSNLKIPLSLMKNGILFIWSEKEI LGQIVEIMEQKGFTYIENFSIMFLGLNKCLQSINHKDEDSQNSTASTN NTNNEAITSDLTLKDTSKFSDQIQDNHSEDSDQARKQQTPDDITQKKN KLLKKSSVPSIQKLFEEDPVQTPSVNKPIEKSIEQVTQEKKFVMNNLD ILKSTDINNLFLRNNYPYFKKTRHTLLMFRRIGDKNQKLELRHQRTSD VVFEVTDEQDPSKVDTMMKEYVYQMIETLLPKAQFIPGVDKHLKMMEL FASTDNYRPGWISVIEK >p1 (manually curated from Tetrahymena DB gene ID: TTHERM_00161750) (SEQ ID No: 414) MSLKKGKFQHNQSKSLWNYTLSPGWREEEVKILKSALQLFGIGKWKKI MESGCLPGKSIGQIYMQTQRLLGQQSLGDFMGLQIDLEAVFNQNMKKQ DVLRKNNCIINTGDNPTKEERKRRIEQNRKIYGLSAKQIAEKLPKVKK HAPQYMTLEDIENEKFTNLEILTHLYNLKAEIVRRLAEQGETIAQPSI IKSLNNLNHNLEQNQNSNSSTETKVTLEQSGKKKYKVLAIEETELQNG PIATNSQKKSINGKRKNNRKINSDSEGNEEDISLEDIDSQESEINSEE IVEDDEEDEQIEEPSKIKKRKKNPEQESEEDDIEEDQEEDELVVNEEE IFEDDDDDEDNQDSSEDDDDDED >p2 (manually curated from Tetrahymena DB gene ID: TTHERM_00439330) (SEQ ID No: 415) MKKNSKSQNQPLDFTQYAKNMRKDLSNQDICLEDGALNHSYFLTKKGQ YWTPLNQKALQRGIELFGVGNWKEINYDEFSGKANIVELELRICMILG INDITEYYGKKISEEEQEEIKKSNIAKGKKENKLKDNIYQKLQQMQ
Sequences were manually curated by mapping RNaseq reads to reference gene annotations and verifying the accuracy of predicted exon boundaries.
Example 2
Epigenomic Profiles of Chromatin and Transcription in Oxytricha
[0203] We generated genome-wide in vivo maps of nucleosome positioning, transcription, and 6 mA in the macronuclei of asexually growing (vegetative) Oxytricha trifallax cells using Mnase sequencing (MNase-seq), poly(Ar RNA sequencing (RNA-seq), transcriptional start site sequencing (TSS-seq), and single-molecule real-time sequencing (SMRT-seq) (
TABLE-US-00005 TABLE 4 Descriptive statistics of 6mA distribution in the genome. Number of 6mA sites Oxytricha Tetrahymena Standard Standard Minimum Maximum Median Mean Deviation Minimum Maximum Median Mean Deviation Methyl 0 14 2 2.03 2.27 0 27 10 9.66 6.10 Cluster 1 Methyl 0 24 6 5.99 4.24 0 26 9 8.78 5.78 Cluster 2 Methyl 0 16 2 2.49 2.91 0 25 5 5.75 5.53 Cluster 3
[0204] Properties of 6 mA distribution in nucleosome linkers. In Oxytricha, methyl cluster 1=between 5′ chromosome end and +1 nucleosome; methyl cluster 2=between +1 and +2 nucleosome; methyl cluster 3=between +2 and +3 nucleosome. In Tetrahymena, methyl cluster 1=between +1 and +2 nucleosome; methyl cluster 2=between +2 and +3 nucleosome; methyl cluster 3=between +3 and +4 nucleosome. Consensus +1/+2/+3/+4 nucleosome positions: 193, 402, 618, 837 bp downstream of Oxytricha 5′ chromosome ends; 112, 304, 497, 698 bp downstream of Tetrahymena TSSs.
Example 3
Purification and Reconstitution of the Ciliate 6 mA Methyltransferase, MTA1c
[0205] To uncover the functions of 6 mA in vivo, we set out to identify and disrupt putative 6 mA methytransferases (MTases). The Oxytricha genome encodes a large number of candidate methyltransferases (Table 5), rendering it impractical to test gene function, one at a time or in combination. To identify the ciliate 6 mA MTase, we undertook a biochemical approach by fractionating nuclear extracts and identifying candidate proteins that co-purified with DNA methylase activity. The organism of choice for this experiment was Tetrahymena thermophila, a ciliate that divides significantly faster than Oxytricha (˜2 h versus 18 h; Cassidy-Hanley, 2012; Laughlin et al., 1983). This faster growth time rendered it feasible to culture large amounts of Tetrahymena cells for nuclear extract preparation. Tetrahymena and Oxytricha exhibit similar genomic localization and 6 mA abundance (
[0206] We prepared nuclear extracts from log-phase Tetrahymena cells, since 6 mA could be readily detected at this developmental stage through quantitative MS and PacBio sequencing (
[0207] We next investigated the phylogenetic relationship of MTA1 and MTA9 to other eukaryotic MT-A70 domain-containing proteins. Two widely studied mammalian MT-A70 proteins—METTL3 and METTL14 (Ime4 and Kar4 in yeast)-form a heterodimeric complex that is responsible for m6A methylation on mRNA. METTL3 is the catalytically active subunit, while METTL14 functions as an RNA-binding scaffold protein (Sledi arid Jinek, 2016; Wang et al., 2016a, 2016b). MTA1 and MTA9 derive from distinct monophyletic clades, outside of those that contain mammalian METTL3, METTL14, and C. elegans DAMT-1 (METTL4) (
[0208] We then sought to determine whether MTA1 and/or MTA9 are bona fide 6 mA methyltransferases. MTA1, but not MTA9, contains a catalytic DPPW motif (
[0209] Purification of the MTA1c proteins from an E. coli overexpression system raises the possibility of methyltransferase activity arising from contaminating Dam methylase; however, we exclude this possibility for three reasons. (1) The DNA substrate used in this assay does not contain 5′-NATC-3′ sites, which are recognized and methylated by Dam methylase (Horton et al., 2006). (2) Methyltransferase activity was only observed when all four recombinant proteins were incubated with DNA. If contaminating Dam methylase were present in one or more of these protein preparations, then background activity should be observed when subsets of these proteins are used in the assay. 3) Mutation of MTA1 catalytic residues leads to loss of methylation, which is also inconsistent with contaminating methyltransferase activity.
TABLE-US-00006 TABLE 5 Candidate genes in ciliates. MT-A70 genes in Oxytricha trifallax Gene name in UniProt ID this study OxyDB gene name J9IF92_9SPIT MTA1 Contig12701.0.0.g16 J9IGS7_9SPIT TAMT-1 Contig17486.0.g100 J9J9V7_9SPIT MTA1-B Contig16314.0.g25 J9HW68_9SPIT MTA9 Contig1237.1.g126 J9IMU5_9SPIT MTA9-B Contig17413.0.g36 MT-A70 genes in Tetrahymena thermophila Gene name in Tetrahymena Genome UniProt ID this study Database gene name Q22GC0_TETTS MTA1 TTHERM_00704040 Q23TW8_TETTS MTA2 TTHERM_00962190 I7LVP8_TETTS MTA3/TAMT-1-B TTHERM_00136470 I7MGX6_TETTS MTA4 TTHERM_00558100 Q23RE0_TETTS MTA5/TAMT-1 TTHERM_00388490 I7MIF9_TETTS MTA9 TTHERM_00301770 Q22XT1_TETTS MTA9-B TTHERM_01005150 METTL16 homologs in Oxytricha trifallax UniProt ID OxyDB gene name J9F3J7_9SPIT Contig11945.0.g48 J9J5P9_9SPIT Contig7462.0.g41 J9III0_9SPIT Contig4244.0.g39 N6AMT1 homologs in Oxytricha trifallax UniProt ID OyDB gene name J9IFV1_9SPIT Contig7751.0.g12 Accessory factor genes in Tetrahymena thermophila Gene name in Tetrahymena Genome UniProt ID this study Database gene name Q22VV9_TETTS p1 TTHERM_00161750 I7M8B9_TETTS p2 TTHERM_00439330 ISWI homologs in Oxytricha trifallax and Tetrahymena thermophile Tetrahymena Genome UniProt ID OxyDB gene name Database gene name I7M280_TETTS TTHERM_00137610 J9FBJ2_9SPIT Contig11737.0.g12
The Uniprot ID of each gene is listed. The Oxytricha macronuclear genome encodes five genes belonging to the MT-A70 family (Iyer et al., 2016; Swart et al., 2013). Such genes commonly function as RNA m6 A MTases in eukaryotes, having evolved from m.MunI-like MTases in bacterial restriction-modification systems (Iyer et al., 2016). An MT-A70 gene belonging to the METTL4 subclade, DAMT1, is a putative 6 mA methyltransferase in C. elegans (Greer et al., 2015). However, none of the Oxytricha MT-A70 genes in this Table cluster together with METTL4 on a phylogenetic tree (
TABLE-US-00007 TABLE 6 Mass spectrometry analysis of MTA1, MTA9, p1, and p2 proteins. Data from Low Salt Fraction Gene name in % of protein covered by peptide UniProt ID this study data from LC-MS/MS experiment Q22GC0_TETTS MTA1 78.8% I7MIF9_TETTS MTA9 46.3% Q22W9_TETTS p1 41.9% I7M8B9_TETTS p2 81.7% Data from High Salt Fraction Gene name in % of protein covered by peptide UniProt ID this study data from LC-MS/MS experiment Q22GC0_TETTS MTA1 69.9% I7MIF9_TETTS MTA9 72.2% Q22VV9_TETTS p1 55.3% I7M8B9_TETTS p2 93.4%
Percentage of each polypeptide that is covered by peptide data is calculated. “Low Salt Sample” and “High Salt Sample” correspond to partially purified nuclear extracts that elute as two distinct peaks of activity from a Q sepharose anion exchange column (
Example 4
[0210] MTA1c Preferentially Methylates ApT Dinucleotides in dsDNA
[0211] We next investigated the substrate preferences of MTA1c. First, in vitro transcription was performed to generate doublestranded RNA (dsRNA) and single-stranded RNA (ssRNA) from the input dsDNA substrate. We found that MTA1c methylates dsDNA but not dsRNA or ssRNA of the same sequence, indicating that it is selective for DNA over RNA (
[0212] Since 6 mA methylation mainly lies in ApT dinucleotides in vivo (
[0213] Given that 6 mA occurs on both strands of genomic DNA in vivo (
[0214] We then asked whether MTA1c activity is modulated not only by the dinucleotide motif sequence per se, but also by flanking sequences. This may manifest as the wide variation in frequency of DNA 4-mer containing a methylated ApT dinucleotide 5′-NA*TN-3′ in vivo (
Example 5
MTA1 is Necessary for 6 mA Methylation In Vivo
[0215] Having established that MTA1c is a 6 mA methyltransferase, we tested the role of MTA1c in mediating 6 mA methylation in vivo in Oxytricha, for which we have ease of generating mutants. The genome-wide localization of 6 mA is conserved between Oxytricha and Tetrahymena (
[0216] What are the phenotypic consequences of 6 mA loss in vivo? It has been proposed that DNA methylation—including 6 mA and cytosine methylation—is involved in nucleosome organization (Fu et al., 2015; Huff and Zilberman, 2014). We thus asked whether nucleosome organization is altered in mta1 mutants. We quantified nucleosome “fuzziness,” defined as the SD of MNase-seq read locations surrounding the called nucleosome peak (Lai and Pugh, 2017; Mavrich et al., 2008). A poorly positioned nucleosome consists of a shallow and wide peak of MNase-seq reads, manifested by a high fuzziness score. Nucleosomes were first grouped according to the change in flanking 6 mA between wild-type and mta1 mutant cells (
Example 6
6 mA Disfavors Nucleosome Occupancy Across the Genome In Vitro but not in Vivo
[0217] Multiple factors, including 6 mA, DNA sequence, and chromatin remodeling complexes, may collectively contribute to nucleosome organization in vivo. The effect of 6 mA could therefore be masked by these elements. We next sought to determine whether 6 mA directly impacts nucleosome organization. To this end, we assembled chromatin in vitro using Oxytricha gDNA, which contains cognate 6 mA. To obtain a matched negative control lacking DNA methylation, 98 complete chromosomes were amplified using PCR (
[0218] We then directly compared the impact of 6 mA on nucleosome occupancy in vitro and in vivo. Loss of 6 mA in vitro is achieved by mini-genome construction, while loss in vivo is achieved by the mta1 mutation. For each overlapping DNA window, we calculated the difference in nucleosome occupancy: (1) between native genome and mini-genome DNA in vitro, and (2) between wild-type and mta1 mutants in vivo (
TABLE-US-00008 TABLE 7 Descriptive statistics of reference genomes. Native genomic DNA Mini-genome DNA Chromosome 2449 +/− 742 2107 +/− 778 length (bp) Min = 1155 Min = 1201 Max = 6494 Max = 4659 SMRT-seq 177.4 +/− 117.0 205.3 +/− 136.1 coverage (x) Min = 75.1 Min = 77.8 Max = 1392.6 Max = 918.4 Total number 46,322 2,344 of 6mA marks in genome 6mA sites per 12 +/− 8 24 +/− 16 chromosome Min = 0 Min = 0 Max = 73 Max = 73 AT content (%) 67.8 +/− 3.0 66.5 +/− 2.7 Min = 55.7 Min = 60.2 Max = 76.2 Max = 72.2 RNAseq 34.4 +/− 75.2 53.7 +/− 71.5 (FPKM) Min = 0.0 Min = 0.1 Max = 1444.5 Max = 424.8
Properties of Oxytricha chromosomes in native genomic DNA and mini-genome DNA. “+/−” indicates one standard deviation above or below the mean.
Example 7
Modular Synthesis of Epigenetically Defined Chromosomes
[0219] The above experiments used kinetic signatures from SMRT-seq data to infer the presence of 6 mA marks in genomic DNA. We next sought to confirm that 6 mA is directly responsible for disfavoring nucleosomes in vitro, and to understand how this effect could be overcome by cellular factors. 6 mA-containing oligonucleotides were annealed and subsequently ligated with DNA building blocks to form full-length chromosomes. Importantly, these chromosomes contain 6 mA at all locations identified by SMRT-seq in vivo. The representative chromosome, Contig1781.0, is 1.3 kb, contains a clearly defined TSS, and encodes a single highly transcribed gene with a predicted RING finger domain. The length and gene structure are characteristic of typical Oxytricha chromosomes (
[0220] Four chromosome variants were synthesized, with cognate 6 mA sites on neither, one, or both DNA strands (chromosomes 1-4 in
Example 8
Chromatin Remodelers Restore Nucleosome Occupancy Over 6 mA Sites
[0221] Nucleosome occupancy in vivo is influenced not only by DNA sequences but also by trans-acting factors such as ATP-dependent chromatin remodeling factors (Struhl and Segal, 2013). We used synthetic, methylated chromosomes to test how the well-studied chromatin remodeler ACF responds to 6 mA in native DNA. ACF generates regularly spaced nucleosome arrays in vitro and in vivo (Clapier and Cairns, 2009; Ito et al., 1997). Its catalytic subunit ISWI is conserved across eukaryotes, including Oxytricha and Tetrahymena (Table 5). Synthetic chromosomes were assembled into chromatin by salt dialysis as before and then incubated with ACF in the presence of ATP (
Example 9
Disruption of MTA1 Impacts Gene Expression and Sexual Development
[0222] Since mta1 mutants exhibit genome-wide loss of 6 mA, we assayed these cells for transcriptional changes by poly(A).sup.+ RNAseq. Only a small minority of genes show significant changes in gene expression (10% false discovery rate [FDR];
[0223] Because the aforementioned phenotypic changes were assayed in vegetative Oxytricha cells, we asked whether MTA1 may play roles outside of this developmental state. MTA1 transcript levels are markedly upregulated in the sexual cycle, as assayed by poly(A). RNA-seq (
Example 10
Discussion
[0224] The present disclosure has identified MTA1c as a conserved, hitherto undescribed 6 mA methyltransferase. It consists of two MT-A70 proteins (MTA1/MTA9) and two homeobox-like proteins (p1/p2). The composition of MTA1c provides immediate insights into how it specifically methylates DNA (
[0225] The observation that MTA1c is more active in the presence of pre-methylated DNA templates is reminiscent of the CpG methyltransferase DNMT1. Yet, MTA1c and DNMT1 exhibit distinct protein domain architectures. Further biochemical studies are required to elucidate the molecular basis of this property. A distinct MT-A70 protein, named TAMT-1, was recently reported to act as a 6 mA methyltransferase in Tetrahymena, (Luo et al., 2018), suggesting that multiple enzymes mediate 6 mA deposition. It remains to be determined how MTA1c and TAMT-1 collectively mediate DNA methylation at various developmental stages, and whether cross-talk occurs between these enzymes.
[0226] In addition to identifying the ciliate 6 mA methyltransferase, we investigated the function of 6 mA in vitro by building epigenetically defined chromosomes. We show that 6 mA directly disfavors nucleosome occupancy in a local, quantitative manner, independent of DNA sequence (
[0227] Intriguingly, nucleosome organization exhibits only subtle changes after genome-wide loss of 6 mA (
[0228] More broadly, our work showcases the utility of Oxytricha chromosomes for advancing chromatin biology. By extending current technologies (Muller et al., 2016), it should be feasible to introduce both modified nucleosomes and DNA methylation in a site-specific manner on full-length chromosomes. Such “designer” chromosomes will serve as powerful tools for studying DNA-templated processes such as transcription within a fully native DNA environment.
REFERENCES
[0229] The following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference. [0230] Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. [0231] Ammermann, D., Steinbruck, G., Baur, R., and Wohlert, H. (1981). Methylated bases in the DNA of the ciliate Stylonychia mytilus. Eur. J. Cell Biol. 24, 154-156. [0232] An, W., and Roeder, R. G. (2004). Reconstitution and transcriptional analysis of chromatin in vitro. Methods Enzymol. 377, 460-474. [0233] Batut, P., Dobin, A., Plessy, C., Carninci, P., and Gingeras, T. R. (2013). Highfidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Res. 23, 169-180. [0234] Beh, L. Y., Muller, M. M., Muir, T. W., Kaplan, N., and Landweber, L. F. (2015). DNA-guided establishment of nucleosome patterns within coding regions of a eukaryotic genome. Genome Res. 25, 1727-1738. [0235] Beh et al., Identification of a DNA N6-Adenine Methyltransferase Complex and Its Impact on Chromatin Organization, Cell (2019), https://doi.org/10.1016/j.ce11.2019.04.028. [0236] Bern, M., Kil, Y. J., and Becker, C. (2012). Byonic: Advanced Peptide and Protein Identification Software. Curr. Protoc. Bioinformatics. 13, 13.20. [0237] Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., and Taylor, J. (2010). Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 19, 19.10.1-19.10.21. [0238] Bracht, J. R., Fang, W., Goldman, A. D., Dolzhenko, E., Stein, E. M., and Landweber, L. F. (2013). Genomes on the edge: programmed genome instability in ciliates. Cell 152, 406-416. [0239] Bromberg, S., Pratt, K., and Hattman, S. (1982). Sequence specificity of DNA adenine methylase in the protozoan Tetrahymena thermophila. J. Bacteriol. 150, 993-996. [0240] Brownell, J. E., Zhou, J., Ranalli, T., Kobayashi, R., Edmondson, D. G., Roth, S. Y., and Allis, C. D. (1996). Tetrahymena histone acetyltransferase A: a homolog to yeast Gcn5p linking histone acetylation to gene activation. Cell 84, 843-851. [0241] Cassidy-Hanley, D. M. (2012). Tetrahymena in the Laboratory: Strain Resources, Methods for Culture, Maintenance, and Storage. Methods Cell Biol. 109, 237-276. [0242] Chen, X., Bracht, J. R., Goldman, A. D., Dolzhenko, E., Clay, D. M., Swart, E. C., Perlman, D. H., Doak, T. G., Stuart, A., Amemiya, C. T., et al. (2014). The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development. Cell 158, 1187-1198. [0243] Clapier, C. R., and Cairns, B. R. (2009). The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 78, 273-304. [0244] Cummings, D. J., Tait, A., and Goddard, J. M. (1974). Methylated bases in DNA from Paramecium aurelia. Biochim. Biophys. Acta 374, 1-11. [0245] Debelouchina, G. T., Gerecht, K., and Muir, T. W. (2017). Ubiquitin utilizes an acidic surface patch to alter chromatin structure. Nat. Chem. Biol. 13, 105-110. [0246] Eisen, J. A., Coyne, R. S., Wu, M., Wu, D., Thiagarajan, M., Wortman, J. R., Badger, J. H., Ren, Q., Amedeo, P., Jones, K. M., et al. (2006). Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol. 4, e286. [0247] Eng, J. K., McCormack, A. L., and Yates, J. R. (1994). An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976-989. [0248] Engel, J. D., and von Hippel, P. H. (1978). Effects of methylation on the stability of nucleic acid conformations. Studies at the polymer level. J. Biol. Chem. 253, 927-934. [0249] Fang, W., Wang, X., Bracht, J. R., Nowacki, M., and Landweber, L. F. (2012). Piwi-interacting RNAs protect DNA against loss during Oxytricha genome rearrangement. Cell 151, 1243-1255. [0250] Finn, R. D., Clements, J., Arndt, W., Miller, B. L., Wheeler, T. J., Schreiber, F., Bateman, A., and Eddy, S. R. (2015). HMMER web server 2015 update. Nucleic Acids Res. 43 (W1), W30-W38. [0251] Fioravanti, A., Fumeaux, C., Mohapatra, S. S., Bompard, C., Brilli, M., Frandi, A., Castric, V., Villeret, V., Viollier, P. H. P., and Biondi, E. G. (2013). DNA binding of the cell cycle transcriptional regulator GcrA depends on N6-adenosine methylation in Caulobacter crescentus and other Alphaproteobacteria. PloS Genet. 9, e1003541. [0252] Fu, Y., Luo, G.-Z., Chen, K., Deng, X., Yu, M., Han, D., Hao, Z., Liu, J., Lu, X., Dore, L. C., et al. (2015). N6-methyldeoxyadenosine marks active transcription start sites in Chlamydomonas. Cell 161, 879-892. [0253] Fyodorov, D. V., and Kadonaga, J. T. (2003). Chromatin assembly in vitro with purified recombinant ACF and NAP-1. Methods Enzymol. 371, 499-515. [0254] Giardine, B., Riemer, C., Hardison, R. C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., et al. (2005). Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15, 1451-1455. [0255] Goecks, J., Nekrutenko, A., and Taylor, J.; Galaxy Team (2010). Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11, R86. [0256] Gorovsky, M A., Hattman, S., and Pleger, G. L. (1973). (6 N)methyl adenine in the nuclear DNA of a eucaryote, Tetrahymena pyriformis. J. Cell Biol. 56, 697-701. [0257] Gottschling, D. E., and Cech, T. R. (1984). Chromatin structure of the molecular ends of Oxytricha macronuclear DNA: phased nucleosomes and a telomeric complex. Cell 38, 501-510. [0258] Greer, E. L., Blanco, M. A., Gu, L., Sendinc, E., Liu, J., Aristizabal-Corrales, D., Hsu, C.-H., Aravind, L., He, C., and Shi, Y. (2015). DNA Methylation on N6-Adenine in C. elegans. Cell 161, 868-878. [0259] Haberle, V., Forrest, A. R. R., Hayashizaki, Y., Carninci, P., and Lenhard, B. (2015). CAGEr: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses. Nucleic Acids Res. 43, e51. [0260] Hattman, S., Kenny, C., Berger, L, and Pratt, K. (1978). Comparative study of DNA methylation in three unicellular eucaryotes. J. Bacteriol. 135, 1156-1157. [0261] Horton, J. R., Liebert, K., Bekes, M., Jeltsch, A., and Cheng, X. (2006). Structure and substrate recognition of the Escherichia coli DNA adenine methyltransferase. J. Mol. Biol. 358, 559-570. [0262] Huang, Y., Niu, B., Gao, Y., Fu, L., and U, W. (2010). CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680-682. [0263] Huang, J., Dong, X., Gong, Z., Qin, L-Y., Yang, S., Zhu, Y.-L, Wang, X., Zhang, D., Zou, T., Yin, P., et al. (2019). Solution structure of the RNA recognition domain of METTL3-METTL14 N6-methyladenosine methyltransferase. Protein Cell 10, 272-284. [0264] Huff, J. T., and Zilberman, D. (2014). Dnmt1-independent CG methylation contributes to nucleosome positioning in diverse eukaryotes. Cell 156, 1286-1297. [0265] Ito, T., Bulger, M., Pazin, M. J., Kobayashi, R., and Kadonaga, J. T. (1997). ACF, an ISWI-containing and ATP-utilizing chromatin assembly and remodeling factor. Cell 90, 145-155. [0266] Iyer, L M., Zhang, D., and Aravind, L (2016). Adenine methylation in eukaryotes: Apprehending the complex evolutionary history and functional potential of an epigenetic modification. BioEssays 38, 27-40. [0267] Karrer, K. M., and VanNuland, T. A. (1999). Nucleosome positioning is independent of histone H1 in vivo. J. Biol. Chem. 274, 33020-33024. [0268] Katoh, K., Rozewicki, J., and Yamada, K. D. (2017). MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. [0269] Kharchenko, P. V., Tolstorukov, M. Y., and Park, P. J. (2008). Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol. 26, 1351-1359. [0270] Khurana, J. S., Wang, X., Chen, X., Perlman, D. H., and Landweber, L F. (2014). Transcription-independent functions of an RNA polymerase II subunit, Rpb2, during genome rearrangement in the ciliate, Oxytricha trifallax. Genetics 197, 839-849. [0271] Khurana, J. S., Clay, D. M., Moreira, S., Wang, X., and Landweber, L. F. (2018). Small RNA-mediated regulation of DNA dosage in the ciliate Oxytricha. RNA 24, 18-29. [0272] Koziol, M. J., Bradshaw, C. R., Allen, G. E., Costa, A. S. H., Frezza, C., and Gurdon, J. B. (2016). Identification of methylated deoxyadenosines in vertebrates reveals diversity in DNA modifications. Nat. Struct. Mol. Biol. 23, 24-30. [0273] Kuraku, S., Zmasek, C. M., Nishimura, O., and Katoh, K. (2013). aLeaves facilitates on-demand exploration of metazoan gene family trees on MAFFT sequence alignment server with enhanced interactivity. Nucleic Acids Res. 41, W22-W28. [0274] Lai, W. K. M., and Pugh, B. F. (2017). Understanding nucleosome dynamics and their links to gene expression and DNA replication. Nat. Rev. Mol. Cell Biol. 18, 548-562. [0275] Langmead, B., and Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357-359. [0276] Laughlin, T. J., Henry, J. M., Phares, E. F., Long, M. V., and Olins, D. E. (1983). Methods for the Large-Scale Cultivation of an Oxytricha (Ciliophora: Hypotrichide). J. Protozool. 30, 63-64. [0277] Lauth, M. R., Spear, B. B., Neumann, J., and Prescott, D. M. (1976). DNA of ciliated protozoa: DNA sequence diminution during macronuclear development of Oxytricha. Cell 7, 67-74. [0278] Lawn, R. M., Neumann, J. M., Herrick, G., and Prescott, D. M. (1978). The genesize DNA molecules in Oxytricha. Cold Spring Harb. Symp. Quant. Biol. 42, 483-492. [0279] Liang, Z., Shen, L., Cui, X., Bao, S., Geng, Y., Yu, G., Liang, F., Xie, S., Lu, T., Gu, X., and Yu, H. (2018). DNA N6-Adenine Methylation in Arabidopsis thaliana. Dev. Cell 45, 406-416. [0280] Lieleg, C., Ketterer, P., Nuebler, J., Ludwigsen, J., Gerland, U., Dietz, H., Mueller-Planitz, F., and Korber, P. (2015). Nucleosome spacing generated by ISWI and CHD1 remodelers is constant regardless of nucleosome density. Mol. Cell. Biol. 35, 1588-1605. [0281] Liu, Y., Tavema, S. D., Muratore, T. L, Shabanowitz, J., Hunt, D. F., and Allis, C. D. (2007). RNAi-dependent H3K27 methylation is required for heterochromatin formation and DNA elimination in Tetrahymena. Genes Dev. 21, 1530-1545. [0282] Liu, J., Yue, Y., Han, D., Wang, X., Fu, Y., Zhang, L, Jia, G., Yu, M., Lu, Z., Deng, X., et al. (2014). A METTL3-METTL14 complex mediates mammalian nuclear RNA N6-adenosine methylation. Nat. Chem. Biol. 10, 93-95. [0283] Liu, J., Zhu, Y., Luo, G.-Z., Wang, X., Yue, Y., Wang, X., Zong, X., Chen, K., Yin, H., Fu, Y., et al. (2016). Abundant DNA 6 mA methylation during early embryogenesis of zebrafish and pig. Nat. Commun. 7,13052. [0284] Livak, K. J., and Schmittgen, T. D. (2001). Analysis of relative gene expression data using real-time quantitative PCR and the 2(−Delta Delta C(T)) Method. Methods 25, 402-408. [0285] Lugar, K., Rcchatcincr, T. J., and Richmond, T. J. (1900). Proparation of nucleosome core particle from recombinant histones. Methods Enzymol. 304, 3-19. [0286] Luo, G.-Z., Blanco, M. A., Greer, E. L., He, C., and Shi, Y. (2015). DNA N(6)-methyladenine: a new epigenetic mark in eukaryotes? Nat. Rev. Mol. Cell Biol. 16, 705-710. [0287] Luo, G.-Z., Hao, Z., Luo, L, Shen, M., Sparvoli, D., Zheng, Y., Zhang, Z., Weng, X., Chen, K., Cui, Q., et al. (2018). N6-methyldeoxyadenosine directs nucleosome positioning in Tetrahymena DNA. Genome Biol. 19, 200. [0288] Mavrich, T. N., loshikhes, I. P., Venters, B. J., Jiang, C., Tomsho, L P., Qi, J., Schuster, S. C., Albert, I., and Pugh, B. F. (2008). A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. Genome Res. 18, 1073-1083. [0289] Miao, W., Xiong, J., Bowen, J., Wang, W., Liu, Y., Braguinets, O., Grigull, J., Pearlman, R. E., Orias, E., and Gorovsky, M A. (2009). Microarray analyses of gene expression during the Tetrahymena thermophila life cycle. PLoS ONE 4,e4429. [0290] Miller, M A., Pfeiffer, W., and Schwartz, T. (2010). Creating the CIPRES Science Gateway for Inference of Large Phylogenetic Trees. Proceedings of the Gateway Computing Environments Workshop (GCE), 14 Nov. 2010, New Orleans, La. pp. 1-8. [0291] Mondo, S. J., Dannebaum, R. O., Kuo, R. C., Louie, K. B., Bewick, A. J., LaButti, K., Haridas, S., Kuo, A., Salamov, A., Ahrendt, S. R., et al. (2017). Widespread adenine N6-methylation of active genes in fungi. Nat. Genet. 49, 964-968. [0292] Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L., and Wold, B. (2008). Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621-628. [0293] M. M., Fierz, B., Bittova, L., Liszczak, G., and Muir, T. W. (2016). A two-state activation mechanism controls the histone methyltransferase Suv39h1. Nat. Chem. Biol. 12, 188-193. [0294] Murray, i.A., Morgan, R. D., Luyten, Y., Fomenkov, A., Correa, i.R., jr., Dai, Allaw, M. B., Zhang, X., Cheng, X., and Roberts, R. J. (2018). The non-specific adenine DNA methyltransferase M.EcoGll. Nucleic Acids Res. 46, 840-848. [0295] Nesvizhskii, A. I., Keller, A., Kolker, E., and Aebersold, R. (2003). A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646-4658. [0296] Nowacki, M., Vijayan, V., Zhou, Y., Schotanus, K., Doak, T. G., and Landweber, L. F. (2008). RNA-mediated epigenetic programming of a genome-rearrangement pathway. Nature 451, 153-158. [0297] Pendleton, K. E., Chen, B., Liu, K., Hunter, 0. V., Xie, Y., Tu, B. P., and Conrad, N. K. (2017). The U6 snRNA m6A Methyltransferase METTL16 Regulates SAM Synthetase Intron Retention. Cell 169, 824-835. [0298] Pratt, K., and Hattman, S. (1981). Deoxyribonucleic acid methylation and chromatin organization in Tetrahymena thermophila. Mol. Cell. Biol. 1, 600-608. [0299] Prescott, D. M. (1994). The DNA of ciliated protozoa. Microbiol. Rev. 58, 233-267. [0300] Rae, P. M., and Spear, B. B. (1978). Macronuclear DNA of the hypotrichous ciliate Oxytricha fallax. Proc. Natl. Acad. Sci. USA 75, 4992-4996. [0301] Rappsilber, J., Mann, M., and Ishihama, Y. (2007). Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat. Protoc. 2, 1896-1906. [0302] Schaffer, A. A., Aravind, L, Madden, T. L, Shavirin, S., Spouge, J. L., Wolf, Y. I., Koonin, E. V., and Altschul, S. F. (2001). Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994-3005. [0303] Schiffers, S., Ebert, C., Rahimoff, R., Kosmatchev, O., Steinbacher, J., Bohne, A.-V., Spada, F., Michalakis, S., Nickelsen, J., Mailer, M., and Carell, T. (2017). Quantitative LC-MS Provides No Evidence for m6 dA or m4 dC in the Genome of Mouse Embryonic Stem Cells and Tissues. Angew. Chem. Int. Ed. Engl. 56, 11268-11271. [0304] Sledz, P., and Jinek, M. (2016). Structural insights into the molecular mechanism of the m(6)A writer complex. eLife 5. Published online Sep. 14, 2016. https://doi.org/10.7554/eLife.18434. [0305] Strahl, B. D., Ohba, R., Cook, R. G., and Allis, C. D. (1999). Methylation of histone H3 at lysine 4 is highly conserved and correlates with transcriptionally active nuclei in Tetrahymena. Proc. Natl. Acad. Sci. USA 96, 14967-14972. [0306] Struhl, K., and Segal, E. (2013). Determinants of nucleosome positioning. Nat. Struct. Mol. Biol. 20, 267-273. [0307] Swart, E. G., Bracht, J. R., Magrini, V., Minx, P., Chen, X., Zhou, Y., Khurana, J. S., Goldman, A D., Nowacki, M., Schotanus, K., et al. (2013). The Oxytricha trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes. PLoS Biol. 11, e1001473. [0308] Tavema, S. D., Coyne, R. S., and Allis, C. D. (2002). Methylation of histone h3 at lysine 9 targets programmed DNA elimination in tetrahymena. Cell 110, 701-711. [0309] Wada, R. K., and Spear, B. B. (1980). Nucleosomal organization of macronuclear chromatin in Oxytricha fallax. Cell Differ. 9, 261-268. [0310] Wang, P., Doxtader, K. A., and Nam, Y. (2016a). Structural Basis for Cooperative Function of Mettl3 and Mettl14 Methyltransferases. Mol. Cell 63, 306-317. [0311] Wang, X., Feng, J., Xue, Y., Guan, Z., Zhang, D., Liu, Z., Gong, Z., Wang, Q., Huang, J., Tang, C., et al. (2016b). Structural basis of N(6)-adenosine methylation by the METTL3-METTL14 complex. Nature 534, 575-578. [0312] Wang, Y., Chen, X., Sheng, Y., Liu, Y., and Gao, S. (2017). N6-adenine DNA methylation is associated with the linker DNA of H2A2-containing well-positioned nucleosomes in Pol II-transcribed genes in Tetrahymena. Nucleic Acids Res. 45, 11594-11606. [0313] Warda, A. S., Kretschmer, J., Heckert, P., Lenz, C., Urlaub, H., Hobartner, C., Sloan, K. E., and Bohnsack, M. T. (2017). Human METTL16 is a N6-methyladenosine (m6A) methyltransferase that targets pre-mRNAs and various noncoding RNAs. EMBO Rep. 18, 2004-2014. [0314] Wei, Y., Mizzen, C. A., Cook, R. G., Gorovsky, M. A., and Allis, C. D. (1998). Phosphorylation of histone H3 at serine 10 is correlated with chromosome condensation during mitosis and meiosis in Tetrahymena. Proc. Natl. Acad. Sci. USA 95, 7480-7484. [0315] Wu, T. P., Wang, T., Seetin, M. G., Lai, Y., Zhu, S., Lin, K., Liu, Y., Byrum, S. D., Mackintosh, S. G., Zhong, M., et al. (2016). DNA methylation on N(6)-adenine in mammalian embryonic stem cells. Nature 532, 329-333. [0316] Xiao, R., and Moore, D. D. (2011). Dam IP: Using Mutant DNA Adenine Methyltransferase to Study DNA-Protein Interactions In Vivo. Curr. Protoc. Mol. Biol. 21. https://doi.org/10.1002/0471142727.mb2121s94. [0317] Xiao, C.-L., Zhu, S., He, M., Chen, D., Zhang, Q., Chen, Y., Yu, G., Liu, J., Xie, S.-Q., Luo, F., et al. (2018). N6-Methyladenine DNA Modification in the Human Genome. Mol. Cell 71, 306-318. [0318] Xiong, J., Lu, X., Lu, Y., Zeng, H., Yuan, D., Feng, L, Chang, Y., Bowen, J., Gorovsky, M., Fu, C., and Miao, W. (2011). Tetrahymena Gene Expression Database (TGED): a resource of microarray data and co-expression analyses for Tetrahymena. Sci. China Life Sci. 54, 65-67. [0319] Xiong, J., Lu, X., Zhou, Z., Chang, Y., Yuan, D., Tian, M., Zhou, Z., Wang, L, Fu, C., Orias, E., and Miao, W. (2012). Transcriptome analysis of the model protozoan, Tetrahymena thermophila, using Deep RNA sequencing. PLoS ONE 7, e30630. [0320] Yao, B., Cheng, Y., Wang, Z., Li, Y., Chen, L, Huang, L., Zhang, W., Chen, D., Wu, H., Tang, B., and Jin, P. (2017). DNA N6-methyladenine is dynamically regulated in the mouse brain following environmental stress. Nat. Commun. 8, 1122. [0321] Yao, B., Li, Y., Wang, Z., Chen, L., Poidevin, M., Zhang, C., Lin, L, Wang, F., Bao, H., Jiao, B., et al. (2018). Active N 6-Methyladenine Demethylation by DMAD Regulates Gene Expression by Coordinating with Polycomb Protein in Neurons. Mol. Cell 71, 848-857. [0322] Yerlici, V. T., and Landweber, L. F. (2014). Programmed Genome Rearrangements in the Ciliate Oxytricha. Microbiol. Spectr. 2. Published online December 2014. 10.1128/microbiolspec.MDNA3-0025-2014. [0323] Zhang, Z., and Pugh, B. F. (2011). High-resolution genome-wide mapping of the primary structure of chromatin. Cell 144, 175-186. [0324] Zhang, G., Huang, H., Liu, D., Cheng, Y., Liu, X., Zhang, W., Yin, R., Zhang, D., Zhang, P., Liu, J., et al. (2015). N6-methyladenine DNA modification in Drosophila. Cell 161, 893-906. [0325] Zhou, C., Wang, C., Liu, H., Zhou, Q., Liu, Q., Guo, Y., Peng, T., Song, J., Zhang, J., [0326] Chen, L., et al. (2018). Identification and analysis of adenine N6-methylation sites in the rice genome. Nat. Plants 4, 554-563.
[0327] The embodiments described in this disclosure can be combined in various ways. Any aspect or feature that is described for one embodiment can be incorporated into any other embodiment mentioned in this disclosure. While various novel features of the inventive principles have been shown, described and pointed out as applied to particular embodiments thereof, it should be understood that various omissions and substitutions and changes may be made by those skilled in the art without departing from the spirit of this disclosure. Those skilled in the art will appreciate that the inventive principles can be practiced in other than the described embodiments, which are presented for purposes of illustration and not limitation.