COMPOSITIONS AND METHODS FOR TREATING OR AMELIORATING A MYCOBACTERIUM TUBERCULOSIS INFECTION
20230076063 · 2023-03-09
Inventors
- Faramarz VALAFAR (San Diego, CA, US)
- Samuel MODLIN (San Diego, CA, US)
- Derek CONKLE-GUTIERREZ (San Diego, CA, US)
Cpc classification
A61P31/00
HUMAN NECESSITIES
A61P43/00
HUMAN NECESSITIES
C12Y201/01072
CHEMISTRY; METALLURGY
C12Q2600/106
CHEMISTRY; METALLURGY
International classification
C12N15/113
CHEMISTRY; METALLURGY
Abstract
In alternative embodiments, provided are products of manufacture and kits, and methods, for treating or ameliorating a Mycobacterium tuberculosis (TB) or a Mycobacterium africanum infection. In alternative embodiments, provided are products of manufacture and kits, and methods, that comprise or comprise use of DNA methylation inhibitory molecules for treating or ameliorating a Mycobacterium tuberculosis (TB) infection. In alternative embodiments, provided are methods and device for classifying drug-resistance phenotype, or diagnosing Multi-drug resistant Tuberculosis (MDR-TB), eXtensively Drug Resistant phenotype (XDR) tuberculosis, or for clinical decision support. In alternative embodiments, provided are kits for or treating or diagnosing drug resistance of, prognosing, or assisting in clinical decision making for a Mycobacterium tuberculosis (TB) or the Mycobacterium africanum infection.
Claims
1: A method for treating or ameliorating a Mycobacterium tuberculosis (TB) or a Mycobacterium africanum infection, comprising inhibiting DNA methylation in an infecting Mycobacterium tuberculosis (TB) or a Mycobacterium africanum bacterium or bacterial population, the method comprising administering to an individual in need thereof a DNA methylation inhibitory molecule capable of inhibiting a Mycobacterium tuberculosis or a Mycobacterium africanum DNA methyltransferase, wherein optionally the DNA methylation inhibitory molecule is formulated as a pharmaceutical composition, or is formulated for administration in vivo; or formulated for enteral or parenteral administration, or for oral, intravenous (IV) or intrathecal (IT) administration, wherein optionally the compound or formulation is administered orally, parenterally, by inhalation spray, nasally, topically, intrathecally, intrathecally, intracerebrally, epidurally, intracranially or rectally, and optionally the DNA methylation inhibitory molecule or the formulation or pharmaceutical composition is contained in or carried in a nanoparticle, a particle, a micelle or a liposome or lipoplex, a polymersome, a polyplex or a dendrimer, and optionally the DNA methylation inhibitory molecule, or the formulation or pharmaceutical composition, is formulated as, or contained in, a nanoparticle, a liposome, a tablet, a pill, a capsule, a gel, a geltab, a liquid, a powder, an emulsion, a lotion, an aerosol, a spray, a lozenge, an aqueous or a sterile or an injectable solution, or an implant, and optionally the DNA methylation inhibitory molecule is an inhibitory nucleic acid, the optionally the inhibitory nucleic acid is contained in a nucleic acid construct or a chimeric or a recombinant nucleic acid, or an expression cassette, vector, plasmid, phagemid or artificial chromosome, optionally stably integrated into a TB cell's chromosome, or optionally stably episomally expressed in a TB cell, and optionally the inhibitory nucleic acid is or comprises: an RNAi inhibitory nucleic acid molecule, a double-stranded RNA (dsRNA) molecule, a microRNA (mRNA), a small interfering RNA (siRNA), an antisense RNA, a short hairpin RNA (shRNA), or a ribozyme.
2: The method of claim 1, wherein the Mycobacterium tuberculosis or the Mycobacterium africanum DNA methyltransferase is a methyltransferase selected from the group consisting of MamA, MamB and HsdM.
3: The method of claim 1, wherein the DNA methylation inhibitory molecule capable of inhibiting a Mycobacterium tuberculosis or Mycobacterium africanum DNA methyltransferase is or comprises a small molecule, an inhibitory nucleic acid (optionally and miRNA or antisense molecule), polypeptide or peptide (optionally an antibody capable of specifically binding to the Mycobacterium tuberculosis or Mycobacterium africanum DNA methyltransferase and inhibiting its expression or activity, a lipid or a polysaccharide.
4: A kit for or treating or ameliorating a tuberculosis (TB) infection, wherein optionally Mycobacterium tuberculosis (TB) or Mycobacterium africanum is the microbacterial agent of infection, comprising a DNA methylation inhibitory molecule capable of inhibiting a Mycobacterium tuberculosis or Mycobacterium africanum DNA methyltransferase, wherein optionally the DNA methylation inhibitory molecule is or comprises a DNA methylation inhibitory molecule used to practice a method of claim 1, and optionally the kit further comprises instructions for practicing a method of any of the preceding claims.
5: A method for treating or ameliorating a tuberculosis (TB) infection, wherein optionally Mycobacterium tuberculosis (TB) or Mycobacterium africanum is the microbacterial agent of infection, comprising inhibiting expression of at least one gene as set forth in Table 1 (
6: The method of claim 5, wherein the molecule capable of inhibiting expression of the gene or a polypeptide encoded by the gene is or comprises a small molecule, an inhibitory nucleic acid (optionally and miRNA or antisense molecule), polypeptide or peptide (optionally an antibody capable of specifically binding to the Mycobacterium tuberculosis or the Mycobacterium africanum DNA methyltransferase and inhibiting its expression or activity, a lipid or a polysaccharide.
7: A kit for or treating or ameliorating a Mycobacterium tuberculosis (TB) or a Mycobacterium africanum infection, comprising a molecule capable of inhibiting expression of at least one gene as set forth in Table 1 (
8: A method for identifying targets for treating, ameliorating, diagnosing, or prognosing infection by a microbial agent, the method comprising an analysis of single-molecule sequencing data, wherein the analysis comprises deducing knowledge of a DNA sequence and the boundaries of genetic elements encoded therein and deducing knowledge of the base modification status of bases comprising the deduced DNA sequence.
9: The method of claim 8, wherein the method provides evidence of druggability and/or utility to a user for helping to clear microbial infection, and the method further comprising a series of single-molecule sequencing data processing steps that incorporate signals of DNA sequence order and DNA sequence modification, such that their coincidence is inferred and coincidences between base modification and identified genetic elements of the sequence that evidence druggability and/or utility for helping to clear microbial infection are returned to the user.
10: The method of claim 8, wherein the genetic elements encoding a plurality of base modifying enzymes are deduced and/or prior knowledge of the identity of a plurality of genetic elements encoding base modifying enzymes are collated and correlated to sequencing kinetics of sequence contexts that are known/deduced to methylate, in order to deduce of the presence or absence of the phenomenon of intercellular mosaic methylation in the analyzed sample.
11: The method of claim 8, wherein the single-molecule sequencing data is processed through a series of analyses and returns estimates of the likelihood of prognostic outcomes based on the presence, absence, or contingencies dictating the presence/absence of the phenomenon of intercellular mosaic methylation to the user of the embodiment.
12-32. (canceled)
Description
DESCRIPTION OF DRAWINGS
[0046] The drawings set forth herein are illustrative of exemplary embodiments provided herein and are not meant to limit the scope of the invention as encompassed by the claims.
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057] as described in further detail in Example 1, below.
[0058]
[0059] as described in further detail in Example 1, below.
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068] as described in further detail in Example 1, below.
[0069]
[0070]
[0071]
[0072]
[0073] as described in further detail in Example 1, below.
[0074]
[0075]
[0076] as described in further detail in Example 1, below.
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
[0083] as described in further detail in Example 1, below.
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090] as described in further detail in Example 1, below.
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097]
[0098] as described in further detail in Example 1, below.
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
[0106] as described in further detail in Example 1, below.
[0107]
[0108]
[0109]
[0110]
[0111] as described in further detail in Example 1, below.
[0112]
[0113]
[0114]
[0115] as described in further detail in Example 1, below.
[0116]
[0117]
[0118]
[0119]
[0120] as described in further detail in Example 1, below.
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
[0130]
[0131] as described in further detail in Example 2, below.
[0132]
[0133]
[0134]
[0135] as described in further detail in Example 2, below.
[0136] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0137] Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION
[0138] In alternative embodiments, provided are compositions, including products of manufacture and kits, and methods, for treating or ameliorating a Mycobacterium tuberculosis (TB) or a Mycobacterium africanum infection.
[0139] We have identified 371 M. tuberculosis genomic loci in gene promoters that are modified heterogeneously within clinical isolates. This affects gene expression, through their interaction with effectors of transcription. Of the identified loci, 63 fall within promoters of 33 genes whose products influence clinically important phenotypes (drug persistence, resistance, and tolerance). Heterogeneous modification at these loci can cause differential expression of these genes across bacilli of a population. Those bacilli whose differential expression offers an advantage under the environmental pressures (e.g. drug pressure) will survive and propagate, reducing drug treatment efficacy in infections that appear genetically susceptible to the prescribed antibiotics.
[0140] The fully de novo assembled set of DNA methylomes and genomes we have generated comprise the largest such set for M. tuberculosis, and, to our knowledge, any pathogen. This analysis was enabled by our unique methods of assembly and annotation, enabling comprehensive identification of these the modification sites for the first time. We have verified that these sites are present across global strains of M. tuberculosis and identified a subset of the sites described above that are positioned to modulate transcription of genes that are present across all or most clinical isolates examined, a challenging key step toward demonstrating their utility as therapeutic and diagnostic targets. Moreover, we have created methods for measuring their propensity to vary across members of the bacterial population isolated from a patient, demonstrating utility in prognostics that assess the capacity of a sample to persist until more permanent mechanisms of resistance to therapeutics emerges. We have identified a specific set of 33 genes whose expression influences key clinical phenotypes and are affected by methylation. These genes include efflux pumps and their regulators that influence drug tolerance, regulators of metabolic downshifts shown to induce a dormant phenotype, toxin-antitoxin modules that induce persistence through post-transcriptional mechanisms, and genes encoding products known to dictate resistance levels through both intrinsic and acquired mechanisms, and their regulation.
[0141] Described herein are molecular targets in the M. tuberculosis genome for the development of drugs, diagnostic tools, prognostic indicators, and clinical decision support for TB infection. These sites have been invisible to scientists despite widespread DNA sequencing because they operate above the DNA level, through the addition of chemical tags to DNA. These tags can change how much of certain genes in the bacteria are used. The use of these affected genes can help the bacteria withstand drug treatments without dying and are undetectable with existing diagnostics. Prioritizing these sites as targets for developing drugs, diagnostics, and prognostics hold promise to improve the toolkit available to doctors to effectively treat TB patients, and epidemiologists to better control the TB pandemic, which kills more adults than Malaria, AIDS, and all tropical diseases combined. This improved toolkit will enable doctors to make more informed treatment decisions and infectious disease scientists to more effectively control TB outbreaks.
[0142] Provided herein is a method of epigenetic diagnostics that can measure the markers that change rapidly (within a day) in response to the environmental and drug pressures. As such, our approach can be used as soon as the first day of treatment for detecting drug resistance and persistence that genetic markers are able to detect months later.
[0143] Most if not all bacteria, including pathogens, possess nucleic acid modifying enzymes. Nucleic acid (base) modification refers to the addition of chemical species to a DNA base. When “epigenetic mosaicism” is occurring, these enzymes modify their target nucleic acids incompletely within each cell, giving rise to a subpopulation of bacteria giving rise to a mosaic of modified and unmodified DNA bases in bacteria within infected tissues. The modification status at these bases can alter the phenotypes of infecting bacteria in clinically meaningful ways, affecting treatment outcome. “Epigenetic mosaicism” is our discovery and we have coined it as such in Example 1 for the first time.
[0144] We use DNA methylation as the example base modification throughout this document; the DNA base modification is used to demonstrate the principle of this invention. However, methods as described herein are not exclusive to DNA methylation, but can applies to other DNA base modifications as well. When we use the term “intercellular mosaic methylation”, the principle can be similarly applied to mosaic base modifications of other chemical species, the portion of the method involving inferring intercellular mosaic methylation from genotype, however, is restricted to base modifications conferred by genetically encoded mechanisms. We use sequencing kinetics as the signature for measuring modification from Pacific Biosciences SMRT-sequencing, and principles for using the measured current changes are from Oxford Nanopore data.
[0145] Intercellular mosaic methylation is a previously undescribed form of epigenetic heterogeneity where the base modification is DNA methylation. Intercellular mosaic methylation emerges from the previously described “intracellular stochastic methylation”, with the additional knowledge that kinetics across reads mapping to a particular site displays average kinetics that resemble neither invariable methylation, nor invariable non-methylation. This knowledge, in combination with observed intracellular stochastic methylation, imply a diverse array of combinations of methylated and nonmethylated sites across the cells the DNA sequences originated from. This diversity of combinations of methylated bases is what we refer to as “intercellular mosaic methylation.” Intercellular mosaic methylation is notably distinct from the comparatively well-described phenomenon of phase variant methylation (
[0146] For example, see
[0147] These different methylation combinations cause differential phenotypes between members of infecting bacterial population. Based on this realization, provided herein are methods to: [0148] 1. Identify “intercellular mosaic methylation” from sequencing kinetics of isolated DNA from an organism, and the genetic or environmental factors that result in intercellular mosaic methylation. Subsequent steps are explained using the sequencing kinetics data, but the principle holds for Nanopore data as well, using current instead of kinetics. [0149] a. Identify the presence or absence of intercellular mosaic methylation from sequencing kinetics; [0150] b. Incorporating genomic data with (la) to identify Gene alleles for base-modifying enzymes that confer constitutive epigenetic mosaicism; [0151] c. Environmental or genetic manipulation prior to sequencing to identify mutations, nutrient constraints, and stressors that cause or induce intercellular mosaic methylation. [0152] 2. Identify DNA bases (loci) that will be differentially affected (because they are methylated in some cells and not in others) across cells of the sequenced population. [0153] 3. Incorporate genomic annotation data to identify loci from (2) that are phenotypically consequential. For pathogens, the focus is on loci identify affecting clinically important phenotypes (e.g. drug tolerance, resistance, persistence, and entry to dormancy).
[0154] We have applied methods as provided herein to Mycobacterium tuberculosis (M. tuberculosis), the primary bacterial cause of Tuberculosis, which killed more humans (1.5 million) than any other infectious disease in 2018. Applying our methods to M. tuberculosis revealed several dozen loci targeted by base-modifying enzymes of M. tuberculosis and are positioned to alter expression of genes responsible for differential resistance within M. tuberculosis bacteria (in Example 1, see Table 1,
[0155] Example 1 describes intercellular mosaic methylation affecting areas of the genome in a bacterial pathogen, Mycobacterium tuberculosis, that are positioned such that they likely alter the level at which genes are expressed (Table 2, see
[0156] One of the key technical advancements described herein is using the dual presence of within-read methylation heterogeneity and the kinetic average at a single site to demonstrate mosaicism in patterns throughout the colony. We further innovated by developing methods to maps that the genotype of the modifying enzymes to basal epigenetic mosaicism and demonstrating that such mosaicking is inducible by nutritional stress in bacterial strains with base modifying enzymes that do not cause epigenetic mosaicism at baseline. Methods as provided herein also incorporate genome annotations to infer loci where modification status is most likely phenotypically consequential.
[0157] We show that intercellular mosaic methylation is 1) detectable through analysis of sequencing kinetics data, 2) that it is likely to affect expression of genes mediating survival probability under drug treatment in the bacterial pathogens responsible for the most deaths of any infectious disease agent in the world, Mycobacterium tuberculosis 3) that it can be caused by genotype of base-modifying enzymes or induced by nutrient starvation.
[0158] We have applied a computational pipeline to Mycobacterium tuberculosis. Through this, we discovered intercellular mosaic methylation (a form of epigenetic heterogeneity), and that it is constitutively present in some strains of M. tuberculosis isolated from patients, and absent in others. Moreover, we have discovered that this constitutive intercellular mosaic methylation is determined by genotype (See
[0159] We describe several hundred loci that are targeted by MTases and positioned to alter expression of genes through their effect on promoter strength and interaction with various molecular effectors of M. tuberculosis transcription. These include influencers of persistence, drug resistance, and drug tolerance M. tuberculosis. This catalog of MTase allele relationship to constitutive intercellular mosaic methylation allows development of diagnostic and prognostic tools of heteroresistance and persistence in M. tuberculosis through targeted genotypic assays. The results of these assays will inform infection control agencies and physicians of the capacity for isolates to heterogeneously modulate antibiotic resistance, drug tolerance levels, and persister cell formation propensity on a patient-specific basis. These phenomena often occur far below the sensitivity thresholds of extant diagnostic tools, challenging informed treatment and containment protocols by current methods. Finally, we envision a new line of TB treatment using phage therapy. This radically new approach uses the results of our MTase genotype analysis to design phages that can alter the infecting cells into a state that prevents the cells from diversification (through methylation phasing or intercellular mosaic methylation).
[0160] Data described herein demonstrates that MTase genotype is consistently predictive of methylation activity level, and of constitutive intercellular mosaic methylation. Moreover, the set of M. tuberculosis clinical isolates we have studied demonstrate: [0161] 1. Differential constitutive DNA methylation affects more of the genome than single-nucleotide polymorphisms, the most commonly compared information between strains for molecular diagnostics. [0162] 2. DNA Methylation is invariably present across all five major lineages of M. tuberculosis, and highly variable between strains in a manner that is not consistent with phylogeny alone. [0163] 3. Constitutive intercellular mosaic methylation is more frequent in hypervirulent and extensively drug-resistant strains. [0164] 4. By compiling the largest library of MTase allele MTase activity mappings in the world, and through methodological advancements, we have corrected previously published mappings; see e.g.,
[0165]
[0166] In the study described in Example 1, the methylomes of a global collection of 93 clinical isolates from all seven lineages of the M. tuberculosis complex (MTBC) were analyzed. The sequence of each isolate was de novo assembled into complete, circularized genomes and integrated with gene, promoter, and transcription factor binding site data, see
DNA Methyltransferase Inhibitory Molecules
[0167] In alternative embodiments, provided are products of manufacture and kits, and methods, that comprise or comprise use of DNA methylation inhibitory molecules for treating or ameliorating a Mycobacterium tuberculosis (TB) infection. In alternative embodiments, the DNA methylation inhibitory molecules can comprise small molecules, inhibitory nucleic acids and antibodies inhibitory to DNA methyltransferases (MTases) including MamA, MamB, and HsdM. In alternative embodiments, a DNA methylation inhibitory molecule is used as described in Yadav M K, et al (2015) The Small Molecule DAM Inhibitor, Pyrimidinedione, Disrupts Streptococcus pneumoniae Biofilm Growth In Vitro. PLoS ONE 10(10): e0139238; and as illustrated in
Products of Manufacture and Kits
[0168] Provided are products of manufacture and kits for practicing methods as provided herein, including DNA methylation inhibitory molecules, including for example, small molecules, inhibitory nucleic acids and antibodies inhibitory to DNA methyltransferases (MTases) including MamA, MamB, and HsdM.
[0169] Any of the above aspects and embodiments can be combined with any other aspect or embodiment as disclosed here in the Summary and/or Detailed Description sections.
[0170] As used in this specification and the claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
[0171] Unless specifically stated or obvious from context, as used herein, the term “or” is understood to be inclusive and covers both “or” and “and”.
[0172] Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from the context, all numerical values provided herein are modified by the term “about.”
[0173] The entirety of each patent, patent application, publication and document referenced herein hereby is incorporated by reference. Citation of the above patents, patent applications, publications and documents is not an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents. Incorporation by reference of these documents, standing alone, should not be construed as an assertion or admission that any portion of the contents of any document is considered to be essential material for satisfying any national or regional statutory disclosure requirement for patent applications. Notwithstanding, the right is reserved for relying upon any of such documents, where appropriate, for providing material deemed essential to the claimed subject matter by an examining authority or court.
[0174] Modifications may be made to the foregoing without departing from the basic aspects of the invention. Although the invention has been described in substantial detail with reference to one or more specific embodiments, those of ordinary skill in the art will recognize that changes may be made to the embodiments specifically disclosed in this application, and yet these modifications and improvements are within the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element(s) not specifically disclosed herein. Thus, for example, in each instance herein any of the terms “comprising”, “consisting essentially of”, and “consisting of” may be replaced with either of the other two terms. Thus, the terms and expressions which have been employed are used as terms of description and not of limitation, equivalents of the features shown and described, or portions thereof, are not excluded, and it is recognized that various modifications are possible within the scope of the invention. Embodiments of the invention are set forth in the following claims.
[0175] The invention will be further described with reference to the examples described herein; however, it is to be understood that the invention is not limited to such examples.
EXAMPLES
[0176] Unless stated otherwise in the Examples, all recombinant DNA techniques are carried out according to standard protocols, for example, as described in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, NY and in Volumes 1 and 2 of Ausubel et al. (1994) Current Protocols in Molecular Biology, Current Protocols, USA. Other references for standard molecular biology techniques include Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, NY, Volumes I and II of Brown (1998) Molecular Biology LabFax, Second Edition, Academic Press (UK). Standard materials and methods for polymerase chain reactions can be found in Dieffenbach and Dveksler (1995) PCR Primer: A Laboratory Manual, Cold Spring Harbor Laboratory Press, and in McPherson at al. (2000) PCR-Basics: From Background to Bench, First Edition, Springer Verlag, Germany.
Example 1: Epigenetic Mosaicism in Human Pathogen Mycobacterium tuberculosis Permits Rapid Adaptation without Genetic Mutation
[0177] This study analyzes the methylomes of 93 Mycobacterium tuberculosis complex clinical isolates, representing seven lineages, the largest to date from a single species. By integrating DNA methylation data with fully annotated, de novo assembled finished genomes, we uncovered three key findings. First, gene promoters are frequently methylated, including the promoters of notable resistance and dormancy regulators. Second, isolates from different lineages often share methyltransferase activity profiles, demonstrating epigenetic similarity between genetically distant strains, yet few isolates match type strain H37Rv. Finally, intracellular stochastic DNA methylation generates a mosaic of methylomes within isogenic colonies, increasing phenotypic diversity. This “intercellular mosaic methylation” was driven by methyltransferase mutations in 40 isolates and could also be induced by methionine starvation. Mutation-driven intercellular mosaic methylation was most prevalent in the Beijing sublineage, potentially contributing to its global success. Intercellular mosaic methylation provides an epigenetic mechanism of phenotypic plasticity in M. tuberculosis, demonstrating an adaptive strategy previously undescribed in pathogens.
Results
[0178] Approach: We sought to compare methylomes across a global collection of mostly M/XDR clinical isolates (
[0179]
Epigenomic Convergence Across Lineages
[0180] For every isolate, the average Inter Pulse Duration (IPD) ratio was calculated across the reads mapping to each base.sup.17. To characterize the noise in these measurements, we compared the IPD ratios at each base between technical replicates of reference strain H37Ra (see
[0181]
[0182] We then identified all bases matching the known target motifs.sup.3 of established M. tuberculosis MTases and examined their IPD ratios (
[0183]
Virulent M. tuberculosis Type Strain H37Rv Poorly Represents Methylomes of Recent Clinical Isolates.
[0184] With numerous distinct MTase activity profiles, we asked how well the commonly used virulent M. tuberculosis type strain H37Rv represents the methylomes of modern clinical isolates. In H37Rv both MamB and HsdM are inactive, while MamA is active, a rare activity profile shared with only 3% of clinical isolates (Table 3). Among the 42 isolates with a mamA knockout or knockdown mutation, a median of 3,424 MamA sites were differentially methylated from H37Rv. In contrast, the median SNP distance between H37Rv and clinical isolates was only 1,826 (
TABLE-US-00004 TABLE 3 Methyltransferase activity by isolate count. “Isolate count” contains the number of isolates in the dataset with methyltransferase the activity profile specified by the values of the “MamA”, “MamB”, and “HsdM” columns. “Active” as the value in the “MamA”, “MamB”, and “HsdM” columns denotes normal methyltransferaseactivity, while “Inactive” denotes reduced or absent activity. Isolate Count MamA MamB HsdM 32 Active Active Active 13 Active Active Inactive 2* Active Inactive Active 35 Inactive Active Active 6** Active Inactive Inactive 5*** Inactive Active Inactive 2 Inactive Inactive Active *Includes knockdown variant K1033T. **Includes reference strains H37Rv and H37Ra. ***Includes knockdown variant G152S.
Diverse Mutations Drive DNA Methyltransferase Activity Profiles.
[0185] Cumulatively, the 93 M. tuberculosis and M. africanum isolates harbored 40 distinct mutations within the known MTase genes mamA, mamB, and hsdM/hsdS, including 32 previously unreported, see
[0186] Recently, mamB D59G was reported as the sole variant in a MamB inactive isolate (SRA: ERP009820). However, we identified four isolates harboring D59G, all of which also harbored V616A. Two of the four isolates were active, and had no other mutations. The remaining two isolates were MamB inactive, and carried a 1356 bp insertion that we have identified as an IS6110 insertion sequence. One of these inactive isolates was the same isolate recently reported with mamB D59G alone. The prior study did not report the mamB insertion, likely due to their reference-mapping of short reads to call variants. However, it is unclear why their methods did not capture V616A. As MamB was active in isolates carrying mamB D59G without the insertion, we conclude the insertion was responsible for MamB knockout.
[0187] In addition to identifying knockout mutations, the comparison revealed several “knockdown” mutations, whose isolates had IPD ratio distributions consistent with neither full methylation nor unmethylation. Bases targeted by MTases with these mutations had faster kinetics than wild-type isolates yet slower than knockout isolates (
M. tuberculosis Clinical Isolates Exhibit Intercellular Mosaic Methylation
[0188] To confirm heterogeneous methylation, we ran SMALR on knockout, knockdown, and wild-type isolates. SMALR.sup.15 detects heterogeneity from SMRT sequencing kinetics by averaging the kinetics signals at multiple MTase motif sites within single sequencing reads, to calculate a “native score” for each read. The 54 mamA wild-type isolates had a normal distribution of native scores with a mean of 2.14 (
[0189]
[0190] The four isolates with knockout variant mamA:W136R each distributed normally with a mean of −0.107 (
[0191] All isolates (n=34) with knockdown variant mamA:E270A also displayed evidence of stochastic methylation, but with a lesser methylated fraction. Their reads had a mean native score of 0.0558. This mean was significantly above that of W136R (p<2.2e-16), indicating stochastic methylation. We also analyzed heterogeneous methylated at MamB motif sites. MamB native scores in mamB wild-type and knockout isolates were similar to MamA native scores in mamA wild-type and knockout isolates, respectively (2.19, −0.192,
[0192]
[0193] MamB possessed one knockdown genotype, mamB:K1033T, found in a single isolate. Like mamA:E270A, mamB:K1033T had a significantly greater native score than that of the knockout genotypes (p<2.2e-16). HsdM motif sites occurred too infrequently across the genome for same-read analysis of multiple sites with SMALR preventing us from drawing conclusions about heterogeneity at HsdM motif sites.
[0194] Because the methylation of each MTase motif site is independent between cells, intracellular stochastic methylation results in diverse combinations of methylated motif sites within an isogenic colony. We call this epigenetic mosaicism “intercellular mosaic methylation”. Intercellular mosaic methylation is notably distinct from phase variant MTase knockout, in which a subpopulation of cells have an inactive MTase. Phase variant MTase knockout causes a portion of an isolate's reads to be entirely methylated, and another portion to be entirely unmethylated. Native scores of phase variant MTase knockouts would distribute bimodally (
[0195] Next, we asked whether nutrient restriction could cause intercellular mosaic methylation in isolates with wild-type MTase function. We ran SMALR on published kinetic data from metA-knockout H37Rv methionine auxotrophs (ΔmetA) SMRT-sequenced following 5 days of methionine starvation.sup.21.
[0196] To test for intercellular mosaic methylation in ΔmetA we compared its native score distribution to a mixture of wholly methylated and wholly unmethylated reads (
Anomalous Methylation Patterns in Orphan MTase Motif Sites
[0197] Next, we surveyed all common motif sites (present in greater than or equal to 75 isolates, n=4,486;
[0198] Assessing the consistency and magnitude of IPD ratios within active wild-type isolates for each MTase across motif sites revealed three interesting features. First, while most motif sites are primarily methylated (light, yellow), a subset had significantly lower median IPD (
[0199] The converse was also true. Some motif sites with low median IPDs had higher IPDs in a subset of isolates. Third, knockdown mutations had distinct methylation profiles with IPDs higher than knockout mutants and lower than wild-type. All three features were more pronounced in motif sites of orphan MTases than of MamB (
[0200] We then sought to identify motif sites that were hypervariable across strains with active MTases. We reasoned that such hypervariability would indicate differential selection for methylation status, highlighting interesting differences between strains. To find hypervariable sites, we calculated the standard deviation in IPD ratio across isolates (variation) for each shared MTase site. Variation across isolates for each qualifying motif site was compared against the distribution of variation in MamB sites (
TABLE-US-00005 FIG. 20 legend: RefLoc References the distance to the nearest i CDS boundary TSSid The CDS downstream of the TSS ahead of which the locus falls, and the number of base pairs upstream of the TSS that the targeted adenine is positioned. TSS The CDS downstream of the TSS within which the locus falls median The median log2(IPD Ratio) at the specified locus among all sites with the specified Mtase active mean The mean log2(IPD Ratio) at the specified locus among all sites with the specified Mtase active sd The standard deviation of log2(IPD Ratio) at the specified locus among all sites with the specified Mtase active
[0201]
[0202] We then contrasted variability of methylation at each motif site across isolates of the same activity level (knockdown, knockout, or wild-type,
[0203]
Hypomethylated MTase Motif Sites are Rare Yet Remarkably Consistent Across Isolates
[0204] Every isolate possessed a handful of unmethylated motif sites targeted by otherwise active MTases. Hypomethylated sites like these have previously been found in M. tuberculosis.sup.3 and many other bacteria.sup.9,22. Per active isolate there were on average 20.7 hypomethylated HsdM sites, 13.4 hypomethylated MamA sites, and 0.289 hypomethylated MamB sites (
TABLE-US-00006 Nearby Position Hypomethylated All P-value Motif 1719 51 51 2.18E−126 No 1716 49 51 2.3367E−118 No 7 39 51 1.22701E−85 No 472 38 50 2.73095E−83 Yes 475 35 50 1.23593E−74 Yes 1272 31 35 2.14038E−72 No 1275 26 35 5.92888E−57 No 447 20 33 2.88804E−41 No 1664 13 20 7.05078E−28 No 23 14 31 8.0244E−27 No 598 13 25 4.65494E−26 Yes 325 12 18 4.9305E−26 No 1069 13 43 3.09114E−22 Yes 2728 13 50 2.93266E−21 No 796 11 49 2.03632E−17 No 880 8 13 2.46189E−17 No 416 10 50 2.07856E−15 Yes 1469 9 44 4.24545E−14 No 4 9 51 1.78287E−13 No 1661 7 20 4.21414E−13 No 1662 7 24 1.85898E−12 No 1066 8 43 2.53081E−12 Yes 642 8 48 6.486E−12 No 389 8 50 9.17186E−12 Yes 787 8 51 1.08451E−11 No 3454 4 4 1.39369E−10 No 1309 7 46 2.69087E−10 No 595 6 25 2.75514E−10 Yes 877 5 13 6.02318E−10 No 1439 4 6 2.07906E−09 No 10 6 48 1.78409E−08 No 712 6 51 2.59529E−08 No 1409 3 3 4.05624E−08 No 37 4 14 1.35722E−07 No
[0205] Table Y summaries a hypomethylation analysis. Consistently hypomethylated MTase motif sites across 93 clinical Mycobacterium tuberculosis and Mycobacterium africanum clinical isolates. MTase motif site loci were assigned by our methylome annotation pipeline, using proximal H37Rv gene references transferred by Rapid Annotation Transfer Tool (http://ratt.sourceforge.net/). Consistently hypomethylated loci were classified as unmodified by our Bayesian analysis in a significant number of isolates in which the relevant MTase was mostly active. Significance was calculated using the cumulative binomial test, setting the number of MTase-active isolates where a locus was present as the number of trials, and the number of said isolates where the locus was hypomethylated as the number of successes. At 0.01 significance level, the threshold for p-value for significance was 4.72E-07, after a Bonferroni correction for the number of loci tested. Sheet one contains hypomethylated MamA motif site loci. Sheet two contains the hypomethylated MamB loci, and sheets three contains hypomethylated HsdM loci.
[0206] This consistency would be unlikely if hypomethylation occurred randomly, suggesting there are conserved mechanisms blocking methylation at these sites.
Transcription Factor Occlusion Explains Most Hypomethylated Sites
[0207] In other bacteria, hypomethylation results from transcription factor occlusion blocking the MTase when their respective target motifs match the same site in the genome.sup.8-10. To determine if this was the case in M. tuberculosis, we scanned the context sequence of each consistently hypomethylated site for transcription factor binding sites (TFBSs) motifs previously characterized in M. tuberculosis.sup.19. All 58 consistently hypomethylated HsdM loci matched at least one significant TFBS motif (p-value <0.0001, converted log-likelihood ratio score), while only 14 of the 34 consistently hypomethylated MamA loci significantly matched a TFBS motif (p-value <0.0001, converted log-likelihood ratio score; Table 1, see
[0208]
[0209] The abundance of TFBS matches at HsdM motif loci may be due to the lower stringency of its motif (HsdM: GATNNNNRTAC (SEQ ID NO:21), MamA: (SEQ ID NO:6) CTGGAG). Notably, the transcription factor binding site (TFBS) motif of oxidation-sensing regulator mosR (Rv1049).sup.28 matched multiple hypomethylated MamA and HsdM loci, and the mosR gene itself had a hypomethylated MamA locus 7 bp upstream of its TSS (Table 1).
[0210] One particularly intriguing example of site-specific hypomethylation was cobK:304, the HsdM motif site 304 bp inside the gene cobK. This locus was hypomethylated in 50 HsdM active isolates, yet methylated in 18 isolates (Table 1, see also
[0211] On the other hand, in some bacteria hypomethylation has also been observed when MTase motif sites were in close proximity.sup.12. To find such instances, we scanned consistently hypomethylated loci for nearby MTase motifs (Table 1, see also
[0212]
Table 1, Illustrated as
[0213] The top 20 most significant hypomethylated loci from each MTase. For each methyltransferase (“MTase”) motif target locus (“Gene”, “Sense”, and “Position”), we counted the number of isolates in which the isolate was hypomethylated and the total number of isolates that possessed the locus (“Hypomethylated”). This fraction was used to perform a cumulative binomial probability test (“P-value”). Loci with p-values below 4.72E-07 were considered significant at 0.01 significance level, after Bonferroni correction for multiple hypothesis testing. Loci were assigned by our methylome annotation pipeline using H37Rv reference annotations transferred from RATT.sup.30. For each palindromic pair, the locus with the most significant hypomethylated fraction is reported. In case of a tie, the locus on the same strand as the gene is reported. The fraction of active isolates hypomethylated at the partner site is included (“Palindrome”). The surrounding 20 bases of each loci were scanned for transcription factor binding site motifs previously characterized in M. tuberculosis.sup.27. The most significant motif match was included (“Top TF”). Only transcription factor binding motifs with an E-value below 0.01 were scanned for, and only matches with a p-value (converted log-likelihood ratio score) below 0.0001 were reported. MTase motif loci less than 100 bp from another locus targeted by the same MTase were labeled (“Yes” in column “Nearby Motif”). Genes that were previously reported.sup.4 to contain frequently hypomethylated sites are marked with an asterisk.
Methylation is Widespread and Distinctly Patterned at Promoters
[0214] Next, we systematically probed promoters with MTase to identify common configurations between motif sites and characterized TSSs.sup.31,32. Within promoter regions (<50 bp upstream from the TSS). Targeted adenines of MamA and HsdM motifs had distinct peaks at the edges of the −10 element (
[0215] Next, we scanned for SFBS motifs overlapping promoter MTase motif sites. Sigma factors SigA and SigB overlapped MTase motif sites most frequently (
[0216]
Hypervariable Promoter Methylation Across Isolates Suggests Epigenetic Selection In Vitro.
[0217] Next, we cross-checked the hypervariable motif sites with promoter motif sites to identify sites of potential differential epigenetic regulation in vitro. Promoters of fifteen genes harbored hypervariable motif sites (
[0218] Seven motif sites comprise a cluster of hypervariable sites in the spacer between the −10 and −35 elements (19-24 bp range,
[0219] HsdM promoter methylation is associated with transcription levels of downstream genes. Notably, Rv1813c is hypervariable and has a motif site 11 bp upstream of its TSS, overlapping a SigA SFBS. Rv1813c was recently reported to be significantly under-expressed following AhsdM, but the authors did not identify the SigA overlap with this motif site in the Rv1813c promoter.sup.14. This discovery prompted us to re-evaluate the ΔhsdM differential expression results recently reported to have no direct influence on transcription at methylated promoters. In that work, the authors defined “differentially expressed” genes using thresholds on both significance (adjusted p-value ≤0.05) and magnitude (|log 2−foldchange|≥1). Since we are interested in the mechanism (Does HsdM promoter methylation have the capacity to influence transcription?) rather than the magnitude of its effect, we defined differentially expressed genes according only to significance. With these criteria, 310 genes (
[0220]
[0221] Nine of these 11 ΔhsdM-DE genes with HsdM promoter motifs overlapped with the −10 promoter element (
Promoter Methylation Implicates Mediators of Clinically Important Phenotypes
[0222] Our discovery of intercellular mosaic methylation, consistent hypomethylation, and widespread promoter methylation reveal multiple mechanisms for epigenetic gene regulation. To determine what processes and phenotypes are potentially regulated by these mechanisms, we examined the functional annotations of genes with methylated promoters and hypomethylated sites. From this examination emerged genes involved in host-lipid metabolism, drug resistance, metal ion homeostasis, and key regulators (Table 2, see
[0223] Host-derived fatty acids and cholesterol are favored carbon sources for M. tuberculosis in macrophage.sup.30. From host lipids, M. tuberculosis can generate energy, fuel central carbon metabolism, and synthesize cell wall components. Promoter MTase motifs and hypomethylated motifs fell in genes required to acquire these host lipids, dictate their metabolic fate, and detoxify intermediates generated during their utilization (Table 2, see
[0224] Frequently hypomethylated genes accE5, bioB, and cobK (Table 1, see
[0225] Rather than generating energy through the TCA cycle, methylmalonyl-CoA intermediates can also be assembled into virulence lipids. This alternative pathway is mediated by pks genes.sup.42, which harbor hypomethylated motif sites (pks6 and pks9, Table 1, see
[0226] MTase motifs reside within promoters of genes mediating both intrinsic and acquired drug resistance (Table 2, see
[0227] Promoter methylation of these genes likely influences efflux pump activity and metabolic quiescence, two primary sources of phenotypic heterogeneity in persister cells.sup.51. Intercellular mosaic methylation may thus imbue some bacilli with methylation patterns that alter expression favorably for tolerating drug pressure. The epigenetically defined tolerant minority would then enable colony survival in fluctuating drug concentrations, buying time for genetic resistance mechanisms to emerge under prolonged pressure.
[0228] Curiously, promoter methylation patterns in RaaS (Rv1219c) converge between distant isolates (
Table 2, Illustrated as FIG. 9:
[0229] Systems implicated at putative epigenetically modulated promoters and consistently hypomethylated sites. Genes implicated as epigenetically-regulated that are involved in clinically relevant processes. These genes mediate known intrinsic and acquired resistance and mechanisms, metabolism of host-derived lipids and flux through subsequent metabolic pathways, metal ion homeostasis.
[0230] The final implicated process is metal ion homeostasis. Cobalt (corA), magnesium (corA), copper (lpqS), and iron (mmpS4, higA, mbtJ, and hemN) homeostasis genes harbor methylated promoters (Table 2, see
Discussion
[0231] Here, we leverage third-generation sequencing technology to investigate an DNA methylation at single-nucleotide resolution, an underexplored source of variation among the MTBC. We assembled, annotated, and compared DNA methylomes of 93 clinical isolates spanning all seven MTBC lineages. This comprehensive survey clarified the diversity and function of MTase variants across the MTBC and produced several novel findings. First, the methylome of virulent M. tuberculosis type strain H37Rv is dissimilar to methylomes of recent clinical isolates. Second, promoter methylation is abundant, occurs in a conserved configuration with sigma factor binding sites, and is upstream of key mediators of drug resistance and other clinically important phenotypes. Third, the methylomes of genetically distant isolates converged in several cases, exposing the limitations of using genetic distance alone as a proxy for phenotypic similarity in M. tuberculosis. Fourth, intracellular stochastic methylation in individual cells creates intercellular mosaic methylation within M. tuberculosis colonies. This intercellular mosaic methylation is genetically driven in some isolates and can be induced by methionine starvation in H37Rv. Finally, our re-analysis of RNAseq data in wild-type versus HsdM-knockout demonstrates direct transcriptional influence by HsdM promoter methylation (
[0232] Kinetics data analysis of all known MTase motif sites revealed knockdown MTase mutations that inspired subsequent heterogeneity analysis. The heterogeneity analysis identified four unique MTase variants that conferred intermediate IPD ratios at all target motif sites, suggesting epigenetic heterogeneity. SMALR.sup.15 confirmed this heterogeneity in mamA:E270A and mamA:G152S isolates, and characterized the phenomenon as intracellular stochastic methylation, rather than phase variant MTase knockout. In stochastic methylation, the methylation status of each MTase target site varies independently between cells. The resulting subpopulations carry diverse combinations of methylated and unmethylated sites, a phenomenon we have termed “intercellular mosaic methylation”. Further analysis demonstrated intercellular mosaic methylation occurs even with wild-type MTases, when under methionine starvation. This suggests nutritive stress may diversify phenotype through differential methylation patterns. Intercellular mosaic methylation appears to serve as an adaptive response and as a constitutive source of diversity in some isolates.
[0233] The most frequent variant associated with intercellular mosaic methylation, mamA:E270A, was ubiquitous among Beijing isolates, and may contribute to their global success. Intercellular mosaic methylation may confer an enhanced ability to colonize new hosts with diverse genetic background and immunities through varied modes of transmission. Indeed, methylated promoters are present in many genes linked to hallmarks of Beijing sublineage: facile dormancy induction.sup.75, increased host-lipid utilization, TAG accumulation in aerobic environments.sup.76, and increased synthesis of cell envelope components and virulence lipids.sup.77 (Table 2, see
[0234] M. tuberculosis has evolved diverse transcription factors that invoke transcriptional programs to promote survival in microenvironments throughout its lifecycle. Yet transcriptional responses to environmental changes are delayed.sup.39, begging the question: How does M. tuberculosis survive before these transcriptional responses take hold? Our findings support a model in which intercellular mosaic methylation imbues some bacilli with methylation patterns that influence transcription favorably for survival in a particular set of conditions. Then, upon appearance of this set of conditions, subpopulations with advantageous methylation patterns survive long enough for transcriptional reconfiguration to manifest through genetically encoded transcriptional programs. This model of intercellular mosaic methylation-driven heterogeneity is consistent with prior observations of M. tuberculosis “persister cells”.sup.40, minority groups that are pre-adapted to tolerate initial exposure to macrophage.sup.41 and drug pressure.sup.42, by entering dormancy.sup.43 or activating efflux pumps.sup.44,45. Reconciling observations of persister cells with our described model requires MTase motifs to affect transcription of the genes mediating persistence, and a plausible mechanism for DNA methylation to influence transcription. We find evidence for both these requirements.
[0235] Promoter methylation motifs implicate dormancy, antimicrobial resistance, and metal ion homeostasis as processes regulated in part by DNA methylation. Rv1813c and hrp1 are especially highly expressed members of the M. tuberculosis dormancy regulon, and hypervariable across isolates with active MTase (
[0236] Intercellular mosaic methylation-driven heterogeneity also implicates the metabolic side of dormancy. Transcriptional influence by the −10 promoter element motif site of ramB.sup.47 is an intriguing candidate for future investigation. RamB mediates the glyoxylate shunt.sup.48 through transcriptional regulation of Isocitrate lyase (Icl1), a key player in central metabolism, handling oxidative stress, and tolerating antimicrobials.sup.49. Hypervariable, ΔhsdM-DE promoter methylation of glpX also implicates dormancy metabolism. Its product, (GlpX) encodes the rate-limiting enzyme of gluconeogenesis, the pathway through which dormant M. tuberculosis furnishes energy.sup.50.
[0237] In vivo, the human immune system imposes its own dynamic selective pressure on M. tuberculosis, which remains incompletely understood. Several of the better characterized immune pressures destroy a majority of bacilli, while a minority subpopulation survives. For example, minor subpopulations of M. tuberculosis successfully rupture host phagosomes, allowing access to the host cytoplasm.sup.81. Intercellular mosaic methylation may play a role in establishing this heterogeneity, allowing the pathogen to employ multiple strategies simultaneously to combat the host immune system.
[0238] Multi-omic integration with annotated and assembled whole methylomes revealed widespread epigenetic gene regulation through promoter methylation. MTase target motifs frequently coincided with classical promoter elements (
[0239] Several genes with methylated promoters are deeply linked to drug resistance, host lipid metabolism, persister cell formation, and key metabolic shifts in vivo. This linkage has two key implications. First, these genes, and their associated metabolic processes, appear to be epigenetically regulated through DNA methylation status. Second, through intercellular mosaic methylation, colonies gain access to a broader range of phenotypes that are not immutably fixed through chromosomal mutation. Rather the methylomes are passed vertically, through semi-heritable epigenetic inheritance. This may enhance colony robustness against changing conditions while preserving a majority subpopulation that is phenotypically adapted to current conditions. The notion that that intercellular mosaic methylation confers an adaptive advantage is supported by its convergence across three lineages, and its emergence following methionine starvation in ΔmetA mutants.
[0240] We cannot extrapolate directly from the DNA methylation patterns we report here to what occurs during infection. Sequencing kinetics are measured from DNA extracted after extensive culturing, during which any methylomic adaptation to the host environment would have presumably been erased. Directly sequencing from sputum is ideal to assay DNA methylation patterns in vivo, but SMRT-sequencing requires large quantities of DNA, necessitating in vitro culturing, as DNA amplification erases epigenetic markings. In the absence of lower DNA input requirements, in vitro studies under host-like conditions (e.g. hypoxia, host-lipids as carbon source) can reveal context-dependent selection of methylation patterns, and time-course serial sequencing could inform us of the dynamics methylomic adaptation their selection. Coupling these sequencing studies with transcriptomic, proteomic, and phenotypic assays could clarify how effects of DNA methylation manifests in gene expression and phenotypically.
[0241] Transcriptional responses in M. tuberculosis are mediated by numerous effectors of transcription, many of which are not constitutively expressed. Interaction with additional transcriptional effectors has been described to interact with DNA methylation in other bacterial species.sup.80 to modulate transcription. The cluster of hypervariable motif sites in the spacer between −10 and −35 promoter isolates in MamA-methylated promoters (
[0242] The large set (n=351) of hypervariable loci (
[0243] Our findings raise important questions. First, are MTase genotypes with wild-type activity in rich media similarly active in vivo or do they exhibit intercellular mosaic methylation under constraints such as cofactor limitation or DNA accessibility? The intercellular mosaic methylation observed in methionine-deprived ΔmetA mutants (
[0244] Methylome rearrangement dynamics are another key question. Under the prevailing view, demethylation does not occur in bacteria (though base excision repair might offer a demethylation mechanism.sup.85). Under this view, demethylation can occur only between generations, through a lack of re-methylation on the nascent strand following replication. Accurate modeling of how the methylome changes within and across generations requires greater knowledge of methyltransferase activity throughout the cell-cycle. The nature of these dynamics has key implications. If M. tuberculosis DNA MTase are active throughout the cell-cycle DNA methylation could mediate acute responses to environmental cues. Alternatively, if MTase expression is restricted to a particular part of the cycle as in E. coli dam.sup.86, DNA methylation status can only be selected upon. Comparative methylomics combining SMRT-sequencing kinetics analysis, cell-cycle coordination.sup.87, and MTase activity probes would answer these questions, especially if employed across multiple conditions.
[0245] Our promoter methylation reports different results and opposing conclusions to those reached in a recent analysis.sup.14 on the role of HsdM in regulating promoter strength. While they conclude that “methylation seems to play a minimal role in shaping in-vitro gene expression”, integrating our identified promoter MTase motif sites with data their ΔhsdM RNAseq experiment shows a clear association between HsdM promoter methylation and in vitro gene expression (
[0246] Our approach differed from prior analyses of MTBC methylomes and key to our findings: [0247] (i) Analyzing sequencing kinetics at all motif sites kinetics in every isolate. [0248] (ii) Using finished assemblies comparative genomics of MTase alleles and regulatory elements. [0249] (iii) Heterogeneity analysis. [0250] (iv) Transferring CDS and TSS annotations from an extensively studied reference strain.
[0251] While the relative clonality of the MTBC made annotation transfer straight-forward, species with more dynamic genomes may present additional challenges. We recommend similar approaches for future large-scale, intra-species comparative and functional methylomics studies in prokaryotes.
[0252] The data and isolate set described in this work can help answer these questions and others regarding the role of DNA methylation in M. tuberculosis. This isolate set comprises all seven lineages of the MTBC and have finished, annotated genomes and methylomes. Future experiments with these isolates can show the effects of methylomic differences on phenotype and adaptive capacity.
[0253] This work extends upon recent characterization of MTase motifs, their DNA methyltransferases.sup.3, and their capacity to modulate transcription.sup.7 in M. tuberculosis. We find epigenetic diversity in M. tuberculosis and evidence that it manifests as clinically important phenotypic diversity. Knockdown and knockout mutations emerge repeatedly in DNA methyltransferases, punctuating M. tuberculosis evolution with sudden change at several thousand sites. Stereotyped promoter methylation configurations indicate widespread epigenetic regulation in M. tuberculosis. These findings demonstrate DNA methylation as a fundamental source of diversity that potentially explains the discord between the limited genetic variation reported in M. tuberculosis and its observed capacity for phenotypic adaptation. More broadly, the discovery of intercellular mosaic methylation in M. tuberculosis reveals that the pathogen forms diverse methylation patterns, conferring a continuum of semi-heritable.sup.88 phenotypes to be selected into epigenetic lineages. This phenomenon provides a new mechanism of phenotypic plasticity in pathogens and opens the door to new therapeutic angles—and challenges—for tuberculosis control.
Methods
[0254] Code availability. All custom code used to for this analysis are publicly available at: https://gitlab.com/LPCDRP.
Isolate acquisition and inclusion criteria. M. tuberculosis colonies were isolated from sputa of tuberculosis patients in the five sites (
Sample preparation and extraction. Samples prepared and extracted in Sweden at the Supranational Reference Laboratory, in Stockholm were performed as previously described. All samples were streaked for isolation using standard microbiological methods, after which well separated colonies were selected, emulsified, and sub-cultured on Lowenstein-Jensen slants and Middlebrooks 7H11 plates, where they were incubated until growth of a full bacterial lawn. DNA was extracted using Genomic-tips (Qiagen Inc., Germantown, Md.) following the manufacturer's sample preparation and lysis protocol for bacteria with the following modifications. Each culture was harvested directly into buffer B1/RNAse solution, homogenized by vigorous vortex mixing and inactivated at 80° C. for 1 hour. Lysozyme was added and incubated at 37° C. for 30 minutes followed by the addition of proteinase K and further incubation at 37° C. for an additional 60 minutes. Buffer B2 was added and the mixture was incubated overnight at 50° C. The remainder of the Genomic-tip protocol was carried out exactly as described by the manufacturer. DNA purity and concentration were analyzed on a Nanodrop 1000 (Thermo Scientific, Waltham, Mass., USA). DNA sequencing. DNA sequencing was performed at the Institute for Genomic Medicine at the University of California, San Diego. DNA libraries for PacBio (Pacific Biosciences, Melon Park, Calif.) were prepared using PacBio's DNA Template Prep Kit with no follow-up PCR amplification. Briefly, sheared DNA was end repaired, and hairpin adapters were ligated using T4 DNA ligase. Incompletely formed SMRTbell templates were degraded with a combination of Exonuclease III and Exonuclease VII. The resulting DNA templates were purified using SPRI magnetic beads (AMPure, Agencourt Bioscience, Beverly, Mass.) and annealed to a two-fold molar excess of a sequencing primer that specifically bound to the single-stranded loop region of the hairpin adapters. SMRTbell templates were subjected to standard SMRT sequencing using an engineered phi29 DNA polymerase on the PacBio RS system according to manufacturer's protocol.
Genome assembly. For isolates that were sequenced on multiple SMRT cells, all SMRT cell raw reads were combined and assembled with either HGAP2.sup.90 or canu.sup.91 with default parameters. Circularization was then performed to confirm a circular genome using minimus2 from amos or circlator.sup.92. Gene dnaA was set as the first gene in each genome. Iterative rounds of consensus polishing using BLASR.sup.93 and Quiver were executed three times. Default parameters were used except max coverage was set to 1000 for Quiver. Genomes failed assembly quality control if they could not be circularized, if their consensus polishing resulted in five or more variants after three iterations, or if PBHoney.sup.94 detected a structural variant in the assembly supported by at least 10% of the reads. PBHoney was run with default parameters.
Analysis of sequencing kinetics. To determine the inter pulse duration (IPD) ratio at each nucleotide in each isolate, we ran Single Molecule Real Time (SMRT) analysis with the Base Modification Detection with Motif Finding protocol with default parameters. A custom R script then scanned the FASTA sequence file of each isolate for matches to the MTase target motifs previously characterized in M. tuberculosis.sup.3, then extracted the IPD ratio of the targeted adenine in each matching site from the Base Modification output. These IPD ratios were then log transformed to produce a normal distribution, and standardized by subtracting the mean IPD ratio (also log transformed) of all adenines outside of MTase motifs in the isolate. Additional custom R scripts plotted the distribution of processed IPD ratios in each isolate to characterize their MTase activity (
Lineage Determination. For isolates that were re-sequenced, lineage information was obtained by inserting the MIRU-VNTR and spoligotype patterns determined previously.sup.39 into TBInsight.sup.95. For all other genomes, a custom script, MiruHero (https://gitlab.com/LPCDRP/miru-hero), determined lineage.
Genome annotation. RATT transferred Transcriptional Start Sites (TSS) from our curated H37Rv annotation. These TSS were originally determined experimentally in the H37Rv strain by Cortes et al.sup.25 and Shell et al.sup.24, and merged into the H37Rv an in-house annotation with custom scripts.
Methylome annotation. Using the annotated genome of each isolate, we annotated their MTase motif sites with a custom python script, which recorded the relative position and gene name of any CDS or TSS features overlapping or neighboring each MTase motif site. To track MTase motif sites across isolates, each MTase motif site was assigned a locus tag based on the nearest CDS boundary.
Methylome annotation. Using the annotated genome of each isolate, we annotated their MTase motif sites with a custom python script, which recorded the relative position and gene name of any CDS or TSS features overlapping or neighboring each MTase motif site. To track MTase motif sites across isolates, each MTase motif site was assigned a locus tag based on the nearest CDS boundary.
[0255] Separately, we also annotated the MTase motif sites using RATT alone, with the curated H37Rv reference annotation. Using RATT without Prokka and the rest of the AnnoTUB pipeline left many genomic regions unannotated, but more consistently annotated MTase motif sites near hypervariable genes such as PE_PGRS54 and PE_PGRS57. In many isolates, MTase motif sites near these genes were assigned different locus tags when using AnnoTUB genome annotations, as AnnoTUB labeled PE_PGRS54 and PE_PGRS57 as new genes in these isolates because they lacked 95% sequence identity.
MTase genotyping. To determine the genotype of the MTase genes mamA (Rv3263), mamB (Rv2024c), and hsdM (Rv2756c)/hsdS (Rv2761c) in each isolate, first eggNOG-mapper.sup.98 identified these genes in each clinical isolate, through homology to these genes in annotated reference genome of M. tuberculosis type strain H37Rv. However because MamB and HsdM are inactive in the H37Rv strain.sup.3, we did not use the H37Rv genes as the wild-type allele. Instead, sequencing kinetics and the previously characterized target motifs were used to determine which isolates had active copies of each MTase gene, and the most common sequence among active isolates was defined as the wild-type sequence. To call variants in these genes using these wild-type sequences, BLASTn then aligned the wilt type sequences against all genes predicted in each isolate by Prodigal.sup.99. Each matching nucleotide sequence was translated into an amino acid sequence using transeq (EMBOSS 6.6.0.0, available online at www.ebi.ac.uk/Tools/emboss/transeq/index.html) to obtain nonsynonymous variants and truncations. The amino acid sequences were then aligned using MAFFT.sup.100 v7.205 with the—clustalout option, and a custom script converted the alignment to a genotype.
Variant Calling for building phylogenies dnadiff.sup.101 (v1.3) aligned each assembled genome to M. tuberculosis H37Rv (NC 000962.3) and call SNPs and small indels with default parameters. A custom Perl script converted the out.snps from dnadiff into a VCF v4.0 file and Variant Effect Predictor.sup.102 (v87) determined the consequence of each variant.
For MTase Genotyping:
[0256] Phylogeny construction and mapping of MTase genotypes. First an alignment of concatenated variants was created using each isolate's VCF file. Then this alignment was used to create a maximum likelihood phylogenetic tree using RAxML.sup.103 version 8.2, specifying a general time-reversible model of nucleotide evolution with 100 bootstrap replicates. The Interactive Tree of Life (iTOL) webtool.sup.104 was used to visualize and map data to the tree, such as lineage and MTase genotypes.
Heterogeneous methylation analysis. SMALR.sup.15 requires a de novo assembled genome FASTA file and a cmp.h5 file with aligned reads, to extract the IPD data from each MTase target motif site within each read. We created a cmp.h5 for each isolate by aligning its reads to its assembled FASTA file using BLASR.sup.93. We ran SMALR on each isolate with the SMp (single molecule, pooled distribution) argument. For MamA sites we set the motif to CTGGAG, the modified position within the motif to 5, and the minimum number of motif sites per read to 6. For MamB sites we set the motif to CACGCAG, the modified position to 6, and the motifs per read threshold to 5. From the SMALR output we used the native score of each read in place of SMp score. The SMp score can only be calculated if a PCR amplified control run of each isolate is provided. This substitution is susceptible to noise from local sequence contexts, but should still resolve differences between isolates and, per the authors of SMALR, it should still distinguish methylated and unmethylated components. We analyzed the distribution of native scores within each isolate for MamA and MamB sites using custom R scripts.
Identification of promoters. To identify MTase motif sites in gene promoters, a custom python script first scanned the surrounding sequence of each MTase motif site in each isolate for Sigma Factor Binding Sight (SFBS) motifs previously characterized in M. tuberculosis.sup.26. If a SFBS match overlapped an MTase motif site, then the script checked if that SFBS match was the appropriate number of bases upstream from a TSS annotated in that isolate. For example, if the SFBS match was the −10 component of a SFBS, the script checked if a there was a TSS on the same strand with a genome position 8 to 12 bp downstream of the matching sequence. If the SFBS match was a −35 component of a SFBS, the script instead checked for a TSS between 30 and 40 bp downstream. MTase sites that met these criteria were labeled with the sigma factor type of their overlapping SFBS, their distance upstream of the TSS, and the gene name of the closest CDS downstream from the TSS. Since these criteria are rather conservative, more relaxed boundary thresholds were implemented for some of the promoter methylation analyses.
Reference-based differential methylation: In each in each clinical isolate we extracted all MTase motif sites that shared their loci with an MTase motif site in reference strain H37Rv, then counted the number of these sites with opposing methylation calls. These counts were then compared to the median SNP distance between each isolate and H37Rv (
Bayesian classification of base specific methylation status: Even within isolates with active MTase genotypes, not every base with an MTase target motif was methylated. To identify MTase motif sites with no base modification (hypomethylated sites) we took a Bayesian approach. In each isolate our custom R script estimated the distribution of normalized IPD ratios among unmodified bases by calculating the standard deviation and mean normalized IPD ratios of bases not within MTase motifs. The script then estimated the distribution of methylated bases by calculating the standard deviation and mean of bases targeted by MTase motifs. This estimate assumed that most bases targeted by MTase motifs were methylated, which held true in isolates with active MTase genotypes (
[0257] The coverage of each MTase site in each isolate was used to adjust the standard deviation of the distributions used to calculate its conditional probability, as bases with lower coverage have more variable IPD ratios (
Conserved hypomethylation patterns. Using the Bayesian classification of each MTase motif target and the loci labeled by our methylome annotation pipeline, we searched for hypomethylated loci that occurred in multiple isolates. For each locus, a custom R script counted the number of isolates with that locus, including only isolates with active genotypes of the MTase targeting the locus. Our script also counted the number of these isolates in which the locus was hypomethylated. To estimate the significance of these findings, we used a cumulative binomial test, with the first count as the sample size and the second count as the number of successes. To find the probability of hypomethylation for each Bernoulli trial if hypomethylation occurred randomly, we calculated the total frequency of hypomethylation among MTase motif sites in active isolates. A separate per trial probability was calculated for MamA, MamB, and HsdM. The Bonferroni correction adjusted for multiple hypothesis testing, by dividing the significance threshold by the total number of unique loci in this study. No reference strains were included in the analysis of hypomethylated loci.
Transcription factor binding motif scanning. We searched for Transcription Factor (TF) binding motifs near hypomethylated bases using the command line motif scanner FIMO.sup.105 version 4.12.0. For each hypomethylated locus in an MTase target motif, we extracted the sequence of 41 bases surrounding the locus in a randomly selected representative isolate (only isolates hypomethylated at that locus were chosen). The context sequences were combined into a multisequence FASTA file. Probability weight matrices of each TF binding motif were kindly provided by Minch and colleagues.sup.19, who derived them from a ChIP-Seq experiment on virulent M. tuberculosis type strain H37Rv. We then ran FIMO using each TF motif on the context FASTA file with a threshold p-value of 0.01. For comparison we also scanned for TF motifs in the context sequences of consistently methylated loci (consistently methylated loci here defined as loci present in at least 30 isolates and methylated in at least 95% of those isolates). Custom scripts then parsed the FIMO output files for each TF binding motif and counted the number of methylated loci and the number of hypomethylated loci matching each TF with a q-value of at least 0.1.
Proximal MTase motif search. For each MTase motif site in each isolate, we found neighboring MTase motif sites through a custom R script. The script found the nearest MTase motif either upstream or downstream from each MTase motif, and recorded the distance in bp.
Methylation anomalies. For each MTase, a custom R script found the set of MTase motif site loci present in at least 75 isolates. For each locus, summary statistics (mean and standard deviation) of mean log(IPD Ratio) were calculated exclusively from isolates with active MTases for each motif. The same was then performed to obtain median and standard deviation of mean log(IPD Ratio) for inactive isolates of each activity profile for each MTase. Hypervariable HsdM, MamA, and MamB motif sites were classified as those more than 3 S.D above the mean for MamB motif sites, since they had the fewest outliers (
RNA-Seq Re-Analysis and Integration. See Supplementary Table 9 from Chiner-Oms et al, Nature Communications vol 10, article no. 3994 (2019), see also https://www.nature.com/articles/s41467-019-11948-6 and merged with our annotated promoter for HsdM. A Benjamini-Hochberg adjusted p-value threshold of 0.05 was set as the criteria for being considered “differentially expressed”, using the column labelled “padj (BH)” from Supplementary Table 9 of Chiner-Oms et al. Two-sided Fisher's Exact Test was implemented in R to test for independence of HsdM promoter presence and Differentially methylated genes following HsdM Knockout. Genes were considered to have an HsdM promoter motif is the modified adenine was within 50 bp upstream of the TSS.
Example 2: Intercellular Mosaic Methylation (IMM) is Distinct from Other Forms of Mosaic-Like DNA Methylation
[0258] Sequencing kinetics of MTase target motif sites indicated heterogeneous methylation in isolates with MTase variants mamAEroA, mamAG152s, and mamB.sub.K1033T (see
[0259]
[0260] This is not the first report of mosaic-like patterning of DNA adenine methylation in prokaryotes. Mosaicism can result from independent ON/OFF switching of multiple phase-variable MTases (Atack et al., 2018) or from domain movement of the target recognition domain (TRD) (Furuta and Kobayashi, 2012), a phenomenon known as “DoMo” (Furuta et al., 2014). However, IMM departs from these two previously described types of mosaic-like methylation heterogeneity in two important respects. First, in the degree of methylomic diversity it generates (
[0261] Throughout this manuscript we have referred to the DNA adenine methyltransferase encoded by Rv2756c as HsdM (hsdM for the gene) and its specificity subunit encoded by Rv2761, as HsdS (hsdS for the gene) to be consistent with previous work (Shell et al., 2013). It appears that Rv2756c was originally referred to as HsdM based on homology to hsdM in R-M systems—before the existence of its restriction component had been investigated—and has propagated through subsequent studies (Chiner-Oms et al., 2019; Gomez-Gonzalez et al., 2019; Phelan et al., 2018; Zhu et al., 2015). However, it has since been determined that Rv2756c lacks a functional HsdR component (Zhu et al., 2015). According to the prevailing nomenclature conventions, the symbol “hsd” is for Type 1 R-M systems (Loenen et al., 2014; Roberts et al., 2003), which Rv2756c is not part of, since it lacks a functional restriction component. Therefore, we propose that the orphan methyltransferase encoded by Rv2756c be renamed to MamC (mamC for the gene) Mycobacterial Adenine Methyltransferase C (since MamA and MamB are assigned to other mycobacterial DNA adenine methyltransferases). Likewise, we propose that the specificity subunit of MamC encoded by Rv2761 be renamed to mamS/MamS (formerly hsdS/HsdS) and the specificity subunit fragment encoded by Rv2755 (formerly hsdS.1/HsdS.1) to mamS.1/MamS.1. This proposed nomenclature retains the S and S.1 from hsdS and hsdS.1, is consistent with the extant naming convention of MamA and MamB, and removes the erroneous implication that HsdM/HsdS/HsdS.1 are part of a Type 1 R-M system.
[0262] In summary, HsdM is also named MamC, and subsequent literature may use MamC; and MamC (formerly HsdM) also requires its specificity subunit, a separately encoded protein, MamS (formerly HsdS); so in alternative embodiments, when HsdM/MamC is referred to, the whole functional complex of MamC and MamS is meant to be referred to.
[0263] Analysis of the relationship between methylation status of the conserved hypervariable sites (
[0264]
[0265]
REFERENCES EXAMPLE 1
[0266] 1. WHO. Global Tuberculosis Report 2017. (2017). doi:WHO/HTM/TB/2017.23 [0267] 2. Cohen, K. A. et al. Evolution of Extensively Drug-Resistant Tuberculosis over Four Decades: Whole Genome Sequencing and Dating Analysis of Mycobacterium tuberculosis Isolates from KwaZulu-Natal. PLoS Med. 12, 1-22 (2015). [0268] 3. Zhu, L. et al. Precision methylome characterization of Mycobacterium tuberculosis complex (MTBC) using PacBio single-molecule real-time (SMRT) technology. Nucleic Acids Res. 44, 730-743 (2016). [0269] 4. Phelan, J. et al. Methylation in Mycobacterium tuberculosis is lineage specific with associated mutations present globally. Sci. Rep. 8, 160 (2018). [0270] 5. Low, D. A. & Casadesús, J. Clocks and switches: bacterial gene regulation by DNA adenine methylation. Curr. Opin. Microbiol. 11, 106-112 (2008). [0271] 6. Ardissone, S. et al. Cell Cycle Constraints and Environmental Control of Local DNA Hypomethylation in α-Proteobacteria. PLOS Genet. 12, e1006499 (2016). [0272] 7. Shell, S. S. et al. DNA Methylation Impacts Gene Expression and Ensures Hypoxic Survival of Mycobacterium tuberculosis. PLOS Pathog 9, e1003419 (2013). [0273] 8. Hernday, A., Krabbe, M., Braaten, B. & Low, D. Self-perpetuating epigenetic pili switches in bacteria. Proc. Natl. Acad. Sci. 99, 16470-16476 (2002). [0274] 9. Stephenson, S. A.-M. & Brown, P. D. Epigenetic Influence of Dam Methylation on Gene Expression and Attachment in Uropathogenic Escherichia coli. Front. public Heal. 4, 131 (2016). [0275] 10. Beaulaurier, J., Schadt, E. E. & Fang, G. Deciphering bacterial epigenomes using modern sequencing technologies. Nat. Rev. Genet. 1 (2018). doi:10.1038/s41576-018-0081-3 [0276] 11. Gomez-Gonzalez, P. J. et al. An integrated whole genome analysis of Mycobacterium tuberculosis reveals insights into relationship between its genome, transcriptome and methylome. Sci. Rep. 9, 5204 (2019). [0277] 12. Casadesus, J. & Low, D. A. Programmed Heterogeneity: Epigenetic Mechanisms in Bacteria. J. Biol. Chem. 288, 13929-13935 (2013). [0278] 13. Wallecha, A., Munster, V., Correnti, J., Chan, T. & Woude, M. van der. Dam- and OxyR-Dependent Phase Variation of agn43: Essential Elements and Evidence for a New Role of DNA Methylation. J. Bacteriol. 184, 3338-3347 (2002). [0279] 14. Phasevarions of Bacterial Pathogens: Methylomics Sheds New Light on Old
[0280] Enemies. Trends Microbiol. 26, 715-726 (2018). [0281] 15. Beaulaurier, J. et al. Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes. Nat Commun 6, (2015). [0282] 16. Zhu, L. et al. Precision methylome characterization of Mycobacterium tuberculosis complex (MTBC) using PacBio single-molecule real-time (SMRT) technology. Available at: http://nar.oxfordjournals.org. (Accessed: 18 Apr. 2016) [0283] 17. Pacific Biosciences. Kinetics Tools. [0284] 18. Otto, T. D., Dillon, G. P., Degrave, W. S. & Berriman, M. RATT: Rapid Annotation Transfer Tool. Nucleic Acids Res. 39, e57-e57 (2011). [0285] 19. Minch, K. J. et al. The DNA-binding network of Mycobacterium tuberculosis. Nat. Commun. 6, 5829 (2015). [0286] 20. Chiner-Oms, A., Gonzalez-Candelas, F. & Comas, I. Gene expression models based on a reference laboratory strain are poor predictors of Mycobacterium tuberculosis complex transcriptional diversity. Sci. Rep. 8, 3813 (2018). [0287] 21. Berney, M. et al. Essential roles of methionine and S-adenosylmethionine in the autarkic lifestyle of Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. U.S.A. 112, 10008-10013 (2015). [0288] 22. Blow, M. J. et al. The Epigenomic Landscape of Prokaryotes. PLOS Genet 12, e1005854 (2016). [0289] 23. Otto, T. D., Dillon, G. P., Degrave, W. S. & Berriman, M. RATT: Rapid Annotation Transfer Tool. Nucleic Acids Res. 39, e57-e57 (2011). [0290] 24. Shell, S. S. et al. Leaderless Transcripts and Small Proteins Are Common Features of the Mycobacterial Translational Landscape. PLoS Genet. 11, (2015). [0291] 25. Cortes, T. et al. Genome-wide Mapping of Transcriptional Start Sites Defines an Extensive Leaderless Transcriptome in Mycobacterium tuberculosis. Cell Rep. 5, 1121-1131 (2013). [0292] 26. Chauhan, R. et al. Reconstruction and topological characterization of the sigma factor regulatory network of Mycobacterium tuberculosis. Nat. Commun. 7, 11062 (2016). [0293] 27. Staroń, A. et al. The third pillar of bacterial signal transduction: classification of the extracytoplasmic function (ECF) σ factor protein family. Mol. Microbiol. 74, 557-581 (2009). [0294] 28. Cook, G. M. et al. Physiology of Mycobacteria. Adv. Microb. Physiol. 55, 81-319 (2009). [0295] 29. Feklistov, A. & Darst, S. A. Structural basis for promoter-10 element recognition by the bacterial RNA polymerase a subunit. Cell 147, 1257-69 (2011). [0296] 30. Lee, W., VanderVen, B. C., Fahey, R. J. & Russell, D. G. Intracellular Mycobacterium tuberculosis exploits host-derived fatty acids to limit metabolic stress. J. Biol. Chem. 288, 6788-800 (2013). [0297] 31. Micklinghoff, J. C. et al. Role of the Transcriptional Regulator RamB (Rv0465c) in the Control of the Glyoxylate Cycle in Mycobacterium tuberculosis. J. Bacteriol. 191, (2009). [0298] 32. Murima, P. et al. A rheostat mechanism governs the bifurcation of carbon flux in mycobacteria. Nat. Commun. 7, 12527 (2016). [0299] 33. Nandakumar, M., Nathan, C. & Rhee, K. Y. Isocitrate lyase mediates broad antibiotic tolerance in Mycobacterium tuberculosis. Nat. Commun. 5, 4306 (2014). [0300] 34. Sirakova, T. D. et al. Identification of a diacylglycerol acyltransferase gene involved in accumulation of triacylglycerol in Mycobacterium tuberculosis under stress. Microbiology 152, 2717-2725 (2006). [0301] 35. Baek, S.-H., Li, A. H., Sassetti, C. M., Mitchell, M. & Milgram, E. Metabolic Regulation of Mycobacterial Growth and Antibiotic Sensitivity. PLoS Biol. 9, e1001065 (2011). [0302] 36. Baek, S.-H., Li, A. H., Sassetti, C. M., Mitchell, M. & Milgram, E. Metabolic Regulation of Mycobacterial Growth and Antibiotic Sensitivity. PLoS Biol. 9, e1001065 (2011). [0303] 37. Daniel, J., Maamar, H., Deb, C., Sirakova, T. D. & Kolattukudy, P. E. Mycobacterium tuberculosis Uses Host Triacylglycerol to Accumulate Lipid Droplets and Acquires a Dormancy-Like Phenotype in Lipid-Loaded Macrophages. PLoS Pathog. 7, e1002093 (2011). [0304] 38. Tong, J. et al. The FBPase Encoding Gene glpX Is Required for Gluconeogenesis, Bacterial Proliferation and Division In Vivo of Mycobacterium marinum. PLoS One 11, e0156663 (2016). [0305] 39. Gago, G., Kurth, D., Diacovich, L., Tsai, S.-C. & Gramajo, H. Biochemical and Structural Characterization of an Essential Acyl Coenzyme A Carboxylase from Mycobacterium tuberculosis. J. Bacteriol. 188, (2006). [0306] 40. Cronan, J. E. & Lin, S. Synthesis of the α,ω-dicarboxylic acid precursor of biotin by the canonical fatty acid biosynthetic pathway. Curr. Opin. Chem. Biol. 15, 407-413 (2011). [0307] 41. Gopinath, K., Moosa, A., Mizrahi, V. & Warner, D. F. Vitamin B.sub.12 metabolism in Mycobacterium tuberculosis. Future Microbiol. 8, 1405-1418 (2013). [0308] 42. Minnikin, D. E., Kremer, L., Dover, L. G. & Besra, G. S. The Methyl-Branched Fortifications of Mycobacterium tuberculosis. Chem. Biol. 9, 545-553 (2002). [0309] 43. Constant, P. et al. Role of the pks15/1 gene in the biosynthesis of phenolglycolipids in the Mycobacterium tuberculosis complex. Evidence that all strains synthesize glycosylated p-hydroxybenzoic methyl esters and that strains devoid of phenolglycolipids harbor a frameshift mutation in the pks15/1 gene. J. Biol. Chem. 277, 38148-58 (2002). [0310] 44. Caws, M. et al. The Influence of Host and Bacterial Genotype on the Development of Disseminated Disease with Mycobacterium tuberculosis. PLoS Pathog. 4, e1000034 (2008). [0311] 45. Balabanova, Y. et al. Beijing clades of Mycobacterium tuberculosis are associated with differential survival in HIV-negative Russian patients. Infect. Genet. Evol. 36, 517-523 (2015). [0312] 46. Mishra, A. K. et al. Identification of an ?(1?6) mannopyranosyltransferase (MptA), involved in Corynebacterium glutamicum lipomanann biosynthesis, and identification of its orthologue in Mycobacterium tuberculosis. Mol. Microbiol. 65, 1503-1517 (2007). [0313] 47. Scherman, H. et al. Identification of a Polyprenylphosphomannosyl Synthase Involved in the Synthesis of Mycobacterial Mannosides. J. Bacteriol. 191, (2009). [0314] 48. De Smet, K. A. L., Brown, I. N., Weston, A., Young, D. B. & Robertson, B. D. Three pathways for trehalose biosynthesis in mycobacteria. Microbiology 146, 199-208 (2000). [0315] 49. Anthony Malinga, L., Stoltz, A. & Walt, M. van der. Efflux Pump Mediated Second-Line Tuberculosis Drug Resistance. Mycobact. Dis. 6, 1-9 (2016). [0316] 50. Morris, R. P. et al. Ancestral antibiotic resistance in Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. 102, 12200-12205 (2005). [0317] 51. Fisher, R. A., Gollan, B. & Helaine, S. Persistent bacterial infections and persister cells. Nat. Rev. Microbiol. 15, 453-464 (2017). [0318] 52. Wagner, D. et al. Elemental Analysis of Mycobacterium avium-, Mycobacterium tuberculosis-, and Mycobacterium smegmatis-Containing Phagosomes Indicates Pathogen-Induced Microenvironments within the Host Cell's Endosomal System. J. Immunol. 174, 1491-1500 (2005). [0319] 53. Kurthkoti, K. et al. The Capacity of Mycobacterium tuberculosis To Survive Iron Starvation Might Enable It To Persist in Iron-Deprived Microenvironments of Human Granulomas. MBio 8, e01092-17 (2017). [0320] 54. Darwin, K. H. Mycobacterium tuberculosis and Copper: A Newly Appreciated Defense against an Old Foe? J. Biol. Chem. 290, 18962-6 (2015). [0321] 55. Brown, K. A. & Ratledge, C. THE EFFECT OF p-AMINOSALICYCLIC ACID ON IRON TRANSPORT AND ASSIMILATION IN MYCOBACTERIA. Biochimica et Biophysica Acta 385, (1975). [0322] 56. Raghu, B., Raghupati Sarma, G. & Venkatesan, P. Effect of Anti-tuberculosis Drugs on the Iron-Sequestration Mechanisms of Mycobacteria. [0323] 57. Peterson, E. J. R. R. et al. A high-resolution network model for global gene regulation in Mycobacterium tuberculosis. Nucleic Acids Res. 42, gku777 (2014). [0324] 58. Gago, G., Kurth, D., Diacovich, L., Tsai, S.-C. & Gramajo, H. Biochemical and Structural Characterization of an Essential Acyl Coenzyme A Carboxylase from Mycobacterium tuberculosis. J. Bacteriol. 188, (2006). [0325] 59. Cronan, J. E. & Lin, S. Synthesis of the α,ω-dicarboxylic acid precursor of biotin by the canonical fatty acid biosynthetic pathway. Curr. Opin. Chem. Biol. 15, 407-413 (2011). [0326] 60. Lee, J. J. et al. Glutamate mediated metabolic neutralization mitigates propionate toxicity in intracellular Mycobacterium tuberculosis. Sci. Rep. 8, 8506 (2018). [0327] 61. Wipperman, M. F., Yang, M., Thomas, S. T. & Sampson, N. S. Shrinking the FadE Proteome of Mycobacterium tuberculosis: Insights into Cholesterol Metabolism through Identification of an α2β2 Heterotetrameric Acyl Coenzyme A Dehydrogenase Family. J. Bacteriol. 195, (2013). [0328] 62. Domenech, P., Reed, M. B., Barry, C. E. & III. Contribution of the Mycobacterium tuberculosis MmpL protein family to virulence and drug resistance. Infect. Immun. 73, 3492-501 (2005). [0329] 63. Turapov, O. et al. Oleoyl Coenzyme A Regulates Interaction of Transcriptional Regulator RaaS (Rv1219c) with DNA in Mycobacteria. J. Biol. Chem. 289, 25241-25249 (2014). [0330] 64. Mustyala, K. K., Malkhed, V., Chittireddy, V. R. R. & Vuruputuri, U. Identification of Small Molecular Inhibitors for Efflux Protein: DrrA of Mycobacterium tuberculosis. Cell. Mol. Bioeng. 9, 190-202 (2016). [0331] 65. Colangeli, R. et al. The Mycobacterium tuberculosis iniA gene is essential for activity of an efflux pump that confers drug tolerance to both isoniazid and ethambutol. Mol. Microbiol. 55, 1829-1840 (2005). [0332] 66. Gupta, A. K. et al. Microarray Analysis of Efflux Pump Genes in Multidrug-Resistant Mycobacterium tuberculosis During Stress Induced by Common Anti-Tuberculous Drugs. Microb. Drug Resist. 16, 21-28 (2010). [0333] 67. Duan, W. et al. Mycobacterium tuberculosis Rv1473 is a novel macrolides ABC Efflux Pump regulated by WhiB7. Future Microbiol. 14, 47-59 (2019). [0334] 68. Parida, S. K. et al. Totally drug-resistant tuberculosis and adjunct therapies. J. Intern. Med. 277, 388-405 (2015). [0335] 69. Nieto R, L. M. et al. Biochemical characterization of isoniazid resistant Mycobacterium tuberculosis: can the analysis of clonal strains reveal novel targetable pathways? Mol. Cell. Proteomics (2018). [0336] 70. Nosova, E. Y. et al. Analysis of mutations in the gyrA and gyrB genes and their association with the resistance of Mycobacterium tuberculosis to levofloxacin, moxifloxacin and gatifloxacin. J. Med. Microbiol. 62, 108-113 (2013). [0337] 71. Schuessler, D. L. et al. Induced ectopic expression of HigB toxin in Mycobacterium tuberculosis results in growth inhibition, reduced abundance of a subset of mRNAs and cleavage of tmRNA. Mol. Microbiol. 90, n/a-n/a (2013). [0338] 72. Chownk, M., Kaur, J., Singh, K. & Kaur, J. mbtJ: an iron stress-induced acetyl hydrolase/esterase of Mycobacterium tuberculosis helps bacteria to survive during iron stress. Future Microbiol. 13, 547-564 (2018). [0339] 73. Game of Somes: Protein Destruction for Mycobacterium tuberculosis Pathogenesis. Trends Microbiol. 24, 26-34 (2016). [0340] 74. Wang, K. et al. The Expression of ABC Efflux Pump, Rv1217c-Rv1218c, and Its Association with Multidrug Resistance of Mycobacterium tuberculosis in China. Curr. Microbiol. 66, 222-226 (2013). [0341] 75. De Keijzer, J. et al. Mechanisms of Phenotypic Rifampicin Tolerance in Mycobacterium tuberculosis Beijing Genotype Strain B0/W148 Revealed by Proteomics. J. Proteome Res. 15, 1194-1204 (2016). [0342] 76. Reed, M. B., Gagneux, S., DeRiemer, K., Small, P. M. & Barry, C. E. The W-Beijing lineage of Mycobacterium tuberculosis overproduces triglycerides and has the DosR dormancy regulon constitutively upregulated. J. Bacteriol. 189, 2583-2589 (2007). [0343] 77. Huet, G. et al. A lipid profile typifies the Beijing strains of Mycobacterium tuberculosis: identification of a mutation responsible for a modification of the structures of phthiocerol dimycocerosates and phenolic glycolipids. J. Biol. Chem. 284, 27101-13 (2009). [0344] 78. Cortes, T. et al. Delayed effects of transcriptional responses in Mycobacterium tuberculosis exposed to nitric oxide suggest other mechanisms involved in survival. Sci. Rep. 7, 8208 (2017). [0345] 79. Vilchèze, C. et al. Enhanced respiration prevents drug tolerance and drug resistance in Mycobacterium tuberculosis. Proc. Natl. Acad. Sci. 114, 4495-4500 (2017). [0346] 80. Keren, I., Minami, S., Rubin, E. & Lewis, K. Characterization and Transcriptome Analysis of Mycobacterium tuberculosis Persisters. MBio 2, (2011). [0347] 81. Bussi, C. & Gutierrez, M. G. Mycobacterium tuberculosis infection of host cells in space and time. FEMS Microbiol. Rev. (2019). doi:10.1093/femsre/fuz006 [0348] 82. Browning, D. F. & Busby, S. J. W. Local and global regulation of transcription initiation in bacteria. Nat. Rev. Microbiol. 14, 638-650 (2016). [0349] 83. Gries, T. J., Kontur, W. S., Capp, M. W., Saecker, R. M. & Record, M. T. One-step DNA melting in the RNA polymerase cleft opens the initiation bubble to form an unstable open complex. Proc. Natl. Acad. Sci. 107, 10418-10423 (2010). [0350] 84. Saecker, R. M. et al. Kinetic Studies and Structural Models of the Association of E. coli σ70 RNA Polymerase with the XPR Promoter: Large Scale Conformational Changes in Forming the Kinetically Significant Intermediates. J. Mol. Biol. 319, 649-671 (2002). [0351] 85. Krokan, H. E. & Bjørås, M. Base excision repair. Cold Spring Harb. Perspect. Biol. 5, a012583 (2013). [0352] 86. Campbell, J. L. & Kleckner, N. E. coli oriC and the dnaA gene promoter are sequestered from dam methyltransferase following the passage of the chromosomal replication fork. Cell 62, 967-979 (1990). [0353] 87. Ardissone, S. et al. Cell Cycle Constraints and Environmental Control of Local DNA Hypomethylation in α-Proteobacteria. PLoS Genet. 12, e1006499 (2016). [0354] 88. Adhikari, S. & Curtis, P. D. DNA methyltransferases and epigenetic regulation in bacteria. FEMS Microbiol. Rev. fuw023 (2016). doi:10.1093/femsre/fuw023 [0355] 89. Elghraoui, A., Modlin, S. J. & Valafar, F. SMRT genome assembly corrects reference errors, resolving the genetic basis of virulence in Mycobacterium tuberculosis. BMC Genomics 18, 302 (2017). [0356] 90. Chin, C.-S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563-569 (2013). [0357] 91. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722-736 (2017). [0358] 92. Hunt, M. et al. Circlator: automated circularization of genome assemblies using long sequencing reads. Genome Biol. 16, 294 (2015). [0359] 93. Chaisson, M. J. & Tesler, G. Mapping single molecule sequencing reads using Basic Local Alignment with Successive Refinement (BLASR): Theory and Application. BMC Bioinformatics 13, 238 (2012). [0360] 94. English, A. C., Salerno, W. J. & Reid, J. G. PBHoney: identifying genomic variants via long-read discordance and interrupted mapping. BMC Bioinformatics 15, 180 (2014). [0361] 95. Shabbeer, A. et al. TB-Lineage: An online tool for classification and analysis of strains of Mycobacterium tuberculosis complex. Infect. Genet. Evol. 12, 789-797 (2012). [0362] 96. Lew, J. M., Kapopoulou, A., Jones, L. M. & Cole, S. T. TubercuList—10 years after. Tuberculosis (Edinb). 91, 1-7 (2011). [0363] 97. Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068-2069 (2014). [0364] 98. Powell, S. et al. eggNOG v3.0: Orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res. 40, 284-289 (2012). [0365] 99. Hyatt, D. et al. Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, (2010). [0366] 100. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772-80 (2013). [0367] 101. Marcais, G. et al. MUMmer4: A fast and versatile genome alignment system. PLOS Comput. Biol. 14, e1005944 (2018). [0368] 102. McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
[0369] 103. Stamatakis, A. RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies. Bioinformatics 30, btu033-btu033 (2014). 104. Letunic, I. & Bork, P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 44, W242-5 (2016). [0370] 105. Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017-1018 (2011).
REFERENCES EXAMPLE 2
[0371] 1. Beaulaurier, J. et al. Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes. Nat Commun 6, 7438 (2015). [0372] 2. Atack, J. M., Tan, A., Bakaletz, L. O., Jennings, M. P. & Seib, K. L. Phasevarions of Bacterial Pathogens: Methylomics Sheds New Light on Old Enemies. Trends Microbiol. 26, 715-726 (2018). [0373] 3. Furuta, Y. & Kobayashi, I. Mobility of DNA sequence recognition domains in DNA methyltransferases suggests epigenetics-driven adaptive evolution. Mob. Genet. Elements 2, 292-296 (2012). [0374] 4. Furuta, Y. et al. Methylome Diversification through Changes in DNA Methyltransferase Sequence Specificity. PLOS Genet 10, e1004272 (2014). [0375] 5. Casadesus, J. & Low, D. A. Programmed Heterogeneity: Epigenetic Mechanisms in Bacteria. J. Biol. Chem. 288, 13929-13935 (2013). [0376] 6. Sanchez-Romero, M. A. & Casadesús, J. The bacterial epigenome. Nature Reviews Microbiology 18, 7-20 (2020). [0377] 7. Shell, S. S. et al. DNA methylation impacts gene expression and ensures hypoxic survival of Mycobacterium tuberculosis. PLoS Pathog. 9, e1003419 (2013). [0378] 8. Gomez-Gonzalez, P. J. et al. An integrated whole genome analysis of Mycobacterium tuberculosis reveals insights into relationship between its genome, transcriptome and methylome. Sci. Rep. 9, 5204 (2019). [0379] 9. Chiner-Oms, A. et al. Genome-wide mutational biases fuel transcriptional diversity in the Mycobacterium tuberculosis complex. Nat. Commun. 10, 3994 (2019). [0380] 10. Zhu, L. et al. Precision methylome characterization of Mycobacterium tuberculosis complex (MTBC) using PacBio single-molecule real-time (SMRT) technology. Nucleic Acids Res. 44, gkv1498 (2015). [0381] 11. Phelan, J. et al. Methylation in Mycobacterium tuberculosis is lineage specific with associated mutations present globally. Sci. Rep. 8, 160 (2018). [0382] 12. Loenen, W. A. M., Dryden, D. T. F., Raleigh, E. A. & Wilson, G. G. Type I restriction enzymes and their relatives. Nucleic Acids Res. 42, 20-44 (2014). [0383] 13. Roberts, R. J. et al. A nomenclature for restriction enzymes, DNA methyltransferases, homing endonucleases and their genes. Nucleic Acids Research 31, 1805-1812 (2003).
[0384] A number of embodiments of the invention have been described. Nevertheless, it can be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.