METHOD FOR SCREENING FOR BIOACTIVE NATURAL PRODUCTS

20230295612 · 2023-09-21

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to methods for screening for the presence of a biosynthetic gene cluster (BGC) in a cell, via the identification of proximal positive 5 regulatory genes, e.g. large ATP-binding regulators of the LuxR family (LAL) genes.

    Claims

    1. A method for screening for the presence of a chemical entity in a bacterial cell, the method comprising the steps of: (a) expressing a positive regulatory gene in a bacterial cell; and (b) determining the presence of one or more chemical entities, other than the polypeptide which is encoded by the positive regulatory gene, whose expression level is increased in the bacterial cell after the expression of the positive regulatory gene, and optionally (c) isolating and/or identifying the chemical entity.

    2. A method for screening for the presence of a biosynthetic gene cluster in a bacterial cell, the method comprising the steps of: (a) identifying the location of a nucleotide sequence coding for a positive regulatory gene within the nucleotide sequence of the genome of a bacterial cell; and (b) analysing the nucleotide sequence of the cell genome in the proximity of the location of the nucleotide sequence of the identified positive regulatory gene in order to determine the presence of a nucleotide sequence which codes for a biosynthetic gene cluster; optionally wherein the method is a computer-implemented method.

    3. The method as claimed in claim 2, wherein the location of the nucleotide sequence coding for the positive regulatory gene is identified using a Hidden Markov model.

    4. The method as claimed in claim 2, wherein the method additionally comprises the step of: (c) proposing a molecular structure for a product resulting from the expression of the biosynthetic gene cluster.

    5. The method as claimed in claim 2, wherein the method additionally comprises the steps of: (d) obtaining a nucleic acid molecule whose nucleotide sequence comprises the nucleotide sequence of the biosynthetic gene cluster; (e) expressing the nucleic acid molecule in a heterologous host cell; (f) expressing the positive regulatory gene, or a derivative thereof, in the heterologous host cell; and optionally (g) isolating and/or identifying a product resulting from the expression of the biosynthetic gene cluster.

    6. A method for screening for the presence of a chemical entity in a bacterial cell, the method comprising the steps of: (i) screening for the presence of a biosynthetic gene cluster in a bacterial cell, as claimed in claim 2; and (ii) screening for the presence of a chemical entity in the bacterial cell by a method comprising the steps of: (a) expressing a positive regulatory gene in a bacterial cell; and (b) determining the presence of one or more chemical entities, other than the polypeptide which is encoded by the positive regulatory gene, whose expression level is increased in the bacterial cell after the expression of the positive regulatory gene, and optionally (c) isolating and/or identifying the chemical entity; wherein the biosynthetic gene cluster in Step (i) and (ii) are the same cluster; and the positive regulatory gene in Step (i) and (ii) are the same gene.

    7. The method as claimed in claim 2, wherein the bacterial cell or heterologous host cell is a Gram-positive bacterial cell.

    8. The method as claimed in claim 7, wherein the bacterial cell or heterologous host cell is of the phylum Actinobacteria.

    9. The method as claimed in claim 8, wherein the bacterial cell or heterologous host cell is of the genus Streptomyces.

    10. The method as claimed in claim 2, wherein the positive regulatory gene is obtained from or derived from the same genus, species or strain as the bacterial cell.

    11. The method as claimed in claim 10, wherein the positive regulatory gene is selected from the group consisting of the LuxR family of genes, SARP (Streptomyces antibiotic regulatory protein) genes and AraC genes.

    12. The method as claimed in claim 11, wherein the positive regulatory gene is the LAL gene.

    13. The method as claimed in claim 2, wherein, when expressed, the positive regulatory gene is operably-associated with a heterologous promoter.

    14. The method as claimed in claim 2, wherein, when expressed, the nucleotide sequence coding for the positive regulatory gene is codon-altered compared to the wild-type nucleotide sequence of the positive regulatory gene.

    15. The method as claimed in claim 2, wherein, when expressed, the G+C content of the nucleotide sequence of the positive regulatory gene is reduced compared to the G+C content of the wild-type nucleotide sequence of the positive regulatory gene.

    16. The method as claimed in claim 15, wherein, when expressed, the G+C content of the nucleotide sequence of the positive regulatory gene is less than 70%.

    17. The method as claimed in claim 2, wherein the chemical entity is a product resulting from the expression of the biosynthetic gene cluster.

    18. The method as claimed in claim 2, wherein the chemical entity is a polyketide, non-ribosomal peptide, terpene or RiPP.

    19. A LAL gene having a G+C content of less than 70%.

    20. A process for producing a modified bacterial cell, the process comprising the step of deleting a LAL gene or a LAL-regulator binding site from the genome of a cell.

    Description

    BRIEF DESCRIPTION OF THE FIGURES

    [0181] FIG. 1: The conserved domain search output for SamR0484 (top) and the HMM created to search for LAL-encoding genes (bottom).

    [0182] FIG. 2: Phylogenetic comparison of LALs highlighting proteins that are similar to those reported to regulate the biosynthesis of known natural products.

    [0183] FIG. 3: Transformants of S. caelestis NRRL 2821 obtained using a plasmid containing native strvi_8009 (left) and a codon-altered derivative (right).

    [0184] FIG. 4: Chromatograms from LC-MS analyses of culture extracts of S. rochei overexpressing a LAL regulator gene. A fresh transformant (top) produces novel metabolites that are absent in cultures grown from spore stocks of a transformant that has been stored for 4 weeks (bottom).

    [0185] FIG. 5: Chromatograms from LC-MS analyses of culture extracts of S. caelestis NRRL 2821 wild type (top) and S. caelestis NRRL 2821 overexpressing codon-altered strvi_8009 (bottom). Peaks corresponding to the novel metabolites identified as the likely products of the BGC associated with strvi_8009 are highlighted (yellow band).

    EXAMPLES

    [0186] The present invention is further illustrated by the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

    Example 1: Screening for the Presence of LAL Genes in Actinobacterial Genomes

    [0187] 44 Actinobacteria were used to test different approaches to activate silent biosynthetic gene clusters (BGCs). The genome sequences of these organisms contained over 1,500 BGCs and we aimed to prioritize and activate those that were silent. An approach based around the large ATP-binding regulators of the LuxR (LAL) family proved to be generalizable.

    Identifying Novel LAL Regulator Genes

    [0188] We used the gene encoding the LAL (samR0484 in Streptomyces ambofaciens) that regulates the expression of the stambomycin BGC to “BLAST” our collected genomes and found 18 genes encoding LAL regulators. We were able to find a larger number of LAL regulators genes in our collection of genomes when we began to manually annotate BGCs. LAL regulators contain a ATP-binding subdomain (at the N-terminus) and a DNA-binding subdomain (at the C-terminus) but the central 600 amino acids have no obvious subdomain structure (FIG. 1). This low sequence homology across the entire length of the protein made it difficult to discover LAL regulators through BLAST nucleotide searches alone.

    [0189] A sequence alignment of 20 reported LAL regulators in the literature and the 18 regulators that we found within our strain collection was used to create a Hidden Markov Model (HMM) using HMMER 3.1b2 (Finn, R. D.; Clements, J.; Eddy, S. R. Nucleic Acids Res. 2011, gkr367) (FIG. 1). This HMM was used to search our collection of genomes and it found over 250 LAL regulator genes. We also used the HMM to search the NCBI non-redundant database where over 17,000 examples of LAL regulators genes were found.

    [0190] Using an HMM to search genomes allowed us to find not only a greater number of LAL regulators genes, but also showed that these genes are associated with BGCs for a greater range of natural product classes (i.e. polyketides, non-ribosomal peptides, ribosomally-synthesized and post-translationally modified peptides, terpenes and non-canonical pathways) than had been previously observed.

    [0191] To prioritize which BGCs to focus on, phylogenetic methods were used to identify proteins that were similar to known LAL regulators (FIG. 2). Further bioinformatics analyses were used to predict structural features of the metabolic products of these BGCs enabling us to prioritise those likely to direct the assembly of novel compounds.

    [0192] More specifically, phylogenetic analyses of particular enzymatic domains (e.g. KS and TE) were used to help distinguish which biosynthetic gene clusters would produce novel natural products. This was combined with manual annotation, which enabled the core scaffold of the unknown natural product to be predicted. This core scaffold can be searched against databases to establish its similarity to reported compounds.

    Example 2: Cloning/Synthesis of Novel LAL Regulator Genes

    [0193] In S. caelestis NRRL 2821, we found a BGC that contained a LAL regulator gene that was predicted to direct the assembly of a novel non-ribosomal peptide. This metabolite was not detected when the wild-type strain was cultured. The process of manipulating the LAL regulator gene to activate expression of the BGC and discover its metabolic product is described below as an example of aspects of the invention.

    [0194] To constitutively express LAL regulator genes, we first attempted to amplify them by PCR from genomic DNA and ligate the PCR product into a plasmid that will put this gene under the control of a constitutive promoter (e.g. ermE*). The gene (strvi_8009) encoding the LAL regulator that is proposed to control the BGC in S. caelestis NRRL 2821 contains 2703 base pairs and has a GC content of 70%. Although this GC content is lower than we typically observe for genes encoding LAL regulators (˜75%) in Actinobacteria, it proved challenging to amplify this gene using PCR. We eventually succeeded by using a high concentration of DMSO (10% v/v) in the reaction, but were only able to obtain the product in low quantities (<1 ng/μL). The PCR product was first cloned into a shuttle vector, then sub-cloned into a plasmid that places the gene under the control of the ermE* promoter.

    [0195] Although it is possible to clone LAL regulators genes using traditional approaches, separate optimization is required for each regulator gene, which prevents the process from being scaled. To overcome this problem, we turned to gene synthesis. However, it also proved difficult to synthesize large genes (>3 kb) with high GC content. Initial efforts to synthesize three LAL genes with high native GC content failed. To overcome this, we codon-altered the genes to lower the GC-content to 55-65% across their length. The following parameters were applied: [0196] GC content between 30-65% [0197] No window of 100 bp with GC higher than 75% [0198] No window of 50 bp with GC higher than 80% [0199] No 16mer repeats [0200] No TTA codons

    [0201] Using these parameters, we were able to synthesize 25 refactored LAL genes, which were cloned into plasmids that placed them under the control of a constitutive promoter.

    Example 3: Transformation of Actinobacteria

    [0202] Genetic-engineering protocols for Actinobacteria are well established. Using these protocols we were unable to transform S. caelestis NRRL 2821 with the native strvi_8009 gene. When the same experiment was attempted with the codon-altered strvi_8009 derivative, transformants were obtained (FIG. 3). The reason for the increased transformation efficiency is not currently understood.

    Methods

    Transformation of Actinobacteria with Codon-Altered LAL Regulator Genes

    [0203] E. coli ET12567/pU8008 was transformed separately via electroporation with an integrative plasmid containing the codon-altered and native strvi_8009 gene under the control of the ermE* promoter. The transformants were incubated for 1 hour at 37° C. and then spread on LB agar containing kanamycin (50 μg/mL of LB), chloramphenicol (50 μg/mL of LB) and ampicillin (100 μg/mL of LB). The resulting plated bacteria were incubated overnight at 37° C. A single colony was picked and grown in liquid LB medium containing kanamycin (50 μg/mL of LB), chloramphenicol (50 μg/mL of LB) and ampicillin (100 μg/mL of LB) overnight. 300 μL of the overnight culture was inoculated into 10 ml of LB liquid medium containing kanamycin (50 μg/mL of LB), chloramphenicol (50 μg/mL of LB) and ampicillin (100 μg/mL of LB) and was grown to an OD ˜0.6 (about 5 hours). Cells were pelleted (5 mins, 4000 rpm) and washed three times with ice cold LB medium, and then resuspended in LB medium(500 μL).

    [0204] Spores of S. caelestis NRRL 2821 (100 μL) were heat shocked in TSB medium (500 μL) at 55° C. for 10 min and then incubated at 30° C. for 5 hours. The E. coli donor cells, prepared as described above, were gently combined with the S. caelestis culture and the mixture was pelleted (2 min, 6000 rpm). 500 μL of the supernatant was removed and the cells were resuspended in the remaining liquid. The resulting mixture was spread on two SFM agar plates (supplemented with MgCl.sub.2, 100 μM) which were incubated overnight at 30° C. and then overlayed with appropriate antibiotics. After 4-7 days cultivation at 30° C. the number of transconjugants on each plate was assessed (FIG. 3).

    Growth, Extraction and Metabolite Analysis (Liquid Cultures)

    [0205] 20 A single transformant was selected and grown in pre-culture medium (TSB, 50 mL) for 2 days at 30° C. 500 μL of the pre-culture was used to inoculate 50 mL of each growth medium (a minimal medium, a natural medium and a rich medium) and the resulting cultures were grown for 7 days at 30° C. The cultures were acidified (to pH 4 with 2M HCl) and extracted with ethyl acetate (3×50 mL). The combined organics were dried over MgSO.sub.4 and evaporated to dryness. The residue was redissolved in acetonitrile/water (50/50 v/v, 1 mL) and analyzed by UHPLC-ESI-Q-TOF-MS as outlined below.

    Growth, Extraction and Metabolite Analysis (Solid Cultures)

    [0206] A single transformant was selected and streaked on ISP4 agar medium. After 7 days growth at 30° C., the spores were harvested and used to inoculate agar plates containing three different media (a minimal medium, a natural medium and a rich medium). After 7 days incubation at 30° C. the cultures were acidified (to pH 4 with 2M HCl) and extracted with acetonitrile (10 mL). The combined organics were dried over MgSO.sub.4 and evaporated to dryness. The residue was redissolved in acetonitrile/water (50/50 v/v, 1 mL) and analyzed by UHPLC-ESI-Q-TOF-MS as outlined below.

    UHPLC-ESI-Q-TOF-MS Equipment and Analysis Conditions

    [0207] Analyses were carried out using a Bruker MaXis Impact ESI-TOF-MS connected to a Dionex 3000 RS UHPLC instrument fitted with an Agilent Zorbax Eclipse Plus C18 column (100×2.1 mm, 1.8 μm, 25° C.). The flow rate was 0.2 mL/min and a gradient of 5-100% acetonitrile (with 0.1% formic acid) over 20 min was used as the eluent.

    Example 4: Metabolite Detection

    [0208] We previously found that transformants overexpressing LAL regulator genes would lose the ability to produce novel metabolites when stored (FIG. 4), hindering isolation and structure elucidation. To overcome this, we grew large-scale cultures directly from transformants, without producing an initial spore stock. We screened metabolite production in three different solid and liquid media (six in total).

    [0209] Ex-conjugants over-expressing strvi_8009 were screened for production of new metabolites. In rich media (both solid and liquid), the metabolite profile in the strain overexpressing the codon-altered strvi_8009 gene differed from the wild type strain (FIG. 5). Four novel compounds hypothesized to be the metabolic products of the BGC associated with strvi_8009 were identified.

    REFERENCES

    [0210] Blin, K., Shaw, S., Steinke, K., Villebro, R., Ziemert, N., Lee, S. Y., Medema, M. H. and Weber, T. (2019) antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res., 47, W81-W87. [0211] Blin, K., Pascal Andreu, V., de los Santos, E. L. C., Del Carratore, F., Lee, S. Y., Medema, M. H. and Weber,T. (2019) The antiSMASH database version 2: a comprehensive resource on secondary metabolite biosynthetic gene clusters. Nucleic Acids Res., 47, D625—D630. [0212] Cimermancic, P., Medema, M. H., Claesen, J., Kurita, K., Wieland, Brown, L. C., Mavrommatis, K., Pati, A., Godfrey, P. A., Koehrsen, M., Clardy, J. et al. (2014) Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell, 158, 412-421.

    [0213] Hadjithomas, M., Chen, I.-M. A., Chu, K., Huang, J., Ratner, A., Palaniappan, K., Andersen, E., Markowitz, V., Kyrpides, N. C. and Ivanova, N. N. (2017) IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes. Nucleic Acids Res., 45, D560-D565. [0214] Kautsar et al. D454—D458 Nucleic Acids Research, 2020, Vol. 48, Database issue Published online 15 October 2019 doi: 10.1093/nar/gkz882 [0215] E. Kuscer, N. Coates, I. Challis, M. Gregory, B. Wilkinson, R. Sheridan, H. Petkovic, J. Bacteriol. 2007, 189, 4756-4763. [0216] Laureti, L. Song, S. Huang, C. Corre, P. Leblond, G. L. Challis, B. Aigle, Proc. Natl. Acad. Sci. USA 2011, 108, 6258-6263. [0217] Thanapipatsiri et al., (2016) ChemBioChem 17, 2189-2198. [0218] D. J. Wilson, Y. Xue, K. A. Reynolds, D. H. Sherman, J. Bacteriol. 2001, 183, 3468-3475.

    LIST OF SEQUENCES

    [0219] The Sequence Listing filed with this patent application is fully incorporated herein as part of the description.

    [0220] SEQ ID NO: 1

    [0221] Samr0484 (stambomycin) nucleotide sequence.

    [0222] SEQ ID NO: 2

    [0223] Samr0484 (stambomycin) aa sequence.

    [0224] SEQ ID NO: 3

    [0225] N-terminal ATPase domain-encoding region from samr0484.

    [0226] SEQ ID NO: 4

    [0227] N-terminal ATPase domain from Samr0484.

    [0228] SEQ ID NO: 5

    [0229] C-terminal LuxR family DNA-binding domain-encoding region with a helix-turn-helix motif from samr0484.

    [0230] SEQ ID NO: 6

    [0231] C-terminal LuxR family DNA-binding domain with a helix-turn-helix motif from SamR0484.