Methods for generating pools of variants of a DNA template
11274295 · 2022-03-15
Assignee
Inventors
Cpc classification
C12N15/1031
CHEMISTRY; METALLURGY
C12N15/1079
CHEMISTRY; METALLURGY
International classification
Abstract
The invention provides methods for generating pools of variants of DNA templates, and methods of using pools of variants to identify sequences involved in conferring sensitivity or resistance to environmental factors.
Claims
1. A method of generating a mixture of variants of a template DNA molecule, said method comprising: a) providing a plurality of oligonucleotides synthesized on one or more solid supports comprising one or more sets of oligonucleotides, each set corresponding to one of multiple (“n”) 100 to 500 nucleotide regions of said template DNA molecule, wherein the oligonucleotides in each set of oligonucleotides comprise: i) 5′ and 3′ ends identical to the 5′ and 3′ ends of the region, and ii) a central variable region comprising at least one sequence variation as compared to the sequence of said region of said template DNA molecule and the other oligonucleotides in the set; b) amplifying a set of oligonucleotides on the one or more solid supports corresponding to a first region of said template DNA molecule from the plurality of oligonucleotides using polymerase chain reaction (PCR); c) providing a plurality of linearized plasmid vectors comprising the template DNA molecule, but lacking said first region of said template DNA molecule, wherein said plurality of linearized plasmid vectors is generated by polymerase chain reaction using primers having 5′ ends that are facing each other on the vector and 3′ ends oriented such that extension amplifies the entire sequence of the vector except for the first region, wherein the ends of said linearized plasmid vectors are configured for joining with the ends of the amplified set of oligonucleotides to complete the template DNA molecule; and d) joining said amplified set of oligonucleotides to said plurality of linearized plasmid vectors, thereby generating a mixture of circular plasmids comprising a mixture of variants of said first region of said template DNA molecule.
2. The method of claim 1, wherein said template DNA molecule encodes a polypeptide, a non-coding RNA, an untranslated regulatory sequence, a promoter sequence, and/or an enhancer sequence.
3. The method of claim 1, wherein said variation is a substitution or deletion of at least one nucleotide.
4. The method of claim 1, further comprising repeating steps b) to d) for a second region of said template DNA molecule; and e) mixing said mixture of circular plasmids comprising variants of said first region of said template DNA molecule with said mixture of circular plasmids comprising variants of said second region of said template DNA molecule to generate a mixture of variants of said first and second regions of said template DNA molecule.
5. The method of claim 1, wherein each of the one or more sets of oligonucleotides of step (a) is a set of oligonucleotides that encodes at least one variation of every amino acid encoded by the central variable region of the set of oligonucleotides.
6. The method of claim 1, wherein each of the one or more sets of oligonucleotides of step (a) is a set of oligonucleotides that comprises at least one variation of every nucleotide of the central variable region of the set of oligonucleotides, and/or is a set of oligonucleotides that each have variation in more than one nucleotide of the central variable region.
7. The method of claim 1, wherein said method further comprises repeating steps b) to d) for each of said “n” regions to generate a mixture of circular plasmids, wherein each mixture of circular plasmids has variants of one of said “n” regions of said template molecule; and e) mixing each mixture of circular plasmids to generate a mixture of circular plasmids having variants of said “n” regions of said template DNA molecule.
8. The method of claim 5, wherein said at least one variation of every amino acid is at least one naturally occurring variation of every amino acid.
9. The method of claim 1, further comprising: e) transforming said mixture of circular plasmids generated by said joining into a host cell; f) selecting for recombinant clones containing said plasmids; and g) isolating said plasmids containing said variants.
10. A method of generating a mixture of variants of a template DNA molecule, said method comprising: a) providing a plurality of oligonucleotides synthesized on one or more solid supports comprising two or more sets of oligonucleotides, each set corresponding to one of multiple (“n”) 100 to 500 nucleotide consecutive regions of said template DNA molecule, wherein the oligonucleotides in each set of oligonucleotides comprise: i) 5′ and 3′ ends identical to the 5′ and 3′ ends of the region, and ii) a central variable region comprising at least one sequence variation as compared to the sequence of said region of said template DNA molecule and the other oligonucleotides in the set; b) amplifying two or more sets of oligonucleotides on the one or more solid supports corresponding to consecutive multiple regions of said template DNA molecule from the plurality of oligonucleotides using polymerase chain reaction (PCR); c) providing a plurality of linearized plasmid vectors comprising the template DNA molecule, but lacking said consecutive multiple regions of said template DNA molecule, wherein said plurality of linearized plasmid vectors is generated by polymerase chain reaction using primers having 5′ ends that are facing each other on the vector and 3′ ends oriented such that extension amplifies the entire sequence of the vector except for the consecutive multiple regions, wherein the ends of said linearized vectors are configured for joining with the ends of the amplified sets of oligonucleotides corresponding to the two outermost of the consecutive multiple regions of said template DNA molecule to complete the template DNA molecule upon joining of all of said amplified plurality of oligonucleotides corresponding to consecutive multiple regions; and d) joining said amplified sets of oligonucleotides to each other and said plurality of linearized plasmid vectors, thereby generating circular plasmids comprising a mixture of variants of said multiple regions of said template DNA molecule.
11. A method of identifying a variant nucleic acid molecule which, when introduced into a cell, selectively increases or decreases the sensitivity of the cell to an environmental factor, said method comprising: a) introducing a mixture of variants of a template DNA molecule into both a first population of cells and a second population of cells, wherein said mixture of variants of a template DNA molecule are generated by the methods of claim 1, b) incubating said first population of cells in the presence of a first environmental factor; c) incubating said second population of cells in the absence of said first environmental factor; d) isolating cells exhibiting a phenotype associated with increased or decreased sensitivity to said first environmental factor; and e) determining which variants of said template DNA molecule are enriched or depleted in cells isolated from said first population of cells as compared to said second population of cells; thereby identifying a variant nucleic acid that selectively increases or decreases the sensitivity of said cells to said first environmental factor.
12. The method of claim 11, wherein said second population of cells are grown in the presence of a second environmental factor, thereby identifying a variant that selectively increases or decreases the sensitivity of said cells to said first environmental factor relative to said second environmental factor.
13. The method of claim 11, wherein said isolating of cells comprises collection of surviving cells, separation of cells having a morphological characteristic, optionally wherein said morphological characteristic is cell size, expression of a differentiation marker, cell adhesion, or cell membrane integrity, and/or fluorescent activation cell sorting (FACS).
14. The method of claim 11, further comprising introducing a mixture of variants of a template DNA molecule into a third population of cells, incubating said third population of cells in the presence of a third environmental factor, and determining which variants of said template DNA molecule are enriched or depleted in cells incubated with said third environmental factor compared to said first and/or second environmental factors, thereby identifying variants that selectively increase or decrease sensitivity of said cells to said third environmental factor.
15. The method of claim 11, wherein said mixture of variants of a template DNA molecule are encapsulated in a viral particle when introduced into said population of cells.
16. The method of claim 15, wherein said isolating cells exhibiting a phenotype associated with increased or decreased sensitivity to said first environmental factor further comprises isolating viral DNA or RNA from said population of cells.
17. The method of claim 1, wherein the plurality of oligonucleotides is synthesized on separate solid supports.
18. The method of claim 1, wherein the central variable regions of each of the one or more sets of oligonucleotides collectively span the entire template DNA molecule.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
DETAILED DESCRIPTION OF THE INVENTION
(24) The invention provides methods of generating mixtures of nucleic acid (e.g., DNA) molecules (typically in the form of vectors, such as plasmids, e.g., viral vectors), in which each DNA molecule contains, in part, a variant of a common template DNA molecule. The methods of the invention can be used in the construction of mixtures of variants (also referred to herein as mutagenesis libraries) that contain many or all variants of a template DNA molecule. For example, the methods can be used for the construction of mutagenesis libraries that encode many or all single amino acid substitutions in a polypeptide from a protein-coding DNA template sequence. The methods of the invention can involve the simultaneous synthesis of short oligonucleotides that contain all desired variations (e.g., substitutions) on microarrays or other micro-scale solid supports, and then use of these oligonucleotides to construct mutagenesis libraries using multiplexed, seamless cloning reactions. The methods of the invention provide significantly better control over the composition of the DNA library than error-prone PCR, and require significantly less labor and reagents than existing library generation methods (e.g., site-directed mutagenesis).
(25) The invention also provides methods of identifying amino acid residues or nucleic acid sequences that, when mutated, give rise to differences in sensitivity to certain environmental factors. For example, the methods of the invention can be used to determine which mutations in bacteria give rise to sensitivity or resistance to one antibiotic, but do not similarly affect the sensitivity of the bacteria to a second (or further), possibly related antibiotic. In other examples, the methods can be used to determine which mutations in mammalian cells give rise to sensitivity or resistance to a particular drug (e.g., a chemotherapeutic agent), but may not similarly affect the sensitivity of the cells to a second (or further), possibly related drug (e.g., chemotherapeutic agent). The methods of the invention can be adapted to measure differences in sensitivity to environmental factors by detection of a variety of phenotypes. These phenotypes include, for example, growth rate, utilization of certain nutrients, sensitivity to growth or differentiation factors, sensitivity to certain types of mutagenic conditions, expression of marker genes and proteins, and sensitivity to infection by, e.g., viruses.
(26) In brief, these methods of the invention can include providing a diverse library of nucleic acid molecules encoding mutants of a protein or nucleic acid molecule of interest (generated by, e.g., the methods described herein), introducing the library into a cellular population, exposing the cells to at least one environmental factor, isolating cells exhibiting a desired phenotype in response to the at least one environmental factor, and identifying the mutations enriched or depleted in the isolated population as compared to, for example, the starting population of cells or a population of cells incubated in the absence of the environmental factor. In some embodiments, the methods can further include exposing one or more additional populations of cells to a second (or third, etc.) environmental factor, isolating cells exhibiting a desired phenotype, identifying the mutations enriched or depleted in the isolated populations as compared to the mutations present in the starting population and/or enriched in the populations of cells isolated after exposure to the first (or second, etc.) environmental agent. Such comparisons facilitate identification of mutations that confer increased or decreased sensitivity (e.g., selective antibiotic or other drug resistance or sensitivity) to one environmental factor over another.
(27) The Template DNA Molecule
(28) The template DNA molecule is a starting material for the methods of the invention, from which variants of the template DNA molecule are generated. The template DNA molecule can optionally include a sequence that encodes a polypeptide (e.g., a protein), a DNA regulatory element (e.g., a promoter or enhancer), or RNA. The template DNA molecule is typically cloned into a vector, such as a plasmid vector (e.g., a mammalian expression vector, a bacterial expression vector, or a viral vector (e.g., a lentiviral vector), which optionally includes a selectable marker. The template DNA molecule can be 100 nucleotides or more in length (e.g., greater than 100, 150, 200, 300, 500, 1000, 2000, 5000, or 10000 nucleotides in length). In certain embodiments, the template DNA molecule can encode an antibiotic, anti-viral, or chemotherapeutic resistance gene, a protein expressed on a cell surface, or contribute to the regulation of, e.g., a differentiation factor.
(29) The Oligonucleotides
(30) Sets of oligonucleotides used in the invention are typically 100-500 (e.g., 100, 150, 200, 250, 300, 350, 400, 450, or 500) nucleotides in length. Each oligonucleotide may contain zero, one, or more variations (e.g., substitutions, deletions, or insertions) in sequence according to the desired composition of the final mixture of variants. Also, as described elsewhere herein, the oligonucleotides, in some embodiments, can include a central region where sequence variations occur, and invariant 5′ and 3′ ends, which can be used, for example, as PCR primer recognition sites. The oligonucleotides can thus include, for example, 15-100, 20-80, or 30-50 invariant nucleotides flanking the central region. In one example, the oligonucleotides include 200 nucleotides, with 30 on each end being invariant and flanking a 140 nucleotide central region including variant sequences as described herein.
(31) The oligonucleotides are synthesized using standard methods, e.g., on microarrays (e.g., a programmable microarray) or other micro-scale solid supports. As described further below, in various examples of the invention, a set of long DNA oligonucleotides that encode all desired mutations, but which are otherwise homologous to the template sequence, are designed. The oligonucleotides are organized into ‘tiles,’ for use in the ‘tiling mutagenesis’ methods described herein, where those within each tile differ in a central variable region but share identical 5′ and 3′ ends. The tiles can be staggered such that their variable regions collectively span the entire template. Individual tiles are PCR amplified using primers complementary to their shared ends. To avoid hybridization and extension of partially overlapping oligonucleotides, the tiles can be split into two (or more) non-overlapping pools that are synthesized and amplified separately. PCR products from each tile are inserted into linearized plasmids that carry the remaining template sequence using multiplexed sequence- and ligation-independent cloning.
(32) Method for Construction of Mutagenesis Libraries
(33) The methods of the invention are exemplified in the following list of steps: 1) Generating a plasmid vector that contains the template DNA molecule using methods known in the art, including, but not limited to cloning and standard commercial gene synthesis. 2) Designing sets of oligonucleotides, typically 150-250 nucleotides long (e.g., 100, 150, 200, or 250 nucleotides long) as shown in
Variants
(34) Sequence variants in the methods of the invention can include substitutions, insertions, or deletions of nucleotides. For example, if a template DNA molecule is an open reading frame that encodes a polypeptide, the methods of the invention can be used to create a mixture of variants in which every amino acid of the polypeptide is changed to at least one (e.g., 2, 3, 4, 5, 6, 10, 15, or 20) of any one of the 20 naturally occurring amino acids or any unnatural amino acids.
(35) If a template DNA molecule encodes a DNA regulatory element (e.g., a transcription factor binding site, an enhancer sequence, or a chromatin remodeling sequence), then the methods of the invention can be used to create a mixture of variants in which every nucleotide of the DNA regulatory element has been changed at least once. Similarly, if the template DNA molecule encodes an RNA molecule (e.g., miRNA, siRNA, or mRNA), then the methods of the invention can be used to create a mixture of variants in which every nucleotide of the encoded RNA has been changed at least once (optionally to all other naturally occurring sequence options and/or to options including unnaturally occurring nucleotides).
(36) The mixture of variants can include plasmids that contain one or more variations in a single region (e.g., 1, 2, 3, 4, 5, 6, 10, 20, 30, 50, 100, or 200 variations) and the mixture may thus include many or all variants of the template DNA molecule. The mixture of variants can include vectors (e.g., plasmids) in which each vector contains one or more variations in different regions corresponding to the template DNA molecule. For example, such a mixture of variants may contain vectors (e.g., plasmids) that represent variations in 1, 2, 3, 10, 15, 20, 100, or 200 regions corresponding to the template DNA molecule.
(37) The mixture of variants can include vectors (e.g., plasmids) that each contain more than one variation in two or more different regions of the template DNA molecule. Such variants can be created by generating a variation in one region first and then using this variant containing vector as an input in the method of the invention for generating further variations in one or more regions.
(38) Alternatively, variation containing oligonucleotides corresponding to two or more overlapping adjacent regions can be joined in a linear manner and then recombined with the linearized plasmid vector to produce a single plasmid containing variations in multiple regions of the template DNA molecule in one step.
(39) Identifying Mutations that Confer Sensitivity to Environmental Factors
(40) The methods of the invention are useful to identify mutations that confer selective sensitivity or resistance of cells to an environmental factor, which can be measured as compared to, for example, cells not exposed to the environmental factor, cells exposed to a second factor, and/or cells exposed to a different concentration or dosage of the first environmental factor. In these methods, the environmental factor can be, e.g., antibiotics, such as antibiotics of the same family. Accordingly, the methods of the invention can be used to determine which mutations provide sensitivity or resistance to one member of an antibiotic family as compared to, for example, a second member of the same family. Such information can be used to inform treatment decisions by identifying whether a particular pathogen infecting a subject contains a mutation that, while providing resistance to a first antibiotic, still provides susceptibility to treatment with a second antibiotic, or to prospectively identify combinations of antibiotics that are not vulnerable to the same resistance mutations, thereby reducing the risk of emergent resistance during the course of treatment.
(41) Antibiotics
(42) As non-limiting examples, the methods of the invention can be used in identifying mutations that confer selective sensitivity or resistance to or between antibiotics selected from among any or all of the following:
(43) Aminoglycosides (e.g., amikacin, gentamicin, kanamycin, neomycin, netilmicin, tobramycin, paromomycin, and apectinomycin).
(44) Ansamycins (e.g., geldanamycin, herbimycin, rifaximin, and streptomycin).
(45) Carbapenems (e.g., ertapenem, doripenem, cilastatin, and meropenem).
(46) First generation cephalosporins (e.g., cefadroxil, cefazolin, cefalotin, and cefalexin).
(47) Second generation cephalosporins (e.g., cefaclor, cefamandole, cefoxitin, cefprozil, and cefuroxime).
(48) Third generation cephalosporins (e.g., cefixime, cefdinir, cefditoren, cefoperazone, cefotaxime, cefpodoxime, ceftazidime, ceftibuten, ceftizoxime, and ceftriaxone).
(49) Fourth and fifth generation cephalosporins (e.g., cefepime, ceftaroline fosamil, and ceftobiprole).
(50) Glycopeptides (e.g., teicoplanin, vancomycin, and telavancin).
(51) Lincosamides (e.g., clindamycin and lincomycin).
(52) Daptomycin.
(53) Macrolides (e.g., azithromycin, clarithromycin, dirithromycin, erythromycin, roxithromycin, troleandomycin, telithromycin, and spiramycin).
(54) Aztreonam.
(55) Nitrofurans (e.g., furazolidone and nitrofurantoin).
(56) Oxazolidonones (e.g., linezolid, posizolid, radezolid, and torezolid).
(57) Penicillins (amoxicillin, ampicillin, azlocillin, carbenicillin, cloxacillin, dicloxacillin, flucloxacillin, mezlocillin, methicillin, nafcillin, oxacillin, penicillin g, penicillin v, piperacillin, penicillin g, temocillin, and ticarcillin).
(58) Penicillin combinations (e.g., amoxicillin/clavulanate, ampicillin/sulbactam, piperacillin/tazobactam, and ticarcillin/clavulanate).
(59) Polypeptides (e.g., bacitracin, colistin, and polymyxin B).
(60) Quinolones (e.g., ciprofloxacin, enoxacin, gatifloxacin, levofloxacin, lomefloxacin, moxifloxacin, nalidixic acid, norfloxacin, ofloxacin, trovafloxacin, grepafloxacin, sparfloxacin, and temafloxacin).
(61) Sulfonamides (e.g., mafenide, sulfacetamide, sulfadiazine, silver sulfadiazine, sulfadimethoxine, sulfamethizole, sulfamethoxazole, sulfanilimide, sulfasalazine, sulfisoxazole, trimethoprim-sulfamethoxazole, and sulfonamidochrysoidine).
(62) Tetracyclines (e.g., demeclocycline, doxycycline, minocycline, oxytetracycline, and tetracycline).
(63) Chemotherapeutic Agents
(64) As further non-limiting examples, the methods of the invention are useful in identifying mutations that confer selective sensitivity or resistance to or between chemotherapeutic agents selected from among any or all of the following:
(65) Alkylating agent (e.g., cyclophosphamide, mechlorethamine, chlorambucil, and melphalan).
(66) Anthracyclines (e.g., daunorubicin, doxorubicin, epirubicin, idarubicin, mitoxantrone, and valrubicin).
(67) Cytoskeletal disruptors (taxanes) (e.g., paclitaxel, and docetaxel).
(68) Epothilones.
(69) Histone deacetylase inhibitors (e.g., vorinostat and romidepsin).
(70) Inhibitors of topoisomerase I (e.g., irinotecan and topotecan).
(71) Inhibitors of topoisomerase II (e.g., etoposide, teniposide, and tafluposide).
(72) Kinase inhibitors (e.g., bortezomib, erlotinib, gefitinib, imatinib, vemurafenib, and vismodegib).
(73) Monoclonal antibodies (e.g., bevacizumab, cetuximab, ipilimumab, ofatumumab, ocrelizumab, panitumab, and rituximab).
(74) Nucleotide analogs and precursor analogs (e.g., azacitidine, azathioprine, capecitabine, cytarabine, doxifluridine, fluorouracil, gemcitabine, hydroxyurea, mercaptopurine, methotrexate, and tioguanine (formerly thioguanine).
(75) Peptide antibiotics (e.g., bleomycin and actinomycin).
(76) Platinum-based agents (e.g., carboplatin, cisplatin, and oxaliplatin).
(77) Retinoids (e.g., tretinoin, alitretinoin, and bexarotene).
(78) Vinca alkaloids and derivatives (e.g., vinblastine, vincristine, vindesine, and vinorelbine).
(79) Anti-Viral Agents
(80) As further non-limiting examples, the methods of the invention are useful in identifying mutations that confer selective sensitivity or resistance to or between anti-viral agents selected, e.g., from among any or all of the following:
(81) Abacavir, aciclovir, acyclovir, adefovir, amantadine, amprenavir, ampligen, arbidol, atazanavir, atripla (fixed dose drug), balavir, boceprevirertet, cidofovir, combivir (fixed dose drug), darunavir, delavirdine, didanosine, docosanol, edoxudine, efavirenz, emtricitabine, enfuvirtide, entecavir, entry inhibitors, famciclovir, fixed dose combination (antiretroviral), fomivirsen, fosamprenavir, foscarnet, fosfonet, fusion inhibitor, ganciclovir, ibacitabine, imunovir, idoxuridine, imiquimod, indinavir, inosine, integrase inhibitor, interferon type iii, interferon type ii, interferon type i, interferon, lamivudine, lopinavir, loviride, maraviroc, moroxydine, methisazone, nelfinavir, nevirapine, nexavir, nucleoside analogues, oseltamivir, peginterferon alfa-2a, penciclovir, peramivir, pleconaril, podophyllotoxin, protease inhibitor (pharmacology), raltegravir, reverse transcriptase inhibitor, ribavirin, rimantadine, ritonavir, pyramidine, saquinavir, stavudine, synergistic enhancer (antiretroviral), tea tree oil, telaprevir, tenofovir, tenofovir disoproxil, tipranavir, trifluridine, trizivir, tromantadine, truvada, valaciclovir (valtrex), valganciclovir, vicriviroc, vidarabine, viramidine, zalcitabine, zanamivir, and zidovudine.
(82) Anti-Fungal Agents
(83) As further non-limiting examples, the methods of the invention are useful in identifying mutations that confer selective sensitivity or resistance between anti-fungal agents selected, e.g., from among any or all of the following:
(84) Polyene antifungals (e.g., Amphotericin b, candicidin, filipin, hamycin, natamycin, nystatin, and rimocidin).
(85) Imidazoles (e.g., Bifonazole, butoconazole, clotrimazole, econazole, fenticonazole, isoconazole, ketoconazole, miconazole, omoconazole, oxiconazole, sertaconazole, sulconazole, and tioconazole).
(86) Triazoles (e.g., Albaconazole, fluconazole, isavuconazole, itraconazole, posaconazole, ravuconazole, terconazole, and voriconazole).
(87) Thiazoles (e.g., Abafungin).
(88) Allylamines (e.g., Amorolfin, butenafine, naftifine, and terbinafine).
(89) Echinocandins (e.g., Anidulafungin, caspofungin, and micafungin).
(90) Others (e.g., Benzoic acid, ciclopirox, flucytosine or 5-fluorocytosine, griseofulvin, haloprogin, polygodial, tolnaftate, undecylenic acid, and crystal violet).
(91) Alternatives (e.g., oregano, allicin, citronella oil, coconut oil, iodine—lugol's iodine, lemon myrtle, neem seed oil, olive leaf, orange oil, palmarosa oil, patchouli, selenium, tea tree oil—iso 4730 (“oil of melaleuca, terpinen-4-ol type”), zinc, and horopito (pseudowintera colorata) leaf contains the antifungal compound polygodial, turnip, chives, radish).
(92) Pathogenic Organisms
(93) As further non-limiting examples, the methods of the invention are useful in identifying mutations that confer selective sensitivity or resistance to pathogenic organisms (or toxins or components of such organisms or particles that mediate cell entry or other interactions with the host cell, etc). In other examples, the methods of the invention are useful in identifying mutations in pathogenic organisms that make them more or less sensitive to antibiotic agents or more or less toxic to cells. Pathogenic organisms are selected, e.g., from among any or all of the following:
(94) Bordetella (e.g., Bordetella pertussis), borrelia (e.g., Borrelia burgdorferi), brucella (e.g., brucella abortus, brucella canis, brucella melitensis, brucella suis), campylobacter (e.g., Campylobacter jejuni), chlamydia and chlamydophila (e.g., Chlamydia pneumoniae, Chlamydia trachomatis, chlamydophila psittaci), clostridium (e.g., Clostridium botulinum, Clostridium difficile, Clostridium perfringens, Clostridium tetani), corynebacterium (e.g., Corynebacterium diphtheriae), enterococcus (e.g., Enterococcus faecalis, Enterococcus faecium), escherichia (e.g., Escherichia coli), francisella (e.g., Francisella tularensis), haemophilus (e.g., Haemophilus influenzae), helicobacter (e.g., Helicobacter pylori), legionella (e.g., Legionella pneumophila), leptospira (e.g., leptospira interrogans), listeria (e.g., Listeria monocytogenes), mycobacterium (e.g., Mycobacterium leprae, Mycobacterium tuberculosis, Mycobacterium ulcerans), mycoplasma (e.g., Mycoplasma pneumoniae), neisseria (e.g., Neisseria gonorrhoeae, Neisseria meningitidis), pseudomonas (e.g., Pseudomonas aeruginosa), rickettsia (e.g., Rickettsia rickettsii), salmonella (e.g., Salmonella typhi, Salmonella typhimurium), shigella (e.g., Shigella sonnei), staphylococcus (e.g., Staphylococcus aureus, Staphylococcus epidermidis, Staphylococcus saprophyticus), streptococcus (e.g., Streptococcus agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes), treponema (e.g., Treponema pallidum), vibrio (e.g., Vibrio cholerae), and yersinia (e.g., Yersinia pestis).
(95) Hormones
(96) As further non-limiting examples, the methods of the invention are useful in identifying mutations that change the sensitivity or resistance of cells to certain hormones or groups of hormones, e.g., from among any or all of the following hormones:
(97) Eicosanoid (e.g., prostaglandins, leukotrienes, prostacyclin, and thromboxane).
(98) Peptide (e.g., Amylin (or islet amyloid polypeptide), antimullerian hormone (or müllerian inhibiting factor or hormone), adiponectin, adrenocorticotropic hormone (or corticotropin), angiotensinogen and angiotensin, antidiuretic hormone (or vasopressin, arginine vasopressin), atrial-natriuretic peptide (or atriopeptin), brain natriuretic peptide, calcitonin, cholecystokinin, corticotropin-releasing hormone, enkephalin, endothelin, erythropoietin, follicle-stimulating hormone, galanin, gastrin, ghrelin, glucagon, gonadotropin-releasing hormone, growth hormone-releasing hormone, human chorionic gonadotropin, human placental lactogen, growth hormone, inhibin, insulin, insulin-like growth factor (or somatomedin), leptin, lipotropin, luteinizing hormone, melanocyte stimulating hormone, motilin, orexin, oxytocin, pancreatic polypeptide, parathyroid hormone, prolactin, prolactin releasing hormone, relaxin, renin, secretin, somatostatin, thrombopoietin, thyroid-stimulating hormone (or thyrotropin), and thyrotropin-releasing hormone).
(99) Steroid (e.g., Testosterone, dehydroepiandrosterone, androstenedione, dihydrotestosterone, aldosterone, estradiol, estrone, estriol, cortisol, progesterone, calcitriol (1,25-dihydroxyvitamin d3), and calcidiol (25-hydroxyvitamin d3)).
(100) Anti-Inflammatory Agents
(101) As further non-limiting examples, the methods of the invention are useful in identifying mutations that change the sensitivity of cells to certain anti-inflammatory agents, e.g., from among any or all of the following hormones:
(102) Salicylates (e.g., aspirin (acetylsalicylic acid), diflunisal, and salsalate)
(103) Propionic acid derivatives (e.g., Ibuprofen, dexibuprofen, naproxen, fenoprofen, ketoprofen, dexketoprofen, flurbiprofen, oxaprozin, and loxoprofen).
(104) Acetic acid derivatives (e.g., Indomethacin, tolmetin, sulindac, etodolac, ketorolac, diclofenac, and nabumetone).
(105) Enolic acid (oxicam) derivatives (e.g., Piroxicam, meloxicam, tenoxicam, droxicam, lornoxicam, and isoxicam).
(106) Fenamic acid derivatives (fenamates) (e.g., Mefenamic acid, meclofenamic acid, flufenamic acid, and tolfenamic acid).
(107) Selective cox-2 inhibitors (coxibs) (e.g., Celecoxib, rofecoxib, valdecoxib, parecoxib, lumiracoxib, etoricoxib, firocoxib, and paracetamol).
(108) Sulphonanilides (e.g., Nimesulide).
(109) Others (e.g., Licofelone and lysine clonixinate).
(110) Natural (e.g., Hyperforin, figwort, and calcitriol (vitamin d)).
(111) Corticosteroids (e.g. corticosterone, deoxycorticosterone, cortisol, 11-deoxycortisol, cortisone, 18-hydroxycorticosterone, 1α-hydroxycorticosterone, and aldosterone).
(112) In other embodiments, the environmental factors used in the methods of the invention can be, e.g., growth factors, drugs, nutrients (e.g., carbon sources, vitamins, etc.), cellular toxins, anti-aging compounds, anti-protozoan compounds, differentiation factors, mutagens (e.g., ultraviolet light, X-rays, gamma rays, and chemical carcinogens), temperature, pressure, pH, salinity, and viscosity.
(113) In other embodiments, the environmental factors used in the methods of the invention can be pathogenic organisms or particles that attach to or enter cells, including viral particles (e.g., adenovirus, picornavirus, herpesvirus, flavivirus, polyomavirus, retrovirus, rhabdovirus and togavirus) and components of such organisms or particles that mediate cell entry or other interactions with the host cell (e.g., virion enveloped glycoproteins such as HIV gp120 and Herpes simplex virus gC and gD).
(114) The methods of isolating cells exposed to a particular environmental factor will depend on the nature of the environmental factor. If, for example, the environmental factor affects the growth rate of the cell or results in cell death, cells can be isolated by selection methods. In this embodiment, for example, if the variant increases sensitivity to the an environmental factor resulting in decreased growth rate or is toxic to a cell population, or decreases sensitivity to a factor that increases the growth rate of a cell, that variant would be depleted upon exposure to that environmental factor. If the variant increases sensitivity to a factor that promotes growth of the cell, or if the variant decreases sensitivity to an environmental factor decreases the growth rate or is toxic to the cell, then that variant will be enriched in the resulting cell population. If, for example, the environmental factor affects morphological aspects of the cell (e.g., expression of differentiation factors or other markers, cell size, cell wall integrity, cellular adhesion, cell cycle arrest, or secretion of certain factors (e.g., inflammatory factors), cells can be isolated on the basis of their morphology (e.g., by fluorescent activated cell sorting) or expression patterns, and variants enriched in the isolated cell population can be subsequently identified.
(115) Once isolated, mutations giving rise to differing sensitivities or resistance to environmental agents can be identified, e.g., by sequence analysis or hybridization analysis. In performing sequence analysis, the entire variant nucleic acid can be sequenced (e.g., from a set of primers universal to the variant nucleic acid sequence), a portion of the variant can be sequenced (e.g., the portion corresponding to a nucleic acid tag), or the entire nucleic acid content of the cell can be sequences (e.g., using next generation sequencing techniques).
EXPERIMENTAL RESULTS
(116) Sensitivity to Related Antibiotics
(117) To enable efficient generation of relatively unbiased, single amino acid substitution libraries, we developed a highly multiplexed approach to site-directed mutagenesis that we refer to as tiling mutagenesis (
(118) We applied the methods to perform mutational scanning of the Tn5 transposon-derived aminoglycoside-3′-phosphotransferase-II (APH(3′)II), a 264 amino acid residues kinase that confers resistance to a variety of aminoglycoside antibiotics (Nurizzo, D. et al. The Crystal Structure of Aminoglycoside-3′-Phosphotransferase-IIa, an Enzyme Responsible for Antibiotic Resistance. Journal of Molecular Biology 327, 491-506 (2003)), with the goal of elucidating which residues are essential for its activity and whether these residues are the same for different substrates. We first designed and synthesized six 200 nucleotide (nt) tiles that encoded all possible single amino acid substitutions across APH(3′)II. Each tile contained a 140 nt variable region flanked by 30 nt constant ends. To test whether the cost-efficiency of our method could be further improved by synthesizing multiple mutant libraries in parallel, we also included corresponding tiles for nine homologous proteins, resulting in two non-overlapping pools of 26,250 and 23,666 distinct oligonucleotide sequences. We found that all six APH(3′)II tiles could be selectively amplified from these pools with minimal optimization of PCR conditions. We then inserted these tiles into plasmids that carried the corresponding remainders of the APH(3′)II coding sequence. Each amplification and multiplexed cloning reaction was performed in duplicate to generate two mutant libraries.
(119) To characterize the resulting libraries, we first shotgun-sequenced their APH(3′)II coding regions to a depth of ˜120,000× (
(120) To perform mutational scanning in vivo, we cultured E. coli transformed with APH(3′)II substitution libraries in liquid media supplemented with decreasing concentrations of one of six aminoglycoside antibiotics with diverse structures and a wide range of potencies: kanamycin, ribostamycin, G418, amikacin, neomycin or paromomycin (
(121) We began our analysis by examining changes in the relative abundance of mutant versus WT amino acids at each position after selection with kanamycin (
(122) To better understand the patterns of selection, we projected them onto a crystal structure of APH(3′)II in complex with kanamycin (Nurizzo, D. et al. The Crystal Structure of Aminoglycoside-3′-Phosphotransferase-IIa, an Enzyme Responsible for Antibiotic Resistance. Journal of Molecular Biology 327, 491-506 (2003)) (
(123) To compare the selection patterns induced by kanamycin in our experiments to those that have molded APH(3′)II over evolutionary timescales, we examined a conservation profile derived from alignment of 133 homologs (Goldenberg, O., Erez, E., Nimrod, G. & Ben-Tal, N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic acids research 37, D323-7 (2009)). We found positive rank correlations between evolutionary conservation and depletion of mutant amino acids at most positions across the protein (from Spearman's ρ=0.51 at ˜1:1 WT MIC to ρ=0.74 at 1:8 WT MIC;
(124) We next examined the effects of selection using the five other aminoglycosides. These substrates generated concentration- and position-dependent selection patterns that were qualitatively similar to those generated by kanamycin (
(125) To identify such specificity-determining residues, we queried our data for individual substitutions that showed significant depletion after kanamycin selection at 1:2 WT MIC (at 5% FDR) but no trend towards depletion after selection with a second aminoglycoside at its highest concentration. These stringent criteria identified a handful of substitutions that appeared to be well-tolerated in the presence of ribostamycin, G418 or paromomycin but not kanamycin (11, 3 and 4 single substitutions, respectively;
(126) TABLE-US-00001 TABLE 1 Candidate specificity-modifying substitutions Substitutions specfically tolerated under selection Substitutions specifically tolerated under selection Selection conditions with kanamycin with second aminoglycoside G418 at 1:1 WT MIC, n/a Glu35, Met157, Pro193 Kanamycin at 1:2 WT MIC G418 at 1:2 WT MIC, None n/a Kanamycin at 1:1 WT MIC G418 at 1:2 WT MIC, Glu159, Ile192, Ser227, Arg231, Lys231 Glu28, Glu31, Phe55, Cys153, Gly154, Thr154, Kanamycin at 1:2 WT MIC Gln157, Met157, Arg157, Pro160, Pro193, Thr198 Paromomycin at 1:1 WT MIC, n/a Ser154, Met157, Ala157, Arg230 Kanamycin at 1:2 WT MIC Paromomycin at 1:2 WT MIC, Lys31 n/a Kanamycin at 1:1 WT MIC Paromomycin at 1:2 WT MIC, Lys31, Arg31, Glu159, Ile192, Ser227, Arg231, Glu31, Ser103, Ala103, Gln154, Gln157, Leu157, Kanamycin at 1:2 WT MIC Phe231, Lys231, Trp231 Met157, Gly157, Arg230, Asn230 Ribostamycin at 1:1 WT MIC, n/a Arg29, Gly35, Arg46, Asn57, Lys96, Ser194, Gly194, Kanamycin at 1:2 WT MIC Ala194, Arg230, Thr231, Gln231 Ribostamycin at 1:2 WT MIC, Ser35, His55 n/a Kanamycin at 1:1 WT MIC Ribostamycin at 1:2 WT MIC, Asn27, Ser27, Tyr28, Leu28, Trp35, Gly39, Tyr55, Leu27, Ser29, Arg29, Arg46, Asn57, Arg96, Ser101, Kanamycin at 1:2 WT MIC Gln104, Ile192, Tyr255 Arg103, Gly194, Ala194, Arg230, Asn230, Ser231, Gln231, Thr231 Neomycin at 1:1 WT MIC, n/a None Kanamycin at 1:2 WT MIC Neomycin at 1:2 WT MIC, His55 n/a Kanamycin at 1:1 WT MIC Neomycin at 1:2 WT MIC, Tyr28, Arg31, Pro33, Trp35, Gln104, Glu159, Asp29, Ser29, Glu31, Trp37, Arg46, Asn56, Asn57, Kanamycin at 1:2 WT MIC Pro176, Pro178, Val179, Ile192, Arg231, Lys231, Phe231, Lys96, Asp194, Gly194 Trp231 Amikacin at 1:1 WT MIC, n/a None Kanamycin at 1:2 WT MIC Amikacin at 1:2 WT MIC, Ile79, Ser127, Phe137, Arg137, Tyr137, Val141, n/a Kanamycin at 1:1 WT MIC Asn149, Ile149, Phe153, Phe158, Tyr158, Trp158, Tyr163, Phe163, Glu163, Met163, Leu167, Leu173, Asn177, Trp177, Ser177, Ala177, Asp177, Gln177, Leu177, Gly177, Thr177, Tyr177, Phe177, Tyr178, Phe178, Trp256 Amikacin at 1:2 WT MIC, Phe137, Tyr137, Arg137, Val141, Leu149, Lys149, Glu35, Val145, Met145, Ile145, Leu145, Cys145, Kanamycin at 1:2 WT MIC Asn149, Ile149, Asn151, His151, Phe153, Leu155, Tyr148, Lys156, Gln157, Glu157, Ile157, Tyr162, Phe158, Tyr158, Trp158, Tyr163, Glu163, Met163, Leu162, Trp162, Met162, Ile162, Phe162, Asp168, His165, Glu165, Tyr165, Trp165, Phe165, Leu167, Ile170, Glu171, Ile174, Arg255, Gln259, Phe259, Leu168, Glu173, Gln173, Cys175, Pro176, Asn177, Thr259, Ala259, Ser259, Trp259, Asp259, His259 Gln177, Ser177, Ala177, Asp177, Tyr177, Leu177, Gly177, Trp177, Thr177, Phe177, Tyr178, Val178, Trp178, His178, Phe178, Gln178, Val179, Thr219, Cys225, Tyr230, Lys249, Try255, Trp256, Tyr256
(127) TABLE-US-00002 TABLE 2 Minimum inhibitory concentrations (MICs) MICs were determined from the A_600 from 2-3 independent cultures using 2-fold dilution series. The MIC estimated from all matched cultures were identical within the resolution of the assay, expect where a range is shown. Note that these MICs are higher than those established for selection immediately following transformation (e.g., Supplementary FIG. 2) due to differences in recovery conditions. MIC of MIC of Change in Expected kanamycin ribostamycin ribostamycin APH(3′)II variant ID Genotype phenotype (ug/mL) (ug/mL) specificity KKA2_KLEPN_opt_K1 Wild-type n/a 2000.00 15000.00 n/a KKA2_KLEPN_opt_K13 Arg230 Favor ribo. 250.00-500.00 15000.00 +4- to +8-fold KKA2_KLEPN_opt_K23 Gly194 Favor ribo. 1000.00 15000.00 +2-fold KKA2_KLEPN_opt_K24 Arg29, Gly35, Arg46, Favor ribo. 15.63-31.25 3750.00 +16- to +32-fold Asn57, Lys96, Gly194, Arg230, Thr231 KKA2_KLEPN_opt_K25 Leu27, Ser29, Arg46, Favor ribo. 15.63 937.50 +8-fold (Ribo+) Asn57, Arg96, Ser101, Arg103, Gly194, Arg230, Ser231 KKA2_KLEPN_opt_K26 Asn27, Leu28, Trp35, Favor kan. 125.00 468.75 −2-fold (Ribo−) Tyr55, Gln104, Ile192, 255Tyr MIC of MIC of Expected kanamycin G418 Change in G418 APH(3′)II variant ID Genotype phenotype (ug/mL) (ug/mL) specificity KKA2_KLEPN_opt_K1 Wild-type n/a 2000.00 50.00 n/a KKA2_KLEPN_opt_K2 Glu35 Favor G418 2000.00 50.00 0 KKA2_KLEPN_opt_K3 Met157 Favor G418 250.00-500.00 25.00-50.00 +4-fold KKA2_KLEPN_opt_K4 Pro193 Favor G418 500.00-1000 25.00-50 +2- to +4-fold KKA2_KLEPN_opt_K5 Glu35, Met157 Favor G418 1000.00 100.00 +4-fold KKA2_KLEPN_opt_K6 Glu35, Pro193 Favor G418 250.00-500.00 25.00 +2- to +4-fold KKA2_KLEPN_opt_K7 Met157, Pro193 Favor G418 500.00 100.00 +8-fold KKA2_KLEPN_opt_K8 Glu35, Met157, Pro193 Favor G418 62.5-125.00 25.00 +8- to +16-fold KKA2_KLEPN_opt_K9 Glu28, Glu31, Phe55, Favor G418 7.81 6.25 +32-fold (G418+) Cys153, Thr154, Arg156, Pro193, 198Thr KKA2_KLEPN_opt_K10 Glu159, Ile192, Ser227, Favor kan. 1000.00-2000.00 3.13 −8- to −16-fold (G418−) Arg231 MIC of MIC of Expected kanamycin amikacin Change in amikacin APH(3′)II variant ID Genotype phenotype (ug/mL) (ug/mL) specificity KKA2_KLEPN_opt_K1 Wild-type n/a 2000.00 25.00 n/a KKA2_KLEPN_opt_K32 Ile145 Favor ami. 62.50 12.50 +16-fold KKA2_KLEPN_opt_K33 Tyr148 Favor ami. 250.00 12.50-25.00 +4- to +8-fold KKA2_KLEPN_opt_K34 Lys156 Favor ami. 1000.00 12.50-25.00 0 to +2-fold KKA2_KLEPN_opt_K35 Ile157 Favor ami. 250.00 12.50-25.00 +2- to +4-fold KKA2_KLEPN_opt_K36 Phe162 Favor ami. 125.00 12.50-25.00 +8- to +16-fold KKA2_KLEPN_opt_K37 Asp168 Favor ami. 1000.00 12.50-25.00 0- to +2-fold KKA2_KLEPN_opt_K38 Ile170 Favor ami. 500.00 12.50 +2-fold KKA2_KLEPN_opt_K39 Glu171 Favor ami. 15.63-31.25 12.50-25.00 +64-fold KKA2_KLEPN_opt_K40 Ile174 Favor ami. 500.00 25.00 +2 fold KKA2_KLEPN_opt_K41 Arg255 Favor ami. 1000.00 12.50 0 KKA2_KLEPN_opt_K42 Phe259 Favor ami. 62.50 25.00-50.00 +32- to +64-fold KKA2_KLEPN_opt_K43 Glu35, Ile145, Tyr148, Favor ami. 15.63-31.25 25.00-50.00 +128-fold (Ami+) Lys156, Ile157, Phe162, Asp168, Ile170, Glu171, Ile174, Arg255, Phe259 KKA2_KLEPN_opt_K44 Glu35, Phe137, Val141, Favor kan. <15.63 6.25-12.50 (*) (Ami−) Asn149, His151, Phe153, Leu155, Phe158, Phe163, Phe165, Leu167, Leu168, Gln173, Cys175, Pro176, Thr177, Phe177, Val178, Thr219, Cys225, Tyr230, Lys249, Tyr255, Trp256 MIC of MIC of Expected kanamycin neomycin Change in neomycin APH(3′)II variant ID Genotype phenotype (ug/mL) (ug/mL) specificity KKA2_KLEPN_opt_K1 Wild-type n/a 2000.00 400.00 n/a KKA2_KLEPN_opt_K23 Gly194 Favor neo. 1000.00 200.00 0 KKA2_KLEPN_opt_K27 Asp194 Favor neo. 1000.00 200.00 0 KKA2_KLEPN_opt_K28 Ser29, Glu31, Trp37, Favor neo. 31.25 50.00-100.00 +8 to +16-fold Arg46, Asn56, Asn57, Lys96, Asp194 KKA2_KLEPN_opt_K29 Ser29, Glu31, Trp37, Favor neo. 31.25-62.50 100.00-200.00 +16-fold Arg46, Asn56, Lys96, Asp194 KKA2_KLEPN_opt_K30 Ser29, Glu31, Trp37, Favor neo. 31.25 100.00 +16-fold (Neo+) Arg46, Asn57, Lys96, Asp194 KKA2_KLEPN_opt_K31 Tyr28, Arg31, Pro33, Favor kan. 62.50 6.25-12.50 0 to −2-fold (Neo−) Trp35, Gln104, Glu159, Pro176, Pro178, Val179, Ile192, Trp231 MIC of MIC of Change in Expected kanamycin paromomycin paromomycin APH(3′)II variant ID Genotype phenotype (ug/mL) (ug/mL) specificity KKA2_KLEPN_opt_K1 Wild-type n/a 2000.00 4000.00 n/a KKA2_KLEPN_opt_K3 Met157 Favor paro. 500.00 2000.00-4000.00 +2- to +4-fold KKA2_KLEPN_opt_K11 Ser154 Favor paro. 1000.00 2000.00-4000.00 0 to +2-fold KKA2_KLEPN_opt_K12 Ala157 Favor paro. 1000.00 2000.00-4000.00 0 to +2-fold KKA2_KLEPN_opt_K13 Arg230 Favor paro. 250.00-500.00 8000.00 +8- to +16-fold KKA2_KLEPN_opt_K15 Ser154, Met157 Favor paro. 250.00-500.00 2000.00-4000.00 +4-fold KKA2_KLEPN_opt_K16 Ser154, Arg230 Favor paro. 125.00 4000.00-8000.00 +16- to +32-fold KKA2_KLEPN_opt_K17 Ala157, Arg230 Favor paro. 125.00 2000.000-4000 +8- +16-fold KKA2_KLEPN_opt_K18 Met156, Arg230 Favor paro. 125.00 2000.00-4000.00 +8- to +16-fold KKA2_KLEPN_opt_K19 Ser154, Met157, Arg230 Favor paro. 31.25-62.50 2000.00-4000.00 +16- to +32-fold KKA2_KLEPN_opt_K20 Ser154, Ala157, Arg230 Favor paro. 62.50 2000.00 +16-fold KKA2_KLEPN_opt_K21 Glu31, Ala103, Gln154, Favor paro. 31.25 2000.00-4000.00 +16- to +32-fold (Paro+) Met157, Arg230 KKA2_KLEPN_opt_K22 Lys31, Glu159, Ile192, Favor kan. 2000.00 62.50 −64-fold (Paro−) Ser227, Lys231 (*) 12.5 ug/mL was the lowest amikacin concentration that reliably inhibited growth of untransformed e. coli in this experiment. These estimated MICs therefore reflect total or near total loss of activity on both susbtrates.
(128) To expand our search for specificity-determining residues, we next looked for substitutions that showed significant depletion after kanamycin selection at 1:2 WT MIC (at 5% FDR) but no trend towards depletion by the second aminoglycoside at its 1:2 WT MIC, or vice versa. These less stringent criteria identified additional candidates in every pairwise comparison (
(129) In summary, tiling mutagenesis coupled with deep sequencing allowed us to perform quantitative mutational scanning of the complete APH(3′)II protein coding sequence with sufficient accuracy to identify individual kinase activity- and substrate specificity-determining residues. We emphasize that structural information was not required or directly utilized to identify these residues. Our approach is therefore applicable even when accurate structural models are not available. Assuming that indirect selection strategies can be devised, it can also be extended to proteins and activities that do not confer a direct growth advantage. Notably, we have recently used a similar approach to identify activity- and specificity-determining nucleotides in mammalian gene regulatory elements (Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nature biotechnology 30, 271-7 (2012)). In both cases, we found that combining mutational scanning data from two separate conditions is an effective method for identifying mutations that change the ratio of the activities in the two conditions in the desired direction.
(130) Methods
(131) Plasmid Construction and Cloning
(132) Primers and individual oligonucleotides were synthesized by Integrated DNA Technologies (Coralville, Iowa), the two 200 mer oligonucleotide pools containing mutagenesis tiles were synthesized by Agilent (Santa Clara, Calif.) as previously described (LeProust, E. M. et al. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic acids research 38, 2522-40 (2010)), the WT APH(3′)II coding region template for library construction was synthesized by GenScript USA (Piscataway, N.J.), and the single- and multi-substitution synthetic ORF variants for validation of specificity-determining amino acid residues were synthesized by Gen9 (Cambridge, Mass.).
(133) To generate a plasmid vector carrying a constitutive EM7 promoter (Life Technologies) and a synthetic gene encoding APH('3)II (neo; UniProtKB entry name: KKA2_KLEPN) flanked by two SacI restriction sites, we first combined a DNA fragment containing EM7 (template: EM7_pBR322; primers: EM7_Amp_F/R) with a PCR-linearized pBR322 backbone (template: pBR322; primers: pBR322_EM7_F/R) using the In-Fusion PCR Cloning System (Clontech Laboratories, Mountain View, Calif.). The resulting plasmid (pBR322[EM7]) was then re-linearized by inverse PCR (primers: pBR322_tetRLin_F/R) and combined with a PCR amplified neo ORF fragment (template: KKA2_KLEPN_opt; primers: KKA2_pBR322_F/R) by In-Fusion to replace the tetR ORF of pBR322 with neo. The final product (pBR322[EM7-neo]) was isolated using QIAprep Spin Miniprep kits (Qiagen, Gaitehrsburg, Md.) and verified by Sanger sequencing (primers: pBR322_Seq_F/R).
(134) To generate the APH('3)II single-substitution libraries, full-length oligonucleotides were first isolated from the synthesized pools using 10% TBE-Urea polyacrylamide gels (Life Technologies, Carlsbad, Calif.). Each pair of APH(3′)II tiles and corresponding linearized pBR322[EM7-neo] plasmids were PCR amplified by Herculase II DNA Polymerase (Agilent) with their respective primers (primer prefix: KKA2_TileAmp for tiles, KKA2_LinAmp for backbones), size selected on 1% E-Gel EX agarose gels (Life Technologies), purified with MinElute Gel Extraction kits (Qiagen) and combined using In-Fusion reactions. Each reaction was separately transformed into Stellar chemically competent cells (Clontech), the transformants were grown in LB media with carbenicillin (50 μg/mL) and plasmid DNA libraries were isolated using QIAprep Spin Miniprep kits (Qiagen). Finally, complete substitution libraries were generated by pooling equimolar amounts of the resulting six single-tile plasmid libraries.
(135) To generate plasmids encoding selected single- and multiple-substitution APH(3′)II variants, we amplified the Gen9 synthetic ORFs using Herculase II DNA Polymerase (primers: KKA2_pBR322_F/R) and inserted them into a PCR-linearized backbone (primers: pBR322_tetLin_F/R, template: pBR322[EM7]) using In-Fusion reactions. The resulting plasmids (pBR322[EM7-K1] through pBR322[EM7-K44]) were verified by Sanger sequencing and preserved in E. coli as glycerol stock.
(136) Cell Culture and Selection
(137) To determine the minimum inhibitory concentration (MIC) of each of the six aminoglycosides: kanamycin, ribostamycin, G418, amikacin, neomycin, paromomycin (Sigma-Aldrich, St. Louis, Mo.) in E. coli expressing WT APH(3′)II, Stellar cells were transformed with pBR322-EM7-neo, recovered in SOC medium (New England Biolabs (NEB), Ipswich, Mass.) at 37° C. for 1 hr, diluted 1:100 with LB and then divided into 96-well growth blocks containing LB with carbenicillin (50 μg/mL) and 2-fold serial dilutions of aminoglycosides. After growth at 37° C. with shaking for 24 hrs, the culture densities were assessed by absorbance at 600 nm (A.sub.600) using a NanoDrop 8000 (Thermo Scientific, Billerica, Mass.). The MIC for each aminoglycoside was estimated as the lowest dilution at which A.sub.600 was less than 0.025.
(138) To perform mutational scanning, Stellar cells were transformed with 10 ng (˜3 fmol) of mutant library plasmids, recovered in SOC medium at 37° C. for 1 hour, diluted into 15 mL LB with carbenicillin (50 μg/mL) and one of the aminoglycosides at 1:1, 1:2, 1:4 or 1:8 dilutions of the following concentrations (estimated 1:1 WT MICs): 225 μg/mL for kanamycin, 2500 μg/mL for ribostamycin, 5 μg/mL for G418, 10 μg/mL for amikacin, 40 μg/mL for neomycin and 320 μg/mL for paromomycin. The cultures were incubated in 50 mL tubes at 37° C. with shaking for 24 hours, pelleted by centrifugation and frozen at −20° C. Plasmids were isolated from the pellets using QIAprep Spin Miniprep kits (Qiagen). Each transformation and selection was performed in duplicate, using each of the two independently generated mutant libraries.
(139) To establish the substrate specificity of the synthetic APH(3′)II variants relative to WT APH(3′)II, glycerol stocks of the relevant clones (pBR322[EM7-K1] through pBR322[EM7-K44]) were streaked onto LB agar plates and cultured overnight at 37° C. Single colonies were then inoculated into 1 mL LB with 50 μg carbenicillin, incubated at 37° C. with shaking for 6 hours, diluted 1:100 into LB media, and then split into 96-well deep well plates containing LB with carbenicillin (50 μg/mL) and 2-fold serial dilutions of aminoglycosides dilution series. We note that the longer recovery in these follow-up experiments increased the absolute MICs compared to those we measured immediately following transformation).
(140) Mut-Seq
(141) To sequence and count mutations, the APH(3′)II coding regions were first isolated from the plasmid pools by SacI digest (NEB) followed by agarose gel purification. The SacI fragments were ligated into high molecular weight concatemers using T4 DNA ligase (NEB). The concatemers were fragmented and converted to sequencing libraries using Nextera DNA Sample Prep kits (Illumina, San Diego Calif.). Library fragments from the 200-800 nt size range were selected using agarose gels and then sequenced on Illunnina MiSeq instruments using 2×150 nt reads. The reads were subsequently aligned to a reference sequence consisting of concatenated copies of the WT APH(3′)II coding region using BWA version 0.5.9-r16 with default parameters (Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754-60 (2009)). The number of occurrences of each amino acid at each position along the coding region was then counted and tabulated from the aligned reads. Only codons for which all three sequenced and aligned nucleotides had phred quality scores ≥30 were included in these counts.
(142) Computational Analysis
(143) Data processing and statistical analysis, including computation of Pearson's and Spearman's correlation coefficients, x.sup.2-statistics and associated p-values, were performed using the Enthought Python Distribution (www.enthought.com), with IPython (Pérez, F. & Granger, B. E. IPython: A System for Interactive Scientific Computing. Computing in Science & Engineering 9, 21-29 (2007)), NumPy version 1.6.1, SciPy version 0.10.1 and matplotlib version 1.1.0 (Hunter, J. D. Matplotlib: A 2D graphics environment. Computing in Science & Engineering 9, 90-95 (2007)). Rendering of the APH(3′)II crystal structure (PDB accession 1ND4) was performed using PyMOL version 1.5 (Schrödinger). The rendering of its secondary structure was derived from PDBsum at EMBL-EB (www.ebi.ac.uk/pdbsum/). The evolutionary conservation profile for APH(3′)II was obtained from ConSurf-DB (Goldenberg, O., Erez, E., Nimrod, G. & Ben-Tal, N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic acids research 37, D323-7 (2009)).
(144) The magnitude of changes in the abundance of mutant amino acids at each position after selection was estimated as
ΔMut=(Mut.sub.selected/WT.sub.selected)/(Mut.sub.input/WT.sub.input)
where
Mut.sub.input and WT.sub.input are, respectively, the observed counts of mutant and wild-type amino acids at that position in the input library and Mut.sub.selected and WT.sub.selected are the corresponding observed counts after selection.
(145) The magnitude of changes in the abundance of each specific amino acid at each position after selection was estimated as
ΔAA=(AA.sub.selected/AA.sub.selected)/(AA.sub.input/
AA.sub.input)
where
AA.sub.input and AA.sub.input are, respectively, the observed counts of that amino acid and all other amino acids at that position in the input library and AA.sub.selected and
AA.sub.selected are the corresponding observed counts after selection.
(146) The statistical significance of a deviation of any ΔMut or ΔAA from 1.0 was estimated using a χ.sup.2-test for independence on a 2×2 contingency table that contained the four corresponding counts with a pseudocount of 1 added to each. To correct for multiple hypothesis testing, the Benjamini-Hochberg procedure was applied to identify the 5% false discovery rate (FDR) threshold (Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B. 57, 289-300 (1995)). We note that ΔMut or ΔAA values greater than 1.0 do not necessarily indicate higher fitness relative to the WT APH(3′)II (i.e., positive selection), because the majority of ORFs that contributed to the count of WT residues at one position still carried substitutions at other positions.
(147) Substitutions that favored growth under selection with one aminoglycoside relative to another were identified by requiring
(148) ΔAA<1.0 at 5% FDR across two replicates under selection with the first and ΔAA≥1.0 across two replicates for the same substitution under selection with the other, as well as a minimum difference of 0.5 between the log.sub.10-transformed ΔAA values, at the concentrations indicated in the main text. These thresholds were established empirically to select a limited number of high confidence candidates.
Variants Implicated in Rosiglitazone-Dependent Adipocyte Differentiation
(149) We also designed an experiment to determine which amino acid residues in PPARγ are implicated in rosiglitazone-dependent adipocyte differentiation. Using a method similar to that described above, we synthesized a mutant library containing variants of the PPARγ gene. This library was cloned into inducible lentiviral vectors. We transduced Simpson-Golabi-Behemel Syndrome (SGBS) pre-adipocytes at a multiplicity of infection of approximately 0.3. Transduced cells were then selected and expanded in the presence of puromycin followed by the induction of differentiation in the presence of doxycycline (to induce expression of PPARγ from the lentiviral vector) and the PPARγ agonist rosiglitazone. Differentiation was selected for by separation of cells by CD36 expression. The DNA contents of the selected populations of cells was sequenced and identification of PPARγ variants enriched and depleted in CD36+ cells was performed (
Other Embodiments
(150) From the foregoing description, it will be apparent that variations and modifications may be made to the invention described herein to adopt it to various usages and conditions. Such embodiments are also within the scope of the following claims.
(151) All publications, patent applications, and patents mentioned in this specification are herein incorporated by reference to the same extent as if each independent publication, patent application, or patent was specifically and individually indicated to be incorporated by reference.
(152) From the foregoing description, one skilled in the art can easily ascertain the essential characteristics of this invention; can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, other embodiments are also within the claims.