Combinatorial Assembly of Composite Arrays of Site-Specific Synthetic Transposons Inserted Into Sequences Comprising Novel Target Sites in Modular Prokaryotic and Eukaryotic Vectors
20220081692 · 2022-03-17
Inventors
Cpc classification
C12N2710/14041
CHEMISTRY; METALLURGY
C12N15/1082
CHEMISTRY; METALLURGY
C12N15/74
CHEMISTRY; METALLURGY
C12N15/11
CHEMISTRY; METALLURGY
C12Y203/01028
CHEMISTRY; METALLURGY
C12N2710/14143
CHEMISTRY; METALLURGY
C12N15/66
CHEMISTRY; METALLURGY
C12N15/1086
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
C12N2710/14043
CHEMISTRY; METALLURGY
C12N9/1033
CHEMISTRY; METALLURGY
International classification
C12N15/70
CHEMISTRY; METALLURGY
C12N15/86
CHEMISTRY; METALLURGY
Abstract
The design, assembly, and use of novel sequences comprising targeting and insertion sites for site-specific bacterial transposons are disclosed. One aspect relates to a nucleotide sequence comprising an attachment site for a site-specific transposon operably-linked to a screenable or selectable marker sequence, wherein said marker sequence encodes one or more active or inactive polypeptides capable of conferring a screenable or selectable phenotype upon a cell comprising the marker sequence, wherein insertion of the site-specific transposon into the attachment site changes the phenotype of a cell comprising the screenable or selectable marker sequence. High and low copy number vectors comprising the sequences, designated synthemids, including plasmids capable of propagating in bacteria, and shuttle vectors, capable of propagating in bacteria and a eukaryotic host cell or two types of bacteria by means of distinct replicons, are also disclosed. Related aspects include the design and assembly of synthetic insect and mammalian virus shuttle vectors, including shuttle vectors comprising segments of a double-stranded DNA virus, such as a baculovirus, which propagates in insect cells, or a herpesvirus, an adenovirus, or a pox virus, which propagate in mammalian cells. Other aspects relate to use of modified vectors to express polypeptides for use as therapeutic drug products, as vaccines, or as components of cell or gene therapy vector systems, and in model and crop plant cells, tissues, and whole plants to facilitate the basic and applied studies leading to improved food products, and as tools advancing the interests of institutions involved in industrial and environmental biotechnology.
Claims
1. A nucleotide sequence comprising a target site for a site-specific transposon, wherein said target site comprises a target sequence comprising a transcriptionally or translationally fused marker sequence encoding a selectable marker sequence or a screenable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive or an active polypeptide capable of conferring a selectable or screenable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite screenable or selectable marker sequence compared to a cell comprising just the selectable or screenable marker sequence.
2. The nucleotide sequence of claim 1, wherein said target site comprises a target sequence for a site-specific transposon comprising a translationally-fused selectable marker sequence or a screenable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive or an active polypeptide capable of conferring a selectable or screenable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite screenable or selectable marker sequence compared to a cell comprising just the selectable or screenable marker sequence.
3. The nucleotide sequence of claim 2, wherein said sequence comprises a target site for a site-specific transposon comprising a translationally-fused selectable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive polypeptide capable of conferring a selectable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite selectable marker sequence compared to a cell comprising just the selectable marker sequence.
4. The sequence of claim 3, wherein said wherein said fused marker sequence encodes a truncated or extended inactive polypeptide which is extended or truncated, respectively, after transposition to form a composite target sequence which encodes an active polypeptide conferring a selectable phenotype upon the cell.
5. The nucleotide sequence of claim 3, wherein said fused marker sequence encodes a truncated, inactive polypeptide which is extended after transposition to form a composite target sequence which encodes an active polypeptide conferring a selectable phenotype upon the cell.
6. The nucleotide sequence of claim 5, wherein the selectable marker sequence encodes an inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
7. The nucleotide sequence of claim 6, wherein the sequence encoding the inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
8. The nucleotide sequence of claim 5, wherein the composite selectable marker sequence encodes an active bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
9. The nucleotide sequence of claim 8, wherein the sequence encoding the active bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive CAT polypeptide domain restore CAT activity to the fusion protein.
10. The nucleotide sequence of claim 5, wherein said fused marker sequence encodes an extended, inactive polypeptide which is truncated after transposition to form a composite target sequence which encodes an active, polypeptide conferring a selectable phenotype upon the cell.
11. The nucleotide sequence of claim 10, wherein the selectable marker sequence encodes an inactive NPT-II fusion protein.
12. The nucleotide sequence of claim 11, wherein the sequence encoding the inactive NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
13. The nucleotide sequence of claim 10, wherein the composite selectable marker sequence encodes an active NPT-II fusion protein.
14. The nucleotide sequence of claim 13, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the removal of amino acids encoded by (ii) (iii) to the inactive NPT-II polypeptide domain restores NPT-II activity to the fusion protein.
15. The nucleotide sequence of claim 13, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of amino acids encoded by (ii) (iii) to the inactive NPT-II polypeptide domain restores NPT-II activity to the fusion protein.
16. A vector designated as a synthemid comprising the target sequence or composite target sequence of claim 1.
17. The vector of claim 16, wherein said vector propagates in bacteria.
18. The vector of claim 17, wherein said vector is a shuttle vector capable of propagating in bacteria and a non-bacterial host cell.
19. The vector of claim 18, wherein said vector is a baculovirus shuttle vector, capable of propagating in bacteria and in Lepidopteran insect cells susceptible to infection by the baculovirus.
20. The vector of claim 19, wherein said baculovirus shuttle vector is capable of propagating in Escherichia coli and insect cells selected from the group consisting of Spodoptera frugiperda, Trichoplusia ni cells, and Bombyx mori cells.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
Statement Concerning Drawings Executed in Color
[0088] This patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Patent Office upon request and payment of the necessary fee.
Statement Concerning Aspects of the Invention Understood by Reference to the Drawings
[0089] The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
[0090]
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097]
[0098]
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
ABBREVIATIONS, TERMS AND THEIR DEFINITIONS
[0106] The following is a list of abbreviations, plus terms and their definitions, used throughout the text of the specification, the figures, the sequence listing, supplementary data tables (if any), and the claims:
TABLE-US-00007 TABLE 4 List of Abbreviations A = adenosine; A = absorbance (1 cm); aa or AA = amino acid; Ab = antibody(ies); AcNPV = Autographa californica Nuclear Polyhedrosis Virus, a member of the Baculoviridae family of insect viruses; Amp, Ap = ampicillin; ATP = Adenosine triphosphate; attTn7 = attachment site for Tn7 (a preferential site for Tn7 insertion into bacterial chromosomes); βGal, β-Gal = β-galactosidase; b = E. coli-derived bacmid; bc = E. coli-derived composite bacmid; bch = mixture of E. coli-derived composite bacmid and helper plasmid; bla = beta lactamase gene conferring resistance to beta-lactam antibiotics, particularly ampicillin; Bluo-gal = halogenated indolyl-β-D-galactoside; BmNPV = Bombyx mori nuclear polyhedrosis virus; bp, Bp = base pair(s); BSA = bovine serum albumin; C = cytidine; Cam or CM = chloramphenicol; cAMP = cyclic adenosine 3′,5′-monophosphate; CAT = chloramphenicol acetyltransferase; cat = gene encoding CAT; CBB = Coomassie Brilliant Blue; ccc = covalently closed circular; cDNA = DNA complementary to RNA; CHO = Chinese hamster ovary; CIAP = calf intestinal alkaline phosphatase; Cm = chloramphenicol; CMP = cytidine monophosphate; cp = chloroplast; cpm = counts per minute; CTP = cytidine triphosphate; Δ = deletion; d = deoxyribo; dd = dideoxyribo; DMF = N,N-dimethylformamide; DMSO = dimethylsulfoxide; DNase = deoxyribonuclease; dNTP = deoxyribonucleoside triphosphate; ds = double strand(ed); DTT = dithiothreitol; EF = elongation factor; ELISA = enzyme-linked immunosorbent assay; Er = erythromycin; EST = expressed sequence tag; EtBr, EtdBr = ethidium bromide; FITC = fluorescein isothiocyanate; g = gram(s); G = guanosine; G418 = Geneticin; Gen or Gent = gentamicin; GLC-MS = Gas-liquid chromatography-mass spectrometry; Gm = gentamicin; HPLC = high performance liquid chromatography; Hy = hygromycin; IF = initiation factor; Ig = immunoglobulin(s); IL = interleukin; IPTG = isopropyl β-D-thiogalactopyranoside; IS = insertion sequence(s); Kan = kanamycin; kb or kbp = kilobase(s) = 1000 bp(s); kDa = kilodalton(s); Km = kanamycin; lacZpo = lac promoter-operator; LB = Luria-Bertani (medium); LTR = long terminal repeat(s); MAb, mAb = monoclonal Ab; Mb = megabase(s); MCS = multiple cloning site(s); Me = methyl; mg = milligram(s); ml or mL = milliliter(s); mm = millimeter(s); mM = millimolar; moi, MOI = multiplicity of infection; Mr = relative molecular mass (dimensionless); N = any nucleoside; NAD/NADH = nicotinamide-adenine dinucleotide, and its reduced form; Nm = neomycin; nmol = nanomole(s); NMR = nuclear magnetic resonance; NPT-II = Neomycin phosphotransferase gene or protein derived from Tn5 conferring resistance to kanamycin and neomycin and related antibiotics; NPV = Nuclear polyhedrosis virus; nt = nucleotide(s); o, O = operator; oligo = oligodeoxyribonucleotide; ONPG = o-nitrophenyl β-D-galactopyranoside; ORF = open reading frame; ori = origin(s) of DNA replication; p = plasmid; p, P = promoter; PA = polyacrylamide; PAGE = PA-gel electrophoresis; PCR = polymerase chain reaction, a gene amplification procedure; PEG = poly(ethylene glycol); PEP = phosphoenolpyruvate; pfu = plaque-forming unit(s); Pi = inorganic phosphate; pmol = picomole(s); PMSF = phenylmethylsulfonyl fluoride; Pol k = Klenow (large) fragment of E. coli DNA polymerase I; PPi = inorganic pyrophosphate; ppm = parts per million; PPO = 2,5-diphenyloxazole; R = (superscript) resistance/resistant; R = purine (or restriction); r or R or superscripted r or R = resistant or resistance RBS = ribosome-binding site(s); rDNA = DNA coding for rRNA; RFLP = restriction-fragment length polymorphism; Rif = rifampicin; RNase = ribonuclease; RP-HPLC = reverse phase high performance liquid chromatograph; rRNA = ribosomal RNA; RT = reverse transcriptase; RT = room temperature; RT-PCR = reverse transcriptase polymerase chain reaction; S or S = (superscript) sensitivity/sensitive; S = sedimentation constant; SAM = 5-adenosylmethionine; SD = Shine-Dalgarno (sequence); SDS = sodium dodecyl sulfate; SDS-PAGE = sodium dodecyl sulfate-polyacrylamide gel electrophoresis; Sf = Spodoptera frugiperda; Sf9 = Spodoptera frugiperda (Sf9) cells/cell line; Sf21 = Spodoptera frugiperda (IPLB Sf21) cells/cell line; SIDNO or SID# = SEQ ID NO; Sm = streptomycin; Spc/Str = spectinomycin/streptomycin; ss = single strand(ed); SSC = 0.15M NaCl/0.015M Na3 .Math. citrate pH 7.6; T = thymidine; t, T = terminator of transcription; Tc, TC = tetracycline; tet = gene conferring resistance to tetracycline and related antibiotics; TK = thymidine kinase; In = transposon or transposable element; Tni, T. ni = Trichoplusia ni cells/cell line; Tni368 = Trichoplusia ni (Tni368) cells/cell line; tns = transposition genes; ts = temperature-sensitive; tsp = transcription start point(s); U, u = unit(s); U = uridine; ug or μg = microgram(s); ul or μl = microliter(s); URF = unidentified open reading frame; UTR = untranslated region(s); UV = ultraviolet; v = insect cell-derived baculovirus; vc = insect cell-derived composite baculovirus; vch = mixture of insect cell-derived composite baculovirus and helper plasmid; wt = wild type; Xgal, X-gal = 5-bromo-4-chloro-3-indolyl β-D-galactopyranoside; Xgluc, X-gluc = 5-bromo-3-chloro-indolyl-β-D-glucopyranoside; Y = pyrimidine; ( ) = denotes prophage (lysogenic) state; [ [ = denotes plasmid-carrier state; “::” = novel junction (fusion or insertion, transposon insertion); ′(prime) = denotes a truncated gene at the indicated side; Nucleotide symbol combinations: Pairs: K = G/T; M = A/C; R = A/G; S = C/G; W = A/T; Y = C/T; Triples: B = C/G/T; D = A/G/T; H = A/C/T; V = A/C/G; N = A/C/G/T;
[0107] Array: A series of genetic elements, in a linear order along the primary sequence of a DNA molecule, typically referring to a series of target sequences for a site-specific transposase or recombinase.
[0108] Bacmid: A baculovirus shuttle vector capable of replication in bacteria and in susceptible insect cells.
[0109] Bacteria: Any prokaryotic organism capable of supporting the function of the genetic elements described below. In one aspect, the bacteria should support the replication of a low copy number replicon operationally linked to the baculovirus in the bacmid, most preferably mini-F. The bacteria should support the replication of the donor plasmids, preferably moderate or high copy number plasmids or the host genome, most preferably either the bacteria chromosome, plasmids based on pUC8 or pMAK705. The bacteria should support the replication of helper plasmids, preferably moderate copy plasmids, most preferably based on pBR322. The bacteria should support the site-specific transposition of a transposon, most preferably one derived from Tn7. The bacteria should also support the expression and detection or selection of differentiable or selectable markers. In the preferred mode, the selectable markers are antibiotic resistance markers, most preferably genes conferring resistance to the following drugs: chloramphenicol, gentamicin, kanamycin, tetracycline, and ampicillin. In the preferred mode the differentiable markers should confer the ability of cells possessing them to metabolize chromogenic substrates. Most preferably, the differentiable marker encodes .alpha.-complementing fragment of .beta.-galactosidase.
[0110] BaculoBrick™: A synthetic adapter comprising one or more recognition sites for restriction enzymes that are typically 7 or more nucleotides, in length, generally 8 nt, and typically palindromic with double-stranded DNA cleavage sites entirely within the recognition site that leaving 5 or 3′ sticky overhangs, or blunt ends suitable for ligation to DNA fragments having complementary sticky or blunt ends. In this context, the adapter comprises sequences for restriction enzymes that cleave wild-type baculovirus DNAs, such as AcNPV or BmNPV DNA, zero to 5 times, permitting the rapid cloning and assembly of modular genetic elements suitable for insertion as cassettes into modified baculovirus genomes. These adapters can also be used to facilitate assembly of other large plasmids and shuttle vectors, including those intended for use in mammalian, plant, fungal, and other eukaryotic systems, plus enteric and non-enteric bacterial systems.
[0111] Baculovirus: A member of the Baculoviridae family of viruses with covalently closed double-stranded DNA genome and which are pathogenic for invertebrates, primarily insects of the order Lepidoptera.
[0112] Cis-Acting: cis-acting elements are genes or DNA segments which exert their functions on another DNA segment only when the cis-acting elements are linked to that DNA segment.
[0113] Combinatorial assembly of an ordered array: Assembly of a series of functionally- or structurally-similar sets of genetic elements in an array, where the sets may be assembled in any order, typically by traditional or modern cloning or gene assembly methods involving assembly of a large segment of DNA from two or more smaller segments of DNA.
[0114] Composite array: A partially or completely filled array of genetic elements comprising one or more segments of DNA inserted at specific target sequences for site-specific transposons or site-specific recombinases.
[0115] Composite Bacmid: A bacmid containing a wild-type or altered transposon inserted into a nonessential locus, usually the preferential target site for the transposon.
[0116] Donor DNA Molecule: Any replicating double-stranded DNA element such as the bacterial chromosome or a bacterial plasmid which carries a transposon capable of site-specific transposition into a bacmid. Preferably, the transposon contains a heterologous DNA and a genetic marker.
[0117] Donor Plasmid: A plasmid containing a wild-type or altered transposon, preferably a mini-Tn7 or Tn7-like transposon, comprising the left and right arms of Tn7 or a Tn7-like element flanking a cassette typically containing a genetic marker, a promoter, and one or more operably-linked genes of interest. The mini-transposon is preferably on a pUC-based or pMAK705-based plasmid.
[0118] Fusion proteins or fusion polypeptides: A single continuous linear polymer of amino acids which generally comprise the complete or partial sequences of two or more domains from distinct proteins. They are generally encoded by a linear segment of DNA and transcribed as a unit under the control of an operably-linked promoter, where the two or more coding sequences are contiguous with each other, optionally separated by one or more polypeptide linker sequences. The polypeptide linker sequences may also be present at the amino terminus, the carboxy-terminus, or both ends, contributing to the activity or inactivity of the fusion polypeptide compared to an unaltered parental polypeptide, or may provide other types of functions, such as binding to another molecule to facilitate purification during extraction from lysed cells or from cell culture media containing a variety of secreted molecules. In some aspects, the fusion polypeptide may comprise two or domains from a single parental molecule, in the same relative N-terminal to C-terminal orientation, or permuted, such that a domain from the C-terminal region of the parental polypeptide is located before a domain derived from the N-terminal region of the parental polypeptide. In other aspects, a fusion protein may comprise one or more segments derived from one or more natural proteins, and a synthetic segment that encodes a polypeptide not normally found in natural proteins.
[0119] Helper Plasmid or Helper Vector: A plasmid or vector which contains a bacterial replicon, a genetic marker and any genes which encode trans-acting factors which are required for the transposition of a given transposon.
[0120] Heterologous DNA: A sequence of DNA, from any source, which is introduced into an organism and which is not naturally contained within that organism.
[0121] Heterologous Protein: A protein which is synthesized in an organism, specifically from an introduced heterologous DNA, and which is not naturally synthesized within that organism.
[0122] Hyperactive transposase: A variant of a parental transposase gene encoded by a transposon that increases the frequency of transposition of a parental or variant transposon compared to the parental transposase gene.
[0123] Locus: A specific site or region of a DNA molecule which may or may not be a gene.
[0124] Mini-attTn7: The minimal DNA sequence required for recognition by Tn7 transposition factors and insertion of a Tn7 transposon or preferably mini-Tn7.
[0125] Mini-F: A derivative of the 100 kb Fertility (F) plasmid, which contains the RepF1A replicon, comprising seven genes including repE, and two DNA regions, oriS and incC, required for replication, maintenance, and regulation of mini-F replication.
[0126] Mini-Tn7: A transposon derived from Tn7 which contains the minimal amount of cis-acting DNA sequence required for transposition, a heterologous DNA and a genetic marker.
[0127] Nonessential: A locus is non-essential, if it is not required for replication of an vector, virus, cell, or organism as judged by the survival of that biological object following disruption or deletion of that locus.
[0128] NR1: A large (90 kb), stable, low copy number, IncFII drug resistance plasmid that confers resistance to chloramphenicol, fusidic acid, streptomycin, spectinomycin, sulfonamide, and tetracycline, which is compatible with the large (100 kb) stable, low copy number, IncFI Fertility (F) plasmid.
[0129] Passage: Infection of a host with a virus (or a mixture of viruses) and subsequent recovery of that virus from the host (usually after one infection cycle).
[0130] Plasmid Incompatibility: Plasmids are incompatible if they interact in such a way that they cannot be stably maintained in the same cell in the absence of selection for both plasmids.
[0131] P.sub.polh: A very late baculovirus promoter which is capable of promoting high level mRNA synthesis from any gene, preferably a heterologous DNA, placed under its control.
[0132] Preferential Target Site: A defined sequence of DNA specifically recognized and preferentially utilized by a transposon, preferably the attTn7 site for Tn7.
[0133] Random transposon: A naturally-occurring, variant, or synthetic transposon that has low to no specificity with respect to the sequences where it is inserted after transposition from one site to another. Common examples of random eukaryotic transposons include the synthetic Sleeping Beauty transposon, derived from consensus sequences in salmon, and the piggyBac transposon, derived from Trichoplusia ni, a caterpillar, and the random bacterial transposon Tn5, derived from a plasmid conferring resistance to kanamycin and other antibiotics. Variant and synthetic versions are often used with vectors comprising genes encoding hyperactive transposases, to enhance the frequency of random transposition a vector or the chromosome of a prokaryotic or eukaryotic cell.
[0134] Replicon: A replicating unit from which DNA synthesis initiates.
[0135] Screenable marker: A reporter gene introduced into a cell that confers a trait suitable for screening, typically allowing a researcher to distinguish between cells harboring a vector or no vector, or a cells harboring a vector and a variant form of a vector, such as bacteria form white colonies in a background of blue colonies in the presence of a chromogenic substrate, such as E. coli cells comprising vectors that do and do not have insertions disrupting expression of the alpha complementation polypeptide encoded by a lacZalpha gene in a cell comprising a lacZΔM15 gene on its chromosome.
[0136] Selectable marker: A reporter gene introduced into a cell that confers a trait suitable for artificial selection, commonly resistance to antibiotics, such as ampicillin, chloramphenicol, tetracycline, kanamycin, among many others, for vectors propagated in E. coli., and a wide variety of other antibiotics that allow selection of vectors that propagate in eukaryotic cells.
[0137] Shuttle Vector: A vector (usually a plasmid) that can propagate in two different types of host cell species, generally where one replicon permits propagation in prokaryotic cell, such as bacteria. A eukaryotic shuttle vector comprises at least one replicon permits propagation in a eukaryotic cell. A mammalian eukaryotic shuttle vector comprises at least one replicon which is derived from a mammalian cell, generally allowing the shuttle vector to propagate in a mammalian cell. A non-mammalian eukaryotic shuttle vector comprises at least one replicon which is derived from a non-mammalian cell, generally allowing the shuttle vector to propagate in a non-mammalian cell. A viral shuttle vector comprises at least one replicon which is derived from a virus, generally allowing the shuttle vector to propagate as a virus. A mammalian viral shuttle vector comprises at least one replicon which is derived from a mammalian virus, generally allowing the shuttle vector to propagate in mammalian cells as a virus. An insect viral shuttle vector comprises at least one replicon which is derived from an insect virus, generally allowing the shuttle vector to propagate in insect cells as a virus. A baculovirus shuttle vector comprises at least one replicon which is derived from an insect virus, generally allowing the shuttle vector to propagate in Lepidopteran insect cells as a virus.
[0138] Synthemid: A modular viral or non-viral vector comprising one or more target sites for a synthetic-site specific transposon, particularly those comprising gene fusions allowing for the direct selection of transposition events.
[0139] The term “amino acid(s)” means all naturally occurring L-amino acids, including norleucine, norvaline, homocysteine, and ornithine.
[0140] The term “degenerate” means that two nucleic acid molecules encode for the same amino acid sequences but comprise different nucleotide sequences.
[0141] The term “fragment” means a nucleic acid molecule whose sequence is shorter than the target or identified nucleic acid molecule and having the identical, the substantial complement, or the substantial homologue of at least 10 contiguous nucleotides of the target or identified nucleic acid molecule.
[0142] The term “fusion protein” means a protein or fragment thereof that comprises one or more additional peptide regions not derived from that protein.
[0143] The term “isolated” when used with respect to a polynucleotide (e.g., single- or double-stranded RNA or DNA), an enzyme, or more generally a protein, means a polynucleotide, an enzyme, or a protein that is substantially free from the cellular components that are associated with the polynucleotide, enzyme, or protein as it is found in nature. In this context, “substantially free from cellular components” means that the polynucleotide, enzyme, or protein is purified to a level of greater than 80% (such as greater than 90%, greater than 95%, or greater than 99%).
[0144] The term “probe” means an agent that is utilized to determine an attribute or feature (e.g. presence or absence, location, correlation, etc.) of a molecule, cell, tissue, or organism.
[0145] The term “promoter” is used in an expansive sense to refer to the regulatory sequence(s) that control mRNA production. Such sequences include RNA polymerase binding sites, enhancers, etc.
[0146] The term “protein fragment” means a peptide or polypeptide molecule whose amino acid sequence comprises a subset of the amino acid sequence of that protein.
[0147] The term “recombinant” means any agent (e.g., DNA, peptide, etc.), that is, or results from, however indirectly, human manipulation of a nucleic acid molecule.
[0148] The term “selectable or screenable marker genes” means genes whose expression can be detected by a probe as a means of identifying or selecting for transformed cells.
[0149] The term “specifically bind” means that the binding of an antibody or peptide is not competitively inhibited by the presence of non-related molecules.
[0150] The term “specifically hybridizing” means that two nucleic acid molecules are capable of forming an anti-parallel, double-stranded nucleic acid structure.
[0151] The term “substantial complement” means that a nucleic acid sequence shares at least 80% sequence identity with the complement.
[0152] The term “substantial fragment” means a nucleic acid fragment which comprises at least 100 nucleotides.
[0153] The term “substantial homologue” means that a nucleic acid molecule shares at least 80% sequence identity with another.
[0154] The term “substantially hybridizing” means that two nucleic acid molecules can form an anti-parallel, double-stranded nucleic acid structure under conditions (e.g., salt and temperature) that permit hybridization of sequences that exhibit 90% sequence identity or greater with each other and exhibit this identity for at least about a contiguous 50 nucleotides of the nucleic acid molecules.
[0155] The term “substantially-purified” means that one or more molecules that are or may be present in a naturally-occurring preparation containing the target molecule will have been removed or reduced in concentration.
[0156] The term “transposon” refers to mobile genetic elements capable of transposition between the genetic material in a cell (e.g., from one chromosomal location to one or more other locations in the chromosome, from a virus or a plasmid to the chromosome, from the chromosome to a virus or a plasmid, and from a plasmid or virus to a different plasmid or virus). The term also refers mobile DNA element, including those which recognize specific DNA target sequences, which can be made to move to a new site by recombination or insertion and does not require extensive DNA sequence homology between itself and the target sequence for recombination or insertion. A non-limiting list of transposons that may be used with the invention described herein, includes piggyBac, Sleeping Beauty (SB), Tn3, Tn5, Tn7, Tn916, Tcl/mariner, Minos and S elements, Quetzal elements, Txr elements, maT, most, HimarI, Hermes, Toll element, Pokey, P-element, and Tc3. In preferred aspects, the transposon is the site-specific Tn7, which inserts preferentially into a specific target or attachment site called attTn7. In other aspects, site-specific transposons, such as those classified as Tn7-like transposons or Tn7-like mobile genetic elements that insert into comparable attachment sites within the chromosome or on a plasmid harbored within a cell, are considered to be within the scope of the invention.
[0157] The terms “cell” and “cells”, which are meant to be inclusive, refer to one or more cells which can be in an isolated or cultured state, as in a cell line comprising a homogeneous or heterogeneous population of cells, or in a tissue sample, or as part of an organism, such as an insect larva or a transgenic mammal.
[0158] Trans-Acting: Trans-acting elements are genes or DNA segments which exert their functions on another DNA segment independent of the trans-acting elements genetic linkage to that DNA segment.
[0159] The phrase “Transpositional inactivation of a (selectable/screenable) marker/reporter gene” refers to inactivation of a marker or reporter gene by insertion of a site-specific or random transposon, disrupting or preventing expression of a functionally-active product encoded by the marker or reporter gene.
[0160] The phrase “Transpositional activation/reactivation of a (selectable/screenable) marker/reporter gene” refers to activation of a marker or reporter gene by insertion of a site-specific or random transposon, allowing expression of a functionally-active product encoded by the marker or reporter gene.
DETAILED DESCRIPTION OF THE INVENTION
[0161] A major aspect of the invention relates to a nucleotide sequence comprising a target site for a site-specific transposon, wherein said target site comprises a target sequence comprising a transcriptionally or translationally fused marker sequence encoding a selectable marker sequence or a screenable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon or a site-specific recombinase, wherein said fused marker sequence encodes an inactive or an active polypeptide capable of conferring a selectable or screenable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite screenable or selectable marker sequence compared to a cell comprising just the selectable or screenable marker sequence.
[0162] Another aspect relates to a nucleotide sequence, wherein said target site comprises a target sequence for a site-specific transposon comprising a translationally-fused selectable marker sequence or a screenable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive or an active polypeptide capable of conferring a selectable or screenable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite screenable or selectable marker sequence compared to a cell comprising just the selectable or screenable marker sequence.
[0163] Another aspect relates to a nucleotide sequence wherein said sequence comprises a target site for a site-specific transposon comprising a translationally-fused selectable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive polypeptide capable of conferring a selectable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite selectable marker sequence compared to a cell comprising just the selectable marker sequence.
[0164] Another aspect relates to a sequence wherein said wherein said fused marker sequence encodes a truncated or extended inactive polypeptide which is extended or truncated, respectively, after transposition to form a composite target sequence which encodes an active polypeptide conferring a selectable phenotype upon the cell.
[0165] Still another aspect relates to a sequence, wherein said fused marker sequence encodes a truncated, inactive polypeptide which is extended after transposition to form a composite target sequence which encodes an active polypeptide conferring a selectable phenotype upon the cell.
[0166] Another aspect relates to a sequence wherein the selectable marker sequence encodes an inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
[0167] Another aspect relates to a sequence wherein the sequence encoding the inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0168] Another aspect relates to a nucleotide sequence wherein the composite selectable marker sequence encodes an active bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
[0169] Still another aspect relates to a nucleotide sequence wherein the sequence encoding the active bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive CAT polypeptide domain restore CAT activity to the fusion protein.
[0170] A major aspect relates to a nucleotide sequence wherein said fused marker sequence encodes an extended, inactive polypeptide which is truncated after transposition to form a composite target sequence which encodes an active, polypeptide conferring a selectable phenotype upon the cell.
[0171] Another aspect relates to a nucleotide sequence of claim 10, wherein the selectable marker sequence encodes an inactive NPT-II fusion protein.
[0172] Still another aspect relates to a nucleotide sequence wherein the sequence encoding the inactive NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0173] Another aspect relates to a nucleotide sequence wherein the composite selectable marker sequence encodes an active NPT-II fusion protein.
[0174] Still another aspect relates to a nucleotide sequence, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the removal of amino acids encoded by (ii) (iii) to the inactive NPT-II polypeptide domain restores NPT-II activity to the fusion protein.
[0175] Still another aspect relates to a nucleotide sequence, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of amino acids encoded by (ii) (iii) to the inactive NPT-II polypeptide domain restores NPT-II activity to the fusion protein.
[0176] Still another aspect relates to a nucleotide sequence, wherein said sequence comprises a target site for a site-specific transposon comprising a translationally-fused to screenable marker sequence operably-linked to a sequence comprising a specific site for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an active polypeptide capable of conferring a screenable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite screenable marker sequence compared to a cell comprising the just the selectable marker sequence.
[0177] Specific aspects of the invention relate to a nucleotide sequence, wherein the screenable marker sequence encodes an active lacZ alpha peptide fusion protein, including aspect where wherein the sequence encoding the active lacZ alpha fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the N-terminal sequence of a lacZalpha polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (iii) and the C-terminal sequence of a lacZalpha polypeptide; and (iv) a sequence comprising one or more stop codons,
[0178] Related aspects include a sequence wherein the composite screenable marker sequence encodes an inactive lacZ alpha peptide fusion protein.
[0179] Related aspects include, a nucleotide sequence wherein the sequence encoding the active lacZ alpha fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the sequence of a lacZalpha polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iii) a sequence comprising one or more in frame stop codons.
[0180] A related aspect includes a nucleotide sequence wherein the composite screenable marker sequence encodes an inactive lacZ alpha peptide fusion protein.
[0181] A related aspect includes a nucleotide sequence wherein the sequence encoding the active lacZ alpha fusion protein comprises in a 5′ to 3′ direction (i) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (ii) a sequence encoding the sequence of a lacZalpha polypeptide; and (iii) a sequence comprising one or more in frame stop codons.
[0182] A related aspect includes a nucleotide sequence wherein the composite screenable marker sequence encodes an inactive lacZ alpha peptide fusion protein.
[0183] Related aspects include a nucleotide sequence wherein the screenable marker sequence encodes an active CAT fusion protein.
[0184] A related aspect includes a nucleotide sequence of wherein the sequence encoding the active CAT fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the N-terminal sequence of a CAT polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (iii) and the C-terminal sequence of a CAT polypeptide; and (iv) a sequence comprising one or more stop codons.
[0185] A related aspect includes a nucleotide sequence, wherein the composite screenable marker sequence encodes an inactive CAT fusion protein.
[0186] Related aspects include a nucleotide sequence wherein the screenable marker sequence encodes an active NPT-II fusion protein.
[0187] A related aspect includes a nucleotide sequence, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the N-terminal sequence of a NPT-II polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (iii) and the C-terminal sequence of a NPT-II polypeptide; and (iv) a sequence comprising one or more stop codons.
[0188] A related aspect includes a nucleotide sequence, wherein the composite screenable marker sequence encodes an inactive NPT-II fusion protein.
[0189] Related aspects include a nucleotide sequence, wherein the screenable marker sequence encodes an active β-lactamase fusion protein.
[0190] Specific aspects include a nucleotide sequence, wherein the sequence encoding the active β-lactamase fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the N-terminal sequence of a β-lactamase polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (iii) and the C-terminal sequence of a β-lactamase polypeptide; and (iv) a sequence comprising one or more stop codons.
[0191] A related aspect includes a nucleotide sequence, wherein the composite screenable marker sequence encodes an inactive β-lactamase fusion protein.
[0192] Related aspects include a nucleotide sequence, wherein the screenable marker sequence encodes an active tetracycline resistance fusion protein.
[0193] Specific aspects include a nucleotide sequence, wherein the sequence encoding the active tetracycline resistance fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the N-terminal sequence of a tetracycline resistance polypeptide, (ii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; (iii) and the C-terminal sequence of a tetracycline resistance polypeptide; and (iv) a sequence comprising one or more stop codons.
[0194] Related aspects include a nucleotide sequence, wherein the composite screenable marker sequence encodes an inactive tetracycline resistance fusion protein.
[0195] Another aspect of the invention relates to a nucleotide sequence, wherein said sequence comprises a target site for a site-specific transposon comprising a translationally-fused selectable marker sequence operably-linked to a sequence comprising a specific target sequence for recognition and insertion of a site-specific transposon, wherein said fused marker sequence encodes an inactive polypeptide capable of conferring a selectable phenotype upon a cell comprising the fused marker sequence, wherein insertion of the site-specific transposon into the target sequence to create a composite target sequence changes the phenotype of a cell comprising the composite selectable marker sequence compared to a cell comprising just the selectable marker sequence.
[0196] Related aspects include a nucleotide sequence, wherein the selectable marker sequence encodes an inactive lacZ alpha fusion protein.
[0197] Specific aspects include a nucleotide sequence, wherein the sequence encoding the inactive lacZ alpha fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding the inactive lacZ alpha fusion protein; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0198] A related aspect includes a nucleotide sequence, wherein the composite selectable marker sequence encodes an active lacZ alpha fusion protein.
[0199] Specific aspects include a nucleotide sequence, wherein the sequence encoding the active lacZ alpha fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive lacZ alpha fusion protein domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the an inactive lacZ alpha fusion domain restores activity to the lacZ alpha fusion protein.
[0200] Another aspect relates to a nucleotide sequence, wherein the selectable marker sequence encodes an inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
[0201] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the inactive bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0202] Another aspect relates to a nucleotide sequence, wherein the composite selectable marker sequence encodes an active bacterial chloramphenicol acetyl transferase (CAT) fusion protein.
[0203] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the active bacterial chloramphenicol acetyl transferase (CAT) fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive bacterial chloramphenicol acetyl transferase (CAT) polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive CAT polypeptide domain restore CAT activity to the fusion protein.
[0204] Another aspect includes a nucleotide sequence, wherein the selectable marker sequence encodes an inactive NPT-II fusion protein.
[0205] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the inactive NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide; (ii) a sequence comprising one or more stop codons; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0206] Another aspect relates to a nucleotide sequence, wherein the composite selectable marker sequence encodes an active NPT-II fusion protein.
[0207] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the active NPT-II fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive NPT-II polypeptide domain; (ii) sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive NPT-II polypeptide domain restores NPT-II activity to the fusion protein.
[0208] Another aspect relates to a nucleotide sequence, wherein the selectable marker sequence encodes an inactive β-lactamase fusion protein.
[0209] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the inactive β-lactamase fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive β-lactamase polypeptide; (ii) a sequence comprising one or more stop codon; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0210] Another aspect relates to a nucleotide sequence, wherein the composite selectable marker sequence encodes an active β-lactamase fusion protein.
[0211] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the inactive β-lactamase fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an active β-lactamase polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive β-lactamase polypeptide domain restores β-lactamase activity to the fusion protein.
[0212] Another aspect relates to a nucleotide sequence, wherein the selectable marker sequence encodes an inactive tetracycline resistance fusion protein.
[0213] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the inactive tetracycline resistance fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive tetracycline resistance polypeptide; (ii) a sequence comprising one or more stop codon; (iii) a sequence comprising the attachment site for the site-specific transposon and encoding a synthetic polypeptide; and (iv) a sequence comprising one or more in frame stop codons.
[0214] Another aspect relates to a nucleotide sequence, wherein the composite selectable marker sequence encodes an active tetracycline resistance fusion protein.
[0215] Specific aspects relate to a nucleotide sequence, wherein the sequence encoding the active tetracycline resistance fusion protein comprises in a 5′ to 3′ direction (i) a sequence encoding an inactive tetracycline resistance polypeptide domain; (ii) a sequence comprising one or more out of reading frame stop codons; and (iii) a sequence comprising one end of the transposon and one or more in frame stop codons; wherein the addition of polypeptides encoded by (ii) (iii) to the inactive tetracycline resistance polypeptide domain restores activity to the tetracycline resistance fusion protein.
[0216] Major aspects of the invention relate to a vector, designated a synthemid, comprising any of the target sequence or composite target sequences noted above.
[0217] Other aspects relate to a vector, wherein said vector propagates in a gram negative bacteria, a vector which propagates in a gram negative enteric bacteria, and a vector which propagates in Escherichia coli.
[0218] Other aspects relate to a vector, wherein said vector propagates in a gram positive bacteria.
[0219] Other aspects relate to a vector, wherein said vector is a shuttle vector capable of propagating in bacteria and a non-bacterial host cell.
[0220] Still another aspect relates to a vector wherein said shuttle vector is a eukaryotic viral shuttle vector capable of propagating in bacteria and in cell line capable of propagating a eukaryotic virus.
[0221] Still another aspect relates to a vector wherein said eukaryotic viral shuttle vector is a baculovirus shuttle vector, capable of propagating in bacteria and in Lepidopteran insect cells susceptible to infection by the baculovirus.
[0222] Still another aspect relates to a vector, wherein said baculovirus shuttle vector is capable of propagating in Escherichia coli and insect cells selected from the group consisting of Spodoptera frugiperda, Trichoplusia ni cells, and Bombyx mori cells.
[0223] Still another aspect relates to a vector wherein said eukaryotic viral shuttle vector is a mammalian virus shuttle vector, capable of propagating in bacteria and in mammalian cells susceptible to infection by the mammalian virus.
[0224] Another aspect relates to a vector comprising the target sequence.
[0225] Another aspect relates to a vector comprising the composite target sequence.
[0226] Related aspects include a nucleotide sequence comprising an array of two or more target sequences, and a vector, designated a synthemid, comprising said array.
[0227] Related aspects include a nucleotide sequence comprising a composite array of two or more composite target sequences, and a composite vector, designated a composite synthemid, comprising said composite array.
[0228] Major aspects relate to a nucleotide sequence wherein site-specific transposon is Tn7 or a Tn7-like transposon.
[0229] A specific aspect relates to a nucleotide sequence wherein said site-specific transposon is Tn7.
[0230] A specific aspect relates to a nucleotide sequence wherein said site-specific transposon is a Tn7-like transposon.
[0231] Another aspect relates to a nucleotide sequence, wherein said attachment site and site specific transposon are derived from a Tn7-like transposable element. In one aspect, said attachment site is attTn7 and the transposon is Tn7.
[0232] A major aspect of the invention also relates to a method of screening or selecting for transposition of a site-specific transposon into a nucleotide sequence comprising an attachment site for a site-specific transposon operably-linked to a screenable or selectable marker sequence, comprising the steps of (i) introducing into a bacterial cell a target vector comprising a marker sequence that encodes one or more active or inactive polypeptides capable of conferring a screenable or selectable phenotype upon a cell comprising the marker sequence, wherein insertion of the site-specific transposon into the attachment site to create a composite marker sequence changes the phenotype of a cell comprising the screenable or selectable marker sequence; (ii) introducing into said cell comprising said target vector, a donor vector comprising sequences capable of transposing the wild type or a variant form of the site-specific transposon, and optionally a helper vector comprising sequences encoding one or more transposase gene products; (iii) culturing and optionally plating bacteria comprising the target vector, and optionally donor and helper vectors, (iv) screening or selecting for bacterial colonies where transposition of the site-specific transposon into the attachment site on the target vector to create a composite marker sequence changes the phenotype of the bacterial cell harboring the target vector.
[0233] Specific aspects relate to a method, wherein step (iv) is screening for bacterial colonies where transposition of the site-specific transposon into the attachment site on the target vector changes the phenotype of the bacterial cell harboring the target vector.
[0234] More specific aspects relate to a method, wherein the screenable method is by a change from a Lac positive (+) to a Lac minus (−) phenotype, a change from an NPT-II positive (+) to an NPT-II minus (−) phenotype, a change from a β-lactamase positive (+) to a β-lactamase minus (−) phenotype, a change from a tetracycline resistant (+) to a tetracycline sensitive (−) phenotype.
[0235] Specific aspects relate to a method wherein step (iv) is selecting for bacterial colonies where transposition of the site-specific transposon into the attachment site on the target vector changes the phenotype of the bacterial cell harboring the target vector.
[0236] More specific aspects include a method, wherein the selectable method is by a change from a Cm sensitive (S) to a Cm resistant (R) phenotype, including a change from a Lac positive (+) to a Lac minus (−) phenotype, a change from a Lac minus (−) to a Lac positive (+) phenotype, a change from a NPT-II minus (−) to a NPT-II plus (+) phenotype, a change from a β-lactamase minus (−) to a β-lactamase plus (+) phenotype, and a change from a tetracycline sensitive (−) to a tetracycline resistant (+) phenotype.
EXAMPLES
[0237] The foregoing discussion may be better understood in connection with the following representative examples which are presented for purposes of illustrating the principle methods and compositions of the invention, and not by way of limitation. Various other examples will be apparent to the person skilled in the art after reading the present disclosure without departing from the spirit and scope of the invention. It is intended that all such other examples be included within the scope of the appended claims.
General Materials and Methods
[0238] Simulated cloning and display of linear DNA segments and circular plasmid maps was facilitated through the use of the SnapGene program obtained from GSL Biotech. Analysis of sequences permitting silent mutations in coding sequences was facilitated by “WatCut: An on-line tool for restriction analysis, silent mutation scanning, and SNP-RFLP analysis”, maintained by Michael Palmer, University of Waterloo, Ontario, Canada (watcut.uwaterloo.ca). General features and annotated maps of a wide variety of DNA segments and cloning or expression vectors can be obtained from online databases maintained by NCBI, such as GenBank, Addgene, SnapGene, Thermo Fisher, and New England Biolabs.
[0239] Standard general methods of cloning, expressing, and characterizing proteins are found in T. Maniatis, et al, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory, 1982, and references cited therein, incorporated herein by reference; and in J. Sambrook, et al, Molecular Cloning, A Laboratory Manual, 2nd edition, Cold Spring Harbor Laboratory, 1989, and references cited therein, incorporated herein by reference. General methods for the cloning and expression of genes in mammalian cells are also found in Colosimo et al, Biotechniques 29:314-331, 2000. Baculovirus- and insect cell culture-related procedures are performed as described (O'Reilly et al, 1992).
[0240] Restriction enzymes were purchased from Thermo Fisher (Waltham, Mass.) and New England Biolabs (Beverly, Mass.), unless otherwise indicated. Synthetic vectors and oligonucleotides were purchased from Twist Biosciences or IDT, unless otherwise indicated. Structural analysis of vectors, by DNA sequencing was performed by GeneWiz (South Plainfield, N.J.). All parts are by weight (e.g., % w/w), and temperatures are in degrees Centigrade (° C.), unless otherwise indicated.
[0241] Brief descriptions of key materials required for the studies described below are provided in the following tables, noted below in different sections of the Examples, including Table: 5—Key Features of Bacterial Strains, Table: 6—Plasmids Used in These Studies; and Table: 7—Summary Table of Sequences.
[0242] Bacterial strains and plasmid vectors are obtained from the sources listed in each table, or constructed for these studies. The nucleotide sequences of plasmid vectors, if known, are indicated by their GenBank Accession Numbers. The sequences of oligonucleotides that are annealed to complementary nucleotides, or used as primers for amplifying segments of dsDNA are also shown below, and assigned specific SEQ ID NOS, as recited in the Sequence Listing, and in one or more tables summarizing key features of nucleotide and amino acid sequences set forth in the Sequence Listing.
Bacterial Media
[0243] Rich media, such as 2XYT broth and LB broth and agar, are purchased or prepared as described by (Miller, 1972). Supplements are incorporated into liquid and solid media typically at the following concentrations (μg/ml): Amp, 100; Gen, 7; Tet, 10; Kan, 50; X-gal or Bluo-gal, 100; IPTG, 40. Ampicillin, kanamycin, tetracycline, and IPTG (isopropyl-beta-D-thiogalactoside) are purchased from Teknova (Hollister, Calif.) and Millipore Sigma (St. Louis, Mo.). Gentamicin, X-gal (5-bromo-3-chloro-indolyl-beta-D-galactoside), and Bluo-gal (halogenated indolyl-beta-D-galactoside) are purchased from GIBCO/BRL. Pre-poured agar plates, antibiotic solutions, and liquid media were also purchased from Teknova (Hollister, Calif.), Thermo Fisher (Carlsbad, Calif.), and Millipore Sigma (St. Louis, Mo.).
Bacterial Transformation
[0244] Plasmids were transformed into frozen competent E. coli DH10B (Grant et al, 1990), obtained from Thermo Fisher, using the procedures recommended by the manufacturer. Briefly, frozen cells were thawed on ice and 33-100 μl of cells are incubated with 0.01-1.0 μg of plasmid DNA for 30-60 minutes. The cells were shocked by heating at 42° C. for 30 seconds, diluted to 1.0 ml with antibiotic-free S.O.C. buffer, and grown at 37° C. for 1-3 hours. A 20 to 100 ul sample of culture was spread on agar plates supplemented with the appropriate antibiotics. Colonies are purified by restreaking on the same selection plates prior to analysis of drug resistance phenotype and isolation of plasmid DNAs. Plasmids are also transformed into competent E. coli DH10B cells prepared by suspending early log phase cells in transformation buffer using a TransformAid kit obtained from Thermo Fisher. Plasmids may be transformed into competent cells prepared by the calcium chloride method described by Sambrook et al, (1989), or by transformation into electrocompetent cells suspended in buffered glycerol using protocols and equipment provided by BioRad.
DNA Preparation and Plasmid Manipulation
[0245] DNA samples are prepared from 1-250 ml cultures grown in LB or 2XYT medium supplemented with appropriate antibiotics. Cultures are harvested and lysed by an alkaline lysis method and the plasmid DNA samples are purified over resin columns provided by Thermo Fisher.
TABLE-US-00008 TABLE 5 Key Features of Bacterial Strains Designation Genotype Description Reference Source DH5aF′IQ F′ proAB.sup.+ laclqΔZM15 zzf::Tn5 (Kan.sup.R) Original source of the GIBCO/BFL isolated from strain DH5alphaF′IQ mini-F replicon and the kanamycin resistance gene inserted into the bacmid bMON14272. E. coli F.sup.−endA1 reck1 galE15 galK16 nupG rpsL DH10B has been Grant et al, Thermo DH10B ΔlacX74 Φ80lacZΔM15 araD139 classically reported to be 1990; Fisher Δ(ara, leu)7697 mcrA Δ(mrr-hsdRMS-mcrBC) λ.sup.− galU galK, the genomic Blattner sequence indicates that DH10B is actually galE galK galU+, and is also deoR.sup.+. E. coli F.sup.− mcrA Δ(mrr-hsdRMS-mcrSC) Φ80lacZΔM15 DH10B harboring the Luckow et al Thermo DH10Bac ™ ΔlacX74 recA1 endA1 araD139 baculovirus shuttle vector (1993) Fisher Δ(ara, leu)7697 galU galK λ.sup.− rpsl (bacmid) bMON7124 and the nupG/bMON14272/pMON7124 helper plasmid pMON7124.
TABLE-US-00009 TABLE 6 Plasmids Used in These Studies Size Designation Markers (bp) Description Reference Source pACYC177 Amp.sup.R, 3941 pACYC177 is an E. coli Chang, A. and Cohen, NEB Kan.sup.R plasmid cloning vector S. (1978) J. Bacteriol. comprising an ampicillin 134: 1114-1156. resistance (Amp.sup.R) gene derived from Tn3, and a kanamycin resistance gene (Kan.sup.R) derived from Tn903. It contains a p15A origin of replication derived from pSC101, allowing it to coexist in cells with plasmids of the ColE1 compatibility group (e.g., pBR322, pUC19), and considered to be a low- medium number vector, with about 15 copies per cell. pACYC184 Tet.sup.R, 4245 pACYC184 carries a gene Chang, A. and Cohen, Boca Cat.sup.R conferring resistance to S. (1978) J. Bacteriol. Scientific tetracycline (Tet.sup.R) and a gene 134: 1114-1156;; encoding chloramphenicol Sequence reported by acetyltransferase, conferring Rose, R. E. (1988) resistance to chloramphenicol Nucleic Acids. (Cat.sup.R). It has the same Res.16: 355. replicon as pACYC177. pTwist- Cat.sup.R 1953 Synthetic cloning vector Twist Chlor-MC conferring resistance to Biosciences chloramphenicol and comprising a medium copy number (MC) p15A bacterial replicon used to facilitate cloning of synthetic sequences. pTwist-Kan- Kan.sup.R 2105 Synthetic cloning vector Twist MC conferring resistance to Biosciences kanamycin and comprising a medium copy number (MC) p15A bacterial replicon used to facilitate cloning of synthetic sequences. pTwist-Amp- Amp.sup.R 2221 Synthetic cloning vector Twist HC conferring resistance to Biosciences Ampicillin and comprising a high copy number (HC) pMB1/ColE1/pUC bacterial replicon used to facilitate cloning of synthetic sequences. pMAK705 Cat.sup.R, 5593 Derived from pH01 and Hamilton et al, lacZ pMAK700 containing a (1989) alpha pSC101.sup.ts replicon, a cat gene and partial amp gene from pBR325, and lacZalpha segment from pUC19. pFastBac1 Amp.sup.R, 4775 Mini-Tn7 donor plasmid Ciccarone et al Thermo Gent.sup.R derived from pMON14327, (1997), based on Fisher containing the AcNPV Luckow et al (1993) polyhedrin promoter, a multiple cloning site (MCS) and SV40 poly(A) transcriptional terminator segment between the left and right arms of Tn7. pMON7124 Tet.sup.R 13,328 pBR322 comprising Tn7 Barry (1988); Thermo transposase genes tns A, B, (Sequenced by D. Fisher C, D, and E, plus the right end Esposito, pers. com.) of Tn7 (Tn7R). bMON14272 Kan.sup.R ~142,278 Baculovirus shuttle vector Luckow et al (1993); Thermo comprising contiguous (Sequenced by D. Fisher segment encoding a Esposito, pers. com.) kanamycin resistance gene (Kan.sup.R), a lacZalpha-mini- attTn7, and a mini-F replicon (stable, IncFl, very low copy number) inserted into the polyhedrin locus of the baculovirus Autographa californica Nuclear Polyhedrosis Virus (AcNPV) E2 variant.
[0246] Table 7 summarizes features sequences and vectors represented by SEQ ID NOS 1-198.
[0247] Tables 24 and 26 summarize features of Twist vectors 1-40 represented by SEQ ID NOS 199-240.
TABLE-US-00010 TABLE 7 Summary Table of Sequences SEQ lD Name Description Length Type NO Tn7 Nucleotide sequence 14067 DNA 01 of wild-type Tn7 (GenBank Acc. No. BM_NC_002525), found in a plasmid isolated from E. coli. attTn7 near 3′ Sequences extending from −2, −1, 61 DNA 02 end of E. coli glmS 0, +1 +2, and +3 to +58 of the gene attachment site for Tn7 near the E. coli glmS gene, where positions −2 to +2 are duplicated as 5 bp sequences at both ends of a Tn7 element after transposition into this sequence. 5-bp duplication Junction of 5-bp duplication 13 DNA 03 at Tn7L in nearTn7L inserted between attTn7 positions −2 to +2 of attTn7 near 3′ end of E. coli glmS gene 5-bp duplication Junction of 5-bp duplication 69 DNA 04 at Tn7R in near Tn7R inserted between attTn7 positions −2 to +2 of attTn7 near 3′ end of E. coli glmS gene. mini-attTn7 Synthetic lacZ-alpha-mini- 549 DNA 05 attTn7 sequence Truncated lacZalpha- Synthetic truncated lacZalpha- 366 DNA 06 mini-attTn7 mini-attTn7 3′ end of Type I cat Sequences From the TatI/ScaI 76 DNA 07 gene adding site to the BaeGI/Bme1508I SrfI/XmaI sites at the 3′ end of the Type I cat gene, adding SrfI and XmaI sites Polypeptide sequence encoded 10 PRT 08 at carboxy terminal region of Type I CAT protein, represented by QYCDEWQGGA* 3′ end of Type I Sequences From the Tat/ScaI 76 DNA 09 cat gene changing site to the BaeGI/Bme1508I GAT to TAA stop at the 3′ end of the Type I codon cat gene, adding SrfI and XmaI sites, changing the GAT to a TAA stop codon. 3′ end of Type I Sequences From the Tat/ScaI 76 DNA 10 cat gene site to the BaeGI/Bme1508I changing GAT codon at the 3′ end of the Type I to TGA stop cat gene, adding SrfI and codon XmaI sites, changing the GAT to a TGA, stop codon. 3′ end of Type I Sequences From the Tat/ScaI 76 DNA 11 cat gene site to the BaeGI/Bme1508I changing GAT at the 3′ end of the Type I codon to a TAG cat gene, adding SrfI and stop codon XmaI sites, changing the GAT to a TAG stop codon. 3′ end of the Type 3′ end of the Type I cat 100 DNA 12 I cat gene, adding gene, adding SrfI and XmaI SrfI and XmaI sites, sites, before changing the Before changing the GAT to a TAA, TGA, or TAG GAT to a TAA, TGA, stop codon, and adding an or TAG stop codon, overlapping mini-attTn7 site and adding an overlapping mini- attTn7 site 3′ end of Type I Sequences From the Tat/ScaI 100 DNA 13 cat gene with site to the BaeGI/Bme1508I TAA stop codon at the 3′ end of the Type I and overlapping cat gene, adding SrfI and mini-attTn7 XmaI sites, changing the GAT to a TAA stop codon, and adding an overlapping mini-attTn7 site. 3′ end of Type I cat Sequences From the Tat/ScaI 100 DNA 14 gene with TGA stop site to the BaeGI/Bme1508I codon and overlapping at the 3′ end of the Type I mini-attTn7 cat gene, adding SrfI and XmaI sites, changing the GAT to a TGA, stop codon, and adding an overlapping mini-attTn7 site. 3′ end of Type I cat Sequences From the Tat/ScaI 100 DNA 15 gene with TAG site to the BaeGI/Bme1508I stop codon and at the 3′ end of the Type I overlapping cat gene, adding SrfI and mini-attTn7 XmaI sites, changing the GAT to a TAG stop codon, and adding an overlapping mini-attTn7 site 3′ end of Type I Sequences From the TatI/ScaI 93 DNA 16 cat gene adding site to the BaeGI/Bme1508I SrfI and XmaI sites, at the 3′ end of Type I cat before changing gene, adding SrfI and XmaI TGCGAT to double stop sites, changing the TGC to codons a TAA, TGA, or TAG stop codon, and the GAT to a TAA stop codon, adding mini-attTn7 overlapping with the first stop codon 3′ end of Type I Sequences From the TatI/ScaI 93 DNA 17 CAT gene with site to the BaeGI/Bme1508I TGCGAT changed at the 3′ end of Type I cat to TAATAA double gene, adding SrfI and XmaI stop codons and sites, changing the TGC to overlapping mini- a TAA stop codon, and the attTn7 GAT to a TAA stop codon, adding mini-attTn7 overlapping with the first stop codon 3′ end of Type I Sequences From the TatI/ScaI 93 DNA 18 cat gene with site to the BaeGI/Bme1508I TGCGAT changed to at the 3′ end of Type I cat TGATAA double stop gene, adding SrfI and XmaI codons and sites, changing the TGC to overlapping mini- a TAA stop codon, and the attTn7 GAT to a TAA stop codon, adding mini-attTn7 overlapping with the firs t stop codon 3′ end of Type I Sequences From the TatI/ScaI 93 DNA 19 cat gene with site to the BaeGI/Bme1508I TGCGAT changed to at the 3′ end of Type I cat TAGTAA double stop gene, adding SrfI and XmaI codons and sites, changing the TGC to overlapping mini- a TGA stop codon, and the attTn7 GAT to a TAA stop codon, adding mini-attTn7 overlapping with the first stop codon 3′ end of a Type I Sequences at the 3′ end 39 DNA 20 cat gene after of a Type I cat gene transposition into after transposition of a an overlapping mini-Tn7 into an over mini-atTn7 overlapping mini- attTn7 site. Polypeptide sequences 3′ 12 PRT 21 end of a Type I cat gene after transposition of a mini-Tn7 into an over overlapping mini- attTn7 site 3′ end of Tn7R 3′ end of Tn7R after 22 DNA 22 after transposition transposition an over an over overlapping overlapping mini- attTn7 mini-attTn7 site site 3′ end of Type I Sequences at the 3′ end 67 DNA 23 cat gene to of a Type I cat gene mimic insertion that mimic Tn7L at the of Tn7L replacing junction of mini-Tn7 stop codon for replacing a stop codon Cys codon for a Cys codon in an overlapping mini-attTn7 site Polypeptide sequence that 7 PRT 24 mimics insertion of the Tn7L replacing the stop codon for a Cys codon, restoring activity to the encoded CAT fusion protein lacZ nt 1-180 5′ end of E. coli lacZ 180 DNA 25 gene nucleotides 1-180 Polypeptide encoded by 5′ 60 PRT 26 end of E. coli lacZ gene nucleotides 1-180 lacZdeltaM15 nt 1-57 5′ end of lacZ delta M15 57 DNA 27 gene of E. coli encoding amino acids 1-11 and 42-49 Polypeptide 5′ end of lacZ 19 PRT 28 delta M15 gene of E. coli encoding amino acids 1-11 and 42-49 pUC19 lacZalpha gene LacZ alpha gene with MCS 360 DNA 29 region pUC19 from positions 1-360 Polypeptide encoded by LacZ 106 PRT 30 alpha gene with MCS region pUC19 from positions 1-360 lacZ 1 to 260 Sequences from 1−260 of the 260 DNA 31 lacZ gene, but polypeptide sequence diverges around nucleotide 186 compared to those in pUC19 Polypeptide encoded by 62 PRT 32 sequences from 1−260 of the lacZ gene, but polypeptide sequence diverges around nucleotide 186 compared to those in pUC19 PuvII to KasI PuvII to KasI sites of 120 DNA 33 sites of LacZ alpha LacZ alpha gene pUC18 or gene pUC18 or pUC19 pUC19 Polypeptide encoded by PuvII 40 PRT 34 to KasI sites of LacZ alpha gene pUC18 orpUC19 PuvII to KasI PuvII to KasI sites of LacZ 120 DNA 35 sites of LacZ alpha gene pUC18 or pUC19 alpha gene pUC18 with synthetic or pUC19 with oligonucleotides comprising synthetic two TAA stop codons near oligonucleotides codons encoding NS comprising two TAA stop codons replacing codons encoding NS Polypeptide encoded by PuvII 16 PRT 36 to KasI sites of LacZ alpha gene pUC18 or pUC19 with synthetic oligonucleotides comprising two TAA stop codons near codons encoding NS PuvII to KasI sites PuvII to KasI sites of LacZ 120 DNA 37 of LacZ alpha alpha gene pUC18 or pUC19 gene pUC18 or pUC19 with synthetic with synthetic oligonucleotides oligonucleotides comprising two TAA stop comprising two codons near codons encoding TAA stop codons SE near codons encoding SE Polypeptide encoded by PuvII 16 PRT 38 to KasI sites of LacZ alpha gene pUC18 or pUC19 with synthetic oligonucleotides comprising two TAA stop codons near codons encoding SE PuvII to KasI sites PuvII to KasI sites of LacZ 120 DNA 39 of LacZ alpha alpha gene pUC18 or pUC19 with gene pUC18 or pUC19 synthetic oligonucleotides with synthetic comprising two TAA stop oligonucleotides codons near codons encoding comprising two TAA EE stop codons near codons encoding EE Polypeptide encoded by PuvII 16 PRT 40 to KasI sites of LacZ alpha gene pUC18 or pUC19 with synthetic oligonucleotides comprising two TAA stop codons near codons encoding EE PuvII to KasI sites PuvII to KasI sites of LacZ 120 DNA 41 of LacZ alpha alpha gene pUC18 or pUC19 gene pUC18 or pUC19 with synthetic with synthetic oligonucleotides comprising oligonucleotides two TAA stop codons nea comprising two r codons encoding EA TAA stop codons near codons encoding EA Polypeptide encoded by PuvII 16 PRT 42 to KasI sites of LacZ alpha gene pUC18 or pUC19 with synthetic oligonucleotides comprising two TAA stop codons near codons encoding EA PuvII to KasI sites PuvII to KasI sites of LacZ 120 DNA 43 of LacZ alpha gene alpha gene pUC18 or pUC19 pUC18 or pUC19 with with synthetic synthetic oligonucleotides comprising oligonucleotides two TAA stop codons near comprising two TAA codons encoding AR stop codons near codons encoding AR Polypeptide encoded by PuvII 16 PRT 44 to KasI sites of LacZ alpha gene pUC18 or pUC19 with synthetic oligonucleotides comprising two TAA stop codons near codons encoding AR PuvII to just beyond PuvII to KasI sites of LacZ 84 DNA 45 the KasI sites alpha gene pUC18 or pUC19 of LacZ alpha gene pUC18 or pUC19 Polypeptide encoded by PuvII 28 DNA 46 to KasI sites of LacZ alpha gene pUC18 or pUC19 PuvII to KasI sites PuvII to KasI sites of LacZ 84 DNA 47 of LacZ alpha gene alpha gene pUC18 or pUC19 pUC18 or pUC19 with stop codons replacing with stop codons SE codon replacing NS codons PuvII to KasI sites PuvII to KasI sites of LacZ 84 DNA 48 of LacZ alpha gene alpha gene pUC18 or pUC19 pUC18 or pUC19 with with stop codons replacing stop codons NS codons replacing NS codons PuvII to KasI sites PuvII to KasI sites of LacZ 84 DNA 49 alpha gene pUC18 or pUC19 of LacZ alpha gene with stop codons replacing pUC18 or pUC19 with EE codons stop codons replacing EE codons PuvII to KasI sites PuvII to KasI sites of LacZ 84 DNA 50 of LacZ alpha gene alpha gene pUC18 or pUC19 pUC18 or pUC19 with with stop codons replacing stop codons replacing EA codons EA codons PuvII to KasI sites PuvII to KasI sites of LacZ 84 DNA 51 of LacZ alpha gene alpha gene pUC18 or pUC19 pUC18 or pUC19 with with stop codons replacing stop codons replacing AR codons AR codons Overlapping mini-Tn7 Synthetic mini-attTn7 from −2 85 DNA 52 ending with KasI site to +2 with unknown nucleotides at the insertion site, followed by +3 to +58, then Synthetic SalI, KasI and other restriction sites Sequences near double Sequences near double stop 43 DNA 53 stop codons replacing codons replacing EA codons EA codons in lacZalpha in lacZalpha peptide after peptide after transposition of a mini-Tn7 transposition of a into an overlapping mini-Tn7 into an mini-attTn7 site overlapping mini-attTn7 site Junction near target Junction near target site 14 DNA 54 site reading after transposition into frame +1 TAA stop codon reading frame +1 Junction near target Junction near target site 15 DNA 55 site reading frame +2 after transposition into TAA stop codon reading frame +2 Junction near target Junction near target site 16 DNA 56 site reading frame +3 after transposition into TAA stop codon reading frame +3 pUC18 with EcoRI-SalI pUC18 lacZalpha region 381 DNA 57 mini- attTn7 containing an EcoRI-SalI fragment from bMON 14272 comprising a mini-attTn7 fragment Chimeric fusion protein 126 PRT 58 comprising lacZalpha fragment with insertion of EcoRI-SalI fragment comprising a synthetic mini- attTn7 fragment pACYC177 near PstI Sequences near the unique PstI 60 DNA 59 site site in the beta lactamase gene of pACYC177 Polypeptide encoded by sequences 20 PRT 60 near the unique PstI site in the beta lactamase gene of pACYC177 pACYC177 PstI to EagI Sequences near unique PstI 60 DNA 61 site in pACYC177 mutated to EagI site pACYC177 PstI to PuvII Sequences near unique PstI 60 DNA 62 site mutated to unique PuvII site pACYC177 near 3′ end pACYC177 with PstI site near 60 DNA 63 of NPT-II gene the 3′ end of the NPT-II gene that don′ t change the amino acids “LQ” encoded by the wild-type gene Polypeptide encoded in 15 PRT 64 pACYC177 with PstI site near the 3′ end of the NPT-II gene that don′ t change the amino acids “LQ” encoded by the wild-type gene ACYC177 with PstI site Sequences near 3′ end of 60 DNA 65 near 3′ end of NPT-II pACYC177 with a new PstI gene site that don′ t change amino acids “LQ” encoded at that position in the NPT-II gene Polypeptide encoded by 15 PRT 66 sequences near 3′ end of pACYC177 with a new PstI site that don′ t change amino acids “LQ” encoded at that position in the NPT-II gene pKM2 3′ end of pKM2 3′ end of NPT-II 51 DNA 67 NPTII gene gene Polypeptide encoded by pKM2 6 PRT 68 3′ end of NPT-II gene pKM243 3′ end of pKM243 3′ end of NPT-II 27 DNA 69 NPT-II gene gene Polypeptide encoded by 8 PRT 70 pKM243 3′ end of NPT-II gene pKM243/1 3′ end of pKM243/1 3′ end of NPT-II 18 DNA 71 NPT-II gene gene Polypeptide encoded by 6 PRT 72 pKM243/1 3′ end of NPT-II gene pKM243-1 3′ end of pKM143-1 3′ end of NPT-II 51 DNA 73 NPT-II gene gene Polypeptide encoded by 16 PRT 74 pKM143-l 3′ end of NPT-II gene pACYC177 3′ end of pACYC177 3′ end of 43 DNA 75 NPT-II gene NPT-II gene Polypeptide encoded by 6 PRT 76 pACYC177 3′ end of NPT-II gene pACYC177-QA 3′ end pACYC177-QA 3′ end of 43 DNA 77 of NPT-II gene NPT-II gene Polypeptide encoded by 6 PRT 78 pACYC177-QA 3′ end of NPT-II gene PACYC177-PS pACYC177-PS 3′ end of NPT-II 43 DNA 79 gene Polypeptide encoded by 8 PRT 80 pACYC177-PS 3′ end of NPT-II gene pACYC177-PSFNAVVYHS pACYC177-PSFNAWYHS 3′ end of 51 DNA 81 NPT-II gene Polypeptide encoded by 16 PRT 82 pACYC177-PSFNAWYHS 3′ end of NPT-II gene pACYC177-Q** pACYC177-Q** with two TAA stop 43 DNA 83 codons after Q codon Polypeptide encoded by 7 PRT 84 pACYC177-Q** with two TAA stop codons after Q codon pACYC177 P** pACYC177-P** with two TAA stop 43 DNA 85 codons after a P codon Polypeptide encoded by pACYC177-P** 7 PRT 86 with two TAA stop codons after a P codon pACYC177 3′ end of pACYC177 3′ end of 50 DNA 87 beta-lactamase gene beta-lactamase gene Polypeptide encoded by pACYC177 3′ 8 PRT 88 end of beta-lactamase gene pACYC177-K*** pACYC177-K*** with two TAA stop 50 DNA 89 codons before the normal TAA stop codon Polypeptide encoded by pACYC177- 6 PRT 90 K*** with two TAA stop codons before the normal TAA stop codon pACYC177~KH** pACYC177-KH** with two stop 50 DNA 91 codons after KH, one replacing “essential Tryptophan (W) codon Polypeptide encoded 7 PRT 92 by pACYC177-KH** with two stop codons after KH, one replacing “essential Tryptophan (W) codon pACYC177-KH** with pACYC177-KHW** with 50 DNA 93 two stop codons two stop codons after KH, one at site of normal replacing “essential TAA stop codon Tryptophan (W) codon Polypeptide encoded by 8 PRT 94 pACYC177-KHW** with two stop codons at site of normal TAA stop codon pAYC177-AAG pACYC177-AAG 11 DNA 95 pACYC177-AAGT pACYC177-AAGT 12 DNA 96 pACYC177-AAGTA pACYC177-AAGTA 13 DNA 97 pACYC177-AAGCAT pACYC177-AAGCAT 14 DNA 98 pACYC177-AAGCATT pACYC177-AAGCATTT 15 DNA 99 pACYC177-AAGCATTA pACYC177-AAGCATTA 16 DNA 100 PACYC177-AAGCATTGG pACYC177-AAGCATTGG 17 DNA 101 pACYC177-AAGCATTGGT pACYC177-AAGCATTGGT 18 DNA 102 pACYC177-AAGCATTGGTA pACYC177-AAGCATTGGTA 19 DNA 103 pACYC177-PstI-BglI pACUC177-PstI-BglI spanning 141 DNA 104 junction between alpha and omega fragments of beta- lactamase Polypeptide encoded by 47 PRT 105 pACUC177-PstI-BglI spanning junction between alpha and omega fragments of beta- lactamase pACYC177-PstI-Asel pACYC177-PstI-Asel with 105 DNA 106 with linker synthetic linker at junction of alpha and omega fragments of beta lactamase Polypeptide encoded by 35 PRT 107 pACYC177-PstI-Asel with synthetic linker at junction of alpha and omega fragments of beta lactamase pACYC177-bla- pACYC177-bla-alpha-omega-mini- 180 DNA 108 alpha-omega- attTn7 with mini-attTn7 at the mini-attTn7 junction of the alpha and omega peptides of beta-lactamase Polypeptide encoded by pACYC177- 60 PRT 109 bla-alpha-omega-mini- attTn7 with mini-attTn7 at the junction of the alpha and omega peptides of beta-lactamase Tn10 Tetracycline lnterdomain loop in Tn10 401 PRT 110 resistance protein tetracycline resistance protein ETKNTRDNTDTEVGVETQSNSVYlTLF pACYC184 Tetracycline lnterdomain loop in pACYC184 396 DNA 111 resistance protein tetracycline gene indirectly derived from pSClOl isolated from Shigella flexneri ESHKGERRPMPLRAFNPVSSFRWARGM pACYC184 reverse Sequence from the reverse 210 DNA 112 complement complement of pACYC184 spanning Tet flanking the interdomain Interdomain loop of the tetracyclin Loop e resistance protein Polypeptide encoded by 70 PRT 113 sequence from the reverse complement of pACYC184 flanking the interdomain loop of the tetracycline resistance protein pACYC184 reverse pACYC184 reverse complement 297 DNA 114 complement Tet-mini-attTn7, with Tet-mini-attTn7 synthetic mini-attTn7 inserted near SalI site in the sequences encoding the interdomain linker of the tetracycline resistance protein Polypeptide encoded by pACYC184 99 PRT 115 reverse complement Tet- mini-attTn7, with synthetic mini-attTn7 inserted near SalI site in the sequences encoding the interdomain linker of the tetracycline resistance protein EcoRI-SalI fragment An EcoRI-SalI fragment 95 DNA 116 comprising comprising a synthetic a synthetic mini-attTn7 mini-attTn7 NotI-PspOMI linker Synthetic NotI-PspOMI 22 DNA 117 linker NotI-scar-PspOMI linker Synthetic Linker with 37 DNA 118 NotI-scar-PspOMI sites PspOMI-NotI linker PspOMI-NotI linker 22 DNA 119 PspOMI-scar-NotI linker Synthetic PspOMI-scar- 37 DNA 120 NotI linker AbsI-SgrDI linker Synthetic AbsI-SgrDI 24 DNA 121 linker AbsI-scar-SgrDI linker Synthetic AbsI-scar- 40 DNA 122 SgrDI linker SgrDI-AbsI linker Synthetic SgrDI-AbsI 24 DNA 123 linker SgrDI-scar-AbsI linker Synthetic SgrDI-scar- 40 DNA 124 AbsI linker MauBI-AscI linker Synthetic MauBI-AscI 24 DNA 125 linker MauBI-scar-AscI linker Synthetic MauBI-scar- 40 DNA 126 AscI linker AscI-MauBI linker Synthetic AscI-MauBI 24 DNA 127 linker AscI-scar-MauBI linker Synthetic AscI-scar- 40 DNA 128 MauBI linker MauBI-AbsI linker MauBI-AbsI 24 DNA 129 MauBI-SgrDI linker MauBI-SgrDI 24 DNA 130 AscI-Abs linker AscI-AbsI 24 DNA 131 AscI-SgrDI linker AscI-SgrDI 24 DNA 132 AbsI-MauBI linker AbsI-MauBI 24 DNA 133 Abs-AscI linker AbsI-Asd 24 DNA 134 SgrDI-MauBI linker SgrDI-MauBI 24 DNA 135 SgrDI-AscI linker SgrDI-AscI 24 DNA 136 MauBI-PacI-AbsI MauBI-PacI-AbsI 24 DNA 137 MauBI-PacI-SgrDI MauBI-PacI-SgrDI 24 DNA 138 AscI-PacI-AbsI linker AscI-PacI-AbsI 24 DNA 139 AscI-PacI-SgrDI linker AscI-PacI-SgrDI 24 DNA 140 AbsI-PacI-MauBI linker AbsI-PacI-MauBI 24 DNA 141 AbsI-PacI-AscI linker AbsI-PacI-AscI 24 DNA 142 SgrDI-PacI-MauBI linker SgrDI-PacI-MauBI 24 DNA 143 SgrDI-PacI-AscI linker SgrDI-PacI-AscI 24 DNA 144 SgrDI-PacI-AbsI-AvrII- MauBI-PacI-AbsI- 54 DNA 145 SgrDI-PacI-AscI linker AvrII-SgrDI-PacI- AscI MauBI-PacI-SgrDI-AvrII- MauBI-PacI-SgrDI- 54 DNA 146 AbsI-PacI- AscI linker AvrII-AbsI-PacI- AscI AscI-PacI- AbsI-AvrII- AscI-PacI-AbsI- 54 DNA 147 SgrDI-PacI- MauBI linker AvrII-SgrDI-PacI- MauBI AscI-PacI- SgrDI-AvrII- AscI-PacI-SgrDI- 54 DNA 148 AbsI-PacI- MauBI linker AvrII-AbsI-PacI- MauBI AbsI-PacI-MauBI- AvrII- AbsI-PacI-MauBI- 54 DNA 149 AscI-PacI- SgrDI linker AvrII-AscI-PacI- SgrDI AbsI-PacI-AscI-AvrII-MauBI- AbsI-PacI-AscI- 54 DNA 150 PacI- SgrDI linker AvrII-MauBI-PacI- SgrDI SgrDI-PacI-MauBI-AvrII- SgrDI-PacI-MauBI- 54 DNA 151 AscI-PacI- AbsI linker AvrII-AscI-PacI- AbsI SgrDI-PacI-AscI-AvrII- SgrDI-PacI-AscI- 54 DNA 152 MauBI-PacI- AbsI linker AvrII-MauBI-PacI- AbsI MauBI-PacI-AscI linker MauBI-PacI-AscI 24 DNA 153 AscI-PacI-MauBI linker AscI-PacI-MauBI 24 DNA 154 AscI-PacI-SgrDI linker AbsI-PacI-SgrDI 24 DNA 155 SgrDI-PacI-AbsI linker SgrDI-PacI-AbsI 24 DNA 156 pTwist+Kan+MC Twist Biosciences 2007 DNA 157 cloning vector for insertion of synthetic DNA sequences, comprising a medium copy p15A bacterial replicon and conferring resistance to kanamycin pTKM-MaAbAvSgAs pTwist-Kan-MC vector 2159 DNA 158 with MauBI-PacI-AbsI- AvrII-SgrDI-PacI- AscI polylinker pTKM-CATd8 cat gene from pACYC184 876 DNA 159 polypeptide 219 PRT 160 pTKM-CAT-TAA cat gene from pACYC184 876 DNA 161 with one TAA stop codon polypeptide 212 PRT 162 pTKM-CAT-TAATAA cat gene from pACYC184 876 DNA 163 with two TAA stop codons polypeptide 211 PRT 164 pTKM-CAT-TAATAA- cat gene from pACYC184 889 DNA 165 mini-attTn7 and two TAA stop codons followed by mini-attTn7 target site polypeptide 211 PRT 166 pTKMC-CAT-Tn7Lrf1 gene fusion comprising 896 DNA 167 cat gene from pACYC194 fused to reading frame 1 from end of Tn7L polypeptide 216 PRT 168 pTKMC-CAT-Tn7Lrf2 gene fusion comprising cat 897 DNA 169 gene from pACYC194 fused to reading frame 2 from end of Tn7L polypeptide 228 PRT 170 pTKMC-CAT-Tn7Lrf3 gene fusion comprising cat 898 DNA 171 gene from pACYC194 fused to reading frame 3 from end of Tn7L polypeptide 220 PRT 172 pTwist-Chlor-MC cloning pTwist-Chlor-MC cloning vector 1953 DNA 173 vector pTwist+Chlor+MC pTwist+Chlor+MC vector with 2007 DNA 174 vector with MauBI-PacI- MauBI-PacI-AbsI-AvrII-SgrDI- AbsI-AvrII-SgrDI- PacI-AscI polylinker PacI-AscI polylinker pTCM-Kan-CGRT gene fusion comprising kanamycin 1028 DNA 175 gene from pACYC177 extended to also encode CGRTK and one stop codon polypeptide 276 PRT 176 pTCM-Kan-PSFNAVVYHS gene fusion comprising kanamycin 1040 DNA 177 gene from pACYC177 extended to also encode PSFNAVVYHS and one stop codon polypeptide 281 PRT 178 pTCM-Kan-PS gene fusion comprising kanamycin 1016 DNA 179 gene from pACYC177 extended to also encode PS and one stop codon polypeptide 273 PRT 180 pTCM-Kan-Tn7Lrf1 gene fusion comprising kanamycin 1074 DNA 181 gene from pACYC177 extended to also encode CGRTK and one stop codon followed by partial Tn7L polypeptide 276 PRT 182 pTCM-Kan-Tn7Lrf2 gene fusion comprising kanamycin 1075 DNA 183 gene from pACYC177 extended to also encode LWADKlVGNWEGWKWSF and one stop codon followed by partial Tn7L in reading frame 2 polypeptide 288 PRT 184 pTCM-Kan-Tn7Lrf3 gene fusion comprising kanamycin 1076 DNA 185 gene from pACYC177 extended to also encode PVGSQNSWELGGVEMEFLRII and one stop codon in reading frame 3 polypeptide 290 PRT 186 pTCM-Kan-PS-mini-attTn7 gene fusion comprising kanamycin 1069 DNA 187 gene from pACYC177 extended to also encode PS and one stop codon and overlapping mini-attTn7 site polypeptide 273 PRT 188 pTCM-Kan-PS gene fusion comprising kanamycin 1016 DNA 189 gene from pACYC177 extended to also encode PS and one stop codon polypeptide 193 PRT 190 pTCM-Kan Unaltered kanamycin gene 1016 DNA 191 from pACYC177 and one TAA stop codon polypeptide 271 PRT 192 pTKM-lacZalpha- lacZalpha gene comprising 837 DNA 193 mini-attTn7 mini-attTn7 target site polypeptide 180 PRT 194 pTKM-lacZalpha- lacZalpha gene comprising 687 DNA 195 micro-attTn7 micro-attTn7 target site polypeptide 130 PRT 196 pTwist-Amp-HC pTwist-Amp-HC cloning vector 2221 DNA 197 pTAH-MaAbAvSgAs pTwist+Amp+HC with MauBI-AbsI- 2275 DNA 198 AvrII-SgrDI-AscI polylinker
[0248] Tables 24 and 26 also summarize features of Twist vectors 1-40 represented by SEQ ID NOS 199-240.
Example 1—Design of Modular Sequences Encoding an Active LacZalpha-Mini-attTn7 Fusion Polypeptide
[0249] The development of cloning vectors comprising a multiple cloning site (MCS) within or between several segments of genes allowing rapid and easy screening for vectors comprising inserts greatly facilitated the cloning and analysis of a wide variety of prokaryotic and eukaryotic genes. High copy number vectors, such as pUC8 and pUC9, typically have an MCS inserted into a short segment at the 5′ end of the lacZ gene encoding an inactive fragment of β-galactosidase called the alpha peptide. The alpha peptide (“α-donor”) can bind to and complement an inactive α-acceptor, lacking a segment at the N-terminal region of the full length β-galactosidase, to restore activity of the enzyme [Juers et al (2012) Protein Science 21:1792-1807].
[0250] Two variants of β-galactosidase were observed in early studies, one deleting residues 23-31 and the other residues 11-41, caused the tetrameric enzyme to dissociate into inactive dimers. Peptides that included some of all of the missing residues, such as 3-41 or 3-92, restored the activity of the enzyme. Crystallographic studies have since shown that the donor binds to the site previously occupied by the deleted N-terminal residues, stabilizing and helping to restore the tetrameric structure. Residues from about 13 to 20 in adjacent subunits contact each other, and residues 29-33 occupy a tunnel in Domain 1 and the remainder of the acceptor polypeptide. Because critical catalytic residues are located in several domains, dissociation of the tetramer into the dimer disrupts all four active sites, abolishing the activity of the enzyme. The length of the complementing peptide is not important, as long as about 41 amino acid residues are present.
[0251] In many common E. coli strains used for cloning, the acceptor polypeptide is encoded by the lacZΔM15 gene which lacks residues 11-41 of the full length enzyme, having 1,024 residues. (In many older papers, the polypeptide numbering schemes apparently omit the amino-terminal methionine residue which is processed off in bacteria, so the second encoded amino acid is designated as +1). Many of these cells also contain the lacI gene encoding a repressor protein that binds to the lac operator in the vector, suppressing transcription of the lacZalpha gene in the cloning vector. When transformed host cells are spread on agar plates containing an appropriate antibiotic (typically ampicillin for many vectors), plus IPTG (isopropyl-β-D-thiogalactoside), and a chromogenic substrate, such as X-gal (5-bromo-4-chloro-3-intolyl-β-D-galactopyranoside), the IPTG induces transcription of the lac promoter and expression of the expression of the lacZalpha complementing peptide. Cells harboring vectors where the lacZalpha gene is intact, form blue colonies due to conversion of the X-gal and H.sub.2O to galactose and 5-bromo-4-chloro-3-hydroxy-indole, which is converted in the presence of oxygen to the insoluble dimeric blue product, 5-5′-dibromo-4-4′-dichloro-indigo. Cells containing vectors where a segment of DNA is inserted into the multiple cloning site, disrupting the expression of the lacZalpha complementing peptide are white. White colonies are typically purified by restreaking a second time on the same type of plate, to ensure that they are not derived from a mixture of cells with a large white colony covering a small blue colony on a crowded plate. Plasmid DNA samples purified from white colonies are then characterized by analysis with restriction enzymes, gene amplification, DNA sequencing, or many other techniques.
[0252] While blue/white or similar colony color screening methods based on complementation between fragments of beta-galactosidase were developed in the early 1980s [Viera Messing (1982) Gene 19(3): 259-268], the first apparent use of this system to screen for insertions into or near a site comprising an attachment site for a transposon, was reported by the developers of the baculovirus shuttle vector (bacmid) system [Luckow et al, (1993)]. In their studies, a synthetic mini-attTn7 segment comprising the 3′ end of the glmS gene and extending into the intergenic region towards the phoS gene was inserted into the multiple cloning site of a lacZalpha gene derived from a cloning vector, but in the opposite orientation of its natural transcriptional direction, and in-frame with sequences upstream from the MCS and downstream from the MCS to encode a functional trimeric fusion protein that could complement the acceptor polypeptide encoded by the lacZΔM15 gene on the chromosome. DH10B cells harboring plasmids comprising this segment formed blue colonies on agar plates in the presence of an antibiotic, the inducer IPTG, and the chromogenic substrate, X-gal. DH10B cells harboring the bacmid, bMON14272, conferring resistance to Kanamycin, and the compatible helper plasmid pMON7124, conferring resistance to Tetracycline, also form blue colonies on plates containing these antibiotics, plus IPTG and X-gal, or similar types of chromogenic substrates (e.g., Bluo-gal, which produces a darker blue product than X-gal, which is turquoise).
[0253] When a donor plasmid, such as pMON14327 comprising the β-glucuronidase gene under the control of the polyhedrin promoter, or vectors derived from the pFastBac series of vectors noted above, is introduced into E. coli DH10B harboring the bacmid and the helper plasmid, the mini-Tn7 cassette from the donor plasmid in many cases will transpose into the synthetic mini-attTn7 target site located on the low copy number bacmid, or into the attTn7 located near the 3′ end of the glmS gene on the chromosome. Insertion into the synthetic site on the bacmid produces colonies that are white, in the presence of Kanamycin, Tetracycline, IPTG, and X-gal, in a background of blue colonies, that have the mini-Tn7 inserted into the unique site on the chromosome. Sectored colonies, part blue and part white, were sometimes observed on plates spread with bacteria, and when the white portions were restreaked on similar plates, white colonies always gave rise to white colonies.
[0254] Despite the remarkable success of this system to facilitate the expression of a wide variety of proteins in cultured insect cells for use in basic and applied research, particularly therapeutic polypeptides, vaccines, and components of cell and gene therapy vector systems over the past 26 years, there is a continuing need to develop new and improved vectors that facilitate the cloning and insertion of gene expression cassettes into large plasmids and viral shuttle vectors. Improvements to shuttle vectors comprising the target site, the donor plasmid, and the helper plasmid, may permit the development of more rapid methods for the assembly and characterization of complex vectors comprising one or more genes of interest, suitable for use in a wide variety of applications, compared to vectors and methods that are currently available from academic and corporate institutions.
[0255] The synthetic lacZ-alpha-mini-attTn7 target site used in the bacmid system described above, was derived from pMON7134, which contains a 523 HincII fragment of pEAL1 containing attTn7 into the HincII site of pEMBL9 [Barry (1988)]. A 112 bp fragment was amplified by polymerase chain reaction (PCR) using two primers to generate a fragment containing a 87 bp functional attTn7 corresponding to positions −23 to +61 with respect to the insertion site at position 0) with EcoRI and SalI 5′ sticky ends. The 112 bp amplified fragment was cloned into the lacZalpha region of the cloning vector pBCSKP to generate the vector pMON14192. E. coli DH10B harboring pMON14192 formed blue colonies on plates containing X-gal or Bluo-gal. This plasmid was linearized with ScaI and amplified with primers containing BbsI sites to generate a 708 bp product with EcoRI and SalI compatible sticky ends, and ligated to pMON14181 (containing a Kanamycin resistance gene linked to a mini-F replicon) to form pMON14231 (mini-F-Kan-lacZalpha-mini-attTn7), which formed light blue colonies containing X-gal or Bluo-gal due to its much lower copy number. This plasmid was partially digested with BamHI to generate full-length linear molecules and ligated to the baculovirus transfer vector pMON14118 (˜8,538 bp) digested with BglII to produce two transfer vectors pMON14271 and pMON14272 (each ˜18,053 bp), which were used to generate the baculovirus shuttle vectors bMON14271 and bMON14272, that conferred resistance to Kanamycin, and formed blue colonies on plates containing X-gal or Bluo-gal, that were infectious when introduced into Spodoptera frugiperda Sf9 cells.
[0256] Key features of a 2033 bp fragment extracted from the sequence of bMON14272 extending from an SbfI site located 124 bp upstream from the 5′ end of the CAP binding site near the lac promoter and operator to a sequence including a SexAI site in the 5′ end of the ytc gene in the cloned mini-F replicon include the following genetic elements: [0257] the lac promoter and operator upstream from the coding sequence for the first 5 amino acids of the lacZalpha polypeptide; [0258] the left part of a multiple cloning site (MCS) derived from pBCSKP; [0259] the synthetic sequence comprising the attTn7 target; [0260] the right second part of the MCS derived from pBCSKP, a sequence encoding amino acids 7-59 of the lacZalpha polypeptide; and [0261] a 123 bp segment encoding 40 additional amino acid extending beyond the BbsI site to the SexAI site near a TAA stop codon in the 5′ end of the ytc gene of the mini-F replicon sequences.
[0262] It seems remarkable, now more than 26 years after these genetic elements were first designed and assembled, that the system for screening insertions of a transposon into a synthetic attachment site worked as well as it did, and very few attempts, if any, were made by others to improve this aspect of the baculovirus shuttle vector system. It is desirable, though, to remove unnecessary sequences, particularly those within the residual parts of the multiple cloning site, and to systematically shorten and test sequences comprising the synthetic mini-attTn7 target site.
[0263] The sequences from the ATG start codon of the lacZalpha peptide through the end of the SexAI recognition site near the TAA stop codon are shown below. The underlined portions are derived from the multiple cloning sites or extend from the 3′ end of the original pBCSKP cloning vector into adjacent sites in the 5′ end of a non-essential gene found in the F plasmid.
##STR00003##
[0264] All of the underlined sequences are not essential to the synthetic target site, and could be deleted to produce a much shorter synthetic attTn7 target, while preserving key features of the screenable method of detecting transpositions of mini-Tn7 elements into this sequence. While the short sequences at the end of the mini-attTn7 comprising recognition sites for EcoRI or SalI are not critical to targeting or insertion of mini-Tn7 elements, and not underlined, they are still useful for extracting and moving this segment from one cloning vector to another, or as a source of material used in a variety of gene amplification techniques.
[0265] One of many possible truncated versions of this sequence is shown below.
##STR00004##
[0266] Sequences shown above and similar sequences are most easily prepared by direct DNA synthesis which are also flanked by sequences comprising one or more recognition sites for restriction enzymes, to facilitate insertion into vectors comprising compatible restriction sites under the control of inducible promoters, such as the lac promoter and operator, and variants thereof. This segment may also be directly linked to a suitable promoter in coupled gene amplification reactions where segments of an upstream promoter and/or a downstream transcriptional terminator are included in the reaction mixture, where there are suitable overlaps between the promoter sequence and the 5′ end of the synthetic lacZalpha-mini-attTn7 target sequence noted above, and the 3′ portion of this sequence overlapping with the 5′ portion of a segment comprising a transcriptional terminator sequence.
[0267] Variants of the synthetic target site are also prepared by systematically deleting nucleotide sequences between the ATG start codon of the lacZalpha polypeptide and sequences just upstream and downstream from the 5-bp Tn7 insertion site that is located 5′ to the TnsD protein binding sites in the 3′ end of the retained portion of the glmS gene. Systematic sets of deletions, designed to retain the reading frame of the chimeric fusion protein, will help define the boundaries and essential residues needed for targeting of mini-Tn7 elements, and synthetic derivatives, where the left and right arms of Tn7 are altered by mutagenesis, or genes encoding any of the relevant transposition proteins are mutagenized, and characterized by their ability to transpose into mini-attTn7 targets sites, or altered variants of the target site, in this system.
[0268] Modular versions of the genetic cassette comprising the lacZ-attTn7 target site, operably linked to a suitable prokaryotic or eukaryotic promoter may be moved to other plasmids or shuttle vectors by traditional cloning methods, or by more modern methods assembling segments of genes into multifunctional vectors.
[0269] A wide variety of vectors comprising the synthetic lacZ-attTn7 target site and longer or shorter variants, may also be used with this system to screen for insertions of mini-Tn7 sequences into a single target maintained on an autonomous replicon or the chromosome of a host cell. These include small and large plasmids that propagate in enteric and non-enteric bacteria, viral shuttle vectors, such as insect and mammalian dsDNA viruses, particularly baculovirus- and herpesvirus-derived shuttle vectors, TI plasmid and chloroplast-derived vectors used to facilitate the insertion of genes into transformed plant cells, tissues, allowing the generation of transgenic plants, and in fungal systems used to facilitate the expression of gene products for research and in industrial biotechnology applications.
[0270] The following table illustrates phenotypes of colonies of E. coli DH10B harboring different plasmids used in the transposition system colonies on agar media containing a chromogenic substrate specific for β-galactosidase, such as X-gal or Bluo-gal, in the presence of one or more kinds of antibiotics.
TABLE-US-00011 TABLE 8 Phenotypes of DH108 Harboring Plasmids in lacZalpha-mini-attTn7 Transposition Studies Designation DH10B/ Inc Phenotype on plasmid(s) Markers Group X-gal plates Stable Description bMON14272 Kan.sup.R IncFl Lac plus (blue) Yes E. coli DH10B harboring (bacmid) just the bacmid bMON 14272 comprising a contiguous segment encoding resistance to Kanamycin, the lacZ-mini- attTn7 target sequence, and the mini-F replicon pMON1724 Tet.sup.R IncColE1 Lac minus (white) Yes pMON7124 encodes (helper) tnsA, B, C, D, and E, near Tn7R on a pBR322-based replicon. pFastBac1 Amp.sup.R, IncColE1 Lac minus (white) Yes The donor plasmid (donor) Gent.sup.R encodes Ampicillin resistance gene on the backbone and Gentamycin Resistance Gene, plus baculovirus polyhedrin promoter, MCS and SV40 poly(A) between Tn7L and Tn7R. bMON14272 + Kan.sup.R, IncFl + Lac plus Yes Bacmid plus helper pMON7124 Tet.sup.R IncColE1 (blue) plasmids bMON14272 + [Kan.sup.R, IncFl + Lac plus (blue) >> No, until Bacmid plus compatible pMON7124 + Tet.sup.R, [IncColE1 + Lac minus (white) transposition helper and incompatible pFastBac1 Amp.sup.R, IncColE1] (by insertion into from donor donor plasmids Gent.sup.R] >> >> IncFl + bacmid to create to bacmid or Kan.sup.R, IncColE1 composite bacmid) chromosome, Tet.sup.R, or Lac plus (blue) losing vector Amp.sup.S, (by insertion into backbone of Gent.sup.R chromosome) donor plasmid
[0271]
Example 2—Design and Assembly of Vectors Allowing for Direct Selection of Site Specific Transposons Inserted into their Attachment Site and Methods Thereof Based on Cassettes Comprising CAT-attTn7 Gene Fusions
[0272] Indirect screenable methods for detecting insertions of site-specific transposons into synthetic target sequences such as those disclosed in the Background of the Invention and Example 1, noted above, work remarkably well. Variant sequences, which eliminate small segments upstream or downstream from the minimal set of attTn7 sequences may also improve the contrast between events that result in insertions and background levels of expression of the chimeric protein comprising segments that can complement a chromosomally-encoded acceptor protein on different types of agar plates or other types of media that result in color changes in the presence of a chromogenic substrate.
[0273] There is a need, however, for methods that allow for the direct selection of bacteria harboring vectors comprising synthetic attTn7 target sites. Direct selection will allow for directed evolution of mutagenized mini-Tn7 transposons, target sites, and sequences encoding transposition proteins, leading to the development of synthetic gene insertion systems, which may have altered efficiencies of transposition into a specific target site or altered abilities to transpose into variants of the wild-type target site compared to systems generally based on unaltered parental transposon and target sequences.
[0274] Chloramphenicol (Cam or CM, Formula: C.sub.11H.sub.12Cl.sub.2N.sub.2O.sub.5, IUPAC name: 2,2-dichloro-N-[(1R,2R)-1,3-dihydroxy-1-(4-nitrophenyl)propan-2-yl]acetamide) is an old antibiotic, now typically used to treat ocular infections caused by Staphylococcus aureus, Streptococcus pneumoniae, and Escherichia coli. Chloramphenicol is a bacteriostatic drug, binding to two residues in the 23S rRNA of the 50S subunit of the ribosome, preventing the elongation of protein chains. Chloramphenicol is also a potent inhibitor of cytochrome P450 isoforms CYP2C19 and CYP3A4 in the liver, which decrease the metabolism and increasing the circulating levels of a wide variety of other drug products.
[0275] Resistance to chloramphenicol (CMR) can diminish its effectiveness in clinical settings. Reduced permeability of bacterial membranes is a common mechanism, that confers a low level of resistance to the drug. Mutations in the 50S subunit of the ribosome also confer resistance, but are rare. High level resistance is conferred by a gene encoding chloramphenicol acetyl transferase (CAT; EC 2.3.1.28), which inactivates the molecule by adding one or two acetyl groups derived from acetyl-S-coenzyme A to hydroxyl groups on the molecule, which prevents the drug from binding to the ribosome.
[0276] A wide variety of genes encoding chloramphenicol acetyl transferase have been isolated and compared Commonly studied are the Type I and the Type III enzymes, which have been shown to be trimers of identical subunits (MW 25,000) with a histidine residue at position 195 identified as having a key role in the catalytic reactions involved in acetylation of chloramphenicol bound to a deep pocket in the trimer complex. The crystal structure of the Type III enzyme, isolated from E. coli, bound to chloramphenicol has been determined.
[0277] Gene cassettes encoding CAT are widely used in bacteriology and molecular genetics to facilitate the selection of plasmids carrying DNA segments with a promoter operably-linked to the cat gene. One common application is to clone an intact cat gene downstream from a promoter of interest, as a gene fusion in a reporter system, to measure the relative activity of different promoters, or the same promoter in different types of tissues. It is also commonly used to facilitate cloning of DNA segments into plasmid vectors, within the cat gene, destroying its activity, or within cloning sites located elsewhere on a plasmid that confers resistance to CM.
[0278] Genes encoding Type I CAT are located in a wide variety of cloning vectors. The plasmid pACYC184, for example, has a cat gene derived from Tn9, that encodes a Type I CAT protein, containing a p15A origin of replication [Chang, A. C. Y. and Cohen, S. N. (1978) J. Bacteriol. 134: 1141-1156.]. This plasmid, which is 4,245 bp, also confers resistance to tetracycline (TET). Plasmids containing DNA segments inserted into the unique EcoRI site of this plasmid are resistant to TET, but not CM. Plasmids containing DNA segments inserted into the unique EcoRV, BamHI, SalI, or many other sites of this plasmid are resistant to CM, but not TET.
[0279] NR1/R100, R1, and many other large plasmids that confer resistance to several types of antibiotics (drug resistance or R plasmids), also carry genes related to Tn9, which encode the type I CAT polypeptide. R plasmids may also carry genes which confer tolerance to heavy metal ions, including mercury, silver, and cadmium, arsenic [Foster, T. J. (1983) “Plasmid-determined Resistance to Antimicrobial Drugs and Toxic Metal Ions in Bacteria. Microbiology Rev 47(3):361-409]. Plasmid-specified resistance to compounds comprising bismuth, lead, boron, chromium, cobalt, nickel, tellurium, and zinc have also been described [Summers and Silver (1979) Microbial transformation of metals. Ann Rev Microbiol. 32: 637-372].
[0280] What is not well known, however, is that the CAT protein tolerates small deletions or insertions (to produce larger fusions) at its amino and carboxy termini. A series of HIV-1 Vpr-CAT N- and C-terminal fusion proteins were constructed and evaluated, which had the activity of both Vpr and CAT domains [Yao et al (1999), Gene Therapy]. Small deletions at the carboxy terminus, are also possible, provided that they do not extend upstream from a conserved cysteine residue near the carboxy terminus of the CAT protein [Robben et al, (1995)] [Van der Schueren et al, 1998]. This residue is located at position 8 residues from the end of the 219 residue Type I CAT protein, and at 6 residues from the end of 213 aa Type III CAT protein. Note the following key observations: [0281] Insertion of a TAA stop codon immediately at or upstream from the Cysteine codon in the gene for the Type I CAT protein results in a polypeptide that is inactive. [0282] Insertion of the TAA stop codon after the Cysteine codon and before the normal stop codon should allow expression of a truncated polypeptide that is functional. [0283] Deletion of the conserved Cysteine residue is believed to prevent assembly of CAT into its active trimer complex.
[0284] DNA cassettes encoding the Type I or Type III CAT proteins, where a stop codon, such as TAA, TGA, or TAG, are located after a codon encoding Cysteine, and one or more codons for non-conserved amino acid residues upstream from the conserved Cysteine codon are designed as noted below. If a site for a restriction enzyme is located after the Cysteine codon is used as part of a cloning site that destroys the stop codon, then the reading frame of the mRNA encoding the upstream portion of the CAT protein may be altered, allowing readthrough into the mRNA segment transcribed from the downstream DNA segment. Sequences of novel gene fusions where site-specific insertions of a segment from a transposon alters the reading frame at the stop codon, allowing expression of a fusion polypeptide is active are noted in more detail below.
[0285] One way to directly select for insertions of site specific transposons into their target site, is to design and assemble an array of genetic elements to include a promoter and optional operator, operably-linked to a sequence encoding a drug resistance marker, and a synthetic sequence encoding the target site for the transposon. The design and assembly of genetic cassettes encoding a fusion between the gene encoding Chloramphenicol Acetyl Transferase (CAT) and the mini-attTn7, or a variant that includes a portion of the coding sequence for the lacZ alpha protein, as a CAT-attTn7-lacZ fusion protein, are described below.
[0286] The junction of the fusion is after a codon for a conserved Cysteine residue near the 3′ end of the gene, adding a TAA stop codon, and then most of the mini-AttTn7 segment. By carefully selecting the relative position of the tnsB binding site so that the duplicated target site (−2 to +2) is within the TAA stop codon after the Cys codon, so that when the Tn7 is inserted, it disrupts the stop codon allowing readthrough into the 5′ end of the left arm of Tn7 (Tn7L, which begins TGT, and then 5 more bases, before the start of several conserved tnsD binding sites).
[0287] CAT fusions can be created at both ends of the gene, but those that extend upstream from the conserved Cys codon are inactive. By restoring a few amino acids beyond the Cys codon, the protein is active again. In one type of fusion, the target site is in a segment that normally does not confer resistance to CM, but if a transposition event occurs, CAT resistance is restored. This arrangement allows one to directly select for CM resistance, and all of the expected structures should be gene fusions with the CAT reading into Tn7L. Direct selection should allow for the detection of rare transposition events (1×10.sup.−5).
[0288] Different promoters can be used to drive expression of CAT-attTn7 fusion polypeptide, such as its native promoter, or the inducible lac promoter. These strategies should apply to equally well to gene fusions assembled from the Type I cat gene, as well as those derived from the Type III cat gene. The Type I cat gene is more widely available on a variety of medium copy number cloning vectors (such as pACYC184) and low copy number drug resistance plasmids (NR1/R100).
[0289] The plasmid pACYC184 (4,345 bp) has two genes encoding resistance to Tetracycline (TC) and to Chloramphenicol (CM). It also has replicon derived from the plasmid p15A, allowing it to co-exist in cells comprising ColE1-derived replicons, such as pBR322 and the pUC series of plasmids. It is a medium copy number vector, maintained at about 15 copies per cell, which can be amplified by treatment with spectinomycin under specific growth conditions. The Type I cat gene in pACYC184 encodes a protein having 219 aa. Several unique restriction sites are located just within the 3′ end of the gene, and just downstream from its TAA stop codon.
[0290] Several plasmids are constructed to demonstrate feasibility of a new system designed to allow direct selection for insertions of mini-Tn7 segments into synthetic CAT-attTn7 target sites, as noted below. They can be derived directly from pACYC184 by traditional cloning methods using cleavage and ligation of restriction fragments into cloning vectors, or by synthesizing gene fusions of interest that are directly inserted into a common base vector (such as those provided by Twist Biosciences) and characterized by DNA sequencing, gene amplification, restriction fragment analysis, or similar methods to characterize the structure of a vector molecule. Twist Biosciences provides a variety of vectors comprising medium (p15A) or high (pUC) copy number replicons, and a selectable marker conferring resistance to chloramphenicol, kanamycin, or ampicillin that comprise a common site where the DNA sequence of interest is inserted. Given the low cost and ease of ordering synthetic DNA molecules, ordering complete vectors from a vendor are now usually preferred, compared to traditional methods of cloning gene fusions of interest that are described In the following examples.
[0291] Initially, pACYC184 DNA is digested with the enzyme TatI (A′GTAC,T) which produces a 5′ sticky ends, or with ScaI (AGT′ACT) which produces blunt ends, and with the enzyme BaeGI or Bme1508I (both of which G,KGCM′C). The start of the TatI site is located at position +410 in the vector, and the end of the BaeGI/Bme1508I site is at position +467. There are 30 bases from the beginning of the TatI site to the start of the TAA stop codon, encoding a the C-terminal peptide sequence QYCDEWQGGA*.
[0292] Synthetic oligonucleotides are prepared and annealed to replace the segment of DNA extending from the TatI or ScaI site to the BaeGI/Bme1508I site. Additional unique restriction sites are located at longer distances downstream from the BaeGI/Bme1508I site, including Tth111I, DrdI, BtsaI, and Bsu36I, if the BaeGI/Bme1508I site is unsuitable for some reason. The synthetic oligonucleotides also contain a recognition site for a rare cutting restriction enzymes (such as those having an 8-bp recognition sequence, preferably a SrfI (GCCC|GGGC) site and an internal XmaI (C′CCGG,G) site, to facilitate extraction of the gene cassette comprising the synthetic CAT-attTn7 sequences when used in conjunction with other unique sequences located within the N-terminal sequence of the cat gene or sequences 5′ from that start of the gene also includes a promoter sequence.
TABLE-US-00012
[0293] The wild-type TatI to BaeGI fragment can be replaced by several altered versions, one comprising a BamHI site in the untranslated region downstream from the natural TAA stop codon, and variants where one or two stop codons are inserted at the positions where the critical Cysteine (C) residue, and the Aspartic Acid (D) residue are located upstream from the natural TAA stop codon. Inserting one stop codon at the position of the Asp codon should truncate the protein, to encode a truncated variant that is active. Inserting two stop codons, replacing the adjacent Cys and Asp codons, should also truncate the protein, to encode a truncated variant that is inactive.
##STR00006##
[0294] Transposing a mini-Tn7 element into the attTn7 site will alter the reading frame of the encoded polypeptide, adding extra amino acids to the CAT-attTn7 fusion protein restoring its activity, allowing for the direct selection bacteria harboring composite vectors comprising transposition events.
[0295] A sequence containing the mini-attTn7 site that has its insertion site positioned to be just before the first TAA should allow transposition in replacing the stop codon by the TGT of the left arm of Tn7, restoring activity.
[0296] The segments shown below illustrate the junction between a Type I cat gene and a mini-Tn7 element inserted into an a target site where the TAA stop codon overlaps with positions 0 to +2 of a 5-bp insertion site (from −2 to +2) of a mini-attTn7 target site, restoring expression of a longer, active CAT fusion protein. The relative position of the transposition site can be adjusted by a single base across the desired insertion site.
[0297] Note that the extended CAT fusion protein extends for varying lengths depending on the reading frame of the gene (+1, +2, or +3), where the TGT represents the first 3 nucleotides of the left arm of Tn7.
[0298] The segment shown below illustrates the junction between a Type I cat gene and a Tn7 element inserted into an overlapping mini-attTn7 target site, restoring expression of a longer, active CAT fusion protein.
TABLE-US-00013 Sequence Alignment 9: Sequences at the 3' end of a Type I cat gene after transposition of a mini-Tn7 into an over overlapping mini-attTn7 site (SEQ ID NO: 20) Omitted (SEQ ID NO: 22)
[0299] The relative position of the 5-bp insertion site can be moved slightly to the left or right of the sequences encompassing the critical Cysteine codon or sequences in adjacent codons to produce different types of truncated proteins, or longer fusion proteins that result by changing the reading frame of downstream intervening segments and sequences in the left arm of Tn7, where a variety of stop codons are located at different distances from the end of Tn7L.
TABLE-US-00014 Sequence Alignment 10: Sequences at the 3' end of a Type I cat gene that mimic Tn7L at the junction of mini-Tn7 replacing a stop codon for a Cys codon in an overlapping mini-attTn7 site The following sequence mimics insertion of the Tn7L replacing the stop codon for a Cys codon, restoring activity to the encoded CAT fusion protein. −2 +2 | | BamHI BaeGI/SrfI/XmaI
[0300] Bacteria harboring synthetic gene fusions comprising truncated, wild-type, or extended forms of the cat gene should have different phenotypes when plated on different concentrations of chloramphenicol, as shown below.
TABLE-US-00015 TABLE 9 Colony Phenotypes of pACYC184 derivatives encoding CAT-attTn7 fusion proteins Markers Reference or Cat.sup.R = + SEQ ID NO of Designation Markers Cat.sup.S = − Description Inserted Sequence Source pACYC184 Tet.sup.R, + pACYC184 carries genes conferring Chang, A. and Boca Cat.sup.R resistance to tetracycline and Cohen, S. (1978); Scientific chloramphenicol (Type I cat gene encoding Sequence reported 219 aa residues). It has the same replicon by Rose, R. E. as pACYC177. (1988). pACYC184-SrfI Tet.sup.R, + pACYC184 digested with TatI or ScaI and (SEQ ID NO: 7) This Cat.sup.R BaeGI or Bme1508I and ligated to or study amplified to include an oligonucleotide encoding a SrfI/XmaI site. GAT > TAA Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 9) This Cat.sup.S changing the codon following the Cysteine study Codon from GAT to TAA. GAT > TGA Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 10) This Cat.sup.S changing the codon following the Cysteine study Codon from GAT to TGA. GAT > TAG Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 11) This Cat.sup.S changing the codon following the Cysteine study Codon from GAT to TAG. GAT > TAA Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 12) This overlapping Cat.sup.S changing the codon following the Cysteine study mini-AttTn7 Codon from GAT to TAA with an attTn7 sequence overlapping with the Cysteine Codon. GAT > TGA Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 13) This overlapping Cat.sup.S changing the codon following the Cysteine study mini-AttTn7 Codon from GAT to TGA with an attTn7 sequence overlapping with the Cysteine Codon. GAT > TAG Tet.sup.R, − pACYC184 containing an oligonucleotide (SEQ ID NO: 14) This overlapping Cat.sup.S changing the codon following the Cysteine study mini-AttTn7 Codon from GAT to TAG with an attTn7 sequence overlapping with the Cysteine Codon. TAA > TAT::Tn7 Tet.sup.R, + Insertion of Tn7 at the TAA Stop codon SEQ ID NO: 23 This Cat.sup.R restores CAT activity. study TGA > TGT::Tn7 Tet.sup.R, + Insertion of Tn7 at the TGA Stop codon This Cat.sup.R restores CAT activity. study TAG > TAT::Tn7 Tet.sup.R, + Insertion of Tn7 at the TAG Stop codon This Cat.sup.R restores CAT activity. study
[0301] Variants of plasmids based on pACYC184 can also be created using any of a variety of other replicons. Vectors provided by Twist Biosciences, for example, can also be used. In the series noted below, key segments derived from the chloramphenicol resistance gene of pACYC184 are synthesized and inserted into pTwist-Kan-MC (also abbreviated as pTKM), which confers resistance to chloramphenicol and has a medium copy number replicon derived from the plasmid p15A. Polylinker sequences flank the entire kanamycin resistance gene, including its promoter, that containing for two or more 8-bp recognition sites for rare cutting restriction enzymes, such as MauBI, AbsI, SgrDI, and AscI.
TABLE-US-00016 TABLE 10 Expected Phenotypes of DH10B Harboring pTwist-Kan-MC plasmids comprising CAT-mini-attTn7 fusion proteins with staggered sets of TAA stop codons Base Vector Insert Expected SID Short Name Markers Marker Phenotype Insert Segments NOS pTwist + Kan + MC KAN None KanR None 157 pTKM- KAN None KanR MauBI-AbsI-AvrII-SgrDI-AscI polylinker 158 MaAbAySgAs pTKM-CATd8 KAN None KanR, CAT gene from pACYC184 not extended or truncated 159/ CamR and deleted 8 bases from the right polylinker 160 pTKM-CAT KAN CAT KanR, CAT gene from pACYC184 not extended or truncated CamR pTKM-CAT-TAA KAN CAT KanR, TAA replaced Asp Codon 161/ CamR 162 pTKM-CAT- KAN CAT KanR, TAATAA replaced CysAsp Codons 163/ TAATAA CamS pTKM-CAT- KAN CAT KanR, TAATAA replaced CysAsp Codons-overlapping mini- 165/ TAATAA-mini- CamS AttTn7 166 attTn7 pTKMC-CAT- KAN CAT KanR, CAT extended with CGRTK with partial Tn7L rf1 167/ Tn7Lrf1 CamR 168 pTKMC-CAT- KAN CAT KanR, CAT extended with LWADKIVGNWEGWKWSF with 169/ Tn7Lrf2 Cam??? partial Tn7L rf2 170 pTKMC-CAT- KAN CAT KanR, CAT extended with PVGGQNSWELGGVEMEFLRII with 171/ Tn7Lrf3 Cam??? partial Tn7L rf3 172
[0302] If the phenotypes are as expected, then the plasmid containing the mini-attTn7 sequence can be used as the basis for additional experiments where a helper plasmid is introduced into the cells, and a donor plasmid transformed in, and plating out in the presence of tetracycline and chloramphenicol. (The marker on the helper plasmid may need to be changed so it is different from that used by the target plasmid). All target plasmids that confer resistance to Tc and CM should have a mini-Tn7 inserted at the 3′ end of the truncated/extended cat gene.
[0303] E. coli DH10B harboring the pACYC184 series of vectors and a variant of the helper plasmid, pMON7124, that encodes a drug resistance marker, such as Kanamycin instead of Tetracycline, can be transformed with a donor plasmid, such as pFastBac1 or a variant thereof (each conferring resistance to Ampicillin and Gentamycin), to test transposition of the mini-Tn7 element from the donor into the target site on different pACYC184 variants containing synthetic attTn7 sites. E coli DH10B cells comprising the unmodified patent plasmid or each of the variant plasmids are then spread on agar plates comprising tetracycline if pMON7124 is used as a helper vector, plus different concentrations of chloramphenicol to determine the relative sensitivity to chloramphenicol. The phenotypes should match what is predicted in tables noted below.
[0304] Transposition events in cells containing the overlapping attTn7 sequence should restore CAT activity, compared to those having the longer attTn7 sequence linked downstream from the truncated cat genes. The Gentamycin resistance marker, which is located on the mini-Tn7 element on the donor plasmid, with the 3′ end of its gene oriented to terminate near Tn7R, should be irrelevant in transposition schemes where the direct selection of transposition events occur by insertion into a gene fusion comprising a truncated cat gene, and where CAT activity is restored after transposition of the mini-Tn7 element into the target site on the pACYC184 derived vector containing an overlapping mini-attTn7 sequence.
[0305] Screening for resistance or sensitivity to Gentamycin, from colonies that confer resistance to Chloramphenicol after transposition should facilitate confirmation of transposition events into the target site on a plasmid, compared to the chromosome. Eliminating the need for a drug resistance marker within the mini-Tn7 element, allows the donor plasmid to be much smaller, before and after transposition, greatly facilitating the design and cloning of cassettes to be inserted into one or more related attachment sites on a target vector, and avoiding the need to remove the gentamycin or other resistance markers after transposition for specific applications.
[0306] Segments from any of these plasmids may then be moved to other plasmids with different replicons by digesting them with restriction enzymes that cut outside the critical genetic elements, by amplifying the key sequences using PCR-like techniques, or by synthesizing and assembling one or more segments and ligating them into appropriate vectors.
[0307] The plasmid pACYC177, which has the same replicon as pACYC184 and encodes genes conferring resistance to Ampicillin and Kanamycin, can be used to clone segments derived from the pACYC184 derivatives noted above and below, that contain variable lengths of a sequence comprising a mini-attTn7 target site, to facilitate testing of transposition in cells where the target confers resistance to Kanamycin, the donor confers resistance to Amp and Gentamycin, and the helper confers resistance to Tetracycline.
[0308] Vectors having much lower copy numbers, such as the mini-F replicon used in the baculovirus shuttle vectors and in many Bacterial Artificial Chromosomes (BAC) vectors, available from a variety of academic, non-profit, or commercial sources, can also be used to facilitate analysis of transposition events using selectable and screenable marker schemes.
[0309] The following table illustrates phenotypes of colonies of E. coli DH10B harboring different plasmids used in the transposition system colonies on agar media in the presence of one or more kinds of antibiotics. Agar plates containing rosanilin dyes such as crystal violet can be used in agar plates to score chloramphenicol resistance types by colony color, such as CM-sensitive sectors in CM-resistant colonies [Proctor and Rownd, 1982]. This procedure, typically used to facilitate screening during cloning by insertional inactivation of cat gene encoding an active enzyme, may not work for cells harboring a nearly full length, but inactive enzyme, if the dye binds to one or more domains outside regions comprising key residues of its catalytic site.
TABLE-US-00017 TABLE 11 Colony Phenotypes of DH10B Harboring Plasmids in CAT-mini-attTn7 Transposition Studies Phenotype on Designation crystal DH10B/ Inc violet plasmid(s) Markers Group plates Stable Description pACYC17 Amp.sup.R, p15A CAT Yes pACYC177 carries (control) Kan.sup.R minus (−) genes conferring (light) resistance to ampicillin and kanamycin resistance gene. pACYC184 Tet.sup.R, p15A CAT Yes pACYC184 carries (control) Cat.sup.R plus (+) genes conferring (dark) resistance to tetracycline and chloramphenicol. pMON1724 Tet.sup.R ColE1 CAT Yes pMON7124 encodes (helper) minus (−) tnsA, B, C, D, and E, (light) nearTn7R on a pBR322-based replicon. pFastBac1 Amp.sup.R, ColE1 CAT Yes The donor plasmid (donor) Gent.sup.R minus (−) encodes Ampicillin (light) resistance gene on the backbone and Gentamycin Resistance Gene, plus baculovirus polyhedrin promoter, MCS and SV40 poly(A) between Tn7L and Tn7R. pACYC184 Kan.sup.R, Fl and CAT Yes pACYC184 and (control) + Tet.sup.R ColE1 plus (+) pMON7124 are in pMON7124 (dark) different compatibility (helper) groups and should stably co-exist in the same cell, selecting for kanamycin or chloramphenicol resistance and tetracycline resistance, respectively.
[0310]
Example 3—Design of Modular Sequences Encoding an Inactive LacZalpha-Mini-attTn7 Fusion Polypeptide
[0311] Strategies similar to those described above for the design and construction of CAT-attTn7 gene fusions can also be applied to generate lacZalpha-mini-attTn7 fusions, where a stop codon is inserted at or near the codon for amino acid 41 (counting from the second codon, after the ATG codon encoding the N-terminal methionine residue, which is processed off in E. coli) of the lacZalpha polypeptide. LacZalpha polypeptides that are shorter than 41 amino acids long cannot efficiently bind to and complement the LacZ acceptor polypeptide encoded by the lacZΔM15 gene [Juers et al (2012)].
##STR00009##
[0312] In this design, gene cassettes encoding a truncated lacZalpha protein and an overlapping mini-attTn7 are assembled and tested. Cassettes containing a lacZalpha that encode a polypeptide that is 42 or more amino acids long should complement and be lac plus on selection plates, or indicator plates comprising a chromogenic substrate. Those that are 41 amino acids or shorter should not efficiently complement and be lac minus on selection or indicator plates.
[0313] Transposition of a mini-Tn7 sequence into a truncated lacZ-alpha gene with an overlapping mini-attTn7 should restore the reading frame of the lacZalpha gene enabling expression of a longer alpha polypeptide that can complement, changing the phenotype from lac minus before transposition to lac plus after transposition.
[0314] In this design, blue colonies in a background of white colonies are picked and analyzed for the presence of the mini-Tn7 cassette inserted into the synthetic target sequence. Methods allowing outgrowth of lac plus cells in liquid minimal media comprising an appropriate carbon source before spreading on agar plates may facilitate the amplification and direct selection of colonies containing transposition events.
##STR00010##
[0315] Plasmid pUC18 or pUC19 DNA ([Yanish-Peron (1985)], obtained from Thermo Fisher or New England Biolabs) is partially-digested with PvuII, to create a linearized full length version of the plasmid, and treated with alkaline phosphatase, or a functionally similar phosphatase, to remove terminal phosphate residues. A synthetic linker is then added containing one or more unique restriction sites which do not cut in the parent plasmid sequence, and ligated to the linearized plasmid DNA, and transformed into competent E. coli cells. Two types of plasmids with linkers are recovered, one where the PvuII site in an intergenic region upstream from lac promoter contains the unique linker containing at least the one or more unique restriction sites and is not digestible by PvuII, and a second type where the linker is located in the lacZalpha gene.
##STR00011##
[0316] The nucleotide sequences are represented by even SEQ ID NOS and the encoded polypeptides by odd Seq ID NOS.
[0317] The plasmid variant that retains the natural PvuII site within the lacZalpha gene is selected for additional studies. DNA from that plasmid variant is digested with PvuII and KasI and a series of synthetic oligonucleotides comprising a series of one or more stop codons in frame with the lacZalpha polypeptide reading frame that have a blunt end and a compatible sticky end are inserted into the vector backbone, ligated, and transformed into competent bacteria comprising the lacZΔM15 gene. A series of ampicillin resistant vectors are recovered and their phenotypes characterized on chromogenic indicator plates.
[0318] In one series of vectors, noted above, the synthetic oligonucleotides contain two sequential TAA stop codons. At least one variant plasmid where double TAA stop codons are inserted is recovered, where expression of an alpha peptide of a functionally competent fragment is prevented, that can complement the acceptor fragment encoded by the lacZΔM15 gene on the chromosome.
[0319] If the transition encompasses the codons for consecutive E and A residues, as noted below, then a synthetic oligonucleotide is prepared comprising downstream sequences comprising an overlapping mini-attTn7 target sequence and ligated into the vector between the PvuII and KasI sites.
TABLE-US-00018 Sequence Alignment 14: Staggered sets of synthetic nucleotides encoding double TAA stop codons from PvuII to KasI sites of LacZ alpha gene pUC18 or pUC19 lined up with a synthetic mini-attTn7 sequence (SEQ ID NOS: 45/46, 47-51) PvuII (CAG|CTG) +41 +42 PvuI KasI +59 | | | | | | A| S W E N S E E A R T| D R P S Q Q L R S L N G E W R L M
[0320] The plasmid variant comprising the stop codon upstream from the overlapping mini-attTn7 target sequence is then tested in a transposition system comprising a compatible helper plasmid and an incompatible mini-Tn7 donor plasmid. The sequences near the end of the insertion site showing the 5 bp duplication at the left and right arms of Tn7 are shown below. In this example, three sets of insertions are shown, shifted by one nucleotide, where the conserved TGT from the left end of Tn7 replace 3, 2, or 1 nucleotides of the first of two TAA stop codons bordering the junction between the codons for amino acids 41 and 42 of the lacZ polypeptide. Sequences upstream from the insertion point encode amino acids S and E, before being joined to 3 types of polypeptides encoded by the transition sequences extending into the left arm of Tn7 where they terminate at varying distances by TAA, TGA, or TAG stop codons farther into Tn7L (not shown).
TABLE-US-00019 Sequence Alignment 15: Sequences near double stop codons replacing EA codons in lacZalpha peptide after transposition of a mini-Tn7 into an overlapping mini-attTn7 site −2 +2 +23 tnsD binding site | TAA TAA | --------AAGAG ttacgcagggcatccatttattactcaaccgtaaccga (SEQ ID NO: 53) Insertion site ------------------ tnsD binding site->
[0321] It is desirable to prepare a control plasmid derived from a plasmid encoding the lacZ alpha peptide, such as pUC18 or pUC 19 vector, to insert the mini-attTn7 target site into the middle of the multiple cloning site such that the reading frame of the sequence encoding the target site is in frame with the sequences encoding the first few amino acids of the lacZalpha polypeptide, and sequences downstream from the multiple cloning site are also in frame through the stop codon 3′ to the sequences encoding amino acids 42 and beyond of the lacZ polypeptide.
[0322] In one of many possible examples, pUC18 can be used to clone the EcoRI-SalI mini-attTn7 fragment from the bacmid bMON14272, which has the EcoRI-SalI sites in the same reading frame as that in pUC18. The background may be high, since both the parent and resulting plasmid are both Ampicillin resistant and Lac plus on selection or indicator plates.
[0323] Plasmid pUC18 DNA is also digested with an enzyme that cuts in the middle of the MCS, the ends filled in with DNA polymerase or nibbled back, and re-ligated and transformed into bacteria and a Lac minus derivative is recovered and characterized. That plasmid is digested with EcoRI and SalI and ligated with EcoRI-SalI fragment from bMON14272 DNA to create a pUC18 derivative with the mini-attTn7 target site that confers resistance to Ampicillin and is lac plus on indicator plates. The sequence of one derivative is shown below.
TABLE-US-00020 Sequence Alignment 16: Clone mini-attTn7 of bMON14272 into EcoRl- SalI sited of LacZ alpha gene of pUC18 restoring reading frame +1 +4EcoRI | lacZ || < Synthetic polypeptide encoded by mini-AttTn7 M T M I T| N S H N R K K N A P L T Q G I (SEQ ID NO: 58) ATGACCATGATTACGaattcacataacaggaagaaaaatgccccgcttacgcagggcatc (SEQ ID NO: 57) | | −2 +2 <-------------------- Insertion Site --------- SalI --------------------------------------------|--------------- H L L L N R N R F C Q V T R L| S T C R H
[0324] Restriction fragments containing this segment can be moved to other modular plasmids or shuttle vectors by using enzymes that cut 5′ to and 3′ to this segment, or various derivatives, or by amplifying the DNA segment using PCR primers that have desirable sites for one or more restriction enzymes that are compatible with those used in the vector to clone the digested or amplified DNA segment. Transposition events using vectors comprising this segment are detected by screening on plates containing a chromogenic substrate, such as X-gal, where white colonies will contain insertions that disrupt the expression of the lacZalpha polypeptide, preventing complementation with the acceptor polypeptide encoded by the lacZΔM15 gene.
[0325] Similar strategies can also be used to obtain and clone or insert DNA fragments encoding active and truncated forms of the lacZalpha polypeptide fused to a synthetic mini-attTn7 sequence, allowing the direct selection of transposition events, in the presence of substrates for β-galactosidase, and by screening in the presence of a chromogenic substrate, where lac plus colonies, that are blue, will contain inserts, extending the sequence of the lacZalpha polypeptide, compared to a truncated version that cannot bind to and complement the acceptor polypeptide encoded by the lacZΔM15 gene.
[0326] MacConkey agar is a selective and differential medium that be used to distinguish colonies that can ferment lactose (Lac plus) from those that cannot (Lac minus). MacConkey medium contains peptones and lactose as nutrients, plus bile salts and crystal violet to inhibit most Gram-positive bacteria, and the dye neutral red. Bacteria that metabolize lactose produce acid, lowering the pH of the agar below pH 6.8, turning the dye red, and creating pink (Lac plus) colonies in a background of pale yellow (Lac minus) colonies.
[0327] Some strains of enteric bacteria that carry a mutation in the galE gene that encodes galactose epimerase, are highly sensitive to galactose, due to accumulation of a toxic intermediate, UDP-galactose, that promotes cell lysis [Fukasawa, T. and H. Nikaido. (1961)]. Mutant galE strains that are also Lac plus, are sensitive to lactose or its analogue phenyl-β-D-galactoside, since β-galactosidase converts lactose to glucose and galactose, leading to the accumulation of the toxic metabolite UDP-galactose. A variety of common laboratory E. coli strains harboring different types of cloning vectors encoding the lacZalpha polypeptide, that also comprise the lacZΔM15 gene encoding the acceptor polypeptide were evaluated on rich and minimal media supplemented with 0.1% D-galactose or 0.1% lactose [Reddy (2004)]. Some strains harboring plasmids that express the lacZalpha polypeptide and complement the acceptor polypeptide encoded by the chromosomal lacZΔM15 gene, performed better than others on test plates, which may be related to the copy number of the plasmid, or activity of the reconstituted enzyme. The author noted that agar plates containing nutrient poor media generally worked better than rich media, and that outgrowth in minimal liquid media supplemented with lactose before plating may enrich the population of Lac minus cells comprising recombinant plasmids with insertions in their lacZalpha genes. Comparable results were obtained when an E. coli C strain, that is lacZ minus and galE minus harboring a plasmid pUR288 which encodes all of lacZ were plated on rich (LB) and poor (LB/M9 in a 1/9 vol/vol ratio, containing 0.05% phenylgalatcoside), suggesting that these methods, while promising, require careful evaluation of a variety of minimal media components [Gossen et al (1992)].
Example 4—Design of Modular Sequences Encoding Inactive and Active Forms of NPT-II (KAN)-Mini-attTn7 Fusion Proteins
[0328] Transposon Tn5 encodes a variety of genes including one, neomycin phosphotransferase II (NPT-II) confers resistance to neomycin and kanamycin in bacteria. NPT-II also confers resistance to G418 (Geneticin, G418 sulfate) in mammalian cells. These and other closely related antibiotics bind to components of the ribosome, inhibiting protein translation. NPT-II phosphorylates the antibiotics, interfering with their active transport into the cell. A wide variety of cloning vectors contain the gene encoding NPT-II to facilitate selection of bacteria in the presence of kanamycin on agar plates and in liquid cultures. This gene and variants encoding several types of fusion proteins are also widely used to facilitate selection of vectors commonly used in transformed plant cells and tissues.
[0329] Reiss et al (1984) observed that a series of genes comprising alterations at the 3′ end of the NPT-II gene encoding truncated proteins or extended fusion proteins were generated, which vary in activity compared to the native enzyme. A plasmid designated pKM2, comprising the wild-type gene conferred resistance to Kanamycin on at levels exceeding >1000 ug/ml. The gene used in these studies encodes a polypeptide ending with the sequence “LLDEFF” before ending with a TGA stop codon.
[0330] Two plasmids encoding extended variant forms, ending with “LLDEFFQA” and “LLDEFFPSFNAVVYHS” before terminating with TAG stop codons also conferred resistance comparable to the wild-type enzyme of >1000 ug/ml kanamycin. One extended variant encoding an additional 263 aa segment derived from a tetracycline resistance gene was inactive, while a second extended variant encoding an additional 303 aa segment was partially active, conferring resistance on plates containing 200 ug/ml kanamycin, and a third variant encoding an additional 300 aa segment, much less active, conferring resistance on plates containing 20 ug/ml kanamycin.
[0331] The extensions in each of these variants differed though, the first two encoding Gln-Ala (QA) immediately after the Phe-Phe (FF) residues in the wild-type enzyme, and the third variant comprising Pro-Asp (PN) after the Phe-Phe (FF) residues and extending beyond that for another 298 residues.
[0332] Most remarkable, however, are the properties of a fourth variant, which encodes Pro-Ser and 8 other residues (PSFNAVVYHS) immediately after the Phe-Phe (FF) residues before terminating at a TAA stop codon. Bacteria harboring the plasmid encoding the fourth variant could not grow on agar plates containing any amount of kanamycin, providing strong evidence that the encoded fusion protein was completely inactive.
[0333] The authors concluded that length alone, is insufficient to alter the activity of the NPT-II fusion protein and that biochemical characteristics of additional amino acids immediately near the carboxy terminal residues of the wild-type protein can also dramatically influence the activity of the fusion protein.
[0334] These and other observations concerning the identification of critical residues near the carboxy terminus of specific enzymes can be considered in the design of a variety of fusion proteins comprising synthetic mini-attTn7 target sites. In the CAT-attTn7 gene fusions noted earlier, the critical amino acid residue is a Cysteine, located several positions before the last amino acid of the CAT protein, and insertions by transposition into a stop codon at or near the Cys codon, will extend the protein, restoring its activity. In the experiments described below, alterations near the normal stop codon for NPT-II, including those encoding Gln (Q) and Pro (P) are made, and tested for their influence on the activity of slightly extended NPT-II fusion proteins. Bacteria harboring plasmids comprising genes encoding inactive variants are then used as targets in transposition experiments to determine if insertion of a mini-Tn7 element into a synthetic mini-attTn7 site restores activity, allowing direct selection for bacteria in the presence of kanamycin that should harbor plasmids comprising site specific insertions.
[0335] Plasmid pACYC177, which confers resistance to Ampicillin and Kanamycin, is digested with PflMI (CCAN,NNN′NTGG) and BsmFI (GGGAC(N).sub.9-10′NNNN,), and compatible sets of synthetic oligonucleotides are inserted between those sites to generate a series of plasmid variants encoding the sequences noted below.
[0336] The start of the recognition site for PflMI through is 125 nucleotides upstream from (5′ to) the start of the TAA stop codon at the end of the NPT-II gene, and the end of the cleavage site for BsmFI site 70 nucleotides downstream from (3′ to) the end of TAA stop codon, so it is desirable to prepare an altered form of pACYC177, where at least one new, unique restriction site is located near the end of the gene, which does not alter the sequence of any encoded polypeptide. This would facilitate insertion of sets of oligonucleotides that are much shorter than those required for insertion between the unique PflMI and BsmFI sites in pACYC177 (˜200 nt) needed for these studies.
[0337] There is a site comprising the sequence “TTGCAG” encoding “LQ” near the 3′ end of the NPT-II gene in pACY177 that can be mutated to “C,TGCA′G” comprising a recognition site for PstI, while encoding “LQ” since TTG and CTG are both codons for Leucine (L).
[0338] There is also an existing PstI (C,TGCA′G) site in the beta-lactamase gene of pACYC177 from position +299 to +304 overlapping 3 codons encoding “PAA”. The T and A residues can be both be mutated since they are in wobble positions for these codons, allowing changes from PstI CTGCAG to EagI C′GGCC,G or PstI to PvuII (CAG|CTG) creating unique sites, since they do not cut in parental pACYC177. A unique SacII (CC,GC′GG) is located near one end of the sequences comprising the p15A origin of replication.
##STR00015##
[0339] Two derivatives of pACYC177 are made by site directed mutagenesis, pACY177-PvuII, and pACYC177-EagI which remove the PstI site starting at position +299.
[0340] Both of these derivatives are then used as templates in a second experiment, changing the T at position +2703 to C, creating a unique PstI site at that position, in plasmids called pACYC177-PvuII-3′-PstI and pACYC177-EagI-3′-PstI. Another derivative can also be made, creating an EcoRI site near the 3′ end of the gene, that does not alter the two consecutive amino acids encoded at those positions.
[0341] Plasmid DNAs are purified and subjected to restriction enzyme analysis confirming the presence or absence of the expected restriction enzyme sites, and sequenced across the boundaries of the mutagenized sequences.
[0342] Bacteria comprising the parental pACYC177 plasmid and the variants are tested on a series of agar plates, and the variants are expected to confer resistance to Ampicillin and Kanamycin at the same level as the parental plasmid.
TABLE-US-00021 Sequence Alignment 19: Junction sequences at the 3' end of genes encoding C-terminal NPT-II (KAN)-mini-attTn7 fusion proteins pKM2 cttcttgacgagttcttc TGAgcgggactctggggttcgaaatgaccacca (SEQ ID NO: 67/68) L L D E F F * pKM243
[0343] Plasmid DNAs comprising the synthetic oligonucleotides noted above are recovered, and sequenced to confirm their expected structure, and bacteria harboring the unaltered pACYC177 and the variant plasmids are spread on a series of agar plates containing increasing concentrations of kanamycin to determine their phenotype.
TABLE-US-00022 TABLE 12 Expected Phenotypes of DH10B Harboring Plasmids Comprising KAN-mini-attTn7 Fusion Proteins Designation Expected DH10B/plasmid(s) Markers Inc Group Phenotype Stable SEQ ID NOS Source pKM2 Cam.sup.R, Kan.sup.R Kan plus (+) Yes 67/68 [Reiss et al (1984)] pKM243 Cam.sup.R, Kan.sup.R Kan plus (+) Yes 69/70 [Reiss et al (1984)] pKM243/1 Cam.sup.R, Kan.sup.R Kan plus (+) Yes 71/72 [Reiss et al (1984)] pKM243-1 Cam.sup.R, Kan.sup.S Kan minus (−) Yes 73/74 [Reiss et al (1984)] pACYC177 Amp.sup.R, Kan.sup.R P15A Kan plus (+) Yes 75/76 This study pACYC177-QA Amp.sup.R, Kan.sup.R P15A Kan plus (+) Yes 77/78 This study pACYC177-PS Amp.sup.R, Kan.sup.S P15A Kan minus (−) Yes 79/80 This study pACYC177-PSFNAVVYHS Amp.sup.R, Kan.sup.R P15A Kan minus (−) Yes 81/82 This study
[0344] A series of additional plasmids are prepared, which contain a synthetic mini-attTn7 that overlaps with the normal stop TAA codon, or codons just upstream from it that encode other amino acids, particularly those, such as Proline (P) that may encode an inactive form of a slightly extended NPT-II fusion protein. Transposition into a sequence comprising an inactive NPT-II-overlapping mini-attTn7 fusion protein should restore activity, allowing direct selection and recovery of bacteria harboring plasmids with transposition events.
TABLE-US-00023 Sequence Alignment 20: Staggered sets of synthetic nucleotides encoding double TAA stop codons from near the 3' end of the NPT-II gene of pACYC177 lined up with a synthetic mini-attTn7 sequence EcoRI GAATTC SpeI ACTAGT {circumflex over ( )} {circumflex over ( )} {circumflex over ( )} {circumflex over ( )} ATGCTCGATGAGTTTTTC TAA TCAGAATTGGTTAATTGGTTGT (SEQ ID NO: 75/76) M L D E F F *
TABLE-US-00024 TABLE 13 Expected Phenotypes of DH10B Harboring pACYC177-based plasmids comprising KAN-mini-attTn7 fusion proteins with staggered sets of TAA stop codons Designation Inc DH10B/plasmid Markers Group Phenotype Stable Source pACYC177-MLDEFF* Amp.sup.R, P15A Kan plus Yes This Kan.sup.R (+) study pACYC177-MLD** Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study pACYC177-MLDE** Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study pACYC177-MLDEF** Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study pACYC177-MLDEF*** Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study pACYC177-MLDEFQ** Amp.sup.R, P15A Kan plus Yes This Kan.sup.R (+) study pACYC177-MLDEFQA* Amp.sup.R, P15A Kan plus Yes This Kan.sup.R (+) study pACYC177-MLDEFP** Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study pACYC177-MLDEFPS* Amp.sup.R, P15A Kan minus Yes This Kan.sup.? (−) study
[0345] E coli DH10B cells comprising the unmodified patent plasmid or each of the variant plasmids are then spread on agar plates comprising Ampicillin, plus different concentrations of Kanamycin to determine the relative sensitivity to Kanamycin. The phenotypes should match what is predicted in tables noted above.
[0346] If the phenotypes are as expected, then the plasmid containing the mini-attTn7 sequence can be used as the basis for additional experiments where a helper plasmid is introduced into the cells, and a donor plasmid transformed in, and plating out in the presence of ampicillin and kanamycin. (The marker on the donor plasmid may need to be changed so it is different from that used by the target plasmid). All target plasmids that confer resistance to Amp and Kan should have a mini-Tn7 inserted at the 3′ end of the truncated/extended NPT-II (Kan) gene.
[0347] Variants of plasmids based on pACYC177 can also be created using any of a variety of other replicons. Vectors provided by Twist Biosciences, for example, can also be used. In the series noted below, key segments derived from the kanamycin resistance gene of pACYC177 are synthesized and inserted into pTwist-Chlor-MC (also abbreviated as pTCM), which confers resistance to chloramphenicol and has a medium copy number replicon derived from the plasmid p15A. Polylinker sequences flank the entire kanamycin resistance gene, including its promoter, that containing for two or more 8-bp recognition sites for rare cutting restriction enzymes, such as MauBI, AbsI, SgrDI, and AscI.
TABLE-US-00025 TABLE 14 Expected Phenotypes of DH10B Harboring pTwist-Chlor-MC plasmids comprising KAN-mini-attTn7 fusion proteins with staggered sets of TAA stop codons Base Vector Insert Expected SEQ ID Short Name Markers Markers Phenotype Insert Segments NOS pTwist + CAT None CamR None 173 Chlor + MC pTCM- CAT None CamR MauBI-AbsI-AvrII-SgrDI-AscI polylinker 174 MaAbAySgAs pTCM-Kan- CAT Kan CamR, KanR Kan extended with CGRTK to mimic Tn7Lrf1 175/ CGRT 176 pTCM-Kan- CAT Kan CamR, KanS Kan extended with PSFNAVVYHS to mimic prior art 177/ PSFNAVVYHS reference 178 pTCM-Kan-PS CAT Kan CamR, KanS Kan extended with PS to mimic prior art reference 179/ with silent EcoRI and SpeI sites 180 pTCM-Kan- CAT Kan CamR, KanR Kan extended with CGRTK with partial Tn7L rf1 181/ Tn7Lrf1 182 pTCM-Kan- CAT Kan CamR, Kan extended with LWADKIVGNWEGWKWSF with 183/ Tn7Lrf2 Kan??? partial Tn7L rf2 184 pTCM-Kan- CAT Kan CamR, Kan extended with PVGGQNSWELGGVEMEFLRII 185/ Tn7Lrf3 Kan??? with partial Tn7L rf3 186 pTCM-Kan-PS- CAT Kan CamR, KanS Kan extended with PS and overlapping mini-attTn7 187/ mini-attTn7 188 pTCM-Kan-PS CAT Kan CamR, KanS Kan extended with PS to mimic prior art reference 189/ without silent EcoRI or Spel sites 190 pTCM-Kan CAT Kan CamR, KanR Kan gene from pACYC177 not extended or 191/ truncated without silent EcoRI or SpeI sites 192
[0348]
Example 5—Design of Modular Sequences Encoding an Inactive β-Lactamase (BLA)-Mini-attTn7 Fusion Polypeptide
[0349] A large class of enzymes, called β-lactamases (BLAs), catalyze the hydrolysis of β-lactam antibiotics, such as penicillins and cephalosporins, allowing bacteria harboring genes encoding these enzymes to confer resistance to these compounds. Four general classes (A-D) of β-lactamases are recognized, based sequence similarity and functionality by their hydrolysis rates against a predefined panel of drug products. The physiological targets of β-lactam antibiotics are membrane DD-peptidases, which are responsible for the biosynthesis of peptidoglycan, a major component involved in the maintaining the shape and rigidity of the bacterial cell wall in Gram-positive and Gram-negative bacteria. β-lactam antibiotics acylate the active site serine residue of DD-peptidases, forming stable covalent non-catalytic acyl-enzymes, resulting in the formation of defective peptidoglycan and cell death. While the widespread emergence of drug resistant strains of pathogenic bacteria has tempered the development of new β-lactam antibiotics, analysis of substrate specificities of β-lactamases encoded by genes isolated from pathogenic strains, and from systematic mutagenesis by various combinations of substitution, insertion, or deletion, of amino acids across the entire length of related enzymes, has greatly facilitated 3-dimensional structure/function studies, and the roles of highly conserved amino acid residues involved in binding of a substrate, thermostability, or folding of the molecule [Matagne et al (1998)] [Axe (2000)] [Hecky and Muller (2005)]. These and many other studies have facilitated the development of other applications involving the use of genes encoding β-lactamases to facilitate the selection of vectors comprising cloned genes. Many of the commonly used cloning vectors comprise a bla.sub.TEM-1 gene encoding the broad spectrum TEM-1β-lactamase (class A) that is present on transposons Tn2 and Tn3 found in many Gram-negative bacteria.
[0350] An alignment of 20 Class A β-lactamases facilitated the numbering of specific amino acid residues within this complex family of related enzymes [Ambler et al (1991) A standard numbering scheme for Class A β-lactamases. Biochem J. 276: 269-272]. The plasmid encoded enzyme designated as R-TEM in this paper, starts with the amino acids “MSIQH” and terminating with “LIKHW” corresponds to positions +3 to +290 on the aligned consensus sequence. The alignment of TEM-1 against the consensus sequence, also shows postulated deletions “.”, at positions 239 and 253, for R-TEM, accounting for its size from the N-terminal methionine, to carboxy terminal tryptophan, of 286 amino acids. Class A β-lactamases from other bacteria in this alignment, range in size from 283 to 295 amino acids.
[0351] The bla gene In the cloning vector pBR322 encodes an enzyme that is 286 amino acids long, which includes a 23 amino acid signal peptide linked to a 263 amino acid secreted product. The same polypeptide is encoded by the bla gene on the popular cloning vectors pACYC177, pUC18, and pUC19.
[0352] One notable study carried out randomized three contiguous codons to create a library of all possible amino acid residues for the region randomized within the gene encoding TEM-1 β-lactamase, finding that 43 of 263 amino acids do not tolerate substitutions, and are critical for the structure and activity of the enzyme [Huang et al (1996) J. Mol. Biol. 258: 688-703.]. A remarkable observation was that Trp165 of four tryptophan residues in TEM-1 (at standard positions +165, +210, +229, and +290) could tolerate substitutions. The carboxy-terminal tryptophan at standard position +290, was identified as being a member of Class 4, where 30 residues were invariant in TEM-1, but not other Class A enzymes, compared to those in Class 1, which has 210 residues that vary in class A and TEM-1, Class 2, which has 23 residues that are invariant in Class A and TEM-1, and Class 3, where 10 residues are invariant in Class A, but not TEM-1.
[0353] Analysis of a series of N-terminal and C-terminal deletion variants of TEM-1 β-lactamase demonstrated impaired resistance to ampicillin on agar plates, and impaired ability of the purified enzymes to hydrolyze the chromogenic β-lactam compound nitrocefin as a substrate [Hecky and Muller (2005)]. Four variants were studied, two designated NΔ3 and NΔ5 deleting the first 3 and first 5 amino acids, respectively, from the amino terminus of the mature protein, and CΔ1 and CΔ3 deleting last 1 and last 3 amino acids, respectively, from the carboxy terminus of the mature protein. No colonies were observed for the NΔ5 and the CΔ3 clones on agar plates containing up to 50 ug/ml of ampicillin, suggesting important role for the terminal residues. Reduced numbers of colonies were also observed for the NΔ3 and the CΔ1 clones, compared to control clones comprising a non-truncated version of the gene. These and other experiments clearly demonstrated that deletion of 5 amino acids from the N-terminus decreased its thermostability in vivo and in vitro, but noting a difference in opinion regarding the “essential” nature of the single C-terminal tryptophan residue observed by Huang et al (1996). Many of the experiments by Hecky and Muller, though, focused on mutagenesis and directed evolution of ampicillin-resistant variants derived from the inactive NΔ5 clone, than on additional analysis of the CΔ1 and CΔ3 truncated variants.
[0354] The demonstrations by Huang et al (1996) and Hecky and Muller (2005) of critical residues near the carboxy terminal end of the TEM-1 β-lactamase provide the opportunity to design and assemble synthetic genes encoding most of the bla gene in common cloning vectors fused to sequences derived from the attachment site for Tn7, (attTn7), and comparable site-specific target sties from other Tn7-like, and site-specific mobile genetic elements.
[0355] Strategies similar to those described above for the design and construction of CAT-attTn7 gene fusions can also be applied to generate bla.sub.TEM-1mini-attTn7 fusions (which may also be referred to as BLA- or AMP-mini-attTn7 fusions), where a TAA, TGA, or TAG stop codon is inserted at or near the codons for encoding for the amino acid Lysine (K), Histidine (H), or Tryptophan (W) that are located at the 3′ end of the gene just before the normal TAA stop codon. These studies can be performed using many common cloning vectors comprising a TEM-1 bla gene, including pBR322, pACYC177, pUC-based plasmids, as noted below, or carried out using bla genes derived from other Class A, B, C, or D β-lactamases encoded on conjugative plasmids or the chromosomes of other bacteria.
TABLE-US-00026 Sequence Alignment 21: 3' end of 6-lactamase gene from pACYC177 showing TGG codon for essential tryptophan residue before the TAA stop codon BanI (G'GYRC,C) | AGGTGCCTCACTGATTAAGCATTGG TAACTGTCAGACCAAGTTTACTCAT (SEQ ID NO: 87/88) G A S L I K H W * | “Essential” Trp -------------------TAATAA ------------------------- (SEQ ID NO: 89/90) ---------------------TAA TAA----------------------- (SEQ ID NO: 91/92) ------------------------ TAATAA-------------------- (SEQ ID NO: 93/94)
[0356] The predicted amino acid sequences from these fusions are not shown, but would terminate at different points in the left arm of the mini-Tn7 sequences transposed into the insertion site on the mini-attTn7 (not shown, but similar to those noted earlier) used that overlaps with codons near the 5′ end of the beta-lactamase gene in pACYC177.
[0357]
Example 6—Design of Modular Sequences Encoding an Active β-Lactamase (BLA)-Mini-attTn7 Fusion Polypeptide Conferring Resistance to Ampicillin (AMP)
[0358] Plasmids encoding inactive alpha and omega fragments of β-lactamase that can complement to form a functional enzyme in both bacteria and in mammalian cells were first reported over 25 years ago [Wehrman et al (2002)]. In these studies, the junction between the alpha fragment (α197) and the omega fragment (ω198) is between at glutamic acid (E) residue at position +197 using the standard numbering scheme, and a leucine (L) residue starting at position +198. In the TEM-1β-lactamases encoded by pBR322, pACYC177, and the pUC series of plasmids, this junction is between the E and L amino acid residues at positions +195 and +196, respectively, where the Methionine (M) residue at the start of the gene is considered +1. These two fragments complemented to produce detectible activity in bacteria to when fused to flexible (Gly.sub.4Ser.sub.3).sub.3 linkers and two helices (the carboxy terminus of the Jun helix and the amino terminus of the Fos helix) that formed a leucine zipper. Extension of the carboxy terminus of the alpha197 peptide by 3 amino acids to include the amino acids Asn-Gly-Arg (NGR) before the flexible linker and the Jun helix, dramatically increased the ability of the extended alpha fragment to bind to the omega fragment by 4 orders of magnitude. Comparable experiments were also performed in mammalian cells, where a gene encoding an alpha fragment comprising FRB was co-expressed with an omega fragment comprising FKB12, with both fusion proteins lacking the bacterial signal peptide. In the presence of rapamycin, a small cell permeable molecule that can bind to both FRB and FKB12, the α197FRB and FKB12ω198 fragments could bind and complement, indicating reconstitution of β-lactamase activity. Use of this system as a biosensor was proposed, to probe novel protein-protein interactions, comparable to several other types of mammalian two hybrid assay systems.
[0359] The clear identification of the junction between two contiguous fragments of β-lactamase, allows for the design of novel fusion proteins where a different type of synthetic polypeptide is inserted between the junction of the alpha and omega fragments. In these studies, the synthetic polypeptide is similar to polypeptide encoded by the sequence inserted into the lacZalpha gene on the bacmid bMON142, noted above, where the attTn7 target site is inserted in frame between the start of the lacZalpha polypeptide (amino acids 1-5), and sequences encoding amino acids 7-41 and beyond, with additional amino acids encoded by different parts of the synthetic multiple cloning site in the vector used to assemble the chimeric gene.
TABLE-US-00027 Sequence Alignment 22: Sequences from the PstI site to BglI site in pACYC177 spanning a junction encoding the carboxy terminal end of an alpha fragment and the N-terminal end of an omega fragments of beta-lactamase +295 |PstI(C,TGCA'G) FspI(TGC1GCA) AseI(AT'TA,TT)
[0360] pACYC177 is digested with PstI and BglI and a synthetic oligonucleotide with compatible sticky ends is ligated to it that has an EcoRI site located after the junction of the sequences encoding the alpha fragment of β-lactamase and a SalI site located before the start of the sequences encoding the start of the omega fragment. The PstI and BglI sites are unique in pACYC177. The reading frame is adjusted so that the start of the EcoRI site and the SalI sites are both in the +3 relative reading frame (the wobble position for a codon). In the example noted above, additional nucleotides are added before and after the EcoRI and SalI sites to adjust the reading frame appropriately. In the illustrated example, a site for NotI is added to separate the EcoRI and SalI sites, though the exact sequences before, after, or in between these sites, are not critical to the design of this vector. Other sites, such as those encoding TAA, TAG, or TGA stop codons, or ATG start codons may also be used, depending on the nature of subsequent experiments.
TABLE-US-00028 Sequence Alignment 23: Sequences in a variant pACYC177 comprising a synthetic linker spanning a junction encoding the carboxy terminal end of an alpha fragment and the N-terminal end of an omega fragments of beta-lactamase +295 (SEQ ID NOS: 106/107) |PstI(C,TGCA'G) FspI(TGCIGCA) EcoRI NotI SalI AatII AseI(AT'TA,TT) | | | | | | |
[0361] The resulting plasmid is then digested with EcoRI and SalI to insert the synthetic min-attTn7 derived from the bacmid bMON14272, to produce a vector designated pACYC177-bla-mini-attTn7. In this case, the new plasmid should confer resistance to Ampicillin and Kanamycin, since the synthetic oligonucleotide encodes a flexible linker between the alpha and omega fragments of the bla gene. The new plasmid can then be used in a series of experiments demonstrating that transposition into the attTn7 target site disrupts expression of the fusion protein encoded by synthetic bla gene. A plasmid comprising a Tn7 element inserted into the middle of the synthetic target site should confer resistance to Kanamycin, but not Ampicillin.
TABLE-US-00029 Sequence Alignment24: Sequences in a pACYC177 variant comprising a synthetic mini-attTn7at the junction the alpha omega fragments of beta-lactamase +295 |PstI(C,TGCA'G) FspI(TGCIGCA) | | ATGCCTGCAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAA (SEQ ID NO: 108) M P A A M A T T L R K L L T G E (SEQ ID NO: 109) | | +180 +195 EcoRI |< Synthetic polypeptide encoded by mini-AttTn7 acgaattcacataacaggaagaaaaatgccccgcttacgcagggcatc T N S H N R K K N A| P |L T Q G I −2 +2 <-------------------- Insertion Site --------- SalI ------------------------------------------ |-----
[0362] Nitrocefin is a chromogenic substrate for beta lactamase. Colonies on agar plates that confer resistance to Ampicillin or related β-lactam antibiotics are red, compared to pale yellow for colonies that do not confer resistance to the antibiotic. Nitrocefin and its product are much more soluble than the indigo dye produced when beta-galactosidase react with a chromogenic substrate such as X-gal or Bluo-gal.
[0363] Strategies similar to those noted above for the CAT-mini-attTn7 and Kan-mini-attTn7 fusions can also be used to design comparable bla-alpha-mini-attTn7 fusions, where one or more stop codons are inserted before the codon at the carboxy terminus of the alpha peptide. In a system where both alpha and omega polypeptides are needed to complement and restore activity of the β-lactamase, transposition by a mini-Tn7 into a sequence encoding a truncated alpha fragment with an overlapping mini-attTn7 sequence will restore expression of the alpha polypeptide or an extended form of it, that can complement with an omega fragment expressed under the control of a different promoter. These strategies should work for both prokaryotic and eukaryotic systems, if the sequences encoding the alpha and omega polypeptide fragments are operably linked to promoters that are functional in the host cells, and if the two fragments can bind to each other by non-covalent bonds, optionally mediated by a third molecule. In prokaryotic systems, signal peptides may be needed to facilitate delivery of each fragment to an appropriate location in the cell, compared to eukaryotic cells, where they may be omitted, as noted above, in the experiments reported by Wehrman et al (2002).
[0364]
Example 7—Design of Modular Sequences Encoding Active and Inactive Tetracycline Resistance (Tet)-Mini-attTn7 Fusion Polypeptide
[0365] At least 30 major classes of genes (A-Z and beyond) have been identified that confer resistance to tetracycline in Gram-negative bacteria, all showing significant homology at the nucleotide amino acid levels [Levy et al (1999)]. The encoded products are cytoplasmic membrane-bound antiporter proteins, which mediate energy dependent export of tetracycline from the cell in exchange for a proton. Class A and C proteins, Tet(A) and Tet(C), respectively, are 78% identical, but only 48% identical to the class B protein, Tet(B) [Rubin and Levy (1991)]. The Class B proteins have 12 transmembrane (TM1-TM12) regions comprising α-helices arranged in two bundles of 6 helices, 1-6 and 7-12, apparently from a gene duplication, that was the result of a duplication of a 3 helix motif [Waters et al (1983)]. Genes encoding proteins from many of these classes have been studied extensively using random and systematic methods of mutagenesis, creating protein variants having one or more substitutions, insertions, or deletions at or spanning across nearly every position of their primary sequence, contributing greatly to identification of key residues involved the transport of molecules across a bacterial membrane. The N- and C-terminal ends of the protein (˜8 and ˜15 aa long) are located in the cytoplasm. The interdomain loop, separating the α and β domains (N- and C-terminal halves, comprising helices 1-6 and 7-12, respectively) of the Class B and C proteins, is much larger (˜27 aa) than other loop segments exposed to the cytoplasmic (9-10 aa) or periplasmic (3-11 aa) sides of the membrane, and less conserved in across families of related proteins, and generally more tolerant of alterations than membrane-bound segments of the transporter protein [Saraceni-Richards and Levy (2000) 275(9): 6101-6106]. Other studies have suggested that the interdomain loop may be larger, encompassing as many as 40 amino acids, because the predicted sequence of the Class B protein diverges strongly (˜10% identity) from the Class A and C proteins throughout this region [Waters et al (1983)].
[0366] Analysis of a variety of deletion mutants in a Tn10 derived gene have noted that deletions corresponding to Δ204-207, Δ195-199, Δ182-197, Δ195-200, Δ202-207, Δ193-199, Δ201-207, Δ180-1987, Δ182-189, and Δ200-207, all conferred resistance to at least 50 uM tetracycline (minimal inhibitory concentration, MIC). on agar plates [Wright and Tate (2015)]. A larger deletion of 9 contiguous amino acids as Δ198-207, and double deletion mutants Δ195-199; 204-207, Δ182-187; 204-207, Δ182-187; 195-199, Δ182-187; 200-208, Δ182-187; 196-207, conferred resistance to 10-20 uM tetracycline, suggesting that larger deletions, or double deletions extending from Δ182-187, plus the central to carboxy terminal portion of this region 195 to 199, 196-207, 200-208, or 204-207, impair the activity of the protein, more than sets of single contiguous deletions of 4-8 residues starting at positions 180, 182, 193, 195, 200, 202, and 204. None of the variants analyzed deleted 4 contiguous amino acids “TDTE” from positions 189-192, which correspond to “PMPL” spanning positions 191-194 for the pACYC184 derived protein. These results suggest that while nucleotides and amino acids in this region are not highly conserved, deletions of 9-19 additional residues affect the activity of the protein.
[0367] A series of 2 codon insertions into the SalI or AccI sites of pBR322, corresponding to sequences encoding RRP from 189-191 did not appear to impair activity of the protein (allowing growth on 100 ug/ml oxytetracycline), while two codon insertions at a HpaII and HhaI sites partially encoding “FR” from 203-204 and “AR” from 206-207 near the C-terminal part of the interdomain loop grew on plates containing 15 or 30 or less ug/ml oxytetracycline, respectively [Barany, F (1975) PNAS 82: 4202-4206]. These results demonstrated that high tolerance for insertions of sequences encoding two amino acids at the SalI, and perhaps other nearby sites, consistent with experiments noted above, that deletions of 8 or less contiguous amino acids of are also tolerated in this segment encoding the interdomain loop.
[0368] A series of elegant experiments by Levy and coworkers also demonstrated that two inactive proteins, each containing a mutation in the opposite domain, are capable of complementation to produce an active enzyme [R. A. Rubin and S. B. Levy, (1990)]. Inactive interdomain hybrid proteins between class B and C Tet proteins [Tet(B)α/Tet(C)β and Tet(C)α/Tet(B),β] together produce can complement in trans to produce an active enzyme. Cells comprising genes encoding interdomain hybrids, where a frameshift mutation and a terminator were inserted at the fusion junction resulted in expression of the four domains on separate polypeptides, showed trans complementation without production of full length proteins [Rubin and Levy (1991)]. The activity of the reconstituted enzyme was slightly lower, but still substantial (˜20% of the wild-type level), strongly suggesting that the Tet (B) α and β domains were expressed as separate functional proteins. These and other extensive mutagenesis experiments support the idea that the α and β domains can complement in trans at least as effectively as full length hybrid proteins, which is typically 10-20% of the full length wild type enzyme.
[0369] Transposon Tn10 comprises a Class B gene, designated tetA(B), which encodes a tetracycline-inducible protein, which is sufficient to confer resistance to the antibiotic. The transposon also has a gene tetR(B), which encodes a repressor, and several other genes, including tetC(B) and tetD(B), jenA, jenB, and jenC, flanked by long (1209 nt) inverted IS10 insertion sequences encoding a transposase.
[0370] Tn10 was derived from a drug resistance plasmid found in the enteric bacterium Shigella flexneri, and referred to as NR1, R22, or R100 by several different laboratories. This plasmid, which has a very low copy number (1-2 copies/cell), and is classified in the IncFII incompatibility group, confers resistance to chloramphenicol, fusidic acid, streptomycin/spectinomycin, mercuric salts, and tetracycline. NR1 is compatible with the fertility plasmid, F, first characterized in E. coli.
[0371] Genes conferring resistance to tetracycline are found in many common cloning vectors. The plasmid pSC101 is a natural plasmid isolated from Salmonella panama that confers resistance only to tetracycline. Plasmid pACYC184, which confers resistance to chloramphenicol and tetracycline, was derived from pSC101. The synthetic vector pBR322, is derived from 3 plasmids, the Class C tetracycline resistance gene of pSC101, the ampicillin resistance gene of RSF2124, and a replicon derived from pMB1, a close relative of the ColE1 plasmid. Plasmid pBR322, which has a variety of unique restriction sites located in the genes conferring resistance to ampicillin and tetracycline was widely used for many years to facilitate cloning of genes, by inserting plasmid or amplified DNA fragments digested with appropriate enzymes allowing ligation and recovery of plasmids that confer resistance to amplicillin but not tetracycline, or tetracycline, but not ampicillin. Cloning by Insertional of the bla or tet genes is facilitated by a unique EcoRI site, which is located between both genes, along with unique EcoRV, NheI, BamHI, and SalI sites among others in the tet gene, and unique ScaI, PvuI, and PstI sites, among others in the bla gene. The unique SalI site is located in a segment near the middle of the tet gene in pSC101, pBR322, and pACYC184, that encodes the interdomain loop region.
[0372] Several studies have reported methods for the direct selection of bacteria that are sensitive to tetracycline. One group reported development of a medium containing the lipophilic chelating agents fusaric acid or quinaldic acid, which was effective for the selection of revertants of Salmonella typhimurium which were resistant to due to insertion of Tn10 into their chromosomes [Bochner, B. R. et al (1980)] An improved media comprising fusaric acid and chlortetracycline and zinc chloride, with lower levels of nutrient supplements, like tryptone, and no glucose improved differentiation between tetracycline-sensitive and tetracycline-resistant strains [Maloy S R, and Nunn W D. (1981)] Two other studies noted that over expression of the membrane bound protein renders cells more sensitive to toxic metal salts, such as nickel chloride or cadmium [Podolsky T, Fong S T, Lee B T. (1996)] [Griffith J K, et al (1982)].
[0373] These and other studies provide the basis for the design and assembly of novel gene fusions comprising one or more segments of a gene encoding a protein conferring resistance to tetracycline, and a segment comprising an attachment site for a site-specific transposon. In the sections noted below, segments of the tetracycline resistance gene of pACYC184 are altered, allowing insertion of a segment comprising a mini-attTn7, particularly within the non-conserved interdomain loop region, which should tolerate insertions of DNA encoding a variety of amino acids. Transposition of Tn7 or a mini-Tn7 segment into the mini-attTn7 should disrupt expression of the fusion protein, which can be monitored by screening on ampicillin resistant colonies on plates containing or lacking tetracycline, or by selecting for colonies that confer resistance to ampicillin that are tetracycline sensitive in the presence of fusaric acid, quinaldic acid, nickel salts, or cadmium salts, as noted above.
[0374] The alignment shown below, illustrates conserved residues in the tet proteins derived from Tn10 and pACYC184/pSC101/pBR322 and the location of the interdomain loop near the middle of both proteins. The interdomain loop in pACYC184 corresponds to residues +183 to +209, while this region in Tn10 corresponds to residues +181 to +207.
TABLE-US-00030 Sequence Alignment 25: Alignment of tetracycline resistance proteins from Tn10 and pACYC184 showing conserved residues within cytoplasmic, membrane-boound, and periplasmic polypeptide domains CLUSTAL O(1.2.4)multiple sequence alignment (SEQ ID NOS:110/111) Tn10 MN--SSTKIALVITLLDAMGIGLIMPVLPTLLREFIASEDIANHFGVLLALYALMQVIFA 58 pACYC184 MKSNNALIVILGTVTLDAVGIGLVMPVLPGLLRDIVHSDSIASHYGVLLALYALMQFLCA 60 *: .: : * . ***:****:***** ***::: *:.**.*:***********.: * Tn10 PWLZKMSDRFGRRPVLLLSLIGASLDYLLLAFSSALWMLYLGRLLSGITGATGAVAASVI 118 pACYC184 PVLGALSDRFGRRPVLLASLLGATIDYAIMATTPVLWILYAGRIVAGITGATGAVAGAYI 120 * ** :*********** **:**::** ::* : .**:** **:::**********.: * Tn10 ADTTSASQRVKWFGWLGASFGLGLIAGPIIGGFAGEISPHSPFFIAALLNIVTFLVVMFW 178 pACYC184 ADITDGEDRARHFGLMSACFGVGMVAGPVAGGLLGAISLHAPFLAAAVLNGLNLLLGCFL 180 ** *...:*.: ** :.*.**:*::***: **: * ** *:**: **:** :.:*: * <---- Interdomain loop --->
TABLE-US-00031 Sequence Alignment 26: Sequence from the reverse complement of pACYC184 flanking the Interdomain Loop of the tetracycline resistance protein +2052 SphI(G,CATG′C) | | pACYC184 TCCTTGCATGCACCATTCCTTGCGGCGGCGGTGCTCAACGGCCTCAACCTACTACTGGGC SEQ ID NO: 112 reverse S L H A P F L A A A V L N G L N L L L G SEQ ID NO: 113 complement | +183
[0375] The SphI, EcoNI and SalI recognition and cleavage sites illustrated in the sequence noted above, are unique in pACYC184. The AccI, HincII, and PshAI, each have two sites, and BbsI has three sites in this plasmid. Variant plasmids comprising unique AccI, HincII, PshAI and/or BbsI sites are made by altering the corresponding sites outside the region shown above by site directed mutagenesis, substituting one or more nucleotides in their recognition sequences for other residues, or adding or deleting one or more nucleotide residues, destroying one or more of the unwanted recognition sites.
[0376] The easiest variant to make is one where the second PshAI site is removed by insertion of a linker containing a site for another restriction enzyme, since the second site is located in a large intergenic region between the 3′ end of the cat gene encoding resistance to chloramphenicol, and the 3′ end of the tet gene. Synthetic oligonucleotides are prepared replacing one or more segments between the EcoNI and SalI sites, the SalI and PshAI sites, or the EcoNI and PshAI sites, substituting, inserting, or deleting nucleotide residues, typically in units of 3, to replace, add, or delete codons encoding one or more amino acids in the interdomain loop region. Other strategies for performing site-directed mutagenesis may also be used, to generate variants of pACYC184 vectors, or derivatives thereof, comprising the altered sequences noted below.
[0377] One of the simplest variants to make is to replace the EcoNI-SalI fragment in pACYC184 with a synthetic fragment comprising part of this segment and a synthetic mini-attTn7 target sequence similar to those used in the construction of synthetic lacZalpha-mini-attTn7 sequences noted above, with the relative location of the restriction enzyme recognition sites altered to maintain the reading frame of the interdomain loop and the synthetic polypeptide encoded by the mini-attTn7 target sequences. Many other locations for insertion of a segment encoding a mini-attTn7 target sequences are possible, taking into account the relative activities of the variant proteins compared to the full length unaltered Tet protein noted in earlier mutagenesis studies. The size of the synthetic mini-attTn7 can also be altered, primarily at the 5′ to and after the Tn7 insertion site (−2 to +2), maintaining key sequences extending into those corresponding to the binding site of the protein encoded by the tnsD gene (+23 to +58).
TABLE-US-00032 Sequence Alignment 27: Insertion of a synthetic mini-attTn7 into a SalI site near sequences encoding the Interdomain Loop of the tetracycline resistance protein +2052 SphI(G,CATG'C) | | pACYC184 TCCTTGCATGCACCATTCCTTGCGGCGGCGGTGCTCAACGGCCTCAACCTACTACTGGGC SEQ ID NO: 114 reverse S L H A P F L A A A V L N G L N L L L G SEQ ID NO: 115 complement | +158 EcoNI(CCTN'N,NNAGG) EcoRI | |<------------ Synthetic mini-AttTn7 --------- TGCTTCCTAATGCAGGAGTCGCATAAGGGAGAgaattcacataacaggaagaaaaatgccccgcttacgcagggcatc C F L M Q E S H K G E N S H N R K K N A| P |L T Q G I | | −2 +2 +183 +188 <Interdomain loop><-------------------- Insertion site -------- SalI/AccI/HincII(GTCCAG) ----------------------------------------------> |
TABLE-US-00033 Sequence Alignment 28: An EcoRI-Sall fragment comrpising a synthetic mini-attTn7 Small versions of the synthetic mini-attTn7 site can be placed in frame with other segments of the tetracycline resistance protein. EcoRI |<------------ Synthetic mini-AttTn7 --------- Gaattcacataacaggaagaaaaatgccccgcttacgcagggcatccat (SEQ ID NO: 116)
[0378] Insertion by transposition of Tn7 or a mini-attTn7 derivative into the synthetic target site in a gene encoding a tet-mini-attTn7 fusion protein, should result in expression of an altered α-fragment, extended by amino acid residues encoded by the left arm of Tn7 (in different amounts depending on the reading frame), and disrupt the expression of a β-fragment, preventing assembly of a functional tetracycline resistance protein.
[0379] In a test system where host bacterial cells harbor a target vector comprising a synthetic tet-mini-attTn7 gene encodes a functional protein, and a compatible helper plasmid, encoding essential transposition proteins, are transformed with a mini-Tn7 donor plasmid that is incompatible with the helper plasmid, transposition of the mini-Tn7 into the mini-attTn7 on the target vector, will disrupt expression of the tet gene. The phenotypic change from tetracycline resistant to sensitive can be monitored by spreading bacteria on plates containing chloramphenicol to select for the pACYC184 vector, plus the antibiotic encoded by a resistance marker on the helper plasmid, and purifying and testing colonies on similar plates with varying amounts of tetracycline. Plasmid DNAs isolated from colonies that are sensitive to tetracycline is purified and analyzed to determine their structures compared to parental vectors used in the experiment.
[0380] Bacteria comprising the target vector, helper plasmid, and donor plasmid can also be spread on agar plates containing the appropriate antibiotics, plus different concentrations of nickel salts, fusaric acid, or quinaldic acid, to select for bacteria that are sensitive to tetracycline. In this scheme, cells harboring plasmids having transposition events should survive, and those harboring the parental target plasmid, or the pACYC184 control plasmid, should not.
[0381]
Example 8—Summary of Direct Selection for or Screening of Transposition Events into Synthetic Min-attTn7 Target Sites
[0382]
[0383] The following table summarizes key features of the methods described in each of the Examples, for direct selection or screening of insertions by transposition of a Tn7-based sequence into a target site comprising a synthetic attachment operably-linked to a regulatory and coding sequence for a selectable or screenable marker gene.
TABLE-US-00034 TABLE 15 Key Examples of Direct Selection for or Screening of Transposition Events Into Synthetic min-attTn7 Target Site* Selection/ Ex Scheme Target before transposition After transposition Screening Key Reagent 1a lacZalpha- lacZalpha gene with synthetic mini- Expression of trimeric Screening Blue/White 1b mini-attTn7 attTn7 inserted between codons 6-7; lacZalpha polypeptide colonies; Extra sequences from legacy MCS disrupted preventing Lac Plus (+) regions flanking mini-attTn7 are complementation with to Minus (−) removed allowing reuse of restriction acceptor polypeptide sites in the MCS regions in construction of modular genetic elements 2 ΔCAT-mini- 3′ end of cat gene near codon for Cys Frameshift after Selection Cm S to attTn7 overlapping with mini-attTn7 transposition, CAT Cm R protein extended, restoring function 3 ΔlacZalpha- ΔlacZalpha with stop codons Frameshift after Selection Blue/White mini-attTn7 overlapping with synthetic mini-attTn7 transposition, colonies; near codons 40-41-mini-attTn7 LacZalpha extended, Lac minus (−) restoring ability to to Plus (+) complement with acceptor polypeptide 4a ΔNPT-II- NPT-II gene with proline residue Frameshift after Selection Kan S to mini-attTn7 replacing TAA stop codon-min-attTn7 transposition, NPT-II Kan R protein extended, restoring function 4b ΔNPT-II- NPT-II gene with proline residue Frameshift after Selection Kan S to mini-attTn7 replacing TAA stop codon-min-attTn7 transposition, NPT-II Kan R protein truncated, restoring function 5 Δβ- bla gene with essential Trp codon near Frameshift after Selection Nitrocefin: lactamase- normal TAA stop codon with synthetic transposition, BLA Amp S to mini-attTn7 mini-attTn7 protein extended, Amp R restoring function 6 β-lactamase- bla gene with mini-attTn7 inserted BLA protein disrupted, Screening Amp R to mini-attTn7 between junction for alpha and omega destroying function Amp S fragments 7a Tet-mini- Tet gene with mini-attTn7 inserted into TET protein disrupted, Screening/ Select TC attTn7 “interdomain loop” between left and destroying function Selection sensitive on right half for domain fragments special plates; TcR toTc S 7b ΔTet-mini- Tet gene with TAA stop codon at end Truncated left or right Selection TcS to attTn7 of left or right domain fragment with domain fragment Tc R overlapping mini-attTn7 extended restoring function and, allowing complementation *The original synthetic mini-attTn7 in Example 1a was on an EcoRI-SalI fragment comprising sequences that are 5′ to the Tn7 insertion site at relative positions −2 to +2, and the binding site for the product of the tnsD gene at relative positions +23 to +58. The composition of sequences at the insertion site are irrelevant to the binding of the TnsD recombinase protein. The relative position of the insertion site can be adjusted to the left or the right of the nucleotide sequences in the overlapping target gene by single nucleotide residues, allowing insertion of the transposon in an orientation-specific manner beginning at the left arm of Tn7 at the insertion site. The sequences from −2 to +2 are duplicated to the left of Tn7L and the right of Tn7R. Inverted repeats are at the ends of Tn7 with TGT nucleotides at the 5′ end of Tn7L, and ACA nucleotides at the 3′ end of Tn7R.
[0384] These and similar approaches (CAT-mini-attTn7 and Kan-mini-attTn7), which allow the direct selection of transposition events, dramatically increase the power of systems designed to insert one or more large segments of DNA into one or more specific sites on a plasmid, a shuttle vector, or the chromosome.
[0385] Promoters driving expression of the fusion proteins encoded synthetic target sites may be altered, changing them to tightly inducible promoters, allowing control of expression only in the presence of specific inducing agents.
[0386] These methods have the potential to dramatically alter strategies for gene insertion in a wide variety of fields, including the development of synthetic transposition systems, where the ends of the transposon, genes encoding transposases, and the target site can be altered by random or site specific mutagenesis, and rare variants recovered by methods involving direct selection of transposition events.
Example 9—Design of Modular Baculovirus Shuttle Vectors Comprising Different Synthetic Mini-Tn7 Target Sequences
[0387] The development of baculovirus vectors capable of expressing heterologous proteins in cultured insect cells and larvae have transformed many fields of biology, particularly applications in the field of healthcare research leading to the development of therapeutic drug products, vaccines, components of diagnostic kits, cell and gene therapy vector systems, and general research tools [Luckow and Summers (1988b)] [O'Reilly, D. R., Miller, L. K., and Luckow, V. A. (1992)]. Proteins expressed at high levels greatly facilitate research studies that reveal the structure and function of polypeptide domains capable of carrying out catalytic reactions, the binding of co-factors, and other residues involved in the binding of a protein to other molecules within or outside a cell.
[0388] A wide variety of strategies have been developed to generate recombinant viruses suitable for the rapid production of heterologous proteins in insect cells susceptible to infection by a virus, which generally rely on homologous recombination between a wild-type or engineered virus and a transfer vector, or by site-specific transposition of a DNA cassette comprising a promoter and a gene of interest into a desired location within an engineered virus. General features of these approaches have been reviewed and compared in several reports, particularly for viral vector backbones and transfer vectors or donor plasmids that are available from a variety of commercial sources [Roy and Noad (2012)] [Lun et al (2011)] [Possee et al (2019)].
[0389] There is a persistent need, however, to develop improved methods for the generation of recombinant baculoviruses, that are easier and more rapid than existing methods, or lead to higher levels of expression of one or more heterologous proteins expressed in cultured cells or insect larvae. Many strategies have been developed to improve the structural organization of DNA segments comprising one or more baculovirus promoters operably-linked to one or more genes of interest (GOIs), that are present in transfer vectors or donor plasmids, or to express the products of these genes as fusion proteins comprising amino- or carboxy-terminal tags to facilitate targeting, secretion. or purification of the heterologous protein from samples comprising host cell proteins and other viral proteins.
[0390] Nearly every laboratory involved in this type of research, is capable of generating modified transfer vectors or donor plasmids, because they are small, and easy to manipulate by traditional cloning methods, and by strategies designed to mutate one or more nucleotide residues by substitution, insertion, or deletion, permitting the systematic functional analysis of one or more genes of interest. Strategies generally designed to manipulate the backbone of the viral vector, are much less common, due in part to the large size of the virus. The sequence of wild-type C6 and E2 variants of the Autographa californica Nuclear Polyhedrosis Virus (AcNPV) are known, each are over 128 kb in length. Development of the baculovirus shuttle vector (bacmid) system permitted the systematic analysis of the >150 genes in these and other related viruses by allowing mutagenesis of a gene in the bacmid propagated in bacteria, before transfecting insect cells with the modified vector to determine if the gene is essential or non-essential for propagation of the budded or occluded forms of the virus. The budded form which is required for transmission from cell to cell in the insect, or in cultured insect cells, is formed about 24 hpi, compared to the stable occluded form, which is produced 48-72 hpi, that can survive in the environment. The occluded form of the virus dissolves in the alkaline environment in the gut of caterpillars that fed on contaminated plant materials, leading to a new cycle of cell-cell infection and eventual release of occluded viral particles.
[0391] Excellent sources of information various aspects of the molecular biology of baculoviruses are the online chapters in a book published by Rohrmann [2019], particularly sections annotating the functions of all known genes in AcNPV and Bombyx mori NPV (BmNPV), among others. The following table provides a list of those genes and whether they are considered core genes, found in many other related viruses, and essential or non-essential based on functional studies in transfected insect cell or injected into larvae, but also noting they are appear to be clustered in groups of two or more contiguous genes. Genes that are not essential, whether they appear alone, or in clusters, may be good targets for mutagenesis, allowing the insertion of gene cassettes located on transfer vectors or donor plasmids, or insertion of bacterial replicons and drug resistance markers used in baculovirus shuttle vector systems.
TABLE-US-00035 TABLE 16 Characteristics of AcNPV genes Non- Clustered Clustered Non- Clustered Gene Gene (Protein) Core Essential Essential? Essential Essential Core Ac1 Ac001 (Protein tyrosine Non- E Clustered Non- E phosphatase (ptp)) Essential Essential Ac2 Ac002 (BRO (Baculovirus Non- E Clustered Non- E repeated orf)) Essential Essential Ac3 Ac003 (Conotoxin like (Ctl)) Non- E Clustered Non- E Essential Essential Ac4 Non- E Clustered Non- E Essential Essential Ac5 Non- E N E Essential *Ac6 Ac006* (Lef2) * Essential N E N Ac7 Non- E Clustered Non- E Essential Essential Ac8 Ac008 (Polyhedrin ) Non- E N E Essential Ac9 Ac009 (Pp78/83; orf1629) Essential Clustered E E Essential Ac10 Ac010 (PK1 Essential N E E (Protein kinase 1)) Ac11 Non- E Clustered Non- E Essential Essential Ac12 Non- E Clustered Non- E Essential Essential Ac13 Non- E N E Essential *Ac14 Ac014* (Lef1) * Essential N E N Ac15 Ac015 (EGT) Non- E Clustered Non- E Essential Essential Ac16 Ac016 (BV/ODV-E26) Non- E N E Essential Ac17 Ac016 (DA26) Essential N E E Ac18 Non- E Clustered Non- E Essential Essential Ac19 Non- E N E Essential Ac20 Ac020/021 (ARIF1 (Actin Essential N E E rearranging factor1)) *Ac22 Ac022* (Pif-2) * Non- E Clustered Non- Clustered Essential Essential Core Ac23 Ac023 (F (fusion protein Non- E N E homolog)) Essential Ac24 Ac024 (PKIP (Protein kinase Essential Clustered E E interacting factor)) Essential Ac25 Ac025 (DBP (DNA binding Essential N E E protein)) Ac26 Non- E Clustered Non- E Essential Essential Ac27 Ac027 (lap-1) Non- E N E Essential Ac28 Ac028 (Lef6) Essential N E E Ac29 Non- E Clustered Non- E Essential Essential Ac30 Non- E Clustered Non- E Essential Essential Ac31 Ac031 (SOD superoxide Non- E Clustered Non- E dismutase) Essential Essential Ac32 Ac032 (FGF (fibroblast Non- E Clustered Non- E growth factor)) Essential Essential Ac33 Ac033 (Histodinol Non- E N E phosphatase) Essential Ac34 Ac033 (PNK polynucleotide Essential N E E kinase) Ac35 Ac035 (Ubiquitin) Non- E N E Essential Ac36 Ac036 (39K, pp31) Essential Clustered E E Essential Ac 37 Ac036 (Pp31; 39K) Essential Clustered E E Essential Ac38 Ac037* (Lef11) Essential N E E Ac39 Ac038 (Nudix) Non- E N E Essential *Ac40 Ac039 (P43) * Essential Clustered E N Essential Ac41 Ac041* (Lef12) Essential N E E Ac42 Ac042 (Gta (global Non- E N E transactivator)) Essential Ac43 Essential N E E Ac44 Ac046 (Chondroitinase, odv- Non- E Clustered Non- E e66) Essential Essential Ac45 Ac046 (ODV-E66) Non- E Clustered Non- E Essential Essential Ac46 Ac047 (ETS) Non- E Clustered Non- E Essential Essential Ac47 Ac047 (TRAX-like) Non- E Clustered Non- E Essential Essential Ac48 Ac048 (ETM) Non- E Clustered Non- E Essential Essential Ac49 Ac049 (ETL (PCNA)) Non- E N E Essential *Ac50 Ac049 (PCNA) * Essential Clustered E Clustered Essential Core Ac51 Ac050* (Lef8) Essential Clustered E E Essential Ac52 Ac051 (DnaJ domain Essential Clustered E E protein) Essential *Ac53 Ac051 (J domain) * Essential Clustered E Clustered Essential Core Ac53a Essential Clustered E E Essential *Ac54 Ac054* (Vp1054 ) * Essential N E N Ac55 Non- E Clustered Non- E Essential Essential Ac56 Non- E Clustered Non- E Essential Essential Ac57 Non- E Clustered Non- E Essential Essential Ac58, Ac059 (ChaB homolog) Non- E Clustered Non- E Ac58/59 Essential Essential Ac60 Ac060 (ChaB homolog) Non- E Clustered Non- E Essential Essential Ac61 Ac061 (FP (few polyhedra), Non- E N E fp-25k) Essential *Ac62 Ac062* (Lef9) * Essential N E N Ac63 Ac064 (Fusolin (gp37)) Non- E Clustered Non- E Essential Essential Ac64 Ac064 (GP37) Non- E N E Essential *Ac65 Ac065* (DNA polymerase) * Essential Clustered E N Essential *Ac66 Ac066* (Desmoplakin-like) * Essential N E N Ac67 Ac067 (Lef3) Non- E Clustered Non- E Essential Essential *Ac68 Ac068* (Pif-6) * Non- E N N Essential Ac69 Ac069 (MTase (methyl Essential N E E transferase)) Ac70 Ac070 (Hcf-1 (host cell Non- E Clustered Non- E factor 1)) Essential Essential Ac71 Ac071 (lap-2) Non- E Clustered Non- E Essential Essential Ac72 Non- E Clustered Non- E Essential Essential Ac73 Non- E N E Essential Ac74 Essential Clustered E E Essential Ac75 Essential Clustered E E Essential Ac76 Essential Clustered E E Essential *Ac77 Ac077* (VLF-1 very late * Essential Clustered E Clustered factor 1) Essential Core *Ac78 * Essential Clustered E Clustered Essential Core Ac79 Essential Clustered E E Essential *Ac80 Ac080 (GP41) * Essential Clustered E N Essential *Ac81 Ac082 (TLP telokin-like) * Essential N E N Ac82 Ac083* (P95, p91) Non- E N E Essential *Ac83, VP91, Ac083* (Pif-8, vp91, vp94) * Essential N E N PIF-8 Ac84 Ac083* (Vp91, p95) Non- E Clustered Non- E Essential Essential Ac85 Ac086 (PNK/PNL Non- E Clustered Non- E PO lynucleotide Essential Essential kinase/ligase) Ac86 Ac087 (P15) Non- E Clustered Non- E Essential Essential Ac87 Ac088 (Cg30) Non- E N E Essential Ac88 Ac089* (Vp39, capsid) Essential Clustered E E Essential *Ac89 Ac090* (Lef4) * Essential Clustered E N Essential *Ac90 Ac092* (P33 sulfhydryl * Essential N E N oxidase) Ac91 Ac092* (Sulfhydryl oxidase, Non- E N E sox) Essential *Ac92 Ac093 (P18) * Essential Clustered E Clustered Essential Core *Ac93 Ac094* (ODV-E25, p25, 25k) Essential Clustered E Clustered Essential Core *Ac94 Ac095* (Helicase, p143) * Essential Clustered E N Essential *Ac95 Ac095* (P143 (helicase)) * Essential N E N *Ac96 Ac096* (19K (pif-4)) * Non- E Clustered Non- Clustered Essential Essential Core Ac97 Ac096* (Pif-4 (19K)) * Non- E N E Essential *Ac98 Ac098* (38K) * Essential Clustered E Clustered Essential Core *Ac99 Ac099* (Lef5) * Essential Clustered E Clustered Essential Core *Ac100 Ac100* (P6.9) * Essential Clustered E Clustered Essential Core *Ac101 Ac101* (BV/ODV-C42) * Essential Clustered E Clustered Essential Core Ac102 Ac102 (C42) Essential Clustered E E Essential *Ac103 Ac102 (P12) Essential Clustered E N Essential Ac104 Ac102* (P40) Essential N E E Ac105 Ac103* (P45, p48) Non- E N E Essential Ac106/107 Ac104 (Vp80, vp87) Essential N E E Ac108 Ac105 (He65 ) Non- E N E Essential *Ac109 * Essential N E N *Ac110 Ac110* (Pif-7) * Non- E Clustered Non- Clustered Essential Essential Core Ac111 Non- E Clustered Non- E Essential Essential Ac112/113 Ac112/113 (Apsup) Non- E Clustered Non- E Essential Essential Ac114 Non- E Clustered Non- E Essential Essential *Ac115 Ac115* (Pif-3) * Non- E Clustered Non- Clustered Essential Essential Core Ac116 Non- E Clustered Non- E Essential Essential Ac117 Non- E Clustered Non- E Essential Essential Ac118 Non- E Clustered Non- E Essential Essential *Ac119 Ac119* (Pif-1) * Non- E Clustered Non- Clustered Essential Essential Core Ac120 Ac123 (PK2 Non- E Clustered Non- E (Protein kinase 2)) Essential Essential Ac121 Ac125 (Lef7) Non- E Clustered Non- E Essential Essential Ac122 Ac126 (Chitinase) Non- E Clustered Non- E Essential Essential Ac123 Ac127 (Cathepsin) Non- E Clustered Non- E Essential Essential Ac124 Ac128 (GP64) Non- E N E Essential Ac125 Ac129 (P24) Essential N E E Ac126 Ac130 (GP16) Non- E Clustered Non- E Essential Essential Ac127 Ac131 (Calyx, polyhedron Non- E N E envelope) Essential Ac128 Ac131 (PEP polyhedron Essential N E E envelope protein) Ac129 Ac131 (Pp34, polyhedron Non- E Clustered Non- E envelope) Essential Essential Ac130 Non- E N E Essential Ac132 Essential Clustered E E Essential *Ac133 Ac133* (Alkaline nuclease) * Essential N E N Ac134 Ac134 (P94 ) Non- E N E Essential Ac135 Ac135 (P35) Essential N E E Ac136 Ac136 (P26) Non- E Clustered Non- E Essential Essential Ac137 Ac137 (P10) Non- E Clustered Non- E Essential Essential *Ac138 Ac138 (P74, Pif-O) * Non- E N N Essential Ac 139 Ac138* (Pif-0, p74) Essential N E E Ac140 Ac139 (Me53) Non- E N E Essential Ac141 Ac141 (Exon-O) Essential Clustered E E Essential *Ac142 Ac142* (49K) * Essential Clustered E Clustered Essential Core *Ac143 Ac142* (P49) * Essential Clustered E N Essential *Ac144 Ac143* (ODV-E18) * Essential N E N Ac145 Ac144 (ODV-EC27) Non- E N E Essential Ac146 Ac145 (P11) Essential Clustered E E Essential Ac147 Ac147 (le1 ) Essential Non- N E E Ac147-0 Ac147-0 (le0) Essential E Clustered Non- E Essential *Ac148 Ac148* (ODV-E56, Pif-5) * Non- E Clustered Non- Clustered Essential Essential Core Ac149 Ac148* (Pif-5, ody-e56) Non- E Clustered Non- E Essential Essential Ac150 Non- E N E Essential Ac151 Ac151 (le2) Essential N E E Ac152 Ac153 (Pe38) Non- E N E Essential Ac153 Ac53a (Lef10) Essential N E E Ac154 Non- E Clustered Non- E Essential Essential
[0392] Over 347 nucleotide sequences have been deposited in Gen Bank providing the complete genomes of a wide variety of insect viruses, including baculoviruses and granulosis viruses, among others. Similar tables can be prepared for each virus, by comparing the homology for each gene against annotated sets of genes for other related viruses. Viruses of most interest to researchers involved in the development of novel expression vector systems, are AcNPV and BmNPV.
TABLE-US-00036 TABLE 17 Relevant AcNPV and BmNPV sequences Name Size Acc No Acc. No. Autographa californica 133,926 bp KM609482.1 GI: 851968049 multiple nucleopolyhedrovirus isolate WP10, complete genome Autographa californica 133,894 bp L22858.1 GI: 510708 nucleopolyhedrovirus clone C6, complete genome Autographa californica 133,966 bp KM667940.1 GI: 700275637 nucleopolyhedrovirus strain E2, complete genome Autographa californica 133,894 bp NC_001623.1 GI: 9627742 nucleopolyhedrovirus, complete genome Bombyx mori NPV strain 127,465 bp JQ991009.1 GI: 393659939 Cubic, complete genome Bombyx mori NPV strain 126,843 bp JQ991011.1 GI: 393717332 Guangxi, complete genome Bombyx mori NPV strain 126,879 bp JQ991010.1 GI: 393717193 India, complete genome Bombyx mori NPV strain 126,125 bp JQ991008.1 GI: 393717051 Zhejiang, complete genome Bombyx mori NPV, 128,413 bp NC_001962.1 GI: 9630816 complete genome Bombyx mori nuclear 128,413 bp L33180.1 GI: 3745835 polyhedrosis virus isolate T3, complete genome Bombyx mori 127,459 bp LC150780.1 GI: 1227954165 nucleopolyhedrovirus DNA, complete genome, isolate: H4 Bombyx mori 127,901 bp KF306215.1 GI: 548577843 nucleopolyhedrovirus isolate C1, complete genome Bombyx mori 126,406 bp KF306216.1 GI: 548578068 nucleopolyhedrovirus isolate C2, complete genome Bombyx mori 125,437 bp KF306217.1 GI: 548578211 nucleopolyhedrovirus isolate C6, complete genome Bombyx mori 126,861 bp KJ186100.1 GI: 695132325 nucleopolyhedrovirus strain Brazilian, complete genome Mutant Autographa 118,582 bp KU697902.1 GI: 1040495973 californica nucleopolyhedrovirus isolate vAcRev-1, complete genome Mutant Autographa 138,991 bp KU697903.1 GI: 1040496108 californica nucleopolyhedrovirus isolate vAcRev-2, complete genome
[0393] Analysis of the nucleotide sequences of the C6 and E2 variants of AcNPV, and the bacmid bMON14272, derived from AcNPV-E2 revealed the frequency of cuts by restriction enzymes available from commercial sources. The following table summarizes these results.
TABLE-US-00037 TABLE 18 Frequency of cuts by non-redundant restriction enzymes in AcNPV-E2 and bMON14272 Cuts AcNPV-E2 bMON14272 0 Bsu36I, SrfI, Sse83987I, I-CeuI, Bsu36I, I-CeuI, PI-SceI, I-PpoI, PI-SceI, I-PpoI, I-SceI, MauBI, I-SceI, MauBI, PI-PspI PI-PspI 1 AvrII, AbsI, FseI AvrII, SrfI, FseI 2 SfiI, AscI AbsI, Sse8387I, SfiI, AscI 3 SexAI, EcoNI, SgrDI, SgfI, KflI SgrDI, KflI 4 SmaI/XmaI, PasI, MreI, NotI SexAI, MreI, SgfI 5 AarI, AflII AarI, PasI, EcoNI 13 PacI PacI
[0394] It is desirable to create variants of AcNPV-E2 and BmNPV, and shuttle vectors derived from them, where one or more of the restriction sites that cut 1-3 times, plus the NotI sites, which cuts 4 times in AcNPV are removed by site directed mutagenesis. These sites include AvrII, AbsI, FseI, SrfI, SdaI, SfiI, AscI, SgrDI, KflI, SexAI, SgfI, and NotI, with the AvrII, SrfI, FseI, AbsI, and AscI sites removed initially. Some of these enzymes produce compatible cohesive ends that can be used to assemble other DNA cassettes, and when the ends of two fragments are ligated together are not cleaved by either enzyme, similar to the BioBricks and related gene assembly schemes noted in the Background of the Invention.
[0395] Synthetic linkers comprising one or more recognition sequences for Bsu36I, SrfI, Sse83987I, and MauBI, that don't cut AcNPV plus AvrII, AbsI, FseI, SrfI, SfiI, AscI, SgrDI, KflI, SexAI, SgfI, and NotI, that cut 1-4 times, or fewer times in a variant lacking one or more of these sites can be prepared, that facilitate the design modular genetic elements that can be assembled into functional baculovirus shuttle vectors. Pad, which has an AT-rich recognition sequence cuts 13 times each in AcNPV and bMON14272, in the backbone of the virus, but not within the contiguous mini-F-Kan-mini-attTn7 sequences of the bMON14272 shuttle vector.
TABLE-US-00038 TABLE 19 Recognition sites of restriction enzymes useful in the design of modular vectors Site Name Compatible Enzymes CC↓TNA↑GG Bsu36I Compatible with BlpI (GC′TNA, GC) which is (Overhang: 5′ symmetric and Bpu10I (CC′TNA, GC) which is TNA)- asymmetric) and DdeI (C′TNA,G) TAACTATAACGGTC↑CTAA↓GGTAGCGAA I-CeuI Not compatible with anything else (Overhang: 3′ CTAA) TAGGG↑ATAA↓CAGGGTAAT I-SceI Not compatible with anything else (Overhang: 3′ ATAA ) TGGCAAACAGCTA↑TTA↓TGGGTATTATGGGT PI-PspI Not compatible with anything else (Overhang: 3′ TTAT ) CG↓CGCG↑CG MauBI Compatible with AscI (GG′CGCG, CC), BssHII (Overhang: 5′ (G′CGCG, C), MluI (A, CGCG, G) CGCG) TAACTATGACTCTC↑TTAA↓GGTAGCCAAAT I-PpoI Not compatible with anything else (Overhang: 3′ TTAA) ATCTATGTCGG↑GTGC↓GGAGAAAGAGGTAATGAAATGG PI-SceI Not compatible with anything else (Overhang: 3′ GTGC) CC↑TGCA↓GG SbfI (Overhang: Compatible with NsiI (A, TGCA′T), PstI 3′ TGCA) (C, TGCA′G) GCCCT↑↓GGGC SrfI (Overhang: BLUNT ENDS Blunt) CC↑TGCA↓GG Sse8387I (Overhang: 3′ TGCA)- C↓CTAG↑G AvrII Compatible with NheI (G′CTAG, C), SpeI (Overhang: 5′ (A′CTAG, T), and XbaI (T′CTAG, A) CTAG) CC↓TCGA↑GG AbsI Compatible with AbsI (CC′TCGA, GG), PaeR7I (Overhang: 5′ (C′TCCGA, G), PspXI (VC,TCGA, GB), SalI TCGA) (G′TCGA, C), SgrDI (CG′TCGA, CG), XhoI (C′TCGA, G) GG↑CCGG↓CC FseI (Overhang: Not compatible with anything else 3′ CCGG) GG↓CGCG↑CC AscI Compatible with BssHII (G′CGCG,C), MauBI (Overhang: 5′ (CG,CGCG,CG), MluI (A′CGCG,T) CGCG)- GGCCN↑NNN↓NGGCC SfiI (Overhang: Compatible with many enzymes, including 3′ NNN)- BglI CG↓TCGA↑CG SgrDI Compatible with AbsI (CC′TCGA, GG), PaeR7I (Overhang: 5′ (C′TCGA,G), PspXI (VC, TCGA, GB), SalI TCGA)- (G′TCGA,C), SgrDI (CG′TCGA, CG), XhoI (C′TCGA, G) GCG↑AT↓CGC SgfI (Overhang: Compatible with AsiSI (GCG, ST′CGC), PacI 3′ AT)- (TTA, AT′TAA), PvuI (CG, AT′CG) GC↓GGCC↑GC NotI Compatible with EagI (C′GGCC, G (Overhang: 5′ GGCC) TTA↑AT↓TAA PacI Compatible with AsiSI (GCG, AT′CGAA), PvuI (CG, AT′CG)
[0396] Pairs of linkers containing recognition sites for rare cutting restriction enzymes, typically with sequences that are 8 or more nucleotides in length, can be used to flank genetic elements in cassettes, such that digestion and annealing of two sets of genetic elements flanked by similar pairs are assembled into one contiguous fragment, similar to the BioBrick system noted earlier. In this scheme, pairs such as NotI/EagI, AbsI/SgrDI, MauBI/AscI can be used to assemble larger DNA cassettes, since they are unlikely to have recognition sequences in the middle of the genetic elements being assembled for insertion into cloning or expression vectors designed. for particular applications.
[0397] Linkers comprising recognition sites suitable for assembly of modular baculovirus vectors are called “BaculoBricks”, as noted in the Terms and Definitions section of this application. These and similar linkers comprising recognition sites for rare-cutting restriction enzymes can also be used in creating modular mammalian shuttle vectors, plant shuttle vectors, fungal shuttle vectors, and many plasmids from other large enteric or non-enteric bacterial plasmid systems, which may have applications in many fields of synthetic biology.
[0398] Modular baculovirus shuttle vectors need to contain a bacterial replicon, preferably one that is stable, and propagates at a low copy number, like the mini-F replicon used in bMON14272. They also need a drug resistance marker to facilitate selection of bacteria harboring the shuttle vector. In bMON14272, this was a gene conferring resistance to Kanamycin, but other selectable markers, such as those conferring resistance to ampicillin, tetracycline, chloramphenicol, gentamycin, among many others, or metabolic markers, such as one carrying a gene that can complement in trans, a gene that is mutated in the host cell. Shuttle vectors may optionally comprise one or more target sites for site specific transposons, such as a mini-Tn7 element liked to a lacZalpha gene, or other selectable or screenable markers noted in other examples of the application.
[0399] The key genetic elements added to a shuttle vector are independent, and need not be contiguous to each other, as they are in bMON14272. The replicon, drug resistance marker, and the optional target site can be in distinct locations within the viral genome, and in opposite orientations with respect to each other, as long as the resulting virus is stably propagated in bacteria, and in cultured eukaryotic host cells.
[0400] It may be desirable to randomly mutagenize a viral backbone, to identify locations that allow insertions of different DNA cassettes, such as a synthetic mini-attTn7, into many locations, which may be equal to or more stable than other locations. Tn5-based mutagenesis systems are now available from Lucigen, that facilitate the random transposition of DNA segments flanked by synthetic left and right arms of Tn5 into target DNA samples in vitro, in the presence of purified transposition proteins, or in vivo in a cell harboring a vector comprising the target sequence and a helper plasmid providing transposition proteins in trans. A viral shuttle vector comprising a replicon and a drug resistance marker, can be subjected mutagenesis with a mini-Tn5 element comprising one or more mini-attTn7 target sites. This approach allows the identification of locations within the viral backbone that may be more suited for stable, long term use, than those traditionally used for construction of recombinant viruses, or those identified by methods directed to sites within one or several clustered non-essential genes, as noted above.
[0401] These general approaches can also be applied to a wide variety of shuttle vectors that propagate only in bacteria, or in bacteria and in other types of eukaryotic cells. Viral and non-viral mammalian vectors, plant cell-based vectors, fungal vectors, for example, can all be redesigned, and used as modular targets for the insertion of DNA cassette carried on site specific transposons that are similar to those described in this application. The powerful new ability to directly select for insertions into a target site, coupled with other novel screening methods, dramatically increases the utility of systems designed to study the structure and function of a wide variety of genes, and facilitates the development of vectors that are capable of expression of heterologous proteins at high levels suitable for use in a variety of commercial applications.
Example 10—Design of Synthetic Linkers Comprising Recognition Sequences for Restriction Enzymes that Cut Infrequently to Facilitate Cloning of One or More Segments of Genetic Elements into Large Plasmids and Shuttle Vectors for Use in Prokaryotic or Eukaryotic Cells
[0402] As noted above, pairs of synthetic linkers containing recognition sites for restriction enzymes that cut infrequently in large plasmids that generally propagate only in bacteria or in shuttle vectors that can propagate in at least two types of host cells, typically with sequences that are 8 or more nucleotides in length, can be used to flank genetic elements in cassettes, such that digestion and annealing of two sets of genetic elements flanked by similar pairs are assembled into one contiguous fragment, similar to the BioBrick system noted earlier.
[0403] In the many of the BioBrick standard assembly schemes, the linkers comprise recognition sites for restriction enzymes that are only 6 nucleotides in length, with one set using a prefix linker comprising sites for EcoRI and XbaI separated by site for NotI, and a suffix linker comprising sites for SpeI and PstI, also separated by a NotI site. For example, a vector comprising a first sequence of interest is digested with EcoRI and SpeI, and a second vector comprising a second sequence of interest and a replicon and selectable marker is digested with EcoRI and XbaI. Samples from both digests are mixed and ligated together, to form a larger vector comprising two sequences of interest with a “scar” site formed by the ligation of the compatible XbaI and SpeI sticky ends that is not recognized by either enzyme. The two contiguous sequences of interest in the larger product vector can be released from digestion with EcoRI and SpeI, or retained in a vector digested with EcoRI and XbaI that are used in subsequent reactions to assemble vectors comprising three or more contiguous sequences of interest, separated by scar sequences. Another standard uses linkers comprising recognition sites for EcoRI, BglII, BamHI, XhoI, where BglII and BamHI generate compatible sticky ends, while another standard uses linkers that contain recognition sites for AgeI and NgoMIV.
[0404] The biggest limitation of many of these assembly schemes is that the DNA segment to be flanked by these types linkers must not contain a recognition site used in the prefix or suffix linkers. If it does, it needs to be removed by mutagenesis, perhaps involving careful design to introduce mutations that do not affect the reading frame of a nucleotide sequence encoding a polypeptide, or by altering nucleotide residues in codons within the recognition site that do not alter the sequence of the encoded polypeptide, or by replacing codons with those encoding amino acids that are similar to those in the parental sequence, or are generally conserved, when a variety of related residues are compared in a multiple sequence alignment.
[0405] For applications that require assembly of larger segments of DNA, such as those derived from large plasmids, or shuttle vectors comprising stable low copy number replicons, such as mini-F, or large operons comprising linked sets of genes operably-linked to one or more promoters, it is desirable to use synthetic linkers that comprise sequences for restriction enzymes that do not cut, or very rarely cut in the sequences of interest that will be flanked at their 5′ and 3′ ends by prefix and suffix linkers, respectively.
[0406] The frequency by which a Class II restriction enzyme will cut is a function of the length of the sequence it is sensitive to. An enzyme with a 4-bp recognition sequence and 4 possible bases at each position, will theoretically cut 1 in 4.sup.4 (256) 4-bp long recognition sites. An enzyme with a 6-bp recognition sequence and 4 possible bases at each position, will theoretically cut 1 in 6.sup.4 (4,096) 6-bp long recognition sites. An enzyme with an 8-bp recognition sequence and 4 possible bases at each position, will theoretically cut 1 in 8.sup.4 (65,536) 8-bp long recognition sites. GC content affects these frequencies, increasing the probability that enzymes that have GC-rich recognition sites will cut more often in large segments of DNA that are more GC-rich than average, compared to the probability that enzymes that have AT-rich recognition sequences will cut in the same large segment of DNA.
[0407] While a variety of Class II restriction enzymes have been characterized that have recognition sites that are 8 or more bp in length, they are much less commonly available from commercial sources than enzymes that have recognition sites that are 4, 5, 6, or 7 bp in length. Of these, many fewer can be assigned to sets where one or more enzymes generate sticky 5′ or 3′ ends suitable for use in ligation experiments where a scar is formed by the annealing and ligation of two compatible sticky ends.
[0408] To facilitate the modular assembly of large plasmids that propagate only in prokaryotes, or shuttle vectors that can propagate in two types of host cells, one typically in bacteria, such as laboratory strains of E. coli, an enteric bacterium, and the other in non-enteric bacteria or eukaryotic cells, such as insect, mammalian, and fungal cells, it is appropriate to determine the relative frequency of cleavage sites for a variety of Class II restriction enzymes. The relative frequency (from 0 to 5) of cuts by non-redundant restriction enzymes in the AcNPV-E2 E2 strain of baculovirus, and the shuttle vector designated bMON14272 are provided in a table noted above. The recognition sites of a variety of restriction enzymes that are potentially useful in the design of modular vectors, are also provided in a table noted above. After eliminating enzymes that produce blunt ends, those that produce sticky ends that are not compatible with any other enzyme, and those that produce sticky ends with one or more ambiguous nucleotides (e.g., Bsu36I), very few enzymes remain that can be considered for use in linkers where one or more of the recognition sites in the prefix or suffix linker that rarely cut within the plasmid or shuttle vector of interest, such as AvrII (C′CTAG,G), which cuts AcNPV and bMON14272 only once, or those that have recognition sites that are 8 or more bp in length.
[0409] Linkers comprising recognition sites for specific pairs of enzymes such as NotI/EagI, AbsI/SgrDI, MauBI/AscI can be used to design and assemble larger DNA cassettes, since they are unlikely to have recognition sequences in the middle of the genetic elements being assembled for insertion into cloning or expression vectors designed. for particular applications. While these may be the most appropriate pairs of enzymes suitable for use in the assembly of modular baculovirus vectors, they are not necessarily limited to these types of vectors, but may also be used to facilitate the design and assembly of large modular mammalian, plant, and fungal shuttle vectors, as well as other large plasmids and shuttle vectors that propagate in one or more types of prokaryotic cells.
Sequence Alignment 29: Synthetic Pairs of Linkers Comprising Recognition Sites for NotI, EagI, and PspOMI
[0410] NotI (GC′GGCC,GC) has a 5′ overhang of GGCC, which is compatible with PspOMI (G′GGCC,C) and EagI (C′GGCC,G). The recognition site for EagI is an internal subset of NotI. NotI cuts AcNPV four (4) times, and bMON14272 six (6) times. PspOMI cuts AcNPV seven (7) times, and bMON14272 nine (9) times. EagI cuts AcNPV forty (40) times, and bMON14272 forty-two (42) times.
[0411] Synthetic DNA sequences comprising recognition sites for NotI and PspOMI are shown below, separated by a series of unspecified nucleotides, specified here as a series of 8 “n” residues, which may comprise recognition sites for other restriction enzymes. The number of unspecified or ambiguous residues can vary, to be larger or smaller than 8 residues, depending on the desired application. In the first example below, ligation of a linker digested to expose a PspOMI site at its 3′ end with a linker digested to expose a NotI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme. In the second example below, ligation of a linker digested to expose a NotI site at its 3′ end with a linker digested to expose a PspOMI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme.
##STR00033##
TABLE-US-00039 TABLE 20 Frequency of cuts by restriction enzymes in used in synthetic linkers in AcNPV-E2 and bMON14272 AcNPV- Enzyme Site E2 bMON14272 Comments NotI GC′GGCC, GC 4 6 All NotI sites contain internal EagI sites EagI C′GGCC, G 40 42 EagI PspOMI produces sticky ends that are compatible with NotI and PspOMI sites PspOMI G′GGCC, C 7 9 PspOMI produces sticky ends that are compatible with NotI and EagI sites AbsI CC′TCGA, GG 1 2 One AbsI/PaeR7I/XhoI site in AcNPV is near the 5′ end of the Ac-sod gene at position 25,926, and the AbsI site in the bacmid is right after the SalI site in the mini-attTn7 segment SgrDI CG′TCGA, CG 3 3 SgrDI/SalI sites are in the Ac-ORF1629 gene at position 6,698, the non-essential AcORF-18 gene at 14,944, and Ac-Orf54 gene at 45,700. XhoI C′TCGA, G 14 17 XhoI sites are compatible with AbsI, SgrDI, and SalI sites PspXI VC′TCGA, GB 8 11 Some PspXI sites are AbsI sites and both contain internal XhoI sites SalI G′TCGA, C 54 55 One SalI site is at the 3′ end of the mini-attTn7 segment in the middle of the lacZalpha gene in the bacmid MauBI CG′CGCG, CG 0 0 Does not cut AcNPV or the bacmid. MauBI sites contain internal BssHII sites AscI GG′CGCG, CC 2 2 Cuts twice in AcNPV, once in Ac-arif-1 gene at position 16,573, plus Ac-pkip-1 gene at 20,948 BssHII G′CGCG, C 34 38 All AscI and MauBI sites contain internal BssHII sites. MluI A′CGCG, G 80 80 Does not cut in Kan-lacZalpha-mini-attTn7-mini-F replicon region in the bacmid, but cuts in the flanking Ac-ORF603 and Ac-ORF-12 genes in the AcNPV and the bacmid FseI GG, CCGG′CC 1 1 Cuts once near 5′ end of Ac-gta gene at position 34,285 in AcNPV PacI TTA↑AT↓TAA 13 13 PacI cuts 13 times each in the viral backbone of AcNPV and bMON14272, but not within the contiguous mini-F-Kan-mini-attTn7 sequences of bMON14272.
[0412] Sequence Alignment 30: Synthetic pairs of linkers comprising recognition sites for AbsI and SgrDI AbsI (CC′TCGA,GG) has a 5′ overhang of TCGA, which is compatible with SgrDI (CG′TCGA,CG), and the 6-base cutters, PaeR7I (C′TCCGA,G), PspXI (VC′TCGA,GB [where V=A or C or G, and B=C or G or T]), SalI (G′TCGA,C), and XhoI (C′TCGA,G). AbsI cuts AcNPV one (1) time, and bMON14272 two (2) times. SgrDI cuts AcNPV three (3) times, and bMON14272 three (3) times.
[0413] Synthetic DNA sequences comprising recognition sites for AbsI and SgrDI are shown below, separated by a series of unspecified nucleotides, specified here as a series of 8 “n” residues, which may comprise recognition sites for other restriction enzymes. The number of unspecified or ambiguous residues can vary, to be larger or smaller than 8 residues, depending on the desired application. In the first example below, ligation of a linker digested to expose a AbsI site at its 3′ end with a linker digested to expose a SgrDI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme. In the second example below, ligation of a linker digested to expose a SgrDI site at its 3′ end with a linker digested to expose a AbsI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme.
[0414] The restriction enzyme XhoI (C′TCGA,G) recognizes the center 6 bp of the AbsI site (CC′TCGA,GG) and SalI (G′TCGA,C) recognizes the center 6 bp of the SgrDI (CG′TCGA,CG) site. The hybrid scar site is also not recognized or digestible by XhoI or SalI.
##STR00034##
[0415] MauBI (CG′CGCG,CG) has a 5′ overhang of CGCG, which is compatible with AscI (GG′CGCG,CC), and the 6-base cutters BssHII (G′CGCG,C) and M/ul (A′CGCG,G). MauBI cuts AcNPV zero (0) times, and bMON14272 zero (0) times. AscI cuts AcNPV two (2) times, and bMON14272 two (2) times.
[0416] Synthetic DNA sequences comprising recognition sites for MauBI and AscI are shown below, separated by a series of unspecified nucleotides, specified here as a series of 8 “n” residues, which may comprise recognition sites for other restriction enzymes. The number of unspecified or ambiguous residues can vary, to be larger or smaller than 8 residues, depending on the desired application. In the first example below, ligation of a linker digested to expose a AscI site at its 3′ end with a linker digested to expose a MauBI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme. In the second example below, ligation of a linker digested to expose a MauBI site at its 3′ end with a linker digested to expose a AscI site at its 5′ end produces a fragment with an internal scar that is not digestible by either enzyme.
[0417] The restriction enzyme BssHII (G′CGCG,C) which recognizes the center 6 bp of both MauBI and AscI can cut at either site, plus the hybrid scar site that is not recognized or digestible by MauBI or AscI.
##STR00035##
[0418] In view of the hybrid scar sites produced by ligating the sticky ends on DNA fragments digested with restriction enzymes that have recognition sites that are typically 8 bp in length illustrated in Sequence Alignments 28-30, a variety of prefix and suffix linkers can be considered for general use in the design and assembly of genetic elements for use in modular vector systems. The following table outlines 8 combinations of recognition sites for compatible restriction enzymes that can used in pairs on synthetic prefix and suffix linkers that flank a DNA fragment of interest. In each pair, the recognition site for the second enzyme listed in the prefix is compatible with the first enzyme listed in the suffix.
[0419] The recognition site for each enzyme in a prefix or suffix illustrated below is separated by a series of unspecified nucleotides, specified here as a series of 8 “n” residues, which may comprise recognition sites for other restriction enzymes. The number of unspecified or ambiguous residues can vary, to be larger or smaller than 8 residues, depending on the desired application.
TABLE-US-00040 TABLE 21 Pairs of recognition sites for restriction enzymes useful in the design of synthetic linkers suitable for use in the assembly of modular vectors Prefix SEQ ID NO Suffix SEQ ID NO MauBI-AbsI 129 SgrDI-AscI 136 MauBI-SgrDI 130 AbsI-AscI 134 AscI-AbsI 131 SgrDI-MauBI 135 AscI-SgrDI 132 AbsI-MauBI 133 AbsI-MauBI 133 AscI-SgrDI 132 AbsI-AscI 134 MauBI-SgrDI 130 SgrDI-MauBI 135 AscI-AbsI 131 SgrDI-AscI 136 MauBI-AbsI 129
##STR00036##
Sequence Alignment 34: Compatibility of different prefix or suffix linkers comprising recognition sites for two restriction enzymes that are 8-bp long separated by additional spacer sequences
[0420] In this example, the spacer sequences in the MauBI and AbsI sites in the prefix linker and the SgrDI and AscI suffix linker are both replaced by the recognition site for the Pad (TTA,AT′TAA). Pad cuts 13 times in AcNPV and 13 times in bMON14272 (but not within the min-F-Kan-mini-attTn7 segment), and is compatible with AsiSI (GCG,AT′CGAA), PvuI (CG,AT′CG).
[0421] Digestion of the DNA fragment flanked by the prefix and suffix sequences noted below with Pad will allow release of the insert that also contains the 3′ portion of the prefix linker and the 5′ portion of the suffix linker, allowing ligation of the insert fragment into a vector comprising an Pad site in either orientation, or ligation of the vector that retains the 5′ portion of the prefix linker and the 3′ portion of the suffix linker to regenerate a single Pad site.
[0422] In one of many possible variations, the spacer sequences in the MauBI and AbsI sites in the prefix linker and the SgrDI and AscI suffix linker are both replaced by the recognition site for the FseI (GG,CCGG′CC). FseI cuts once in AcNPV and once in bMON14272, and is not compatible with any other restriction enzyme since the sticky end that is generated is a 4-bp 3′ CCGG overhang.
[0423] Digestion of the DNA fragment flanked by the prefix and suffix sequences noted below with FseI will allow release of the insert that also contains the 3′ portion of the prefix linker and the 5′ portion of the suffix linker, allowing ligation of the insert fragment into a vector comprising an FseI site in either orientation, or ligation of the vector that retains the 5′ portion of the prefix linker and the 3′ portion of the suffix linker to regenerate a single FseI site. An EagI site, which is compatible with NotI, overlaps the FseI and AscI sites (data not shown).
[0424] One advantage of using Pad instead of FseI as the spacer sequence is that the Pad recognition sequence is very AT-rich, compared to the recognition sequence for FseI, which is very GC-rich. A long stretch of GC-rich residues across the entire prefix-spacer-prefix and suffix-spacer-suffix sequences may prevent or impair the ability of DNA segments to be synthesized where the prefix and suffix sequences flank a desired set of genetic elements, compared to prefix and suffix sequences where the spacer sequence is more AT-rich. Note also that Pad cuts 13 times in AcNPV and in bMON14272, while FseI cuts once each in AcNPV and bMON14272, which may alter strategies for assembling modular baculovirus vectors using Pad in a spacer sequence, compared to FseI.
TABLE-US-00041 TABLE 22 Summary of pairs of synthetic prefix and suffix linkers comprising two 8-bp recognition sites separated by the recogntion site for Pact each pair separate by an intervening sequence (IV) comprising an AvrII site SEQ SEQ SEQ Digestion/ SEQ ID ID Prefix-AvrII-Suffix ID Ligation ID Prefix NO Suffix NO Double Polylinker NO Product NO MauBI- 137 SgrDI- 144 MauBI-PacI-AbsI-AvrII- 145 MauBI-PacI- 153 PacI-AbsI PacI-AscI SgrDI-PacI-AscI AscI MauBI- 138 AbsI-PacI- 142 MauBI-PacI-SgrDI-AvrII- 146 MauBI-PacI- 153 PacI-SgrDI AscI AbsI-PacI-AscI AscI AscI-PacI- 139 SgrDI- 143 AscI-PacI-AbsI-AvrII- 147 AscI-PacI- 154 AbsI PacI-MauBI SgrDI-PacI-MauBI MauBI AscI-PacI- 140 AbsI-PacI- 141 AscI-PacI-SgrDI-AvrII- 148 AscI-PacI- 154 SgrDI MauBI AbsI-PacI-MauBI MauBI AbsI-PacI- 141 AscI-PacI- 140 AbsI-PacI-MauBI-AvrII- 149 AbsI-PacI- 155 MauBI SgrDI AscI-PacI-SgrDI SgrDI AbsI-PacI- 142 MauBI- 138 AbsI-PacI-AscI-AvrII- 150 AbsI-PacI- 155 AscI PacI-SgrDI MauBI-PacI-SgrDI SgrDI SgrDI- 143 AscI-PacI- 139 SgrDI-PacI-MauBI-AvrII- 151 SgrDI- PacI- 156 PacI-MauBI AbsI AscI-PacI-AbsI AbsI SgrDI- 144 MauBI- 137 SgrDI-PacI-AscI-AvrII- 152 SgrDI-PacI- 156 PacI-AscI PacI-AbsI MauBI-PacI-AbsI AbsI
TABLE-US-00042 TABLE 23 Pairs of synthetic prefix and suffix linkers comprising two 8-bp recognition sites separated by the recogntion site for Pacl, each pair separated by an intervening sequence (IV) comprising an Avrll site SEQ IV SEQ Prefix or ID or ID Ligated Digestion Product (LP) NO LP Suffix NO MauBI PacI AbsI 137 // SgrDI PacI AscI 144 | | | | | | CG′CGCG,CG tta,at′taa CC′TCGA,GG CG′TCGA,CG tta,at′taa GG′CGCG,CC BssHII Xhol SalI BssHII CG′CGCG,CG tta,at′taa CC′TCGA,GG cctagg CG′TCGA,CG tta,at′taa GG′CGCG,CC 145 CG′CGCG,CG tta,at′′taa GG′CGCG,CC 153 MauBI PacI SgrDI 138 // AbsI PacI AscI 142 | | | | | | CG′CGCG,CG tta,at′taa CG′TCGA,CG CC′TCGA,GG tta,at′taa GG′CGCG,CC BssHII SalI XhoI BssHII CG′CGCG,CG tta,at′taa CG′TCGA,CG cctagg CC′TCGA,GG tta,at′taa GG′CGCG,CC 146 CG′CGCG,CG tta,at′taa GG′CGCG,CC 153 AscI PacI AbsI 139 // SgrDI PacI MauBI 143 | | | | | | GG′CGCG,CC tta,at′taa CC′TCGA,GG CG′TCGA,CG tta,at′taa CG′CGCG,CG BssHII XhoI SalI BssHII GG′CGCG,CC tta,at′taa CC′TCGA,GG cctagg CG′TCGA,CG tta,at′taa CG′CGCG,CG 147 GG′CGCG,CC tta,at′taa CG′CGCG,CG 154 AscI PacI SgrDI 140 // AbsI PacI MauBI 141 | | | | | | GG′CGCG,CC tta,at′taa CG′TCGA,CG CC′TCGA,GG tta,at′taa CG′CGCG,CG BssHII SalI XhoI BssHII GG′CGCG,CC tta,at′taa CG′TCGA,CG cctagg CC′TCGA,GG tta,at′taa CG′CGCG,CG 148 GG′CGCG,CC tta,at′taa CG′CGCG,CG 154 AbsI PacI MauBI 141 // AscI PacI SgrDI 140 | | | | | | CC′TCGA,GG tta,at′taa CG′CGCG,CG GG′CGCG,CC tta,at′taa CG′TCGA,CG XhoI BssHII BssHII SalI CC′TCGA,GG tta,at′taa CG′CGCG,CG cctagg GG′CGCG,CC tta,at′taa CG′TCGA,CG 149 CC′TCGA,GG tta,at′taa CG′TCGA,CG 155 AbsI PacI AscI 142 // MauBI PacI SgrDI 138 | | | | | | CC′TCGA,GG tta,at′taa GG′CGCG,CC CG′CGCG,CG tta,at′taa CG′TCGA,CG XhoI BssHII BssHII SalI CC′TCGA,GG tta,at′taa GG′CGCG,CC cctagg CG′CGCG,CG tta,at′taa CG′TCGA,CG 150 CC′TCGA,GG tta,at′taa CG′TCGA,CG 155 SgrDI PacI MauBI 143 // AscI PacI AbsI 139 | | | | | | CG′TCGA,CG tta,at′taa CG′CGCG,CG GG′CGCG,CC tta,at′taa CC′TCGA,GG SalI BssHII BssHII XhoI CG′TCGA,CG tta,at′taa CG′CGCG,CG cctagg GG′CGCG,CC tta,at′taa CC′TCGA,GG 151 CG′TCGA,CG tta,at′taa CC′TCGA,GG 156 SgrDI PacI AscI 144 // MauBI PacI AbsI 137 | | | | | | CG′TCGA,CG tta,at′taa GG′CGCG,CC CG′CGCG,CG tta,at′taa CC′TCGA,GG Sall BssHII BssHII XhoI CG′TCGA,CG tta,at′taa GG′CGCG,CC cctagg CG′CGCG,CG tta,at′taa CC′TCGA,GG 152 CG′TCGA,CG tta,at′taa CC′TCGA,GG 156
Proof of Concept Experiments
[0425] Twenty vectors were designed and synthesized Twist Biosciences (T), which included test, target, and donor vectors. Twist vectors with the prefix pTAH, confer resistance to ampicillin and have a high copy number (H). Vectors with the prefix pTCM, confer resistance to chloramphenicol and have a medium copy number (M). Vectors with the prefix pTKM, confer resistance to kanamycin and have a medium copy number. Test vectors have the suffix -CX or -KX, target vectors have the suffix -CT or -KT, and donor vectors have the suffix -AD.
[0426] Test vectors comprise sequences that mimic transposition of Tn7 in a synthetic attachment site in different reading frames to express extended or truncated fusion protein that may or may not confer resistance to an antibiotic such as chloramphenicol or kanamycin. Target vectors are similar, but also contain the synthetic attachment site positioned an appropriate distance away from where the insertion is desired. Donor vectors typically contain the left and right arms of Tn7 flanking a cargo DNA sequence that may contain one or more synthetic polylinkers that contain recognition sites for several restriction enzymes (also referred to as a multiple cloning site or MCS), and other genes, such as the lacZalpha gene derived from pUC18, pUC19, or similar cloning vectors, wild-type and variant forms of the aacC1 gene derived from pFastBac1 conferring resistance to gentamycin, the rpsL gene conferring resistance to streptomycin, and genes encoding products that confer a screenable phenotype upon a cell, such as chromogenic or fluorescent proteins, or the uidA gene encoding E. coli beta glucuronidase.
[0427] Dry DNA samples were resuspended in water or Tris-EDTA buffer, and transformed into competent E. coli DH10B cells using a protocol provided by Thermo Fisher, and purified by restreaking on agar plates containing the antibiotic of the drug resistance gene on the backbone of the vector. Liquid LB media supplemented with antibiotics were used to prepare overnight cultures. Glycerol stocks were prepared from overnight cultures and stored at −20 degrees Celsius. The phenotypes of DH10B cells harboring different vectors were determined by restreaking overnight cultures on LB agar plates containing different concentrations of antibiotics, typically, Amp 100, IPTG 40, X-Gal 40, Cam 50, Kan 50, or a series of concentrations on solid agar or liquid LB medium, that included Cam 0, 6.25, 12.5, and 25, or Kan 0, 12.5, 25, and 50.
TABLE-US-00043 TABLE 24 Summary of Twist Vectors 1-20 Size SEQ ID Expected Observed of NO of ID Code Short Name Description Phenotype Phenotype Insert Insert 01-AD pTAH-new-mini-Tn7 New-miniTn7 with smaller flanking AmpR, Iac AmpR, Iac 546 199 sequences and internal MauBI-PacI- minus minus AbsI-AvrII-SbfI(PstI)-SacII-SgrDI- PacI-AscI polylinker 02-AD pTAH-new-mini-Tn7- New mini-Tn7 with internal AmpR, Iac AmpR, Iac 986/79 200/201 lacZalphapUC18 lacZalpha region derived from plus pUC18 03-CX pTCM-Kan-CGRT Kan extended with CGRTK to mimic CamR, KanR CamR, KanS 1028 202 Tn7LrfI 04-CX pTCM-Kan-PS Kan extended with PS to mimic CamR, KanS CamR, KanS 1028 203 prior art reference with silent EcoRI and SpeI sites 05-CX pTCM-Kan- Kan extended with PSFNAVVYHS to CamR, KanS CamR, KanS 1040 204 PSFNAVVYHS mimic prior art reference 06-CT pTCM-Kan-PS-mini- Kan extended with PS and CamR, KanS CamR, KanS 1069 205 attTn7 overlapping mini-attTn7 07-CX pTCM-Kan-Tn7Lrf1 Kan extended with CGRTK with CamR, KanR CamR, KanS 1074 206 partial Tn7L rf1 08-CX pTCM-Kan-Tn7Lrf2 Kan extended with CamR, KanR CamR, KanS 1075 207 LWADKIVGNWEGWKWSF with partial Tn7L rf2 09-CX pTCM-Kan-Tn7Lrf3 Kan extended with CamR, KanR CamR, KanS 1076 208 PVGGQNSWELGGVEMEFLRII with partial Tn7L rf3 10-CX pTCM-Mau-Abs- Kan extended with PS to mimic CamR, KanS CamR, KanS 1016 209 Kan177-PS-Sgr-Asc prior art reference without silent EcoRI or SpeI sites 11-CX pTCM-Mau-Abs- Kan gene from pACYC177 not CamR, KanR CamR, KanR 1016 210 Kan177-Sgr-Asc extended or truncated without silent EcoRI or SpeI sites 12-KX pTKM-CATd8 CAT gene from pACYC184 not KanR, CamR KanR, CamR 876 211 extended or truncated and deleted 8 bases from the right polylinker 13-KX pTKM-CAT-TAA TAA replaced Asp Codon KanR, CamR KanR, CamR 876 212 14-KX pTKM-CAT-TAATAA TAATAA replaced CysAsp Codons KanR, CamS KanR, Cam(S) 876 213 with micro colonies on Kan 50/Cam 50 15-KT pTKM-CAT-TAATAA- TAATAA replaced CysAsp Codons- KanR, CamS KanR, Cam(S) 889 214 mini-attTn7 overlapping mini-AttTn7 with micro colonies Kan 50/Cam 12.5 and Kan 50/Cam 50 16-KX pTKMC-CAT-Tn7Lrf1 CAT extended with CGRTK with KanR, CamR KanR, CamR 896 215 partial Tn7L rf1 17-KX pTKMC-CAT-Tn7Lrf2 CAT extended with KanR, CamR KanR, CamR 897 216 LWADKIVGNWEGWKWSF with partial Tn7L rf2 18-KX pTKMC-CAT-Tn7Lrf3 CAT extended with KanR, CamR KanR, CamR 898 217 PVGGQNSWELGGVEMEFLRII with partial Tn7L rf3 19-KT pTKM-lacZalpha- lacZalpha-micro-attTn7 which is Kan R, Iac Kan R, Iac 687 218 micro-attTn7 150 nt smaller than pTKM-19-KT plus plus 20-KT pTKM-lacZalpha- lacZalpha-mini-attTn7 similar to Kan R, Iac Kan R, Iac 837 219 mini-attTn7 the sequence in the bacmid plus plus bMON14272
[0428] A first series of gene fusions has the cat gene altered, so that insertions take place near an essential cysteine codon, upstream from the normal stop codon as disclosed in Example 2. Extensions after transposition were expected to restore resistance to chloramphenicol.
[0429] Colonies harboring the test vectors, where the extension included sequences derived from the left end of Tn7 in three different reading frames, all grew on agar plates containing kanamycin and chloramphenicol, strongly suggesting that transposition into the gene fusion sequence in the target vector should restore activity to the encoded gene fusion.
[0430] Cells harboring the pTKM-14-KX and pTKM-15-KT vectors grew very slowly, forming microcolonies on agar plates after 1 day, containing kanamycin and chloramphenicol, as noted above.
[0431] A second series of gene fusions has the NPT-II gene, which confers resistance to kanamycin, altered so that insertions take place near the normal stop codon just upstream from an extension that encodes proline and serine, that were expected to produce a fusion protein that is inactive, as disclosed in Example 4. Colonies harboring the test vectors, where the extension included sequences derived from the left end of Tn7 in three different reading frames, did not confer resistance to chloramphenicol and kanamycin, which was unexpected, compared to the results observed for the cat-attTn7 gene fusions.
[0432] A third series of gene fusions has the lacZalpha gene with the mini-attTn7 site inserted into it, to mimic the target site in the bacmid bMON14272, and a smaller version that deletes 150 bp flanking the MCS region in the mini-attTn7 sequence in this gene. Both of these target vectors conferred resistance to kanamycin and were lac plus on agar plates containing IPTG and X-gal.
[0433] The donor vector pTAH-01-AD conferred resistance to ampicillin and the donor vector pTAH-02-AD conferred resistance to ampicillin and was lac plus on agar plates containing IPTG and X-gal.
[0434] Transposition experiments were carried out by first transforming the helper vector pMON7124 into DH10B cells harboring the target vectors pTKM-CAT-TAATAA-mini-attTn7, pTKM-lacZalpha-micro-attTn7, or pTKM-lacZalpha-mini-attTn7, and isolating pure colonies on agar plates containing chloramphenicol and tetracycline, or kanamycin and tetracycline, depending on the drug resistance marker on the backbone of the target vector. Overnight cultures containing the target and helper vectors were prepared and transformed with a donor vector pTAH-new-mini-Tn7-lacZalphapUC18 or pFastBac1.
[0435] Two independent cultures of cells harboring pTKM-CAT-TAATAA-mini-attTn7 and pMON7124 that were transformed with pTAH-new-mini-Tn7-lacZalphapUC18 and spread on LB agar plates containing Kan 50, Cam 25, Tet 20, IPTG and X-gal, contained a mixture of blue and white colonies. Blue colonies from the two independent cultures were restreaked on the same agar plates, and pure overnight cultures prepared and stored as glycerol stocks.
[0436] Samples of each glycerol stock were provided to GeneWiz, which prepared DNA samples comprising a mixture of both the composite and the helper vectors that were used as templates for sequencing across the junction of the left end of Tn7 and the expected insertion site in the gene fusion of the target vector. Structural analysis of the both composite vectors confirmed the mini-Tn7-lacZalpha gene from the donor vector was inserted into the pTKM-CAT-TAATAA-mini-attTn7 vector to produce a composite vector, where the gene fusion was extended into the left end of Tn7 to restore resistance to chloramphenicol. This is apparently the first demonstration of transposition into a gene fusion based on selection for restoration of activity of the encoded enzyme.
TABLE-US-00044 Sequence Alignment 35: Sequence of 240 bp segment across the insertion site in a 15KCT-2A7-Blue-1 composite target vector derived from pTKM-CAT-TAATAA-mini-attTn7 and a mini-Tn7-lacZalpha donor segment SEQ ID NO 240 CAATCCCTGGGTGAGTTTCACCAGTTTTGATTTAAACGTGGCCAATATGGACAACTTCTTCGCCCCCGTTTTCACCATGG <-- Partial coding sequence of 3′ end of the cat gene --------------------------> GCAAATATTATACGCAAGGCGACAAGGTGCTGATGCCGCTGGCGATTCAGGTTCATCATGCCGTCTGTGATGGCTTCCAT <------------------------------------------------------------------------------> GTCGGCAGAATGCTTAATGAATTACAACAGTNC NGTNGNNNGNCAAAATAGTTGGGAACTGGGAGGGGTGGAAATGGAGT <-------------------------------> <-- Tn7L * Stop Codon -----------------
With unsure nucleotides at positions 192, 194, 197, 199-201, and 203.
[0437] Independent cultures of cells harboring pTKM-lacZalpha-mini-attTn7 or pTKM-lacZalpha-micro-attTn7 plus the helper vector pMON7124 were also transformed with pFastBac1, and spread on LB agar plates containing Kan 50, Tet 20, Gent 7, IPTG, and Bluo-gal, which contained a mixture of blue and white colonies after one day. White colonies from the two independent cultures were restreaked on the same agar plates, and pure overnight cultures prepared and stored as glycerol stocks.
[0438] Samples of each glycerol stock were provided to GeneWiz, which prepared DNA samples comprising a mixture of both the composite and the helper vectors that were used as templates for sequencing across the junction of the left end of Tn7 and the expected insertion site in the gene fusion of the target vector. Structural analysis of the both types of composite target vectors confirmed that the mini-Tn7-5V40-MCS-PpolH-Gent segment from the pFastBac1 donor vector was inserted into both types of target vectors comprising a lacZalpha-mini-attTn7 gene to produce composite target vectors, where the gene fusion is disrupted by the insertion of the mini-transposon, preventing complementation between the alpha peptide and the acceptor polypeptide, resulting in a lac minus phenotype on agar plates containing IPTG and the chromogenic substrate X-gal or Bluo-gal (Nucleotide sequence data across the junctions in the composite vectors is not shown).
[0439] Taken together, all three sets of transposition experiments demonstrated that DH10B cells harboring novel medium copy target vectors and compatible helper vectors could be used to test transposition from a variety of new modular donor vectors, reconstituting in a sense, the donor/helper/target vector system used in the original baculovirus shuttle vector system, but substituting much smaller target vectors that could be used in a systematic analysis of gene fusions that could be used to directly select or screen for transposition events in bacteria.
[0440] A second series of vectors were designed and ordered from Twist Biosciences (Vectors 21-41) to test the significance or optimize the effectiveness of different DNA segments in the target or donor vectors.
[0441] Cells harboring the first series of cat-attTn7 fusions grew very slowly, and replacing the cat promoter with an inducible lac promoter, and encoding a protein ending with ELQQY instead of ELQQYC may allow them to grow better under uninduced and induced conditions. The sulfhydryl group in the extra Cysteine residue at the end of the protein may react with other molecules within the cell if is expressed at high levels.
[0442] Two alterations to the kan gene (adding a silent EcoRI site, without altering the codons upstream from the stop codon, or a SpeI site, downstream from the stop codon) just upstream and downstream from the natural stop codon could have affected the outcome. Extensions added by reading into Tn7L in different reading frames could also prevent restoration of activity to the fusion protein.
[0443] New vectors where designed to separate these issues, to remove the altered EcoRI site, and to redesign the kan fusions so that transposition into a vector that has a Pro-Ser extension will truncate it back to the normal stop codon. To do this though, the TGT (encoding Cys) at the left end of Tn7L has to be in the right reading frame, to encode a normal sized enzyme. The last amino acid is Phe (F), and the second to last is also Phe, but the second to last is not always conserved in lineups of related kanamycin phosphotransferases. The second to last codon was altered to encode Leucine (L), which should allow expression of a product that has the same size after transposition, from the gene encoding extended, inactive PS fusion protein.
[0444] Several new donor vectors were designed work with the kan gene comprising the F270L mutation to contain stop codons in several different reading frames. While many are possible, three were designed and synthesized, two containing Pad sites (TTAATTAA) in slightly different positions just beyond the TGT, and one containing an XbaI site that has a TAG stop codon within it. Transposition of any of the three new donors should restore kanamycin activity in the target vectors comprising the redesigned kan-attTn7 sequence. Altered sequences near the 5′ end of Tn7L don't need to be palindromic. Other sequences can be used as long as the truncation or extension restores activity to the encoded protein. If TGT is an essential requirement at the 5′ end of Tn7 in a donor vector, it can be inserted into 3 different reading frames as noted below.
TABLE-US-00045 TABLE 25 Encoding amino acids by Tn7L after transposition into a target site Three Reading TGT Nnn Frames Encoded polypeptide nTG Tnn rf1, rf2, and rf3 segment nnn nnT GTN nnn nnn TGT nnn nnn X-C-X-X $ C $ $ Excludes 19 aa plus * nnn nTG Tnn nnn X-(L/M/V)- $ LMV FLSY*CW $ (F/L/S/Y/*/C/W)-X Excludes Excludes 17 aa plus * PHQRIMTNKVADE nnn nnT GTn nnn X-(FSYCILTVPNAHRDG)-(V)-X $ FSYCILTVPNAHRDG V $ Excludes Excludes WQ*MKE 19 aa plus * *The symbol “$” represents any amino acid and any of the three stop codons is represented by “*”. “QKE” are common to the list of excluded amino acids, preceded by “#”, for reading frames 2 and 3. The net effect is that polypeptides containing adjacent Q, K, or E residues will be difficult to encode for restoration or disruption of activity by a Tn7-like transposon.
[0445] Other site-specific transposons may have sequences at their ends that are different than TGT, which maybe longer or shorter, complicating the algorithm noted above, but fusions created after transposition should be predictable based on genetic code tables for different organisms.
[0446] Target and donor vectors comprising the rpsL gene (conferring sensitivity to streptomycin) and a chromogenic staghorn coral protein were also designed. The target vector containing rpsL-attTn7 gene should allow direct selection of transposition events in the presence of streptomycin. The coral-attTn7 gene should allow detection of white colonies in a background of cyan blue colonies (without the need to use IPTG and expensive X-gal or Bluo-Gal chromogenic substrates.
[0447] Several donor vectors were synthesized to contain two genes, lacZalpha, rpsL, or CyanFP, plus the gentamycin resistance gene derived from pFastBac1, which can be used to test and monitor transposition events with or without selection of drug resistance conferred by a marker within the cargo segment of the donor vector.
[0448] The new “double donors” can easily be reduced in size, removing the first or second gene by digesting with a single restriction enzyme that has a site that flanks either gene, and ligating to circularize the molecule.
[0449] Two codons near the 5′ end of the gentamycin resistance gene were altered to have silent changes to encode Serine, since the Twist Sequence Analysis flagged part of the unaltered sequences to be part of a direct repeat just upstream from the ATG start codon. Vectors without these changes could not be synthesized due to the direct repeats flagged by their system.
TABLE-US-00046 TABLE 26 Summary of New Vectors 21-40 SEQ ID Expected Observed Size of NO ID Code Short_Name Description phenotype Phenotype Insert of Insert 21-CX pTCM-21C-Kan- Kan MLDEFF not extended or CamR, KanR CamR, KanR 1016 220 EcoRI truncated with silent EcoRI site 22-CX pTCM-22C-Kan- Kan MLDEFFCGRTK extended to CamR, KanS CamR, KanS 1025 221 MLDEFFCGRTK mimic Tn7Lrf1 without silent if CGRTK EcoRI and Spel sites extension doesn't restore activity 23-CX pTCM-23C-Kan- Kan MLDELF-F270L (TTT-Phe to CamR, KanR, CamR, KanR 1016 222 F270L CTG-Leu) if F270L is conservative 24-CX pTCM-24C-Kan- Kan MLDELFPS-F270L (TTT-Phe to CamR, KanS, if CamR, KanS 1016 223 MLDELFPS-F270L CTG-Leu) extended PS F270L and PS fusion is inactive 25-CX pTCM-25C-Kan- Kan MLDELFN-TG-TTT-AAT-TAA- CamR, Kan? CamR, KanS 1021 224 MLDELFPSN-F270L Pacl-1 extended N 26-CX pTCM-26C-Kan- Kan MLDELF-TG-TTT-TAA-TTT-A- CamR, KanR CamR, KanR 1022 225 MLDELF-F270L Pac1-2, Phe to Leu, plus Phe before TAA stop should be resistant 27-CX pTCM-27C-Kan- Kan MLDELF-TG-TTC-TAG-A-Xbal, CamR, KanR CamR, KanR 1022 226 MLDELF-F270L Phe to Leu, plus Phe before TAG stop should be resistant 28-CT pTCM-28C-Kan- Kan MLDELFPS-F270L (TTT-Phe to CamR, KanS CamR, KanS 1064 227 MLDELFPS-F270L- CTG-Leu)-FPS-Stop-mini-attTn7 attT version 1, should be sensitive 29-CT pTCM- LacP-Kan MLDELFQA-F270L (TTT- CamR, KanR CamR, KanS 1188 228 29CLacPKanMLDEL Phe to CTG-Leu)-FQA-Stop-mini- FQA-F270Latt attTn7 should be resistant if QA doesn't affect activity 30-CT pTCM- LacP-Kan MLDELFPS-F270L (TTT- CamR, KanS CamR, KanS 1188 229 30CLacPKanMLDEL Phe to CTG-Leu)-FPS-Stop-mini- FPS-F270Latt attTn7 version 1, replacing the kan promoter, with lacPO inducible promoter driving kan- mini-attTn7 31-KT pTKM- Lac promoter-cat gene-TAATAA KanR, CamS KanR, CamR 965 230 31KTLacPCatTAATA replaced CysAsp Codons- when ACysAspatt overlapping mini-AttTn7 ending spotted, not ELQQY, replacing the cat streaked promoter with lacPO driving CAT- mini-attTn7 encoding truncated cat protein 32-KT pTKM-32KT- Lac promoter-cat gene-TAA KanR, CamS KanR, CamR, 965 231 LacPCat- replaced Asp Codon-overlapping when TAArepAspatt mini-AttTn7 ending ELQQYC, spotted, not replacing the cat promoter with streaked lacPO driving CAT-mini-attTn7 encoding truncated cat protein 33-KT pTKM-33KT-rpsL- rpsL-mini-attTn7 with insertion in KanR, StrepS KanR, StrepS, 965 232 mini-attTn7 codon 122 of 125 encoding but very slow GVKRPKA before insertion, and or no growth replacing PKA after insertion so target with dominant StrepS gene linked to mini-attTn7 is disrupted by transposition and confers StrepR 34-KT pTKM-34KT-LacP- Lac promoter-Cyan chromogenic KanR, cyan KanR, white 1016 233 CyanFP-attTn7 protein-mini-attTn7 encoding NPLKVQ before insertion near codon 228 of 231 replacing KVQ so transposition disrupts protein (colored to white). 35-AD pTAH-35AD- Mini-Tn7-MauBl-Absl-LacZalpha- AmpR, GentR, AmpR, GentS, 1822 234 miniTn7-lacZalpha- SgrDI-Absl-Gent-SgrDI-Ascl, with lac plus lac plus Gent wild-type Tn7 ends 36-AD pTAH-36AD- Mini-Tn7L-Pacl-2a-lacZalpha- AmpR, GentR, AmpR, GentS, 1822 235 Tn7LPac1-2a-lacZ- Gent where Tn7L in rf2 would lac plus lac plus Gent encode Kan-MLDELF*, with altered Tn7L and Padl site 37-AD pTAH-37AD-Tn7L- Mini-Tn7L-Pacl-la-lacZalpha- AmpR, GentR, AmpR, GentS, 1822 236 Pacl-la-lacZaGent Gent where Tn7L in rf2 would lac plus lac plus encode Kan-MLDELFN* with altered Tn7L and Padl site 38-AD pTAH-38AD- Mini-Tn7L-Xbal-lacZalpha-Gent AmpR, GentR, AmpR, GentS, 1822 237 Tn7LXbal-1a-lacZa- where Tn7L in rf2 would encode lac plus lac plus Gent Kan-MLDELF* with altered Tn7L and Xbal site 39-AD pTAH-39AD-mini- Mini-Tn7-MauBl-Absl-rpsL-SgrDI- AmpR, GentR AmpR, GentS 1868 238 Tn7-rpsL-Gent Absl-Gent-SgrDI-Ascl, with rpsL dominant StrepS gene, plus Gentamycin gene 40-AD pTAH-40AD-mini- Mini-Tn7-MauBl-Absl-lacP- AmpR, GentR AmpR, GentS 2278 239 Tn7-CyanFP--Gent AmilCyanFP-SgrDI-Absl-Gent- SgrDI-Ascl with Cyan chromogenic coral fluorescent
[0450] Analysis of the phenotypes of colonies harboring different test vectors confirmed that introducing a silent EcoRI site at the 3′ end of the kan gene did not affect activity of the encoded protein, but adding extensions that mimicked reading frames extending into a wild-type Tn7L resulted in fusion proteins that did not confer resistance to kanamycin. Gene fusions comprising a conserved F270L mutation at the 3′ end of the kan gene, did not affect activity of the encoded enzyme, while those encoding extensions adding PS or QA did affect activity of the enzyme. These results strongly suggest that gene fusions comprising an altered form of the kan gene fused to mini-attTn7 can be used to detect transposition events where the insertion truncates an extended, inactive fusion protein back to a sequence that has the same length as the wild-type enzyme that also contains the conserved F270L substitution near the C-terminal end of the enzyme.
[0451] Analysis of the phenotypes of colonies harboring target vectors comprising altered cat-mini-attTn7 sequences gave different results when cultures were streaked, compared to spotted onto agar plates containing kanamycin plus chloramphenicol. Colonies comprising these vectors grew well on agar plates containing kanamycin, but not at all or poorly on agar plates containing kanamycin and chloramphenicol. When 20 ul of cells from an overnight culture were spotted onto agar plates containing kan, cam, or kan and cam, both grew well on plates containing kanamycin after 1 day, but grew well on all test plates after 2 days. Chloramphenicol is bacteriostatic, so inactivation of the antibiotic by any mechanism should allow growth if the concentration falls below a minimal inhibitory concentration, compared to kanamycin which is bacteriostatic, and kills cells that cannot inactivate the antibiotic.
[0452] Both strategies, restoring activity to cells harboring vectors comprising gene fusions encoding a catalytically-inactive enzyme, one by extension and one by truncation, can be used to with other types of genes encoding enzymes conferring resistance to antibiotics, including ampicillin, tetracycline, gentamycin, hygromycin, among many others, and pairs of toxin/anti-toxin genes, to facilitate the direct selection of transposition events in E. coli, and related bacteria.
[0453] Analysis of the phenotypes of colonies harboring new dual donor vectors revealed that the gentamycin gene that was inserted into these vectors was defective, and could not confer resistance to the antibiotic at 7 ug/ml, although they all conferred resistance to ampicillin at 100 ug/ml, and were lac plus on agar plates if they contained also the lacZalpha gene. The gene encoding a chromogenic protein derived from staghorn coral did not produce colonies that were noticeably different in color from lac minus colonies on agar plates containing IPTG and X-gal.
[0454] Analysis of the phenotypes of colonies harboring target and donor vectors comprising the rpsL gene did not grow or grew very slowly as microcolonies on different kinds of selection plates, suggesting that the product of this gene is toxic when it is carried on a high copy number vector, even in the absence of induction with IPTG.
[0455] Cells harboring each of the new target vectors and the helper vector were prepared by transforming target vector DNA samples into D10B cells harboring pMON7124, and their colony phenotypes compared on agar plates containing tetracycline plus different concentrations of kanamycin and/or chloramphenicol.
[0456] Cells harboring the pTCM-28C-Kan-MLDELFPS-F270L-attTn7, pTCM-29CLacPKanMLDELFQA-F270LattTn7, and pTCM-30CLacPKanMLDELFPS-F270LattTn7 target vectors plus pMON7124, all grew when 20 ul of overnight cultures were spotted onto agar plates containing chloramphenicol, but not on plates containing kanamycin, confirming that the PS, QA extensions did not encode an active enzyme.
[0457] Cells harboring the pTKM-31KTLacPCatTAATAACysAspattTn7 and pTKM-32KT-LacPCat-TAArepAspattTn7 target vectors plus pMON7124, all grew when 20 ul of overnight cultures were spotted onto agar plates containing chloramphenicol, kanamycin, or both chloramphenicol and kanamycin, which was unexpected, but consistent with observations noted above, where growth of cells on plates containing chloramphenicol, a bacteriostatic agent, might be observed on densely spotted plates, compared to plates where cultures are streaked out to form separate colonies.
[0458] Similar results were also obtained, when transposition experiments were carried out when two independent cultures of DH10B harboring the target vector pTKM-31KTLacPCatTAATAACysAspattTn7 or pTKM-32KT-LacPCat-TAArepAspattTn7 and the pMON7124 helper vector were transformed with four different donor vectors, pTAH-new-mini-Tn7-lacZalphapUC18, pTAH-37AD-Tn7L-PacI-1a-lacZaGent, pTAH-38AD-Tn7LXbaI-1a-lacZa-Gent, and pTAH-40AD-mini-Tn7-CyanFP-Gent, to and selecting for colonies that grew on agar plates containing Cam 25 Kan 50 Tet 10 IPTG Xgal Gent 7, Cam Kan Tet IPTG Xgal, Cam Kan Tet Gent, and Cam Kan Tet. Microcolonies were observed for all four combinations of donor vectors transformed into cells harboring pTKM-32KT-LacPCat-TAArepAspattTn7 and the pMON7124 on plates containing Cam Kan Tet IPTG Xgal, but not for cells harboring the pTKM-31KTLacPCatTAATAACysAspattTn7n7 vector, strongly suggesting that the gene fusion in the pTKM-32KT vector is suitable for selecting for transposition events that restore activity by extension of truncated cat gene that ends with the sequence ELQQYC, compared to the sequence encoded by the pTKM-32KT that ends with the sequence ELQQY, which did grew on plates cells containing kanamycin, but not on plates containing chloramphenicol. DNA sequence analysis across the target sites in parental and composite target vectors will be performed to confirm these observations.
[0459] Analysis of the sequence of the defective gentamycin resistance genes suggested that the “silent changes” made to two adjacent serine codons at the 5′ end of its coding sequence altered nucleotides at the 3′ end of second of three 15-bp direct repeats, one in the promoter region, and two which were are identical within the coding sequence. The functional nature of these direct repeats are not known, but are reported in the annotated version of the GenBank sequence of the transposon comprising the aacC1 gene.
[0460] The defective gentamycin resistance genes in four dual donor vectors pTAH-35AD-miniTn7-lacZalpha-Gent, pTAH-36AD-Tn7LPacI-2a-lacZ-Gent, pTAH-37AD-Tn7L-PacI-1a-lacZaGent, pTAH-38AD-Tn7LXbaI-1a-lacZa-Gent, and pTAH-40AD-mini-Tn7-CyanFP-Gent were repaired by digesting mixing pFastBac1 plus each of the new donor vectors with the restriction enzyme BtgI, which cuts twice in each of the new donors, just upstream from the promoter and downstream from the 3′ end of the gentamycin resistance gene, and three times in in pFastBac1, heat inactivating the restriction enzyme, and ligating with T4 DNA ligase, before transforming the mixture into competent DH10B cells. Two colonies from each ligation mixture that grew on agar plates containing ampicillin, gentamycin, IPTG and X-gal were purified by restreaking and DNA samples and DNA samples prepared were for sequencing. Colonies harboring the repaired pTAH-35AD-miniTn7-lacZalpha-Gent, pTAH-36AD-Tn7LPacI-2a-lacZ-Gent, pTAH-37AD-Tn7L-PacI-1a-lacZaGent, and pTAH-38AD-Tn7LXbaI-1a-lacZa-Gent dual donor vectors were blue on plates containing X-gal, while those harboring the pTAH-40AD-mini-Tn7-CyanFP-Gent vector were white. Miniprep DNA samples were prepared for sequence analysis to confirm that the defective gene was repaired in each of the dual donor vectors.
[0461] The new dual donor vectors will greatly facilitate the analysis of transposition events using target vectors comprising modified cat-mini-attTn7 or kan-mini-attTn7 fusions, among others, by allowing for the selection of composite vectors based on the restoration of activity in the gene fusion, and monitoring the expression of the lacZalpha gene, with and without selection for gentamycin resistance carried within the cargo sequence of the mini-transposon, and comparing their efficiencies of transposition under different selection or screening schemes.
Example 11—Design of Modular Donor Vectors
[0462] Many types of donor vectors comprising mini-Tn7 elements have been constructed, where the left and right arms of Tn7 (Tn7L and Tn7R) flank a central cargo DNA segment comprising one or more genes of interest that can all be transposed to a specific attachment site on a target vector or the chromosome by the products of the tnsA-D genes carried on a helper vector, or randomly transposed to a segment on a conjugal plasmid by the products of the tnsA-C and E genes. Random transposition has also been observed in several cases when products of the tnsA and tnsB genes are used with a gain-of-function mutant product encoded by a variant tnsC gene.
[0463] The pFastBac series of vectors commonly used to facilitate expression of heterologous proteins by recombinant baculoviruses in cultured insect cells are derived from pMON14327, that contains the left and right arms of Tn7 (Tn7L and Tn7R) flanking an internal region comprising a gene encoding resistance to gentamycin, along with the strong polyhedrin promoter (Ppolh) driving expression of a gene conceding β-glucuronidase, and a sequence comprising an SV40 poly(A) transcriptional terminator [Luckow et al, (1993)]. The order of genetic elements is Tn7L, SV40 poly(A), β-gluc, Ppolh, GentR, and Tn7R, with the promoter and coding sequences for the gentamycin resistance gene oriented towards Tn7R, and the SV40 poly(A)-β-gluc-Ppolh segment oriented in the opposite strand, towards Tn7L. This plasmid also contains an origin of replication from the cloning vector pUC8, and a gene encoding resistance to ampicillin (AmpR), which is incompatible with the replicon in the helper plasmid pMON7124, since they were both derived from replicons commonly used in the ColE1/pMB1/pBR322/pUC series of related cloning vectors.
[0464] The pFastBac1 vector (now available from ThermoFisher), which has a size of 4776 bp, contains a variety of genetic elements that are not typically required for many transposition experiments. The mini-Tn7 transposon is 2084 bp long, where Tn7L is 166 bp long, and Tn7R is 225 bp long, with its central cargo DNA segment is 1693 bp long, comprising the SV40 poly(A) transcriptional terminator, a multiple cloning site, the polyhedrin promoter, and the gene conferring resistance to gentamycin. A 159 bp sequence that flanks Tn7L is apparently derived from sequences in the intergenic region between the E. coli phoS gene (also called pstS) and the 5-bp duplication (corresponding to −2 to +2) site beyond the 3′ end of the glmS gene. A 62 bp sequence that flanks Tn7R is apparently derived from the 3′ end of the glmS gene, extending from positions −2 to +2 (the 5-bp duplication), +3 to +22 (including the second but not the first TAA stop codon), +23 to +58 (which is the TnsD binding site, and encodes the last 11 aa of the glmS gene product (*EVTVSKALNRP) and the first stop codon), followed by 6 bp to half of a natural HincII site within the glmS gene. The vector backbone also comprises a 456 bp sequence comprising a bacteriophage f1 origin of replication that is not involved in transposition.
[0465] Smaller versions of the pMON14327 and related pFastBac series vectors can constructed by using a smaller backbone without the bacteriophage f1 origin of replication and shorter sequences that flank Tn7L and Tn7R, shorter arms in some case, and a shorter internal cargo segment comprising a multiple cloning site permitting the modular assembly by cloning or direct insertion of synthetic DNA segments to generate synthetic mini-Tn7 transposons, capable of being transposed to a wide variety of random or specific locations on target vectors or the chromosome of a host cell.
[0466] In one new version of a donor vector, designated pTAH-new-mini-Tn7, the mini-Tn7 is 495 bp long, with left and right arms that are 166 and 225 bp in length, respectively, flanking a 104 bp central cargo DNA segment comprising a polylinker comprising several 8-bp recognition sites for several rare cutting restriction enzymes (including MauBI, AbsI, AvrII, SgrDI, and AscI) as noted above in Example 9.
[0467] A variant form of this vector, designated pTAH-new-mini-Tn7-lacZalphapUC18, was also constructed, that has a 460 bp lacZalpha segment including the lac promoter of the cloning vector pUC18 inserted between the AbsI and SgrDI sites of the polylinker.
[0468] Other variant forms, comprising longer or shorter left and right arms of the Tn7 or Tn7-like element, or with altered sequences, adding or removing recognition sites for different restriction enzymes, or adding or removing stop codons within the arms of transposon, and forms comprising one or more marker genes or cargo genes of interest between the arms of the transposon, wherein each marker or cargo gene of interest is operably-linked to at least one promoter that is functional in bacteria or another type of host cell, may also be constructed and used with comparable donor/helper/target vector systems.
[0469] Transposition of the mini-Tn7-lacZalpha segment to the chromosome of E. coli DH10B cells should change the phenotype of the host cell from Lac minus (−) to Lac plus (+), or to a target vector comprising the truncated cat or NPT-II genes, restoring resistance to chloramphenicol or kanamycin, respectively, and screening to confirm that their phenotype was changed from Lac minus (−) to Lac plus (+) as well, without the need to select for resistance to gentamycin, that was commonly carried out in the pMON14327 and pFastBac series of vectors.
Example 12—Design of Modular Helper Vectors Encoding Wild-Type and Variant Transposition Genes
[0470] A helper vector, designated pMON7124 comprising the right half of Tn7 cloned onto a derivative of pBR322, contains the Tn7R and the tnsABCDE genes encoding all five proteins needed for site-specific or random transposition of Tn7 into the chromosome or other plasmids within the cell [Barry (1988)]. When E. coli strain DH10B, harbors both the bacmid bMON14272, which confers resistance to Kanamycin, and the helper plasmid pMON7124, which confers resistance to Tetracycline, both plasmids co-exist because their replicons are in different incompatibility groups [Luckow et al (1993)]. When a pUC-based donor plasmid is introduced into a cell harboring the bacmid and pMON7124 (which a replicon that is incompatible with the donor plasmid), the mini-Tn7 segment on the donor plasmid is transposed by a cut/paste mechanism into its attachment site on the bacmid or into the chromosome, if the chromosomal site is not blocked by an existing Tn7 element.
[0471] This vector is fairly large, having a predicted length of 13,274 bp (D. Esposito, personal communication) comprising an 3,613 bp EcoRI-PstI fragment derived from pBR322 encompassing all of the tetracycline resistance gene, several genes involved in replication, including the rop, born, the incompatibility RNA, and the origin of replication (oriV), plus the 3′ end of the bla gene. The product of the rop gene is involved in copy number control, and the born (basis of mobility) sequence is described as the origin of transfer for conjugative mobilization using a conjugative broad host range plasmid, such as RP4. The remaining sequences from the PstI site to the EcoRI site apparently comprise a Tn7 element derived from Proteus mirabilis, including a 177 bp segment from the PstI site to an end of Insertion Sequence 1 (IS1), a 344 bp segment identical to the P. mirabilis glmS gene, Tn7R, the tnsA, B, C, D, and E genes, and two other complete genes (ybgA and rbfB) and one partial gene (ybfA) derived from Tn7.
[0472] While pMON1724 is adequate for many transposition experiments involving screening of transposition events involving bMON14272 and donor plasmids derived from pMON14327 or any of the pFastBac series of vectors, it is unnecessarily large, and several segments can be deleted without affecting the ability of the plasmid to provide transposition proteins in trans in a cell harboring a bacmid and a donor plasmid. One smaller variant deletes the 3′ two-thirds of the tnsE gene, both ybgA and rbfB genes, and the partial ybfA gene extending from a Pad site to the EcoRI site to produce a plasmid designated R982-X01 that is 10,822 bp, that retains the tetracycline resistance and replication genes from pBR322, and all of the tnsA, B, C, and D genes [Mehalko, J. L., Esposito, D. (2016) J. Biotechnol. 238: 1-8]
[0473] Smaller functional variants of pMON7124 and R982-X01 can also be made by deleting all of the tnsE gene (saving ˜393 bp), and sequences extending from one end of the origin of replication near two closely-spaced PpiI sites, across the 3′ end of a disrupted bla gene, a partial IS1 sequence, and most of the glmS-related sequences derived from Proteus mirabilis (saving ˜988 bp), as noted above. Other sequences between the 3′ end of the tetracycline resistance gene and one end of the origin of replication, that include the rop gene and the born sequence might also be deleted.
[0474] A very small tetracycline resistant helper plasmid can be constructed from small high copy number cloning vectors provided by Twist Biosciences in several steps, including those that confer resistance to chloramphenicol, ampicillin, or kanamycin resistance, by inserting a gene encoding a product conferring resistance to tetracycline, and deleting other sequences conferring resistance to other antibiotics, and then inserting sequences comprising a promoter operably linked to the tnsA, B, C, and D genes.
[0475] Smaller variants can also be prepared, comprising sequences encoding fewer transposition genes, such as the tnsA, B, and C genes, with the tnsD gene located on a target vector to facilitate studies designed to identify variants of the tnsD gene product that have an altered ability to bind to specific glmS-like sequences, such as those derived from homologues glmS found in human, yeast or other prokaryotic or eukaryotic chromosomes. A vector comprising a novel gene fusion comprising a sequence for a selectable marker fused to an attTn7-like target, and a tnsD gene comprising one or more mutagenized segments can be used in directed evolution experiments, in the presence of a helper vector encoding the tnsA, B, and C genes, and a donor plasmid comprising a mini-Tn7 element and one or more genes of interest. If the tnsD gene on the target vector is altered by mutagenesis, then composite variant target vectors that resulted from transposition into the target site, restoring the ability of the target vector to confer resistance to chloramphenicol or kanamycin as noted above, can be recovered by isolating plasmid DNA samples, retransforming composite vector into plasmid-free strain selecting for the target but not the helper or donor vectors, and analyzing its sequence to determine the nature of the mutation(s) in the tnsD gene. Several rounds of mutagenesis and direct selection may be needed to alter the specificity of the tnsD gene product to efficiently bind to specific target sequences that are similar but not identical to the E. coli glmS gene.
[0476] Modified target vectors comprising variant tnsC genes can also be constructed, to identify mutants that are similar to the “Gain of Function” mutations identified in earlier studies [Stellwagen, A. E and Craig, N. L. (1997) Genetics 145(3): 573-85]. The tnsD and tnsE genes were not required, and wild-type tnsA and B genes in the presence of an altered tnsC gene (tnsC*) facilitated random transposition of a mini-Tn7 element into other vectors or the chromosome of the host cell. Methods to identify variants of tnsC will differ from those used to identify variants of tnsD, by screening for phenotypic changes that occur as a result of the random transposition into a gene carried on the target vector, perhaps a large gene allowing counterselection or screening of transposition events if an insertion disrupts expression of its gene product. Examples include disruption of the lacZ, cat, NPT-II, bla, or tet genes, as noted in earlier sections of this application.
[0477] Variant synthetic forms of Tn7 that can randomly transpose at very high levels may be preferred for particular applications involved in modifying prokaryotic or eukaryotic cells that result in insertions without a plasmid or viral vector backbone, such as cell and gene therapy applications requiring insertion of one or more cargo DNA segments comprising one or several genes of interest.
Example 13—General Principles Concerning Design of Modular Vectors Comprising One or More Transposon Traps
[0478] When key components of a bacterial plasmid or a viral or non-viral shuttle vector will be reused in other variant vectors, it is often useful to design the vectors so segments DNA comprising functionally-distinct genetic elements are modular, allowing easy methods for their extraction and insertion into other vectors, or easy methods for the insertion of other DNA segments into one or more sites on a vector that is adjacent to the 5′ end or the 3′ end of a segment of interest, in a preferred orientation, or in either orientation.
[0479] Traditionally simpler methods rely on use of one or more restriction enzymes to digest vectors comprising a DNA segment of interest, to create a mixture of DNA fragments, which may be separated on agarose or acrylamide gels and purified, that are then ligated into a vector digested with one or more enzymes that produce compatible 5′, 3′, or blunt ends, followed by ligation, and recovery of the new variant vector comprising the desired insert.
[0480] Other methods can also be used, including amplification of the desired segment using primers that flank the desired segment in the presence of a thermostable DNA polymerase (e.g., polymerase chain reaction, PCR) and comparable methods, to produce linear DNA segments that may be ligated directly into cloning vectors, or treated with other enzymes to add additional nucleotides at either end to facilitate ligation to a compatible vector, or digested with restriction enzymes that have recognition sites in the primer sequences flanking the original ends of the insert.
[0481] It may be desirable to build larger modular vectors from a series of smaller modular vectors in a sequential fashion, using functional genetic elements flanked by synthetic linkers comprising recognition sites for restriction enzymes that cut infrequently or not at all within an unmodified parental vector, or a virus that will be engineered to include a replicon, such as a shuttle vector, that allow it to be propagated in two types of host cells. Compatible sets of synthetic linkers, such as those described above in Example 9, may be used, to flank DNA segments comprising functionally distinct genetic elements, in smaller cloning vectors, which may be used as the source of an insert or a vector in a series of steps to assemble a final, product vector.
[0482] The baculovirus shuttle vector (bacmid) bMON14272, comprises a large ˜8 kb DNA segment containing several smaller functionally-distinct genetic elements, including a segment encoding a gene which confers resistance to kanamycin in E. coli, a lacZalpha gene comprising a synthetic mini-attTn7 sequence, and mini-F, a stable low copy number replicon derived from the prototype fertility plasmid, F. This large segment is inserted into the non-essential polyhedrin gene, in the baculovirus Autographa californica Nuclear Polyhedrosis Virus (AcNPV). Another bacmid, bMON14271, has this large segment inserted into the opposite orientation at the same location in AcNPV. Functionally-equivalent bacmids could have the DNA segment with the kanamycin resistance marker, the mini-attTn7 target sequence, or the bacterial replicon located elsewhere in the viral genome, in the same or opposite orientation, or all together as one large segment, but in a different order or the same or opposite orientations to each other compared to the order and orientations in bMON14272 and bMON14271.
[0483] If these functionally distinct genetic elements are abbreviated as K, L, and F, they could be assembled six congruous segments in the order KLF, KFL, LFK, LKF, FKL, and FLK. The relative orientation each segment may also be flipped, such that the K element could be in one orientation in the order K(+)LF or the opposite orientation as K(−)LF, and so on. In other cases, the K element could be on a segment that is inserted into the AcNPV genome away from a site where the L and F elements are located, or L separated from K and F, or F separated from K and L, or K, L, and F, located at 3 distinct locations in the shuttle vector.
[0484] The locations for insertion of functionally distinct genetic elements should be stable, and not prone to loss when the bacterial plasmid, or shuttle vector, are propagated in host cells over time. Inserted segments may be unstable, and prone to deletion by recombining with homologous segments in flanking regions, or somehow toxic to host cells comprising the engineered vector compared to a parental vector.
[0485] Rational designs for inserting drug resistance markers, synthetic target sites, and replicons in shuttle vectors rely heavily on existing knowledge concerning whether other genes in the vector are essential or non-essential for growth under specific growth conditions. For AcNPV, a wide variety of genes have been identified as non-essential, by creating shuttle vectors that propagated in bacteria, that were subjected to mutagenesis and then transformed into cultured insect cells for testing. If testing needs to be carried out in an infected caterpillar, then structural proteins needed to produce the occluded form would also be considered essential, even though they are not essential for production of the budded virus that infects cells within a caterpillar, and in cultured cells. A non-essential gene, or clusters of several contiguous non-essential genes may be good locations for inserting a drug resistance marker, synthetic target site, or a replicon in a redesigned shuttle vector.
[0486] Semi-rational or random methods for inserting drug resistance markers, synthetic target sites, and other replicons can also be used to introduce genetic elements into a prokaryotic and eukaryotic viral or non-viral shuttle vectors. Simpler methods may rely on linearization of a circular vector and ligation of DNA segment comprising the genetic element of interest, and transformation of the ligated product into bacteria or eukaryotic host cells for propagation and analysis. It may be desirable, in some cases though, to use a transposon that can randomly insert its cargo in another vector or a bacterial chromosome, such as variant forms of Tn5, in vitro using purified proteins, or in cells harboring vectors that encode a modified transposase [Reznikoff, W. S. (2008) Ann. Rev. Genetics 42(1): 269-286].
Example 14—Design and Assembly of Synthetic Tn7-Like Donor/Helper/Target Vector Systems Based on Transposable Elements Observed in Genomic Islands
[0487] A wide variety of site-specific bacterial transposons have been observed in epidemiological studies and bioinformatics studies, where Tn7-like elements that confer resistance to many antibiotics, or carry genes involved in reduction of heavy metals (including gold, silver, mercury, cobalt, and bismuth) are clustered in specific locations, called genomic islands, within a host cell [Peters (2017)]. Many of these elements often comprise genes that are highly similar to the Tn7 tnsABC genes, and a homologue of tnsD called tniQ, that facilitates targeting into specific target sites, that are not similar to the sequence at the 3′ end of the essential and highly conserved E. coli glmS gene. Some of the targets for Tn7-like elements are within non-essential genes. TnAbaR1, for example, inserts in the middle of the comM-like genes in many kinds of bacteria. Representative examples from several other kinds of Tn7-like elements and their target sites are summarized in the Table below.
TABLE-US-00047 TABLE 27 Targets for Tn7 and Tn7-like Genetic Elements Associated with Specific Sites or Genomic Islands Donor/ Target Helper/Target Transposon Host Cell Gene Essential? Gene Function Vector System? Reference Tn7 Escherichia glmS Yes Glutamine-fructose-6- Yes Craig (1996); coli phosphate aminotransferase Peters (2014) (isomerizing), with identical or highly similar homologues in a wide variety of prokaryotic and eukaryotic cells TnAbaR1 Acinetobacter comM No Hexameric helicase capable of No Nero (2017) baumannii binding ssDNA and dsDNA in the presence of ATP, which appears to be a Mg chelatase- like protein comprising an ATPase domain Tn6022 Escherichia yifB No? Mg chelatase subunit D/I No Peters (2017) coli family having ATP-dependent peptidase activity and a member of the comM subfamily Tn6230 yhiN No Putative FAD/NAD(P) binding No Peters (2017) oxidoreductase #2 yciA ? Acyl-CoA thioester hydrolase No Peters (2017) #141 IMPDH ? Inosine-5′-monophosphate No Peters (2017) dehydrogenase #298 SRP-RNA ? Signal recognition particle No Peters (2017) RNA
[0488] Several genes that are commonly associated with genomic islands targeted by Tn7-like elements have not been extensively characterized (comM, yifB, yhiN, yciA, IMPDH, and SRP-RNA). Sequences flanking and including sites for insertion in these genes, the left and right arms of these elements, and their transposase genes, can be characterized and developed into comparable donor/helper/target vector systems comprising synthetic transposons for use in a wide variety of applications requiring efficient and reproducible methods for site-specific or random insertions of one or more DNA segments into genetic material within a host cell.
[0489] A mini-TnAbaR1 donor vector is constructed by analyzing the sequences of the entire element, and inserting synthetic DNA sequences into a cloning vector such as pTwist-Amp-HC, that comprise the left and right arms of the Tn7-like element plus short sequences flanking it, with a central core cargo region comprising a DNA segment containing one or more genes of interest and/or optionally one or more multiple cloning sites (MCSs) to facilitate insertion of genetic elements derived from other vectors.
[0490] A helper mini-TnAbaR1 donor vector is constructed by cloning transposase genes into a vector having a similar replicon as the donor vector, that encodes a gene conferring resistance to a different antibiotic, such as tetracycline, comparable to the pBR322-based pMON7124 vector used in the baculovirus shuttle vector system.
[0491] A target vector comprising an attachment site for TnAbaR1 is constructed by synthesizing and cloning segments of the comM gene into a vector such as pTwist-Chlor-MC or pTwist-Kan-MC comprising a gene fusion allowing screening or selection of transposition events, such as those noted above, in Examples 1-7 of the application. One commonly observed insertion site for TnAbaR1 is near the center of the comM gene, such that the ends of the transposon are duplicated as 5-bp sequences after transposition. A 150 bp sequence spanning the insertion site is synthesized and cloned in frame with sequences near the 5′ end of the lacZalpha gene, in a fashion that is similar to the sequences used in the bMON14272 vector disclosed in Example 1, or in smaller versions disclosed in Example 3 of this application.
[0492] Transposition experiments can be carried out using donor/helper/target vectors comprising sequences derived from TnAbaR1, and analyzed by comparing the phenotype of bacteria harboring the vectors before and after transposition on agar plates containing antibiotics or chromogenic substrates, and analyzing the structure of target vectors before transposition and a composite vector after transposition.
[0493] The length of the sequence spanning the insertion site can be minimized in smaller variant forms of the target vector, and this segment can also be moved into gene fusions derived from truncated cat or NPT-II genes, to generate vectors that can be used in experiments where direct selection of transposition events by synthetic TnAbaR1 elements is allowed.
[0494] Comparable donor/helper/target vectors can be designed and assembled from other Tn7-like elements, including those noted in the table above, such as Tn6022, Tn6230, #2, #141, and #298 that target the yifB, yhiN, yciA, IMPDH, and SRP-RNA genes, respectively.
Example 15—Design and Combinatorial Assembly of Ordered Arrays of Two or More Synthetic Attachment Sites for Site-Specific Transposons Allowing Creation of Ordered Composite Arrays Comprising Transposons Inserted into Stable Locations on Modular Prokaryotic and Eukaryotic Vectors
[0495] A target vector comprising a nucleotide sequence comprising an attachment site for a site-specific transposon can be combined with sequences derived from a second target vector to facilitate the construction of a target vector comprising an array of two or more attachment sites by any of a variety of gene assembly methods, including those characterized as being encompassed by traditional sequential methods of cloning, BioBrick assembly, Three Antibiotic (3A) Assembly, Gibson Assembly, In-Fusion™ PCR Cloning, Golden Gate Assembly, Iterative Capped Assembly, TOPO-TA Cloning, and Overlap Extension PCR methods, which are all described above, in the section entitled “Background of the Invention”.
[0496] A bacterial cell harboring a target vector comprising two distinct attachment sites may be used in transposition experiments facilitated a helper vector and a donor vector by to allow for the selection or screening of transposition events depending on the nature of the nucleotide sequences comprising gene fusions where one portion encodes a polypeptide that confers a selectable or screenable phenotype to a cell and another portion comprises a sequence derived from the attachment site for the transposon and optionally encodes polypeptide sequences fused within or to one or two portions of the polypeptide that confers the selectable or screenable phenotype to the cell.
[0497] For example, a target vector may comprise a nucleotide sequence encoding a lacZalpha polypeptide that also comprises sequences derived from the E. coli glmS gene fused in frame in the same or opposite orientation as the 3′ end of the natural glmS gene, provided that there are no stop codons in the same reading frame as the lacZalpha polypeptide, such as one of the sequences disclosed in Example 1 of the application, noted above, where an synthetic EcoRI-SalI sequence comprising the attachment site is inserted in frame between codons 5 and 7 of the lacZalpha polypeptide. A second target sequence may be derived from a gene fusion encoding an inactive cat gene fused to a mini-attTn7 sequence, such as one of the sequences disclosed in Example 2, that can be included in a contiguous array of two or more target sites, or in a separate, distinct location on the target vector between or among other key genetic elements, such as a drug resistance marker and a replicon sequence.
[0498] Transposition experiments can then be carried out, to select or screen for a first insertion into the first target site, or into the second target site, and a second experiment to select or screen for a second insertion into the remaining open target site, and confirming by phenotype and by structural analysis of that the “composite” array comprises two transposons inserted into two sites in an orientation specific manner, and that the entire array is stable, at least, in a recombination-deficient host cell strain, such as a recA minus E. coli strain. Direct repeats of sequences derived from the transposon, or from the target sequences may contribute to instability of the array in host cell strains that promote or allow homologous recombination to occur, particularly if the growth rate of cells harboring deletion variants of the composite target vector is greater than the growth rate for cells harboring a full length version of the composite target vector.
[0499] Tn7 and several but not all Tn7-like genetic elements have a property called “transpositional target immunity” where only one Tn7 element is inserted at a target site, and subsequent insertions by the same element at the target site do not occur [Stellwagen, A. E and Craig, N. L. (1997) Genetics 145(3): 573-85]. Two proteins, TnsB and TnsC, bind to the ends of Tn7 on a donor segment and target sequences comprising the ends of Tn7, preventing Tn7 elements from inserting adjacent to itself in the chromosome or in vectors comprising its attachment site.
[0500]
[0501]
[0502]
[0503]
Example 16—Directed Evolution of Site-Specific Transposons to Create Synthetic Transposons Having Enhanced Transposition Frequency or Altered Site Specificity
[0504] Methods for the directed evolution of a gene typically rely on three steps: (1) subjecting a gene to iterative rounds of mutagenesis creating a library of variants; (2) selection and isolation of cells harboring vectors comprising genes expressing variant products having the desired function or phenotype, and (3) amplifying vectors comprising sequences encoding the best variants for use in subsequent rounds of mutagenesis and selection. These steps can be performed in vivo, or in vitro, to recover variants that may be structurally and functionally different than those obtained by rationally designing and testing the phenotypes of cells harboring one or more modified genes.
[0505] The ability to directly select for transposition events, regardless of the nature or size of the cargo sequences carried on a mini-transposon, allows the use of methods for the directed evolution of components of a donor/helper/target vector-based transposition system, to alter the efficiency of transposition (increasing observed level of transposition in the presence of one or more variant products of the transposase genes, compared to results obtained with gene products encoded by unaltered, wild-type or parental genes), or alter the specificity of transposition (allowing the donor segment to insert at one or more specific or even random sites, compared to an assay system where all of the key components are identical or functionally similar to their wild-type counterparts.
[0506] A variety of components in a Tn7-based transposition system are suitable as targets for mutagenesis that can be carried out in the course of a series of directed evolution experiments to alter the efficiency or specificity of transposition events, are noted in the following table.
TABLE-US-00048 Table 28 Strategies to Alter the Site-Specificity or Efficiency of Transposition of Synthetic Tn7-Like Elements* TnsA TnsB TnsC TnsD TnsE Tn7L and Tn7R Size (aa or bp) 273 aa 702 aa 555 aa 508 aa 538 aa ~150 and ~90 bp Functions Binds to Binds to and Interacts with the Binds to attTn7 at Binding to 3′ Tn7L has an 8-bp DR and cuts cuts at the 3′ product the tnsD the 3′ end of the recessed ends with a 5′ TGT, and 5-bp from ends of Tn7L gene bound to E. coli glmS gene of a replicating Tn7R has an 8-bp DR the 5′ and Tn7R, structural features of and insertion DNA structure with a 3′ ACA; Tn7L ends of allowing target DNA occurs 24 bp and a sliding typically ~150 bp and 3 Tn7L and them to be sequences, and the beyond the 3′ end clamp TnsB binding sites, and Tn7R, and paired in a DNA-bound complex producing processivity Tn7R typically 90 bp binds to process of tnsA and tnsB gene structure with 5-bp factor (β-clamp with 4 overlapping the mediated by products, with a duplications at protein), tnsB binding sites; product of the product central domain Tn7L and Tn7R. encoded by the Both ends are bound the tnsB of the tnsA involved with binding host dnaN or cleaved by the gene. gene. and hydrolysis of ATP gene. products of the tnsA and target immunity, and B genes; Promoter preventing driving expression of transposition into all of the tnsABCDE segments of DNA genes is near the 3′ comprising Tn7. end of Tn7R. Key Role in Random 3′ end of the E. coli Random Targeting glmS gene and sequences near highly conserved the replication homologues in fork in conjugal other bacteria and plasmids many eukaryotic cells Key Variants “Gain of Function” Lengths of Tn7L and TnsC* mutants Tn7R can be identified by minimized, and some Stellwagen and Craig nt residues can be (1997) transpose altered without randomly in the affecting ability of the presence of TnsA, donor segment to TnsB, and TnsC*. transpose. Opportunities New TnsC “Gain of Variants of TnsD These and other types to exploit Function” variants selected through of alterations may through may have higher directed evolution allow transposition of directed efficiencies of methods should Tn7-like elements with evolution to random transposition allow transposition altered sequences produce of Tn7 variants in to altered target within or adjacent to synthetic prokaryotic and sites, including their 5′ and 3′ ends for transposons eukaryotic cells. wild-type and specific applications variant homologues of the E. coli glmS gene in other prokaryotic and eukaryotic cells. *[Portions adapted from general reviews on Tn7 by Craig (1997), Peters (2014), and this work (2020)].
[0507] The ability to directly select for transposition events based on the use of novel gene fusions, such as the cat-attTn7 or NPT-II-attTn7 sequences disclosed in Examples 2 and 4, plus others noted above, allow for the selection and recovery of vectors comprising sequences encoding variants of tnsD, that should have an altered specificity compared to the wild-type attTn7 target sequence near the 3′ end of the E. coli glmS gene.
[0508] In a traditional Tn7-based donor/helper/target vector system, all of the genes encoding transposases, tnsABCD, are located on a helper vector, such as pMON7124, that is on a high copy number bacterial replicon that confers resistance to tetracycline and incompatible with the donor vector, such as pFastBac1, that is on a high copy number replicon that confers resistance to ampicillin from a gene located on the backbone of the vector, and resistance to gentamycin that is located in a gene within the mini-Tn7 element along with other sequences allowing insertion of a gene of interest downstream from an operably-linked polyhedrin promoter that is functional in the baculovirus-infected host cells. Transposition occurs when the donor plasmid is introduced into an E. coli cell harboring the target vector, bMON14272, and the helper vector, and screening for white colonies in a background of blue colonies, on indicator plates comprising the chromogenic substrate, X-gal.
[0509] In Examples 2 and 4, the target vector comprises a gene fusion, where the 5′ portion of the chimeric gene encodes an inactivated drug resistance gene, linked to a mini-attTn7 sequence that partially overlaps with codons near the 3′ end of the gene, such as those encoding a Cysteine residue for the cat gene, or a Proline residue for the NPT-II gene. Transposition of a mini-Tn7 element from the donor vector, in the presence of a helper vector should occur, and all of the vectors that are recovered when the chloramphenicol or kanamycin are used in the selection plates, in addition to antibiotics conferring resistance to the gene on the backbone of the vector, should be composite vectors, each having an insertion of the mini-Tn7 element into the target site in the novel gene fusion sequence.
[0510] In one of many possible schemes for performing directed evolution of transposase genes, the gene encoding tnsD, is moved from the helper vector, to the target vector, and placed under the control of an inducible promoter. The target vector comprising selectable gene fusion (such as those disclosed in Examples 2 and 4) is altered to comprise a desired sequence, such as a human or yeast homologue of the E. coli glmS attachment site, and the tnsD gene is then mutagenized by a random or a site-specific method, so that all or parts of its coding sequences are altered, primarily by single or multiple nucleotide base substitutions, and then transformed into a host cell comprising the helper vector comprising the tnsABC genes and a donor vector. Cells harboring the modified target vector can also be co-transformed with a helper vector comprising the tnsABC genes and a donor vector. The transformed cells are plated on the antibiotic that is restored after transposition of the mini-transposon into the gene fusion, and cells comprising composite vectors are characterized by their cellular phenotype, and the vectors characterized by structural analysis, such as DNA sequencing across the ends of the transposon, the sizes of fragments amplified fragments, or by the sizes of fragments cleaved by one or more restriction enzymes.
[0511] Since the target vector also contains the mutagenized tnsD gene, selecting for restoration of drug resistance should recover bacteria harboring vectors that encode transposase variant gene products that bind to the altered binding site associated with its corresponding insertion site. If the target sequence in the gene fusion is different than the wild-type E. coli glmS gene, it should be possible to recover target vectors with the one or more altered tnsD genes. The variants can be used in subsequent rounds of directed evolution experiments, to recover variants that allow the mini-Tn7 element to be inserted into human, yeast, or other target sites that are substantially different from the wild-type E. coli glmS gene.
[0512] It should also be possible to recover variants where the altered target sequence does not naturally occur in any prokaryotic or eukaryotic host cell system, which would permit its transfer and use in a wide variety of vector and host cell systems, dramatically transforming many fields of synthetic biology, including those directed to the discovery and development of novel food and drug products, and components of cell and gene therapy vector systems.
[0513] Similar approaches can also be used to mutagenize and recover vectors comprising other altered transposase genes, which transpose more frequently or efficiently into their natural specific target sites (hyper-transposase mutants)), much different perhaps, than tnsC* variants that have 100× the activity of the wild-type gene, efficiently promoting random transposition of a mini-Tn7 donor element into a vector or into chromosome of E. coli [Stellwagen, A. E and Craig, N. L. (1997) Genetics 145(3): 573-85].
[0514] Both approaches can also be combined to build a set of donor/helper/target vectors that increase the level of site-specific transposition events, where the helper vector comprises one or more variant tnsA, B, C, and D genes, that encode products that act on the ends of Tn7 in the donor vector, to facilitate its efficient insertion into a specific sequence on a target vector or target sequence integrated into the chromosome of a host cell.
[0515]
[0516]
Example 17—Design and Assembly of Synthetic Site-Specific Bacterial Transposons that Work Efficiently in Eukaryotic Cells
[0517] Major features of the design and assembly of novel vectors and methods for the selection or screening of transposition events carried out with vectors propagated in prokaryotic cells, can be carried over into the development of site-specific transposition systems that work well in eukaryotic cells, where the target sequence is propagated in a shuttle vector, or is integrated into a host cell chromosome that would provide great flexibility for use in many types of cell engineering applications.
[0518] Compatible sets of vectors are designed and assembled to take into account factors relating to expression of heterologous genes of interest in different types of host cell systems, including (a) construction of new helper vectors comprising 3-4 codon-optimized genes encoding transposases operably-linked to eukaryotic promoters and termination signals that function in the desired host cell; (b) isolation and characterization of mutant transposases genes that increase overall levels of transposition or alter the specificity towards particular target sites; and (c) demonstration that donor, helper, and target vectors lead to the introduction of a single donor transposon at a specific target site at a stable location on a vector or the host chromosome, or in other circumstances, multiple random insertions into the chromosome, without the potential for or evidence of remobilization.
[0519] Helper vectors that encode transposase genes optimized for expression in mammalian cells are constructed by cloning codon-optimized variants of the tnsABCD genes including any tnsD variants that target the E. coli glmS sequence or the human homologue of this sequence, and placed under the control of a strong, perhaps inducible promoter that functions in mammalian cells. Human CMV and HSV Thymidine kinase promoters are commonly used now for a wide variety of applications. A mammalian cell comprising the target vector, or an engineered cell comprising the target sequences integrated into its genome is transformed with the variant helper vector and a donor vector, selecting for resistance to the gene that is reactivated by transposition in the synthetic attTn7 gene fusion.
[0520] Synthetic site specific transposons that work well in plant cells can be based on many of the vectors derived from the TI plasmid, and shuttle vectors comprising major parts of the chloroplast genome. Helper vectors comprising transposase genes operably-linked to bacterial or plant host cell promoters are designed and assembled, using the approaches noted above, and used with donor and target shuttle vectors modified appropriately to reflect codon preferences and regulatory signals that are known to function in the host cell. Transposition experiments are carried out with appropriately modified donor and helper vectors, followed by analysis of the phenotype of bacteria harboring the composite vectors and the structures of the composite vectors. The composite vectors are then transferred to plant cells or tissues, and expression of the products encoded in the donor cassette is evaluated. Comparable systems that work well for vectors propagated in Agrobacterium, Xanthomonas, or other phytobacteria can also be developed.
[0521] Similar approaches can be used to develop site-specific transposons based on Tn7-like elements that work well in non-enteric bacteria, or fungi (unicellular yeast, or filamentous fungi) can also be developed. Target sequences that work well in other host cell systems can be moved into shuttle vectors propagated in these types of host cells, or directly into the chromosome of a host cell. Helper vectors comprising codon-optimized transposase genes that facilitate insertion of a mini-Tn7-like transposon into the target site are used, including those that encode variants that may target a wild-type of variant form of an attachment sequence within the host cell. A variant form of a helper vector developed through directed evolution techniques, can be used to target the yeast homologue of the E. coli glmS gene, allowing perhaps, targeted insertions of DNA segments into a single, safe location within a yeast cell.
[0522] Eukaryotic gene delivery systems based on synthetic site-specific prokaryotic transposons can be a powerful tool to transform many fields of synthetic biology, leading to the discovery and development of many novel food and drug products, and efficient, cost-effective methods for the production of many other products in cultured cells and transgenic organisms.
Example 18—Design of Modular Target Sites to Assay the Efficiency and Fidelity of Gene Editing Events, Including One or More Combinations of Nucleotide Substitution, Insertion, and Deletion Events
[0523] There are two types of DNA substitutions. Transitions involve substitutions of purines comprising two aromatic rings (A.Math.G), or substitutions of pyrimidines comprising one aromatic ring (C.Math.T). Transitions involve substitutions of structures comprising one ring with one comprising two rings, and substitutions of structures comprising two rings with one comprising one ring (C.Math.A, C.Math.G, T.Math.A, T.Math.G). There are four types of transition events: A to G, G to A, C to T, and T to C. There are eight types of transversion events: C to A, A to C, C to G, G to C, T to A, A to T, T to G, and G to T.
[0524] Small or large Insertions or deletions can alter the reading frame of a sequence encoding a protein or alter the structure of a sequence in a critical domain of an encoded polypeptide or complementary RNA molecule, generally leading to the expression of functionally impaired or inactive molecules.
[0525] Novel methods to assay the efficiency and selectivity of gene editing systems can be designed that are based on methods that alter the level or functional activity of a product encoded by gene. Bacterial plasmids and shuttle vectors comprising at least one of the novel gene fusions noted in earlier examples of this application can be used to facilitate the design of assays to test not only the insertion of transposons at a specific target site, but also the efficiency and specificity of endonuclease based complexes (e.g., CRISPR-Cas, homing enzymes, and chimeric molecules comprising recognition and editing functions) designed to edit nucleotide sequences carried on replicons or integrated into a host chromosome.
[0526] In Example 2, novel gene fusions are disclosed, where one or more TAA, TGA, or TAG stop codons are inserted upstream from the 3′ end of the cat gene encoding chloramphenicol acetyltransferase (CAT protein). Transposition of a mini-attTn7 sequence from a donor plasmid into a synthetic mini-attTn7 that is designed to have its insertion site (−2 to +2) overlap with the stop codon, will alter the reading frame of the truncated gene after transposition to generate a sequence encoding a CAT fusion protein that is extended, and active, compared to the inactive truncated CAT protein. The same vector can be used as a target for CRISPR- and other nuclease-based complexes to test their effectiveness in making alterations at the one or more stop codons, allowing expression of a functional CAT protein, restoring the ability of a cell harboring the vector to confer resistance to chloramphenicol.
[0527] A variety of nucleotide substitutions and insertions or deletions can be detected with this system, where one or more TAA, TGA, and TAG stop codons are introduced in the middle of or near the 3′ end of a gene encoding a selectable marker or a reporter molecule.
TABLE-US-00049 TAA, to (A/C/G, not T)AA, to 1 Transition, 6 Transversions T(C/T, not A/G)A, TA (C/T, not A/G) TGA, to (A/C/G, not T)GA, to 2 Transitions, 6 Transversions T(C/T, not A/G)A, TG (C/T/G, not A) TAG, to (A/C/G, not T)AG, to 2 Transitions, 6 Transversions T(C/T, not A/G)A, TA (A/C/T, not G)
[0528] These methods apply not only to truncated, disrupted, or extended versions of cat genes, but also many other types of genes, including NPT-II (conferring resistance to kanamycin), bla (conferring resistance to amplicillin, tet (conferring resistance to tetracycline, and the lacZalpha gene encoding an alpha polypeptide that can bind to and complement an acceptor polypeptide to generate a functional β-galactosidase molecule, which are all disclosed in Examples 1, and 3-7 of this application.
[0529] The effectiveness of gene editing systems can be assayed by detecting the efficiency of converting stop codons in synthetic gene fusions comprising truncated versions of genes encoding a protein conferring resistance to an antibiotic or a reporter molecule. Vectors comprising gene fusions noted above, can be used in assays designed to monitor the efficiency of converting a stop codon in a gene encoding a truncated, inactive enzyme to a codon that allows translation of a normal or extended version of an active enzyme. Vectors based on pACYC184, for example, that comprise a TAA, TGA, or TAG stop codon near the 3′ end of the cat gene encoding an inactive truncated chloramphenicol acetyl transferase (CAT protein), can be used as targets for editing by complexes comprising a nuclease and a targeting protein or guide RNA, such as a CRISPR/Cas9/guide RNA-based complex in vitro, or expressed in vivo, to generate an edited gene encoding a functional CAT protein. The edited products can be transformed into a host cell selecting for resistance to tetracycline and the ratio of cells conferring resistance to chloramphenicol to those conferring resistance to tetracycline compared to determine the efficiency of the editing process.
[0530] Mutagenized versions segments of DNA encoding components of the gene editing complex can be prepared and their effectiveness compared to complexes comprising unaltered components. Genes encoding nucleases, targeting proteins, and guide RNAs can be mutagenized and rapidly identified as being beneficial or not, if they increase the efficiency of conversion of an inactive truncated enzyme to a normal or extended version of an active enzyme, such as the CAT protein.
[0531] Similar types of assays can also be developed, based on genes encoding truncated or disrupted versions of NPT-II (conferring Kanamycin resistance), beta-lactamase (conferring resistance ampicillin resistance), and the tetracycline anti-porter (conferring resistance to tetracycline), and the lacZalpha polypeptide (which can complement an acceptor polypeptide in a host cell containing lacZΔM15 gene to generate a functional β-galactosidase protein).
[0532] Assays designed to determine the efficiency of small gene deletions can also be developed, where deletion of the stop codon and one or more additional codons in a truncated or disrupted gene can be performed, allowing expression of an active enzyme.
[0533] Assays can also designed to detect deletions or insertions of 1-bp or 2-bp insertions, by using a target sequence that has or is missing several nucleotides near a stop codon in a truncated gene, creating a frameshift leading to early termination of translation, and requiring one or more compensating insertions or deletions of several nucleotides upstream or downstream from that site to allow expression of an active enzyme.
[0534] It may be desirable in some cases to include the gene of interest being mutagenized on the same vector comprising the truncated, disrupted, or extended target gene. For example, a pACYC184-based vector comprising a cat gene with a stop codon near its 3′ end can also contain a gene encoding the Tn7 tnsD gene, along with a bacterial replicon and gene conferring resistance to tetracycline. Parts of the segment of DNA encoding the tnsD gene can be altered by mutagenesis, such as inserting a synthetic oligonucleotide containing one or more substitutions compared to the wild-type sequence, and the altered plasmid transformed into a cell comprising a helper plasmid (providing the products of the tnsA, B, and C genes, and a plasmid comprising a mini-Tn7 donor element. The cells can be grown on a series of plates containing tetracycline and different concentrations of chloramphenicol. Cells that are resistant to chloramphenicol should contain a transposon inserted into the mini-attTn7 target site downstream from the altered cat gene, if the product of the tnsD gene is functional. Direct selection for colonies that are resistant chloramphenicol under these conditions should allow the analysis of genes encoding products involved in transposition, including the left and right arms of the transposon and the ability of the product of the tnsD gene to bind to the target site and bind to one or more of the products of the tnsA, B, and C genes that direct insertion of the mini-transposon into its specific target site. Similar approaches can be used to mutagenize and test the effectiveness of one or more altered tnsA, B, and C genes carried on the altered target plasmid.
[0535] Vectors designed to test the efficiency and specificity of other types of gene editing complexes do not need to include mini-attTn7 based sequences located within or flanking the target genes, simplifying the design of the test vectors to some extent. CRISPR-Cas-based complexes, for example, can be tested using vectors encoding disrupted or truncated cat, NPT-II, bla, tet or lacZalpha genes, or almost any other type of gene encoding a selectable marker or reporter molecule. Vectors comprising a gene encoding an altered Cas protein, and the truncated or altered target site can be used in a program of directed evolution to select for genes encoding products that have one or more improved activities, such as ability to recognize the target site, with lower levels of off target nucleotide substitution, insertion, or deletion activities
Statement Regarding Specific Aspects, Various Modifications, and Alternatives, are Meant to be Illustrative and not Limiting as to the Scope of the Invention
[0536] While specific aspects of the invention have been described in detail, it will be appreciated by those skilled in the art that various modifications and alternatives to those details could be developed in light of the overall teachings of the disclosure. Accordingly, the particular arrangements disclosed are meant to be illustrative only, and not limiting as to the scope of the invention, which is to be given the full breadth of the appended claims, and any equivalent, thereof.
[0537] It is recognized that a number of variations can be made to this invention as it is currently described but which do not depart from the scope and spirit of the invention without compromising any of its advantages. These include substitution of different genetic elements (e.g., drug resistance markers, transposable elements, promoters, heterologous genes, and/or replicons, etc.) on the donor plasmid, the helper plasmid, or the shuttle vector, particularly for improving the efficiency of transposition in E. coli or for optimizing the expression of the heterologous gene in the host cell. The helper functions or the donor cassette might also be moved to the attTn7 on the chromosome to improve the efficiency of transposition, by reducing the number of open attTn7 sites in a cell which compete as target sites for transposition in a cell harboring a shuttle vector containing an attTn7 site.
[0538] This invention is also directed to any substitution of analogous components. This includes, but is not restricted to, construction of bacterial-eukaryotic cell shuttle vectors using different eukaryotic viruses, use of bacteria other than E. coli as a host, use of replicons other than those specified to direct replication of the shuttle vector, the helper vector encoding one or more transposition genes, or the donor vector comprising the left and right arms of a transposon, each arm flanking a cargo DNA segment comprising one or more sequences of interest, use of selectable or differentiable genetic markers other than those specified, use of site-specific recombination elements other than those specified, and use of genetic elements for expression in eukaryotic cells other than those specified. It is intended that the scope of the present invention be determined by reference to the appended claims.
BIBLIOGRAPHY
Statement Regarding Incorporation by Reference of Journal Articles and Patent Documents
[0539] All references, patents, or applications cited herein are incorporated by reference in their entirety, as if written herein.
PATENT DOCUMENTS
[0540] 1. U.S. Pat. No. 5,348,886, issued 1994 Sep. 20, expired 2012-09-20, assigned to Monsanto Company.
Journal Articles
[0541] 1. Adrian W. Briggs, Xavier Rios, Raj Chari, Luhan Yang, Feng Zhang, Prashant Mali and George M. Church (2012) Iterative capped assembly: rapid and scalable synthesis of repeat-module DNA such as TAL effectors from individual monomers. Nucleic Acids Research, 2012, Vol. 40, No. 15 e117 doi:10.1093/nar/gks624]. [0542] 2. Anderson, D., Harris, R., Polayes, D., Ciccarone, V., Donahue, R., Gerard, G., and Jessee, J. (1996) Rapid Generation of Recombinant Baculoviruses and Expression of Foreign Genes Using the Bac-To-Bac® Baculovirus Expression System. Focus 17, 53-58 [0543] 3. Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A., and Struhl, K. (1994) Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Interscience, New York [0544] 4. Ausubel, F. M., R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A. Smith, K. Struhl, P. Wang-Iverson, and S. G. Bonitz (ed.). 1989. Short Protocols in Molecular Biology: A Compendium of Methods from Current Protocols in Molecular Biology, p. 1-387. Greene Publishing Associates and Wiley-Interscience, New York. [0545] 5. Axe, D. D. (2000) Extreme functional sensitivity to conservative amino acid changes on enzyme exteriors. J. Mol. Biol. 301: 585-695. [0546] 6. Barany, F (1985) Two-codon insertion mutagenesis of plasmid genes by using single stranded hexameric oligonucleotides. Proc. Natl. Acad. Sci. USA 82: 4202-4206. [0547] 7. Barry, G. F. (1988) A Broad Host-Range Shuttle System for Gene Insertion into the Chromosomes of Gram-negative Bacteria. Gene 71: 75-84 [0548] 8. Barry, G. F. 1986. Permanent insertion of foreign genes into the chromosomes of soil bacteria. Bio/Technology 4:446-449. [0549] 9. Barth P T, Datta N, Hedges R W, Grinter N J. (1976) Transposition of a deoxyribonucleic acid sequence encoding trimethoprim and streptomycin resistances from R483 to other replicons. J Bacteriol 25:800-10. [PubMed: 767328] [0550] 10. Bird, L. E., Rada, H., Flanagan, J., Diprose, J. M., Gilbert, R. J. C. and Owens, R. J. (2014). Application of In-Fusion™ cloning for the parallel construction of E. coli expression vectors. Methods Mol. Biol. Clifton N. J. 1116: 209-234; [0551] 11. Bochner, B. R., H. Huang, G. L. Schieven, and B. N. Ames. (1980) Positive selection for loss of tetracycline resistance. J. Bacteriol. 143:926-933. [0552] 12. Bryksin A. M. I., “Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids.” Biotechniques, 29(6): 997-1003, 2012] [0553] 13. C. Engler, R. Kandzia, and S. Marillonnet, “A one pot, one step, precision cloning method with high throughput capability.,” PLoS One, 3(11): p. e3647, January 2008.] [0554] 14. Carrington, J. C., and Dougherty, W. G. (1988) A Viral Cleavage Site Cassette: Identification of Amino Acid Sequences Required for Tobacco Etch Virus Polyprotein Processing. Proc. Natl. Acad. Sci. USA 85: 3391-3395. [0555] 15. Choi, K.-H. and Kim, K.-J. (2009) Applications of Transposon-Based Gene Delivery System in Bacteria. J. Microbiol. Biotechnol. 19(3): 217-228; doi: 10.4014/jmb.0811.669; First published online 23 Jan. 2009. [0556] 16. Ciccarone, V. C., Polayes, D., and Luckow, V. A. (1997) Generation of Recombinant Baculovirus DNA in E. coli Using Baculovirus Shuttle Vector. Methods in Molecular Medicine (Reischt, U., Ed.), 13, Humana Press Inc., Totowa, N.J. [0557] 17. Cole, C. N., and Stacy, T. P. (1985) Identification of Sequences in the Herpes Simplex Virus Thymidine Kinase Gene Required for Efficient Processing and Polyadenylation. Mol. Cell. Biol. 5: 2104-2113. [0558] 18. Craig, N. L. (1996) Transposition. In: Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology II (eds. Neidhardt, F. et al) American Society for Microbiology, Washington, D.C., pp. 2339-2362. [0559] 19. DeBoy, Robert T., Craig, Nancy L. (2000) Target Site Selection by Tn7:attTn7 Transcription and Target Activity. J. Bacteriol. 182(11): 3310-3313. [0560] 20. Deutscher, M. P. (ed) (1990) Guide to Protein Purification Vol. 182. Methods in Enzymology. Edited by Abelson, J. N., and Simon, M. I., Academic Press, San Diego, Calif. [0561] 21. Dougherty, W. G., Carrington, J. C., Cary, S. M., and Parks, T. D. (1988) Biochemical and Mutational Analysis of a Plant Virus Polyprotein Cleavage Site. EMBO J. 7: 1281-1287. [0562] 22. Durfee T, Nelson R, Baldwin S, Plunkett G 3rd, Burland V, Mau B, Petrosino J F, Qin X, Muzny D M, Ayele M, Gibbs R A, Csörgo B, Pósfai G, Weinstock G M, Blattner F R. (2008) The complete genome sequence of Escherichia coli DH10B: insights into the biology of a laboratory workhorse. J Bacteriol. 190(7): 2597-606. doi: 10.1128/JB.01695-07. Epub 2008 Feb. 1. [0563] 23. Fukasawa, T. and H. Nikaido. (1961) Galactose sensitive mutants of Salmonella. II. Bacteriolysis induced by galactose. Biochim. Biophys. Acta 48:470-483. [0564] 24. Gibson et al, (2008) “Complete chemical synthesis, assembly, and cloning of a Mycoplasma genitalium genome.” Science, 319:1215-1220. [0565] 25. Gibson et al, “Enzymatic assembly of DNA molecules up to several hundred kilobases.” Nat Meth, 6:343-5, 2009. [0566] 26. Gossen et al (1992) Application of galactose sensitive E. coli strains as selective hosts for LacZ-plasmids. Nucleic Acids Research 20(12): 3254. [0567] 27. Grant, S. G. N., J. Jessee, F. R. Bloom, and D. Hanahan. (1990) Differential plasmid rescue from transgenic mouse DNAs into Escherichia coli methylation restriction mutants. Proc. Natl. Acad. Sci. USA 87:4645-4669. [0568] 28. Griffith J K, Buckingham J M, Hanners J L, Hildebrand C E, Walters R A. (1982) Plasmid-conferred tetracycline resistance confers collateral cadmium sensitivity of E. coli cells. Plasmid 8: 86-88. [0569] 29. Gringauz, E. Orle, K. A., Waddell C. S., Craig N. L. (1988) Recognition of Escherichia coli attTn7 by transposon Tn7: lack of specific sequence requirements at the point of Tn7 insertion. J. Bacteriol. 170(6): 2832-2840. [0570] 30. Hall, New York, N.Y. Luckow, V. A. (1991) in Recombinant DNA Technology and Applications (Prokop, A., Bajpai, R. K., and Ho, C., eds), McGraw-Hill, New York. [0571] 31. Hamilton, C. M., M. Aldea, B. Washburn, P. Babitzke, and S. R. Kushner. 1989. New method for generating deletions and gene replacements in Escherichia coli. J. Bacteriol. 171:4617-4622. [0572] 32. Hanahan, D. (1983) Studies on Transformation of Escherichia coli with Plasmids. J. Mol. Biol. 166: 557-580. [0573] 33. Harris, R., and Polayes, D. (1997) A New Baculovirus Expression Vector for the Simultaneous Expression of Two Heterologous Proteins in the Same Insect Cell. Focus 19: 6-8. [0574] 34. Hecky, J., Muller, K. M. (2005) Structural perturbation and compensation by directed evolution at physiological temperature leads to thermostabilization of β-lactamase. Biochemistry 44: 12640-12654. [0575] 35. Hedges R W, Datta N, Fleming M P. (1972) R factors conferring resistance to trimethoprim but not sulphonamides. J. Gen. Microbiol. 73:573-5. [PubMed: 4571517]. [0576] 36. Holton, T. A., Graham, M. W. (1991). A simple and efficient method for direct cloning of PCR products using ddT-tailed vectors. Nucleic Acids Research, 19(5): 1156. [0577] 37. In-Fusion® H D Cloning Kit User Manual, available from Takara Bio. [0578] 38. Janson, J. C., and Ryden, L. (1989) in Protein Purification: Principles, High Resolution Methods, and Applications, VCH Publishers, New York. [0579] 39. Juers et al (2012) LacZ β-galactosidase: Structure and function of an enzyme of historical and molecular biological importance. Protein Science 21:1792-1807. [0580] 40. Kertbundit, S., Greve, H. d., Deboeck, F., Montagu, M. V., and Hernalsteens, J. P. (1991) In vivo Random beta glucuronidase Gene Fusions in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 88: 5212-5216. [0581] 41. King, L. A., and Possee, R. D. (1992) The Baculovirus Expression System: A Laboratory Guide, Chapman. [0582] 42. Knight, T. (2005) Idempotent Vector Design for Standard Assembly of BioBricks. MIT Synthetic Biology Working Group. [0583] 43. Levy et al (1999) Nomenclature for new tetracycline resistance determinants. Antimicrob. Agents Chemother. 43(6): 1523-1524. [0584] 44. Li, H., Yang, Y., Hong, W., Huang, M., Wu, M., and Zhao, X. (2020) Applications of genome editing technology in the targeted therapy of human diseases: mechanisms, advances and prospects. Signal Transduction and Targeted Therapy 5:1. [0585] 45. Luckow, V. A. (1991) Cloning and expression of heterologous genes in insect cells with baculovirus vectors., p. 97-152. In A. Prokop, R. K. Bajpai, and C. Ho (ed.), Recombinant DNA Technology and Applications. [0586] 46. Luckow, V. A., and M. D. Summers (1988a) Signals important for high-level expression of foreign genes in Autographa californica nuclear polyhedrosis virus expression vectors. Virology 167:56-71. [0587] 47. Luckow, V. A., and M. D. Summers (1988b) Trends in the development of baculovirus expression vectors. Bio/Technology 6:47-55. [0588] 48. Luckow, V. A., and M. D. Summers. 1989. High level expression of nonfused foreign genes with Autographa californica nuclear polyhedrosis virus expression vector. Virology 70:31-39. [0589] 49. Luckow, V. A., and Summers, M. D. (1988) Signals Important for High-Level Expression of Foreign Genes in Autographa californica Nuclear Polyhedrosis Virus Expression Vectors. Virology 167, 56-71. [0590] 50. Luckow, V. A., Lee, C. S., Barry, G. F., and Olins, P. O. (1993) Efficient Generation of Infectious Recombinant Baculoviruses by Site-Specific Transposon-Mediated Insertion of Foreign Genes into a Baculovirus Genome Propagated in Escherichia coli. J. Virol. 67: 4566-4579. [0591] 51. Lun et al (2011) Recent patents on the baculovirus systems. Recent Patents on Biotechnology 5:1-11. [0592] 52. Magota, K., Otsuji, N., Miki, T., Horiuchi, T., Tsunasawa, S., Kondo, J., Sakiyama, F., Amemura, M., Morita, T., Shinagawa, H. (1984) Nucleotide sequence of the phoS gene, the structural gene for the phosphate-binding protein of Escherichia coli. J. Bacteriol. 157(3): 909-917. [0593] 53. Maloy S R, Nunn W D. (1981) Selection for loss of tetracycline resistance by Escherichia coli. J. Bacteriol. 1981; 145:1110-1111. [0594] 54. Maniatis, T., E. F. Fritsch, and J. Sambrook (ed.). 1982. Molecular Cloning. Cold Spring Harbor, Cold Spring Harbor. McGraw-Hill, New York. [0595] 55. Matagne, A., Lamotte-Brasser, J., Frere, J.-M. (1998) Catalytic properties of Class A β-lactamases: efficiency and diversity. Biochem J. 330:581-598. [0596] 56. Mehalko, J. L., Esposito, D. (2016) Engineering the transposition-based baculovirus expression vector system for higher efficiency protein production from insect cells. J. Biotechnol. 238: 1-8. [0597] 57. Miller, J. H. 1972. Experiments in Molecular Genetics, p. 1-446. Cold Spring Harbor, Cold Spring Harbor, N.Y. [0598] 58. O'Reilly, D. R., Miller, L. K., and Luckow, V. A. (1992) Baculovirus Expression Vectors: A Laboratory Manual, W. H. Freeman and Company, New York, N.Y. [0599] 59. Parks, A. R., and Peters, J. E. (2007) Transposon Tn7 is widespread in diverse bacteria and forms genomic islands. J. Bacteriol. 189: 2170-2173. [0600] 60. Parks, A. R., and Peters, J. E. (2009) Tn7 elements: engendering diversity from chromosomes to episomes. Plasmid 61: 1-14. [0601] 61. Peters J. 2014. Tn7. Microbiol. Spectrum 2(5): MDNA3-0010-2014. doi:10.1128/microbiolspec.MDNA3-0010-2014. [0602] 62. Peters, J. E. (2014) Tn7. In Mobile DNA, 3.sup.rd Edition. Craig Nancy, L., Rice, P., Lambowitz, A., Gellert, M., and Sandmeyer, S. B. (eds). Washington D. C.: ASM Press. [0603] 63. Podolsky T, Fong S T, Lee B T. (1996) Direct selection of tetracycline-sensitive Escherichia coli cells using nickel salts. Plasmid. 36:112-115. [0604] 64. Polayes, D., Harris, R., Anderson, D., and Ciccarone, V. (1996) New Baculovirus Expression Vectors for the Purification of Recombinant Proteins from Insect Cells. Focus 18, 10-13. [0605] 65. Possee et al (2019) Recent developments in the use of baculovirus expression vectors. Curr. Issues Mol. Biol. 34: 215-230. [0606] 66. Reddy (2004) Positive selection system for identification of recombinants using α-complementation plasmids. Biotechniques 37: 948-952. [0607] 67. Reiss, B., Sprengel, R. and Schaller, H. (1984) Protein fusions with the kanamycin resistance gene from transposon Tn5. EMBO J. 3(13): 3317-3322. [0608] 68. Reznikoff, W. S. (2008) Transposon Tn5. Ann. Rev. Genetics 42(1): 269-286. [0609] 69. Robben, J. Van der Schueren, J., and Volckaert G. (1993) Carboxyl terminus is essential for intracellular folding of chloramphenicol acetyltransferase. J. Biol, Chem. 268(33): 24555-24558. [0610] 70. Rohrmann, G. F. (2019) Baculovirus Molecular Biology [Internet]. 4th edition. Bethesda (Md.): National Center for Biotechnology Information (US); NBK543458. [0611] 71. Rose, R. E. (1988) The nucleotide sequence of pACYC184. Nucleic Acids. Res. 16: 355. [0612] 72. Roy, P. and Noad R. (2012) Use of bacterial artificial chromosomes in baculovirus research and recombinant protein expression: Current trends and future perspectives. ISRN Microbiology Article ID 628797, 11 pages. [0613] 73. Rubin and Levy (1991) J. Bacteriol. 173(14): 4503-4509]. [0614] 74. Rubin, R. A. and Levy, S. B. (1990) J. Bacteriol. 172: 2303-2312] [0615] 75. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, Second Ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. [0616] 76. Saraceni-Richards and Levy (2000) Evidence for interactions between helices 5 and 8 and a role for interdomain loop in tetracycline resistance mediated by hybrid Tet proteins. J. Biol. Chem. 275(9): 6101-6106 [0617] 77. Sigma Aldrich (2015) Topoisomerase I from Vaccinia Virus. Datasheet. [0618] 78. Skipper, K. A., Andersen, P. R., Sharma, N., and Mikkelsen, J. G. (2013) DNA transposition-based gene vehicles-scenes from an evolutionary drive. J. Biomedical Sci. 20(1): 92. [0619] 79. Stellwagen, A. E and Craig, N. L. (1997) Gain-of-function mutations in TnsC, an ATP-dependent transposition protein that activates the bacterial transposon Tn7. Genetics 145(3): 573-85. [0620] 80. Thermo Fisher (2015) TOPO Cloning Technology Brochure. [0621] 81. Urban, A. A. (1997) rapid and efficient method for site-directed mutagenesis using one-step overlap extension PCR. Nucleic Acids Res. 25(11): 2227-2228. [0622] 82. Van der Schueren, J., Robben, J. and Volckaert, G. (1998) Misfolding of chloramphenicol acetyl transferase due to carboxy-terminal truncation can be corrected by second site mutations. Protein Engineering 11(12): 1211-1217. [0623] 83. Walker, J. E., N. J. Gay, M. Saraste, and A. N. Eberle. (1984) DNA sequence around the Escherichia coli unc operon. Completion of the sequence of a 17 kilobase segment containing asnA, oriC, unc, glmS and phoS. Biochem. J. 224:799-815. [0624] 84. Waters et al (1983) The tetracycline resistance determinants of RP1 and Tn1721: nucleotide sequence analysis. Nucleic Acids Res. 11: 6089-6105. [0625] 85. Westwood, J. A., Jones, I. M., and Bishop, D. H. L. (1993) Analyses of Alternative Poly(A) Signals for Use in Baculovirus Expression Vectors. Virology 195: 90-93. [0626] 86. Wright and Tate (2015) Isolation and characterization of transport-defective substrate-binding mutants of the tetracycline antiporter TetA(B). Biochimica et Biophysica Acta 1848: 2261-2270. [0627] 87. Yao X-J, G P Kobinger, S Dandache, N Rougeau, E A Cohen (1999) HIV-1 Vpr-chloramphenicol acetyltransferase fusion proteins: sequence requirement for virion incorporation and analysis of antiviral effect. Gene Therapy 6: 1590-1599. [0628] 88. Zhu, B., Cai, G., Hall, E. O. and Freeman, G. J. (2007). In-fusion assembly: seamless engineering of multidomain fusion proteins, modular vectors, and mutations. BioTechniques 43: 354-359.