UNIVERSAL RIBOSWITCH FOR INDUCIBLE GENE EXPRESSION
20240026383 ยท 2024-01-25
Inventors
- Constantinos PATINIOS (Renkum, NL)
- Sjoerd Constantijn Arnoud CREUTZBURG (Bennekom, NL)
- John Van Der Oost (Renkum, NL)
- Raymond Hubert Josephe STAALS (Ede, NL)
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
C12N2800/80
CHEMISTRY; METALLURGY
C12N15/67
CHEMISTRY; METALLURGY
C12N15/11
CHEMISTRY; METALLURGY
International classification
C12N15/90
CHEMISTRY; METALLURGY
C12N15/11
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C12N15/67
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
Abstract
Aspects described herein relate to methods for controlling expression of RNA and polypeptides of interest using a tuneable self-splicing intron. Specifically, there is provided modified 5 and 3 exons of the T4 td intron which function as a tuneable self-splicing intron that can be introduced to any gene of interest to multiple spots in the open reading frame therefore allowing the intron to be inserted without changing the amino acid sequence of the protein of interest. Methods and a system for inducer controlled modification of a target genomic locus in a cell are also provided herein. The invention further provides kits for expressing an RNA of interest or a polypeptide of interest, and wherein the expression is in transformed host cells under the control of an inducer molecule.
Claims
1. A method for controlling expression of a polypeptide of interest (POI) in a cell, comprising A. providing a cell comprising a polynucleotide construct, the polynucleotide construct comprising: i. a promoter functional in the cell; ii. a polynucleotide portion encoding said P01; and iii. a polynucleotide portion encoding at least one self-splicing intron which includes 5 and 3 exon nucleotide sequences, wherein the self-splicing activity of the intron is controlled by an inducer molecule; wherein the inducer-controlled self-splicing intron is located (a) at or 5 of the start of the polynucleotide portion encoding the POI, or (b) within the polynucleotide portion encoding the P01; B. subjecting the cell to conditions which express polypeptides in the cell and thereby the transcription of the polynucleotide construct into RNA transcripts in the cell; and C. subjecting the cell to conditions which cause a concentration of inducer molecule to promote the self-splicing activity of the intron in the transcripts; thereby resulting in expression of the POI.
2. A method for controlling expression of an RNA of interest (ROI) in a cell, comprising: A. providing a cell comprising a polynucleotide construct, the polynucleotide construct comprising: i. a promoter functional in the cell; ii. a polynucleotide portion encoding the ROI; and iii. a polynucleotide portion encoding at least one self-splicing intron which includes 5 and 3 exon sequences, wherein the self-splicing activity of the intron is controlled by an inducer molecule; wherein the inducer-controlled self-splicing intron is located (a) at or 5 of the start of the polynucleotide portion encoding the ROI, or (b) within the polynucleotide portion encoding the ROI, B. subjecting the cell to conditions which expresses the polynucleotide construct into RNA transcripts in the cell; and C. subjecting the cell to conditions which produces a concentration of inducer molecule which promotes the self-splicing activity of the intron in the RNA transcript to produce the ROI; thereby resulting in the expression of the ROI.
3. A method as claimed in claim 1, wherein the self-splicing intron is 3 of and in-frame with the start codon and the expressed POI comprises an amino acid tag sequence encoded by a polynucleotide sequence which includes the 5 and 3 exon nucleotide sequences of the self-splicing intron rendered contiguous by self-splicing of the intron; preferably wherein the self-splicing intron is directly adjacent to the start codon and the amino acid tag sequence is an N-terminal amino acid tag in the expressed POI.
4. A method as claimed in claim 1 or claim 2, wherein the self-splicing intron is 5 of the polynucleotide portion from which the ROI or POI is expressed and the said polynucleotide is not disrupted by the self-splicing activity of the intron; preferably wherein the self-splicing intron is 5 of the start codon of the polynucleotide encoding the POI.
5. A method as claimed in claim 1 or claim 2, wherein the self-splicing intron is located within the polynucleotide portion encoding the ROI and preferably does not result in a tag sequence in the ROI or POI.
6. A method as claimed in any of claim 1, 3, 4 or 5, wherein the polynucleotide construct further comprises a polynucleotide sequence encoding an additional amino acid sequence.
7. A method as claimed in claim 6, wherein the additional amino acid sequence is a functional moiety, e.g. a protein purification or detection tag, a cellular localization sequence, a fluorescent moiety.
8. A method as claimed in any preceding claim, wherein there are two or more self-splicing introns located 3 and in frame of the start codon.
9. A method as claimed in any preceding claim, wherein there is a single self-splicing intron located 5 of the start of the polynucleotide portion encoding the ROI or POI.
10. A method as claimed in any preceding claim, wherein the inducer molecule is provided to the cell.
11. A method as claimed in any of claims 1 to 9, wherein (a) the inducer molecule is generated as a result of expression of a separate gene in the cell, wherein the separate gene is under the control of different expression regulatory elements; optionally wherein the different expression regulatory elements are responsive to a different inducer molecule and/or physical condition, e.g. temperature; or (b) wherein the inducer molecule is naturally synthesized by the cell in response to chemical and/or physical condition to which the cell is subjected to.
12. A method as claimed in any preceding claim, wherein the self-splicing intron comprises an aptamer which has binding affinity for the inducer molecule.
13. A method as claimed in any preceding claim, wherein the inducer is selected from flavin mononucleotide, thiamine pyrophosphate, s-adenosylmethionine, s-adenosylhomocysteine, adenosylcobalamin, cyclic di-GMP, adenine, guanine, glycine, lysine, theophylline, 3-methylxanthine, caffeine, 1-methylxanthine, 7-methylxanthine, 1,3-dimethyl uric acid, hypoxanthine, xanthine, theobromine tetracycline, neomycin or malachite green; preferably wherein the inducer is theophylline.
14. A method as claimed in any preceding claim, wherein the 5 exon nucleotide sequence and/or 3exon nucleotide sequence of the self-splicing intron are modified compared to the respective wild type exon nucleotide sequence(s) of the intron.
15. A method as claimed in any preceding claim, wherein the self-splicing intron is a group I intron.
16. A method as claimed in any of claims 1 to 14, wherein the self-splicing intron is a group II or a group III intron.
17. A method as claimed in any preceding claim, wherein the 5 exon sequence of the self-splicing intron is NNNNNNGGT (SEQ ID NO: 3) and the 3 exon sequence of the self-splicing intron is CTN (SEQ ID NO: 4), wherein N is A, T, C or G; optionally wherein the exon sequence is TTBYBDGGT (SEQ ID NO: 5) and the 3 exon sequence is CTH (SEQ ID NO: 6), wherein B=G/T/C, Y=C/T, D=G/A/T and H=A/T/C optionally wherein the 5 exon sequence is selected from TCCTCAGGT (SEQ ID NO: 7), TCCTCGGGT (SEQ ID NO: 8), TCCTTGGGT (SEQ ID NO: 9), TCCTCTGGT (SEQ ID NO: 10) or TTCTTGGGT (SEQ ID NO: 11) and the 3 exon sequence is CTA (SEQ ID NO: 12).
18. A method as claimed in any of claims 1 or 3 to 17, wherein the POI is selected from any of: i. a sequence specific DNA/RNA binding protein; preferably a meganuclease (MGN), zinc finger nuclease (ZFN), a TALEN, an RNA-guided nuclease or a DNA-guided nuclease; ii. an RNA-guided nuclease; preferably a Crispr-Cas protein; iii. a sequence-specific DNA binding protein lacking nuclease activity or a nickase; optionally fused to a heterologous functional moiety; preferably wherein the POI is a base editor or a prime editor.
19. A method as claimed in claim 18, wherein the POI is ii) or iii) and the polynucleotide further comprises a portion encoding a targeting RNA molecule, e.g. guide RNA (gRNA) which directs ii) or iii) to a target locus in a DNA sequence.
20. An isolated polynucleotide comprising: i. a promoter functional in a cell; ii. a polynucleotide portion encoding an RNA of interest (ROI) or a polypeptide of interest (P01); and iii. a polynucleotide portion encoding at least one self-splicing intron which includes 5 and 3 exon nucleotide sequences, wherein the self-splicing activity of the intron is controlled by an inducer molecule; wherein the inducer-controlled self-splicing intron is located (a) at or 5 of the start of the polynucleotide portion encoding the ROI or POI, or (b) within the polynucleotide portion encoding the POI or ROI.
21. A polynucleotide as claimed in claim 20, wherein the ROI is translatable into a POI.
22. A polynucleotide as claimed in claim 20 or claim 21, wherein the self-splicing intron is 3 of and in-frame with the start codon and a POI when expressed from the polynucleotide comprises an amino acid tag sequence encoded by a polynucleotide sequence which includes the 5 and 3 exon nucleotide sequences of the self-splicing intron rendered contiguous by self-splicing of the intron; preferably wherein the amino acid tag sequence is an N-terminal amino acid tag in the expressed POI.
23. A polynucleotide as claimed in claim 20 or claim 21, wherein the self-splicing intron is 5 of the polynucleotide portion from which the ROI or POI is expressed and the said polynucleotide is not disrupted by the self-splicing activity of the intron; preferably wherein the self-splicing intron is 5 of the start codon of the polynucleotide encoding the POI.
24. A polynucleotide as claimed in any one of claims 20 to 23, wherein the self-splicing intron is located within the polynucleotide portion encoding the ROI or POI and preferably does not result in a tag sequence in the ROI or POI.
25. A polynucleotide as claimed in any of claims 20 to 24, wherein the polynucleotide construct further comprises a polynucleotide sequence encoding an additional amino acid sequence; optionally wherein the additional amino acid sequence is a functional moiety, e.g. a protein purification or detection tag, a cellular localization sequence, a fluorescent moiety.
26. A polynucleotide as claimed in any of claims 20 to 25, wherein there is a single self-splicing intron located 5 of the start of the polynucleotide portion encoding the ROI or POI.
27. A polynucleotide as claimed in any of claims 20 to 26, wherein the self-splicing intron comprises an aptamer which has binding affinity for the inducer molecule; optionally wherein the inducer is selected from flavin mononucleotide, thiamine pyrophosphate, s-adenosylmethionine, s-adenosylhomocysteine, adenosylcobalamin, cyclic di-GMP, adenine, guanine, glycine, lysine, theophylline, 3-methylxanthine, caffeine, 1-methylxanthine, 7-methylxanthine, 1,3-dimethyl uric acid, hypoxanthine, xanthine, theobromine tetracycline, neomycin or malachite green; preferably wherein the inducer is theophylline.
28. A polynucleotide as claimed in any of claims 20 to 27, wherein the 5 exon nucleotide sequence and/or 3exon nucleotide sequence of the self-splicing intron are modified compared to the respective wild type exon nucleotide sequence(s) of the intron.
29. A polynucleotide as claimed in any of claims 20 to 28, wherein the self-splicing intron is a group I intron.
30. A polynucleotide as claimed in any of claims 20 to 29, wherein the 5 exon sequence of the self-splicing intron is NNNNNNGGT (SEQ ID NO: 3) and/or the 3 exon sequence is CTN (SEQ ID NO: 4), wherein N is A, T, C or G; optionally wherein the 5 exon sequence is TTBYBDGGT (SEQ ID NO: 5) and the 3 exon sequence is CTH (SEQ ID NO: 6), wherein B=G/T/C, Y=C/T, D=G/A/T and H=A/T/C; preferably wherein the 5 exon sequence is selected from TCCTCAGGT (SEQ ID NO: 7), TCCTCGGGT (SEQ ID NO: 8, TCCTTGGGT (SEQ ID NO: 9), TCCTCTGGT (SEQ ID NO: 10) or TTCTTGGGT (SEQ ID NO: 11) and the 3 exon sequence is CTA (SEQ ID NO: 12).
31. A polynucleotide as claimed in any of claims 20 to 30, wherein the POI is selected from i. a sequence specific DNA/RNA binding protein; preferably a meganuclease (MGN), zinc finger nuclease (ZFN), a TALEN, an RNA-guided nuclease or a DNA-guided nuclease; ii. an RNA-guided nuclease; preferably a Crispr-Cas protein; iii. a sequence-specific DNA binding protein lacking nuclease activity or a nickase; optionally fused to an heterologous functional moiety; preferably wherein the POI is a base editor or a prime editor.
32. A polynucleotide as claimed in claim 31, wherein the POI is ii) or iii) and the polynucleotide further comprises a portion encoding a targeting RNA molecule, e.g. a guide RNA (gRNA) which directs the ii) or iii) to a target locus in a DNA sequence; optionally wherein the gRNA is under the control of a self-splicing intron.
33. An expression vector comprising a polynucleotide of any of claims 20 to 32.
34. A transformed cell for inducer molecule-controlled expression of an RNA of interest (ROI) or polypeptide of interest (POI) thereby, wherein the cell comprises a polynucleotide of any of claims 20 to 32, or an expression vector of claim 33.
35. A kit for expressing an RNA of interest (ROI) or a polypeptide of interest (POI) and wherein the expression is under the control of an inducer molecule comprising: i. a composition comprising a polynucleotide of any of claims 20 to 32, or an expression vector of claim 33, or a transformed cell of claim 34; and ii. a composition comprising an inducer molecule which activates self-splicing activity of a self-splicing intron when expressed in a cell.
36. A system for generating an RNA of interest (ROI) or a polypeptide of interest (POI), comprising a transformed cell of claim 34.
37. A method of inducer controlled modification of a target genomic locus in a cell, comprising introducing or generating in the cell a ribonuclease complex comprising a Crispr-Cas nuclease and a gRNA molecule for the target genetic locus; wherein the Crispr-Cas nuclease and/or the gRNA is comprised as the POI and/or ROI in a polynucleotide of any of claims 20 to 32 or an expression vector of claim 33; and subjecting the cell to a condition which causes a concentration of inducer molecule to promote the self-splicing activity of the intron, thereby resulting in expression of the Crispr-Cas nuclease and/or gRNA in the cell; optionally wherein an homologous repair (HR) template encoded by the same or different polynucleotide or expression vector, and the HR template is expressed in the cell.
38. A method of inducer-controlled base editing of a target genomic locus in a cell, comprising: A. introducing or generating in the cell a ribonuclease complex comprising a base editor and a gRNA molecule for the target genetic locus, wherein the base editor and/or gRNA is comprised as the respective ROI or POI in a polynucleotide or polynucleotides of any of claims 20 to 32 or an expression vector of claim 33; and B. (a) providing inducer molecule to the cell, or (b) subjecting the cell to a condition which causes a concentration of inducer molecule to promote the self-splicing activity of the intron, thereby resulting in expression of the base editor and/or gRNA in the cell.
39. A method of inducer-controlled prime editing of a target genomic locus in a cell, comprising: A. introducing or generating in the cell a ribonuclease complex comprising a prime editor and a prime editing guide RNA (pegRNA) molecule for the target genetic locus, wherein the prime editor and/or pegRNA is comprised as the respective ROI or POI in a polynucleotide or polynucleotides of any of claims 20 to 32 or an expression vector of claim 33; and B. (a) providing inducer molecule to the cell, or (b) subjecting the cell to a condition which causes a concentration of inducer molecule to promote the self-splicing activity of the intron, thereby resulting in expression of the prime editor and/or pegRNA in the cell.
40. A method as claimed in any of claims 37 to 39, wherein the inducer molecule is provided to the cell.
41. A method as claimed in any of claims 37 to 39, wherein (a) the inducer molecule is generated as a result of expression of a separate gene in the cell, wherein the separate gene is under the control of different expression regulatory elements; optionally wherein the different expression regulatory elements are responsive to a different inducer molecule and/or physical condition, e.g. temperature; or (b) the inducer molecule is naturally synthesized by the cell in response to chemical and/or physical condition to which the cell is subjected to.
42. A method as claimed in any of claims 37 to 40, wherein a first polynucleotide comprises a self-splicing intron under the control of a first inducer molecule, and a second polynucleotide of comprises a self-splicing intron which is under the control of a second different inducer molecule.
43. A system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide of any of claims 20 to 32, wherein the respective POI or ROI is selected from: A. a Crispr-Cas nuclease, and/or B. a gRNA, and/or C. an HR template.
44. A system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide of any of claims 20 to 32, wherein the respective POI or ROI is selected from: A. a base editor, and/or B. a gRNA
45. A system for inducer controlled genetic modification of a cell, comprising at least a first expression vector, the first expression vector comprising a polynucleotide of any of claims 20 to 32, wherein the respective POI or ROI is selected from: A. a prime editor, and/or B. a pegRNA
46. A system as claimed in any of claims 43 to 45, wherein each individual POI and/ROI is under the control of a respective self-splicing intron.
47. A system as claimed in any of claims 43 to 46, wherein a first polynucleotide comprises a self-splicing intron under the control of a first inducer molecule, and a second polynucleotide comprises a self-splicing intron which is under the control of a second different inducer molecule.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0140] Embodiments of the invention are further described hereinafter with reference to the accompanying drawings, in which:
[0141]
[0142]
[0143]
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151]
[0152]
[0153]
[0154]
[0155]
[0156]
[0157]
[0158]
[0159]
[0160]
[0161]
[0162]
[0163]
[0164]
[0165]
[0166]
[0167]
[0168]
[0169]
[0170]
[0171]
[0172]
[0173]
[0174]
DETAILED DESCRIPTION
[0175] Ribozymes and riboswitches are gene regulation systems found in a wide range of bacterial species. The catalytic and/or regulatory functionality of these RNA molecules relies on their primary, secondary and tertiary structures, making them great candidates for developing universal tools for regulating gene expression, without the use of proteins (Breaker, R. R. Riboswitches and the RNA world. Cold Spring Harbor perspectives in biology 4, a003566 (2012); Park, S. V. et al. Catalytic RNA, ribozyme, and its applications in synthetic biology. Biotechnology advances 37, 107452 (2019); Serganov, A. & Nudler, E. A decade of riboswitches. Cell 152, 17-24 (2013); Serganov, A. & Patel, D. J. Ribozymes, riboswitches and beyond: regulation of gene expression without proteins. Nature Reviews Genetics 8, 776-790 (2007); Weinberg, C. E., Weinberg, Z. & Hammann, C. Novel ribozymes: discovery, catalytic mechanisms, and the quest to understand biological function. Nucleic acids research 47, 9480-9494 (2019)) To this end, several studies used ribozymes and riboswitches to control the expression of a gene of interest (G01), but also for regulating the activity and function of CRISPR-Cas (Zhao, J., et al. Development of aptamer-based inhibitors for CRISPR/Cas system. Nucleic Acids Research (2020); Caadas, I.s.C., et al. RiboCas: a universal CRISPR-based editing tool for Clostridium. ACS synthetic biology 8, 1379-1390 (2019); Tang, W., Hu, J. H. & Liu, D. R. Aptazyme-embedded guide RNAs enable ligand-responsive genome editing and transcriptional activation. Nature communications 8, 1-8 (2017); Siu, K.-H. & Chen, W. Riboregulated toehold-gated gRNA for programmable CRISPR-Cas9 function. Nature chemical biology 15, 217-220 (2019). Kundert, K. et al. Controlling CRISPR-Cas9 with ligand-activated and ligand-deactivated sgRNAs. Nature communications 10, 1-11 (2019); Park, S. V. et al. Catalytic RNA, ribozyme, and its applications in synthetic biology. Biotechnology advances 37, 107452 (2019)). Although quite successful, these approaches leave room for improvement. For example, the technology developed by Tang et al. (2017) requires base pairing of the CRISPR spacer sequence with the 5 end of the hammerhead ribozyme; something that requires modification in case the CRISPR spacer needs to be changed. Moreover, the studies by Kundert et al. (2019), Siu et al. (2019) and Zhao et al. (2020) rely on the secondary structure of the Cas9 single guide RNA (sgRNA), which rules out the use of other CRISPR-Cas systems. Lastly, the RiboCas technology developed by Caadas et al. (2019), regulates the expression of Cas9 by masking the RBS with a theophylline-dependent riboswitch. Whereas this technology is a smart alternative to previous approaches, it can be cumbersome to use either in organisms that do not use the canonical RBS sequence, or in cases that the secondary structure of the 5 UTR sequence interferes with the theophylline aptamer (Chen, S., Bagdasarian, M., Kaufman, M. & Walker, E. Characterization of strong promoters from an environmental Flavobacterium hibernum strain by using a green fluorescent protein-based reporter system. Appl. Environ. Microbiol. 73, 1089-1100 (2007); Gmez, E., lvarez, B., Duchaud, E. & Guijarro, J. A. Development of a markerless deletion system for the fish-pathogenic bacterium Flavobacterium psychrophilum. PLoS One 10, e0117969 (2015); Accetto, T. & AvgAtin, G. Inability of Prevotella bryantii to form a functional Shine-Dalgarno interaction reflects unique evolution of ribosome binding sites in Bacteroidetes. PloS one 6 (2011)).
[0176] The inventors substituted the Wild Type (VVT) P6a loop of the T4 td intron with a theophylline responsive aptamer (see
[0177] To create a universal T4 td intron riboswitch, the inventors introduce modifications to the intron allowing it to be transferred to any gene of interest without compromising its splicing activity. The modifications are located in the 5 and 3 exon sequences of the T4 td intron (
[0178] When converting the T4 td intron into a universal riboswitch, certain modifications were introduced to the intron, allowing it to be transferred into any gene of interest without compromising its activity. The use of the inducer controlled self-splicing intron to control CRISPR-Cas proteins was found to solve the problem of how to engineer some prokaryotes which have proved intractable previously to attempts to modify them with a Crispr-Cas approach, as previous attempts failed to do so (e.g. Flavobacterium IR1).
[0179] In more detail, the inventors explored the role of the 5 exon and 3 exon sequences of the td intron and determined its splicing activity by substituting the relevant bases in the 5 exon and 3 exon (see
[0180] Initially the inventors substituted the 7 and +296 positions of the 5 exon and 3 exon, respectively, and by inserting the different variants into the LacZa gene and by performing assays in E. coli (see Examples 1 to 3). The, positions 6, 5 and 4 of the 5 exon of the td intron were tested. This defined several base substitutions which either allowed more self-splicing and therefore more LacZa activity, or less self-splicing and therefore less LacZa activity.
[0181] The inventors then further modified the 5exon and 3exon sequences of the intron in order to control/titrate its self-splicing activity, or to introduce it in multiple sites in the Open Reading Frame (ORF) of the Gene Of Interest (GOI). The inventors were successful in transferring the self-splicing intron to any GOI at different positions in the ORF.
[0182] Altered splicing efficiency by changing the base pair interactions at the P1 stem of the T4 td intron was previously observed by Pichler A. & Schroeder R. (2002) Folding Problems of the 5 Splice Site Containing the P1 Stem of the Group I Thymidylate Synthase Intron J. Biol. Chem 277 (20) 17987-17993, who created two mutant variants to either stabilize (4A, 5C, 6T) or destabilize (4C, 5A, 6C) the base pair interactions at the P1 stem and noticed increased splicing efficiency for both the stabilized and the destabilized variants compared to the WT intron. However, these results are contradicting to the present results, as stabilization (4A, 5C, 6T) of the P1 stem decreased the splicing efficiency by approximately 80% (compared to the WT intron) in our setup (
[0183] The inventors further successfully provide a universal TAG sequence whereby the intron is introduced just after the ATG start codon and therefore is gene/protein independent. The TAG sequence leaves a 4 amino acid tag at the N-terminus of the protein of interest (P01) just after the methionine (m) encoded by the start codon. This tag sequence does not usually hinder the activity of the expressed protein as it consists only of 4 amino acids. A cleavage sequence of a TEV protease cleavage site can be added directly after the Tag sequence and then cleaved with proteases afterwards. The cleavage leaves a single amino acid attached to the protein of interest. Other cleavage sequences and proteases well known in the art may be used, e.g. https://web.expasy.org/peptide_cutter/ and https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html.
[0184] Using different versions of tag-introns, the inventors are able to control expression of a GOI at the protein level which gives the advantage of titration. Tag sequences are chosen from those shown in
[0185] The addition of Tags has been successfully tested in E. coli, P. putida and Flavobacterium IR1 by inserting Tagged introns after the start codon of Cas12a. This approach allowed efficient editing of the bacterium of interest. More specifically, for P. putida editing efficiencies of up to 75% were reached with Tag4 (
[0186] The invention is applicable to any self-splicing intron and these are found in many species of bacteriophage, bacteria, protozoa and fungi, for example. The self-splicing introns are usually found embedded in specific genes of a species or strain. For example, the T4 td self-splicing intron is located in the td gene of the T4 bacteriophage.
[0187] Other self-splicing introns from bacteriophages are: T6: td, RB3: td, LZ2: td, TulA: td, 1: DNA polymerase, W31: DNA polymerase, Pf-WMP3: DNA polymerase, 822: td, SPO1: DNA polymerase, SP82: DNA polymerase, cpe: DNA polymerase, SPb prophage (Ribonucleotide reductase (bnrdE and bnrdF)), Sb3: lysin, rlt: ORF40, LLH: Terminase, Twort (introns nrdE-11 & nrdE-12): ORF142.
[0188] Examples of self-splicing introns from bacteria are: Agrobacterum tumefaciens A136: tRNA.sup.ArgCCU, Azoarcus sp. strain BH72: tRNA.sup.IleCAU, Coxiella burnetii (Cbu.L1917): 23S rRNA, Coxiella burnetii (Cbu.L1951): 23S rRNA, Thermotoga neapolitana NS-E Tna.bL1931: 23S rRNA, Thermotoga subterranea SL1 Tsu.bL1926: 23S rRNA, Clostridium botulinum: tmma pos. 338, Geobacillus stearothermophilus (NBRC 12550): flagellin, Bacillus sp. Kps3: flagellin, Clostridium difficile strain 630: CD3246, Anabaena PCC7120: tRN.sup.LeuUAA, Scytonema hofmanii: RNA.sup.fMet, Synechocystis PCC 6803: RNA.sup.fMet, Neochloris aquatica: ml pos. 1931, Calothrix sp. strain PCC7601: Cal.x1, Calothrix sp. strain PCC7101: Cal.x2, L. lactis ML3: LI.LtrB, L. lactis 712: IntL, S. meliloti GR4: RmInt1.
[0189] Examples of self-splicing introns from Protozoa are: Tetrahymena thermophila (Tth.L1925): 26S rRNA, Didymium iridis (Dir.S956-1): SSU rDNA, Didymium iridis (Dir.S956-2): SSU rDNA, Physarum polycephalum (Ppo.L1925): LSU rDNA, Amoebidium parasiticum: ml, pos. 2500 and ml, pos. 1403, Naegleria (NaGIR1 and NaGIR2): SSU rRNA.
[0190] Examples of self-splicing introns from Fungi are: Neurospora crassa: ml, pos. 2449, Saccharomyces cerevisae (Sc.OX1,3): SSU rDNA, Candida albicans: 25S rRNA, Scytalidium dimidiatum (rns, pos. 1199).
[0191] Examples of self-splicing introns from other miscellaneous organisms are: Simkania negevensis Z.sup.T: 23S rRNA, Chlamydomonas nivalis: rnl, pos 2593, Dunaliella parva: rnl, pos. 1931, Aureoumbra lagunensis: SSU rRNA, Bangia atropurpurea: SSU rRNA.
[0192] Calothrix sp. strain PCC7601: Cal.x1, Calothrix sp. strain PCC7101: Cal.x2, L. lactis ML3: LI.LtrB, L. lactis 712: IntL, S. meliloti GR4: RmInt1 are Group II introns, while all others are Group I introns.
[0193] Examples of Group III introns include the Euglena gracilis introns found in the psbC, rps18, ycf8, ycf13, rpoCl, rp116, psbF, rps3, rp123, rps18, rps19, rp114, rps8, rps14, rp116, psbK genes.
[0194] A unique type of ribozymes includes the self-splicing Group I introns. Group I introns have been described to control gene expression and RNA processing in bacteria and phages but also in some eukaryotes (protozoa and plants) (Hausner, G., Hafez, M. & Edgell, D. R. Bacterial group I introns: mobile RNA catalysts. Mobile DNA 5, 1-12 (2014); Edgell, D. R., Belfort, M. & Shub, D. A. Barriers to intron promiscuity in bacteria. Journal of Bacteriology 182, 5281-5289 (2000); Nielsen, H. & Johansen, S. D. Group I introns: moving in new directions. RNA biology 6, 375-383 (2009)). Due to their prevalence and simplistic nature, Group I introns have the potential to be used as universal, synthetic ribozymes to control gene expression. Especially when ribozymes are associated with a specific ligand-binding sequence (RNA aptamer), the presence/absence of such a ligand allows for switching ON/OFF the splicing activity (riboswitch), potentially controlling the expression of an associated gene. An example of a natural Group I intron-based riboswitch has been discovered in the bacterium Clostridium difficile, where its sequence resides between the RBS and the ATG start codon of an adjacent gene. After transcription, this results in a secondary structure in the 5-UTR that prevents recruitment of the ribosome, hence hampering translation initiation. After induction by intracellular GTP or c-di-GMP, this ribozyme induces its splicing from the precursor transcript, resulting in appropriate re-positioning of the RBS upstream the start codon, thereby allowing for the ribosome to start the translation process (Lee, E. R., Baker, J. L., Weinberg, Z., Sudarsan, N. & Breaker, R. R. An allosteric self-splicing ribozyme triggered by a bacterial second messenger. Science 329, 845-848 (2010); Chen, A. G., Sudarsan, N. & Breaker, R. R. Mechanism for gene control by a natural allosteric group I ribozyme. Rna 17, 1967-1972 (2011)). Although this natural mechanism is a beautiful case of gene expression control, its requirement for specific endogenous inducers (GTP and c-di-GMP) as well as its dependency on specific secondary structures (including both the ribozyme and the coding sequence) complicates its general applicability. A synthetic alternative was provided by Thompson et al. (2002), when they combined the self-splicing Group I intron of the T4 bacteriophage with a theophylline aptamer towards a functional inducible gene expression system (Thompson, K. M., Syrett, H. A., Knudsen, S. M. & Ellington, A. D. Group I aptazymes as genetic regulatory switches. BMC biotechnology 2, 21 (2002)). Although this system was restricted to controlling the original thymidylate synthase (td) gene, we here describe its repurposing as a generic system to tune gene expression.
[0195] The inventors have also created a novel system termed Self-splicing Intron Based Riboswitch Cas (SIBR-Cas). This is created using the Group I-based aptazyme to enhance recombination in prokaryotes. The inducer controlled T4 td intron (containing an in-frame stop codon) is inserted into a CRISPR-Cas nuclease gene (Cas12a, for example) resulting in incomplete translation and avoiding formation of a functional CRISPR-Cas nuclease. Then, upon exposure to theophylline, this triggers the induction of a conformational change in the synthetic riboswitch which induces the self-splicing activity of the td intron resulting in the excision of the intron and the joining of the 5 exon to the 3 exon. This restores the complete mRNA of the CRISPR-Cas gene which consequently leads to the functional expression/translation of the CRISPR-Cas nuclease. In the particular example of the Cas12a protein, by controlling the expression, a time series can be made to find the appropriate induction time for counter-selection by Cas12a, thereby increasing the chances of generating correct HDR-based mutants.
[0196] So long as the relevant inducer, e.g. theophylline, can reach the self-splicing intron, then the SIBR-Cas system can be used in any organism. The advantages of such a technology are: [0197] Tight control of the GOI (in this case the Cas protein) at the mRNA level. Complete, functional protein will be formed after the induction with theophylline [0198] Universalitythe intron can be introduced to virtually any GOI, in any archaeon, bacterium or eukaryote as long as the inducer can enter the cell of interest, at least at moderate temperatures [0199] No complex design is required as a tag sequence can be used for the insertion of the intron at the beginning of the GOI [0200] The only option for engineering non-model organisms with high AT %, low HDR efficiencies and no inducible (or characterised) promoters (see example of Flavobacterium IR1 below)
[0201] The SIBR-Cas tool can be applied for editing virtually any GOI in any cell of interest. The inventors have applied SIBR-Cas to Flavobacterium IR1.
[0202] Suitable nucleases to be used in the methods described herein are selectable at the option of the skilled person. A choice may depend upon the optimal growth temperature of the particular microbe being used. The CRISPR-Cas nucleases may be selected from any Cas Type I, Type II or Type III. More particularly, the Cas may be selected from Cas9, Cas12a (previously known as Cpf1) or Cas13 (previously known as C2c2); also any of Caw, Cas12b, Cas12c, Cas13a,b,c,d, Cas4, Csn2, Csf1, Csx10, Csx11, Cmr5, Csm2, Cas10, Csy1,2,3, Cse1,2, Cas10d, Cas8a,b,c, Cas5 or Cas3. The CRISPR-Cas nucleases may any variant from any species, whether well-known, e.g. from Streptococcus pyogenes (SpyCas9), or less commonly used such as from Geobacillus thermodenitrificans T12 (ThermoCas9) or Geobacillus stearothermophilus (GeoCas9). Methods described herein may preferably use Cas9, preferably Streptococcus pyogenes Cas9; or C2c1. Alternatively, methods described herein may preferably use Cas 12a (Cpf1). Further alternative nucleases suitable for the methods described herein are C2C3 or Argonaute. It is also contemplated that the methods described herein may use other nucleases such as zinc finger nucleases (ZFNS), meganucleases or transcription activator effector like nucleases (TALENS
[0203] In order that expression of any of the polynucleotide constructs or expression vectors of the invention described herein can be carried out in a chosen host cell, the these incorporate regulatory elements which allow expression in the host cell of interest and preferably which facilitate high-levels of expression. Such regulatory sequences may be capable of influencing transcription or translation of a gene or gene product, for example in terms of initiation, accuracy, rate, stability, downstream processing and mobility.
[0204] Such elements may include, for example, strong and/or constitutive promoters, 5 and 3 UTR's, transcriptional and/or translational enhancers, transcription factor or protein binding sequences, start sites and termination sequences, ribosome binding sites, recombination sites, polyadenylation sequences, sense or antisense sequences, sequences ensuring correct initiation of transcription and optionally poly-A signals ensuring termination of transcription and transcript stabilisation in the host cell. The regulatory sequences may be plant-, animal-. bacteria-, fungal- or virus derived, and preferably may be derived from the same organism as the host cell. Clearly, appropriate regulatory elements will vary according to the host cell of interest. For example, regulatory elements which facilitate high-level expression in prokaryotic host cells such as in E. coli may include the pLac, T7, P(Bla), P(Cat), P(Kat), trp or tac promoters. Regulatory elements which facilitate high-level expression in eukaryotic host cells might include the AOX1 or GAL1 promoter in yeast or the CMV- or SV40-promoters, CMV-enhancer, SV40-enhancer, Herpes simplex virus VIP16 transcriptional activator or inclusion of a globin intron in animal cells. In plants, constitutive high-level expression may be obtained using, for example, the Zea mays ubiquitin 1 promoter or 35S and 19S promoters of cauliflower mosaic virus.
[0205] Suitable regulatory elements may be constitutive, whereby they direct expression under most environmental conditions or developmental stages, developmental stage specific or inducible. Suitably, promoters may be chosen which permit expression of the protein of interest at particular developmental stages or in response to extra- or intra-cellular conditions, signals or externally applied stimuli. For example, a range of promoters exist for use in E. coli which give high-level expression at particular stages of growth (e.g. osmY stationary phase promoter) or in response to particular stimuli (e.g. HtpG Heat Shock Promoter).
[0206] Suitable expression vectors may comprise additional sequences encoding selectable markers which allow for the selection of said vector in a suitable host cell and/or under particular conditions.
[0207] Regarding transformation of a host cell with an heterologous gene sequence, expression constructs comprising the polynucleotide sequences of the invention may be located in plasmids (expression vectors) which are used to transform the host cell. Methods of transformation may include but are not limited to; heat shock, electroporation, particle bombardment, chemical induction, microinjection and viral transformation, Agrobacterium-mediated transformation, PEG-mediated transformation, lipofection.
[0208] As well as a ROI or POI, the polynucleotides of the invention as described herein may include a selectable marker protein. This may be used to screen cell populations positively or negatively. For example, the expression of a particular POI in a host cell may be coupled to relief of an auxotrophic deficit, it will be appreciated that such selectable markers may include polynucleotide sequences encoding proteins to which the cell is fatally sensitive. In these embodiments of the invention, the presence of the desired product may be coupled to the restoration of translation of the reporter protein. In this way host cells expressing the protein of interest may be selected from those which do not express the protein of interest.
[0209] Where the expression of a particular POI in a host cell is coupled to promotion of cell growth and/or division, it will be appreciated that such selectable markers may include polynucleotide sequences encoding proteins which promote cell growth and/or division. In these embodiments of the invention, the presence of the desired product may be coupled to the restoration of translation of the reporter protein. In this way host cells expressing the protein of interest may be selected from those which do not express the protein of interest.
[0210] The polynucleotides may include a reporter protein which may be assayed for or monitored for. Such reporter proteins include for example Green Fluorescent Protein (GFP), Yellow Fluorescent Protein (YFP), Red Fluorescent Protein (RFP), Cyan Fluorescent Protein (CFP), or Luciferase fusion tags. The reporter protein may be an enzyme which can be used to generate an optical signal. Alternatively, the expression vector may incorporate a polynucleotide reporter encoding a luminescent protein, such as a luciferase (e.g. firefly luciferase). Alternatively, the reporter gene may be a chromogenic enzyme which can be used to generate an optical signal, e.g. a chromogenic enzyme (such as beta-galactosidase (LacZ) or beta-glucuronidase (Gus)).
[0211] Tags used for detection of reporter protein expression may also be antigen peptide tags. A cleavable tag may also be provided for affinity purification, e.g. a polyhistidine tag. It is envisaged that other types of label may also be used to indicate expression of the reporter protein including, for example, organic dye molecules or radiolabels. In particular, preferred expression vectors will include sequences encoding a fluorescent protein, for example GFP which will enable the screening and optionally separation (selection) of a cell which expresses the protein of interest for example by Fluorescence Activated Cell Sorting (FACS).
EXAMPLES
Example 1: Effect of Position 7 on the Self-Splicing of the T4 td Intron
[0212] The flanking regions (5 and 3 exons) of the group I introns are part of the coding sequence as well as of the ribozyme (see
[0213] When inserting the intron into another gene it is almost impossible to retain both the intron flanking regions and the CDS. Applying minor changes to the CDS with synonymous codons may create a site that resembles the wild type intron flanking regions. However, it is not clear to which extent the flanking regions determine the splicing efficiency.
[0214] To investigate the effect of the flanking regions of the T4 td intron on its splicing efficiency and on the expression of the target gene, a series of constructs were made containing the lacZ gene from E. coli with the intron in between amino acids D6 and S7 (see
TABLE-US-00001 TABLE1 Primersusedtointroducepointmutationsatthe7positionofthe5exon oftheT4tdintron.Boldbasesshowthe7to4positions.Underlinedbasesshow the7pointmutations. Plasmidname 7 6 5 4 +296 Forward Reverse PEA001[WT] P P W W M GATCTTAAGGATG TGActgcagAATATTAA TTCT GGTTAAT ACGGTAGCATTATGT TGAGGCCTGAGTA TCAGATAAGGTCG TAAGGTG(SEQID (SEQIDNO:25) NO:24) pEA001[7W] W P W W M GATCTTAAGGATG TGActgcagAATATTAA TTCT
GGTTAAT ACGGTAGCATTATGT TGAGGCCTGAGTA TCAGATAAGGTCG TAAGGTG(SEQID (SEQIDNO:25) NO:26) pEA001[7M] M P W W M GATCTTAAGGATG TGActgcagAATATTAA TTTT
GGTTAAT ACGGTAGCATTATGT TGAGGCCTGAGTA TCAGATAAGGTCG TAAGGTG(SEQID (SEQIDNO:25) NO:27) P: Pair; W: Wobble; M: Mismatch.
[0215] The wild type interactions are shown in
[0216] In more detail in
[0217]
Example 2: Effect of Position +296 on the Self-Splicing of the T4 td Intron
[0218] Thompson et al. (2002) Supra do not show any interaction between position +296 (the +3 position in the 3 exon) and the P1 loop of the T4 td intron. This is similar to the situation with the 7 position. Therefore, point mutations at the +296 position of the T4 td intron were made to see if they might impact on the splicing activity of the intron. The WT +296 position (mismatch) was mutated by PCR (Table 2) to form either a pair or a wobble pair with the P1 loop. All mutants were assayed for -galactosidase activity after overnight growth.
TABLE-US-00002 TABLE2 primersusedtointroducepointmutationsatthe+296positionofthe3exonofthe T4tdintron.Boldunderlinedbasesshowthe+296pointmutations.Sequencesareshown from5to3. Plasmidname 7 6 5 4 +296 Forward Reverse pEA001[WT] P P W W M GATCTTAAGGA TGActgcagAATATTAA TGTTCTcttgGGT ACGG AGCATTATGTT TAATTGAGGCC CAGATAAGGTCG TGAGTATAAGG (SEQIDNO:25) TG(SEQIDNO: 24) pEA001[296P] P P W W P GATCTTAAGGA TGActgcagAATATTAA TGTTTTcttgGGT ACGG
AGCATTATGT TAATTGAGGCC TCAGATAAGGTCG TGAGTATAAGG (SEQIDNO:29) TG(SEQIDNO: 28) pEA001[296W] P P W W W GATCTTAAGGA TGActgcagAATATTAA TGTTTTcttgGGT ACGG
AGCATTATGT TAATTGAGGCC TCAGATAAGGTCG TGAGTATAAGG (SEQIDNO:30) TG(SEQIDNO: 28) P: Pair; W: Wobble; M: Mismatch.
[0219]
Example 3: Effect of Positions 4 to 6 on the Self-Splicing of the T4 td Intron
[0220] This investigated the effect of altering positions 4 to 6 in all possible combinations of pair (P), mismatch (M) and wobble pair (W) (if applicable) whilst preserving all the other bases as WT. With reference to
[0221] In
TABLE-US-00003 TABLE3 Primersusedtointroducemutationsatthe6to4positionsofthe5exon oftheT4tdintron.Boldbasesshowthe7to4positions.Underlinedbasesshowthe6 to4mutations.Sequencesareshownfrom5to3. Plasmidname 7 6 5 4 296 Forward Reverse pEA001[WT] P1 P W W M GATCTTAAGGA TGActgcagAATATTAA TGTTCT GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:24) pEA001[PPP] P P P P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:72) pEA001[PPW] P P P W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:73) pEA001[PPM] P P P M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:74) pEA001[PWP] P P W P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:75) pEA001[PWW] P P W W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:28) pEA001[PWM] P P W M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GGT ACGGTAGCATTATGT TAATTGAGGCC TCAGATAAGGTCG TGAGTATAAGG (SEQIDNO:25) TG(SEQIDNO: 31) pEA001[PMP] P P M P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:32) pEA001[PMW] P P M W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:33) pEA001[PMM] P P M M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:34) pEA001[MPP] P M P P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:35) pEA001[MPW] P M P W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:36) pEA001[MPM] ? M P M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:37) pEA001[MPW] P M W P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:38) pEA001[MWW] P M W W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:39) pEA001[MWM] P M W M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:40) pEA001[MMP] P M M P M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:41) pEA001[MMW] P M M W M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:43) pEA001[MMM] P M M M M GATCTTAAGGA TGActgcagAATATTAA TGTTTT
GG ACGGTAGCATTATGT TTAATTGAGGC TCAGATAAGGTCG CTGAGTATAAG (SEQIDNO:25) GTG(SEQID NO:42) P: Pair; W: Wobble; M: Mismatch.
Example 4: Script Development for the Introduction of the T4 td Intron at any Gene of Interest
[0222] Transferring the T4 td intron into the open reading frame (ORF) of genes other than the WT thymidylate synthase gene, can be achieved by following the script provided in Example 11, and by introducing silent mutations to the 5 and 3 flanking regions of the intron. The script retains the WT interactions of positions 1 to 3, +294 and +295, but changes the positions 4 to 6 and +296 in order to find an insertion side in the gene of interest (GOI). The script ensures that the insertion side preserves the amino acids of the encoded protein from the GOI by introducing silent mutations and it also ensures sufficient splicing activity of the intron according to our previous results.
[0223]
[0224] An example of the insertion site for the FnCas12a gene is shown in
[0225] To control the splicing activity of the T4 td intron, Thompson et al. (2002) Supra attached a theophylline aptamer at the P6 stem loop of the T4 td intron. In a similar fashion, the theophylline aptamer was also added to the modified (changes at positions 4 to 6) T4 td introns developed in this example. In this way, tight, titratable and inducible control of the GOI was obtained. A schematic representation of the T4 td intron with the theophylline aptamer at the P6 stem loop is shown in
Example 5: Generation of Tagged-T4 td Intron Variants and Use Thereof to Control Expression of FnCas12a in Escherichia coli MG1655
[0226] To further control the splicing activity of the modified introns, a theophylline aptamer was added at the P6 stem loop of the T4 td intron as previously described (see Thompson et al. (2002) Supra) and shown in
[0227] As shown in
[0228] Splicing of the intron results in a short (four amino acid long) tag sequence attached to the N-terminus of the POI (when not counting the M encoded by the start codon) whereas unspliced mRNA results in a small, non-functional peptide sequence (due to stop codons present in the T4 td intron).
[0229]
TABLE-US-00004 TABLE 4 Plasmids used for targeting in E. coli MG 1655. Plasmid name Description and relevant characteristics pSIBR EcoPpu NT tag 1 KanR, FnCas12a with Tag1-intron, NT crRNA pSIBR EcoPpu NT tag 2 KanR, FnCas12a with Tag2-intron, NT crRNA pSIBR EcoPpu NT tag 3 KanR, FnCas12a with Tag3-intron, NT crRNA pSIBR EcoPpu NT tag 4 KanR, FnCas12a with Tag4-intron, NT crRNA pSIBR EcoPpu NT no intron KanR, WT FnCas12a, NT crRNA pSIBR EcoPpu T lacZ tag 1 KanR, FnCas12a with Tag1-intron, LacZ T crRNA pSIBR EcoPpu T lacZ tag 2 KanR, FnCas12a with Tag2-intron, LacZ T crRNA pSIBR EcoPpu T lacZ tag 3 KanR, FnCas12a with Tag3-intron, LacZ T crRNA pSIBR EcoPpu T lacZ tag 4 KanR, FnCas12a with Tag4-intron, LacZ T crRNA pSIBR EcoPpu T lacZ no intron KanR, WT FnCas12a, LacZ T crRNA
[0230] Electrocompetent E. coli MG1655 were transformed (2.5 kV, 200 , 25 F) with 10 ng L.sup.1 of the respective plasmid and recovered for 1 hour in 500 L LB medium [10 g L.sup.1 tryptone (Oxoid), 5 g L.sup.1 yeast extract (BD), 10 g L.sup.1 NaCl (Acros)] at 37 C. Then, the recovered culture was serially diluted and drop plated on selective (50 g mL.sup.1 kanamycin) LB agar plates in the presence or absence of 2 mM theophylline. The agar plates were incubated at 30 C. for 24 hours and the CFUs were counted.
[0231]
[0232]
[0233]
Example 6: Efficient Homologous Recombination in E. coli MG1655 Using T4 td Intron Variants
[0234] For efficient genome editing in bacteria, HR should precede CRISPR-Cas counterselection. To assess whether tight control over CRISPR-Cas targeting could bolster the efficiency of CRISPR-Cas mediated genome editing by allowing more time for HR to occur, we used SIBR-Cas and targeted the LacZ gene of E. coli MG1655 for knock-out through HR and CRISPR-Cas counterselection using a blue/white screening colony assay. To facilitate HR, we added 500 bp up- and down-stream homology arms to the plasmids expressing the four SIBR-Cas (Int1-4) and WT-FnCas12a variants that target the LacZ gene. After 1 hour recovery, we induced the expression of the SIBR-Cas variants to counterselect the WT from the mutant colonies.
[0235] The WT-FnCas12a variant targeting the LacZ gene produced no colonies, demonstrating the targeting efficiency of WT-FnCas12a but also the inefficient HR system of the WT E. coli MG1655 strain (
[0236] Since disruption of LacZ can also be achieved through non-HR mediated approaches (spontaneous mutations or occasional error-prone DNA repair following DNA cleavage by Cas12a), not all gene deletions can be screened phenotypically. Therefore, we repeated our experiment, but X-gal was omitted from the medium to eliminate the possibility of false-positives. Randomly selected colonies that were obtained were screened by PCR for LacZ deletion showing a 0%, 0%, 29% and 38% editing efficiency for Int1, Int2, Int3 and Int4 SIBR-Cas variants, respectively (
Example 7: Tight and Inducible Expression of FnCas12a in Pseudomonas putida Using Tagged-T4 td Intron Variants
[0237] Following successful demonstration of inducible expression of FnCas12a in E. coli, the system was transferred to Pseudomonas putida, an organism with very low HR efficiencies. Plasmids bearing the four T4 td intron-FnCas12a or the intron-less FnCas12a and an EndA T or an NT crRNA were transformed to P. putida and the targeting efficiency was assessed by comparing the CFUs g.sup.1 in the presence or absence of the theophylline inducer. All the plasmids used for this experiment are listed in Table 5.
TABLE-US-00005 TABLE 5 Plasmids used for targeting in P. putida. Plasmid name Description and relevant characteristics pSIBR EcoPpu NT tag 1 KanR, FnCas12a with Tag1-intron, NT crRNA pSIBR EcoPpu NT tag 2 KanR, FnCas12a with Tag2-intron, NT crRNA pSIBR EcoPpu NT tag 3 KanR, FnCas12a with Tag3-intron, NT crRNA pSIBR EcoPpu NT tag 4 KanR, FnCas12a with Tag4-intron, NT crRNA pSIBR EcoPpu NT no intron KanR, WT FnCas12a, NT crRNA pSIBR EcoPpu T EndA tag 1 KanR, FnCas12a with Tag1-intron, EndA T crRNA pSIBR EcoPpu T EndA tag 2 KanR, FnCas12a with Tag2-intron, EndA T crRNA pSIBR EcoPpu T EndA tag 3 KanR, FnCas12a with Tag3-intron, EndA T crRNA pSIBR EcoPpu T EndA tag 4 KanR, FnCas12a with Tag4-intron, EndA T crRNA pSIBR EcoPpu T EndA no intron KanR, WT FnCas12a, EndA T crRNA
[0238] In more detail, electrocompetent P. putida cells were transformed (2.5 kV, 200 , 25 F) with 200 ng plasmid and recovered in 1 ml LB for 2 hours at 30 C. Then, the culture was serially diluted and drop plated on selective (50 g mL.sup.1 kanamycin) LB agar plates in the presence or absence of 2 mM theophylline. The agar plates were incubated at 30 C. for 24 hours and the CFUs were counted.
[0239]
Example 8: Efficient Homologous Recombination in P. putida Using Tagged-T4 td Intron Variants
[0240] Further genome editing experiments were conducted to knock-out the FlgM gene of P. putida. A repair template (1125 bp) was included on the plasmids bearing approximately 500 bp homologous sides upstream and downstream of the FlgM gene. The repair template was introduced to either of the four tagged-intron-FnCas12a variants along with the T crRNA for counterselection or the NT crRNA as a control. A list of the plasmids is given in table 6. Plasmids were transformed to P. putida through electroporation and the transformed cells were recovered in LB medium for 2 hours before plating on LB agar plates containing 50 g ml.sup.1 kanamycin and 2 mM theophylline. Plates were incubated at 30 C. overnight and formed colonies were screened through colony PCR for the knock-out of the FlgM gene.
TABLE-US-00006 TABLE 6 Plasmids used for homologous recombination in P. putida. Plasmid name Description and relevant characteristics pSIBR EcoPpu NT FlgM HA tag 1 KanR, FnCas12a with Tag1-intron, NT crRNA, Homologous arms for FlgM pSIBR EcoPpu NT FlgM HA tag 2 KanR, FnCas12a with Tag2-intron, NT crRNA, Homologous arms for FlgM pSIBR EcoPpu NT FlgM HA tag 3 KanR, FnCas12a with Tag3-intron, NT crRNA, Homologous arms for FlgM pSIBR EcoPpu NT FlgM HA tag 4 KanR, FnCas12a with Tag4-intron, NT crRNA, Homologous arms for FlgM pSIBR EcoPpu T FlgM HA tag 1 KanR, FnCas12a with Tag1-intron, EndA T crRNA, Homologous arms for FlgM pSIBR EcoPpu T FlgM HA tag 2 KanR, FnCas12a with Tag2-intron, EndA T crRNA, Homologous arms for FlgM pSIBR EcoPpu T FlgM HA tag 3 KanR, FnCas12a with Tag3-intron, EndA T crRNA, Homologous arms for FlgM pSIBR EcoPpu T FlgM HA tag 4 KanR, FnCas12a with Tag4-intron, EndA T crRNA, Homologous arms for FlgM
Example 9: Using a Theophylline Induced Self-Splicing Intron to Control the Expression of Cas12a and Knock Out the SprF Essential Gene in the Non-Model Organism Flavobacterium IR1
[0241] Flavobacterium IR1 is a non-model organism known for its iridescent colour (see Johansen, V., et al., (2018) Genetic manipulation of structural colour in bacterial colonies Proceedings of the National Academy of Sciences 115 (11): 2652-2657; and Schertel, L., G. T. et al., (2020) Complex photonic response reveals three-dimensional self-organization of structural coloured bacterial colonies Journal of the Royal Society Interface 17 (166): 20200196). The lack of genomic tools and the low HR efficiency of IR1 are currently the main bottlenecks limiting the fundamental characterization and commercial exploitation of this phenomenon (i.e. development of new paints). As IR1 is a recently discovered non-model organism, inducible promoters are not characterized. Therefore, the control of CRISPR-Cas cannot succeed without a promoter-independent regulatory system such as is disclosed herein.
[0242] To establish controllable genetic engineering tools for IR1, plasmids were constructed by inserting the 300 bp self-splicing aptazyme intron of Thompson et al., (2002) Supra into the fncas12a gene to provide a module, and subsequently inserting this module into two editing plasmids yielding pSIBRFnCas12a_sprF_HR_NT (no-target spacer) and pSIBRFnCas12a_sprF_HR_S3 (spacer targeting sprF gene). For this, the theophylline T4 td intron was introduced in the ORF of FnCas12a. The insertion position was generated by using the algorithm of Example 11. The insertion position is illustrated in
[0243] The constructed plasmids were then transformed into IR1 and cultured following the experimental design shown in
[0244] Prior to theophylline induction, the liquid cultures of IR1 transformed with pSIBRFnCas12a_sprF_HR_S3 and incubated for 0, 24, and 48 h showed no obvious growth (data not shown). Correspondingly, there was no colony obtained after plating these cultures following theophylline induction in the liquid culture. In contrast, IR1 transformed with the non-targeting plasmid pSIBRFnCas12aFb_sprF_HR_NT showed growth following 24 and 48 h incubation prior and after the induction with theophylline.
[0245]
[0246] Interestingly, after 72 h and 96 h incubation, cultures transformed with pSIBRFnCas12a_sprF_HR_S3 started to show some growth (data not shown). Likewise, colonies were also obtained when plating these cultures after theophylline induction.
Example 10: Highly Efficient Homologous Recombination in the Non-Model Organism Flavobacterium IR1
[0247] To demonstrate the inefficient HR mechanism of IR1, the organism was transformed using electroporation with a plasmid expressing an intron-less (WT) FnCas12a under a constitutive promoter (OmpA-P), and a T crRNA targeting the SprF gene under the constitutive promoter HU-P. A repair template (2963 bp) for knocking out the SprF gene through HR was also included on the plasmid resulting in the final plasmid pFnCas12aFb_sprF_HR_T. As control, the crRNA was replaced with an NT crRNA resulting in pFnCas12aFb_sprF_HR_NT. Also, the pCP11 empty vector was used as an indicator for transformation efficiency.
[0248] IR1 electrocompetent cells were prepared as follows: IR1 was grown overnight in 10 mL of ASW at 25 C., 200 rpm. The overnight culture was used to inoculate 2100 mL ASW broth in 500 ml baffled flask and grown until it reached an OD600 of 0.3. Thereafter, the cells were harvested by centrifugation at 4000 rpm for 10 minutes, 4 C. The cells were washed two times with 1volume of washing buffer (10 mM MgCl.sub.2 and 5 mM CaCl.sub.2) at 4 C. and washed once with 10% (v/v) glycerol (Gilchrist and Smit, (1991) Transformation of freshwater and marine caulobacters by electroporation Journal of Bacteriology 173.2: 921-925). The pellet was suspended using 10% glycerol to 1/100 of the initial volume. Cells were divided into aliquots of 100 L in 1.5 mL Eppendorf tubes and stored at 80 C. until use.
[0249] IR1 electrocompetent cells were transformed with 1 g l.sup.1 plasmid in 1-mm cuvette using the following settings: 1.5 kV, 200 , 25 F. 900 L of ASW medium [5 g L.sup.1 peptone (Sigma #70173), 1 g L.sup.1 yeast extract (BD), 10 g L.sup.1 sea salt (Sel marin)] was added immediately and the cells were incubated at 25 C. for 4 hours for recovery. The cells were plated on ASWBC agar [ASW medium, 15 g L.sup.1 agar (Oxoid), 100 mg L.sup.1 nigrosine (Aldrich #198285), and 5 g L.sup.1 Kappa Carrageenan (Special Ingredients)] supplemented with 200 g mL.sup.1 erythromycin and incubated at 25 C. for 2 to 3 days.
[0250]
[0251] Clearly, the constitutive expression of the WT FnCas12a and the T crRNA along with the inefficient HR machinery of IR1 resulted in the targeting of the genome of IR1 causing cell death. To overcome this limitation, it is suggested to use the four T4 td intron-FnCas12a variants (as developed for E. coli and P. putida) and this would result in the tight and controlled expression of FnCas12a in order to allow HR to precede counterselection.
[0252] Because theophylline uptake from IR1 appears to be a prerequisite, a toxicity assay was carried out on the growth of IR with varying theophylline concentrations (0, 0.1, 2, 5 and mM) grown for 24 hours at 25 C.
[0253] To achieve efficient HR in IR1, the WT FnCas12a in pFnCas12aFb_sprF_HR_S3 was replaced with the four T4 td intron-FnCas12a variants developed previously for E. coli and P. putida resulting in the plasmids listed in table 7. As a control, the WT FnCas12a of pFnCas12aFb_sprF_HR_NT was replaced with the four T4 td intron-FnCas12a variants (Table 7). In addition, a better method was developed in order to increase the obtained colonies as this will increase the chances of obtaining knock-outs.
[0254]
TABLE-US-00007 TABLE 7 plasmids used for homologous recombination in Flavobacterium IR1. Plasmid name Description and relevant characteristics pSIBR Flavo NT SprF HA tag 1 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag1- intron, NT crRNA, Homologous arms for FlgM pSIBR Flavo NT SprF HA tag 2 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag2- intron, NT crRNA, Homologous arms for FlgM pSIBR Flavo NT SprF HA tag 3 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag3- intron, NT crRNA, Homologous arms for FlgM pSIBR Flavo NT SprF HA tag 4 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag4- intron, NT crRNA, Homologous arms for FlgM pSIBR Flavo T SprF HA tag 1 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag1- intron, T crRNA, Homologous arms for FlgM pSIBR Flavo T SprF HA tag 2 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag2- intron, T crRNA, Homologous arms for FlgM pSIBR Flavo T SprF HA tag 3 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag3- intron, T crRNA, Homologous arms for FlgM pSIBR Flavo T SprF HA tag 4 SpecR for E. coli, ErmR for Flavo, FnCas12a with Tag4- intron, T crRNA, Homologous arms for FlgM
Example 11: Python Script Used to Find Insertion Sites
[0255] By using the following script, the user can upload the sequence of the gene of interest and the script will return possible insertion sites for the T4 td intron. The insertion sites will require point mutations to be introduced when inserting the intron at the target site. Multiple sites are possible options but one at the beginning of the gene is recommended to eliminate potential function of partially produced proteins.
TABLE-US-00008 def findbindingtype(Q, S): a = M if Q = = T and S = = A: a = P if Q = = T and S = = G: a = W if Q = = C and S = = G: a = P if Q = = A and S = = T: a = P if Q = = G and S = = C: a = P if Q = = G and S = = T: a = W if Q = = N or S = = N: a = X return a def permutatelist(X): Y = [] for i in X: Z = Y Y = [ ] for j in i: for I in range(len(Z)): Y.append(Z[I]+j) return Y def matchsense(q, s): a = 100.000 if q = = s: a = 1.000 if q = = R and (s = = A or s = = G): a = 1.000 if q = = Y and (s = = C or s = = T): a = 1.000 if q = = S and (s = = G or s = = C): a = 1.000 if q = = W and (s = = A or s = = T): a = 1.000 if q = = K and (s = = G or s = = T): a = 1.000 if q = = M and (s = = A or s = = C): a = 1.000 if q = = B and s = A: a = 1.000 if q = = D and s != C: a = 1.000 if q = = H and s != G: a = 1.000 if q = = V and s != T: a = 1.000 if q = = N: a = 1.000 return a def RevComp(sequence): RC = for i in sequence: j = if i = = T: j = A if i = = C: j = G if i = = A: j = T if i = = G: j = C if i = = Y: j = R if i = = W: j = W if i = = K: j = M if i = = M: j = K if i = = S: j = S if i = = R: j = Y if i = = H: j = G if i = = B: j = A if i = = D: j = C if i = = V: j = T if i = = N: j = N RC = j + RC return RC #-------------- Input Subject ------------------------- SubDNA = [ ] FN = raw_input(DNA File name = ) with open(FN, r+) as f: for r in f: for c in r: if c.upper( ) = = G or c.upper( ) = = A or c.upper( ) = = T or c.upper( ) = = C: SubDNA.append(c.upper( )) #-------------- Input Query --------------------------- #8 7 6 5 4 3 2 1 1 2 3 4 QueDNA = [N ,T ,G ,A ,G ,T, C, C ,G ,G ,A ,G ,T] #Native QueDNA = [N ,T ,G ,A ,G ,T ,C ,C ,G ,G ,A ,G ] #Simplified #Query Functions: #(P)air, (W)obbly, (N)onbinding) #(F)ixed (P)air, (F)ixed (W)obbly, (F)ixed (N)onbinding #-------------- Import Codon Table -------------------- AA = [ ] Codon = [ ] import csv with open(Codon.csv) as csvfile: f = csv.reader(csvfile, delimiter=,, quotechar=|) for r in f: Codon.append(r[0]) AA.append(r[1]) #-------------- Translate Subject --------------------- SubPro = [ ] for i in range(int(len(SubDNA)/3)): qcodon = SubDNA[3*i]+SubDNA[3*i+1]+SubDNA[3*i+2] for j in range(64): if Codon[j] = = qcodon: SubPro.append(AA[j]) #--------------- Define Peptide ----------------------- TBS_DNA = [ ] TBS_Pos = [ ] for s in range(len(SubPro)-int((len(QueDNA)+4)/3)1): #Subject Protein Start Peptide = SubPro[s:s+int((len(QueDNA)+4)/3)] #--------------- Reverse Translate -------------------- mRNA = [ ] for i in Peptide: Cdn = [ ] for j in range(64): if AA[j] = = i: Cdn.append(Codon[j]) mRNA.append(Cdn) RevTrans = permutatelist(mRNA) TBS_DNA = TBS_DNA + RevTrans for k in RevTrans: TBS_Pos.append(s) #--------------- Test binding type -------------------- print Binding Type analysis ScoreTable_BP = [ ] ScoreTable_Score = [ ] import csv with open(TdScoreTable.csv) as csvfile: f = csv.reader(csvfile, delimiter=,, quotechar=|) for r in f: ScoreTable_BP.append(r[0]) ScoreTable_Score.append(r[1]) DNA_List = [ ] Pos_List = [ ] Score_List = [ ] Pep_List = [ ] Frame_List = [ ] Query = [N,N,N] + QueDNA + [N,N,N] for i in range(len(TBS_DNA)): #Subject list Subject = TBS_DNA[i] for f in range(3): #Frame select BP = for j in range(len(QueDNA)): BP = BP + findbindingtype(QueDNA[j],Subject[j+f]) Score = 0 for k in range(len(ScoreTable_BP)): if BP = = ScoreTable_BP[k]: Score = ScoreTable_Score[k] DNA_List.append(TBS_DNA[i]) Pos_List.append(TBS_Pos[i]+1) Score_List.append(Score) Frame_List.append(f+1) Peptide = for I in SubPro[TBS_Pos[i]: TBS_Pos[i]+int((len(QueDNA)+4)/3)]: Peptide = Peptide + I Pep_List.append(Peptide) #--------------- select 5 best ------------------------ Result = [ ] for row in range(len(Pos_List)) : Result.append ([Pos_List[row], DNA_List[row], Pep_List[row], Score_List[row], Frame_List[row]]) Result2 = sorted(sorted(Result, key=lambda A: A[3], reverse=True), key=lambda A: A[0]) A = Results2[0][0] count = 0 Result3 = [ ] for r in range(len(Result2)): if Result2[r][0] = = A and count <= 5: Result3.append(Result2[r]) count = count + 1 if Result2[r][0] != A: A = Result2 [r][0] count = 1 print Result3 #--------------- Export to CSV ------------------------ print Writing intron sites to file import csv Writer = csv.writer(open(FN+.csv, wb), delimiter=,) Writer.writerow([Position, DNA, Protein, Score, Frame]) with open(FN+.csv, ab) as F: for row in range(len(Result3)): Writer = csv.writer(F, delimiter=,) Writer.writerow(Result3[row]) #------------------Find the Restriction Enzymes------------------------- #------------------Import RE Sequences------------------ PreREName = [ ] PreRESeq = [] import csv with open(NEB RE.csv) as csvfile: f = csv.reader(csvfile, delimiter=,, quotechar=|) for r in f: PreREName.append(r[0]) PreRESeq.append(r[1]) #------------------Find RE in DNA Sequence-------------------------- print(Analysing current Restriction endonulease sites) REName = [ ] RESeq = [ ] REPresent = [ ] PreREPresent = [ ] for e in range(len(PreRESeq)): QueDNA = [ ] s = 0 for c in PreRESeq[e]: QueDNA.append(c) for i in range(len(SubDNA)-len(QueDNA)+1): score = 0 for j in range(len(QueDNA)): score = score + matchsense(QueDNA[j],SubDNA[i+j]) if score = = len(QueDNA): if s = = 0: Result = str(int((i+3)/3)) if s > 0 and s <= 3: Result = Result + , + str(int((i+3)/3)) s = s + 1 if s > 0: PreREPresent.append(Result) if s = = 0: PreREPresent.append(N/A) if s < = 3: REPresent.append(PreREPresent[e]) REName.append(PreREName[e]) RESeq.append(PreRESeq[e]) #------------------Find RE is protein Sequence--------------------------- ResultName = [ ] ResultPos = [ ] ResultDNA = [ ] ResultProtein = [ ] ResultFrame = [ ] ResultOtherRE = [ ] import csv Writer = csv.writer(open(FN+-RE.csv, wb), delimiter=,) Writer.writerow([Enzyme, Position, DNA Seq, AA Seq, Frame, RE sites present]) print(Analysing potential Restriction endonulease sites) for e in range(len(RESeq)): QueDNA = [N,N,N] for c in RESeq[e]: QueDNA.append(c) QueDNA = QueDNA+[N]+[N]+[N]+[N] for f in range(3): for i in range(len(SubPro)-int((len(QueDNA)-3)/3)+1): #Subject amino acid start currentDNA = currentProtein = currentScore = 0 m = 0 #Maxscore counter for j in range(int((len(QueDNA)3)/3)): #Subject amino acid SubAA = SubPro[i+j] maxscore = 0 for k in range(64): #64 codons/AA if SubAA = = AA[k]: score = 0 QueCodon = Codon[k] for I in range(3): score = score + matchsense(QueDNA[3*j+3+I-f],QueCodon[I]) if score<100 and score>maxscore: maxscore = score maxcodon = k if maxscore>0: m = m + 1 currentDNA = currentDNA + Codon[maxcodon] currentProtein = currentProtein + AA[maxcodon] currentScore = currentScore + maxscore if m = = int((len(QueDNA)3)/3): ResultName.append(REName[e]) ResultPos.append(i+1) ResultDNA.append(currentDNA) ResultProtein.append(currentProtein) ResultFrame.append(f+1) ResultOtherRE.append(REPresent[e]) print(Writing RE sites to file) import csv with open(FN+-RE.csv, ab) as F: for i in range(len(ResultName)): Writer = csv.writer(F, delimiter=,) Writer.writerow([ResultName[i], ResultPos[i], ResultDNA[i], ResultProtein[i], ResultFrame[i], ResultOtherRE[i]]) print Done
Example 12: Using the SIBR T4 td Intron for Inducible Gene Expression into the Eukaryotic Model Organism Bakers Yeast (Saccharomyces cerevisiae)
[0256] To show the functionality and applicability of SIBR into eukaryotic systems, we transferred SIBR into the eukaryotic model organism Baker's yeast (Saccharomyces cerevisiae) and controlled the expression of the FnCas12a protein.
[0257] To control the activity of FnCas12a, we sought to disrupt its activity by disrupting the encoded protein through the placement of SIBR. To this end, by using the acquired knowledge from Example 1, 2 and 3 and the generated script from Example 4 and/or 11, we introduced SIBR before the RuvC I domain at amino acid position 859 (
[0258] PL-319, PL-320 or pUDE731 were co-transformed in the yeast S. cerevisiae with either a plasmid containing a non-targeting spacer (PL-207) or a plasmid containing a targeting spacer (PL-074). The targeting spacer was targeting the ADE2 gene. To transform S. cerevisiae, the LiAc/SS carrier DNA/PEG method by Gietz and Schiestl (Gietz, R. D. and Schiestl, R. H., 2007. High-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nature protocols, 2(1), pp. 31-34) was used. 500 ng of each plasmid was used per transformation. After transformation, the transformed yeast cells were recovered in YPD medium for 3 hours at 30 C. and then serially diluted in PBS and plated on drop-out (omitting uracil; for the selection of PL-319, PL-320 or pUDE731) minimal agar medium (1.7 g/L bacto-yeast nitrogen base w/o amino acids and without ammonium sulfate; 1 g/L monosodium glutamate; 20 g/L glucose; 20 g/L agar) containing 200 g/mL Geneticin (G418 sulfate) antibiotic (for the selection of PL-074 or PL-207; targeting and non-targeting plasmids) and containing different concentrations of theophylline (0, 5, 10, 20 mM).
[0259] The results of this experiment are depicted in
Example 13: Turning any Group I Intron into a SIBR System
[0260] As noted herein above, Group I introns, like the T4 td intron, form core secondary structures consisting of multiple paired regions. In principle, to turn any Group I intron into a Self-splicing Intron Based Riboswitch (SIBR) according to the invention, a stepwise approach, for example as described herein below, can be followed, similar to the one described in this patent.
[0261] As a first step, a library of mutant 5 and 3 exonic sequences is developed, since the and 3 exonic sequences of Group I introns interact with the intron sequence and affect the secondary and tertiary structure of the intron. This mutant library will serve as the basis to define the effect of the 5 and 3 exonic sequences on the splicing efficiency of the intron. Moreover, this library will contain introns with a range (low to high) of splicing efficiencies. It is likely that the mutant intron library will contain introns with better splicing efficiency than the wild type intron; similar to the results observed in Examples 1-3). Also, this library will allow the transfer of the intron of interest to the open reading frame of any gene of interest without disturbing the amino acid sequence of the target gene/protein, for example when applying the script of Example 4 and/or 11.
[0262] Next, to achieve inducible control over the splicing of the intron, an aptamer moiety which responds to specific small molecules (e.g. theophylline) is introduced at one or multiple pairing (P) domains of the intron. For example, as described by Thompson et al., 2002, and also shown in Examples 5 and 7, the theophylline aptamer is introduced at the P6 domain of the T4 td intron, turning it into an inducible self-splicing gene regulator. Another example is described by Kertsburg and Soukup, 2002 (Nucleic Acids Research, Volume 30, Issue 21, 1 Nov. 2002, pages 4599-4606), where they turned the Tetrahymena group I intron into an inducible self-splicing intron by replacing the P6 or P8 or both P6 and P8 domains with a theophylline aptamer. Similar approaches (to that of Thompson et al., 2002 and that of Kertsburg and Soukup, 2002) can be taken for any other Group I intron where one of their P domains is altered to contain an aptamer moiety that responds to specific small molecules and can consequently control the splicing of the intron.
[0263] After generating the mutant intron library (mutations at the 5 and 3 exonic sequences) and achieving inducible control over the splicing of the intron (through the introduction of an aptamer in one of the P domains of the intron), the generated intron variants can be moved to the ATG start codon, or 5 to the start codon, of the polynucleotide portion encoding the POI, or within the polynucleotide portion encoding the POI. When transferring the intron at a location of choice, attention should be given in avoiding codon frameshifting after splicing as this will result in a non-sense protein.
Example 14: Turning any Group II Intron into a SIBR System
[0264] Group II introns are found in higher (plants) and lower eukaryotes (fungi and yeasts) but also in bacteria. Similar to Group I introns, group II introns reside in between genes (separating them into 5 and 3 exons) which upon excision (formation of a lariat product instead of linear product as observed for Group I introns) allow for the formation of a functional protein. Group II introns can self-splice, although some intron-encoded proteins (IEPs) may facilitate splicing by stabilizing the intron RNA structure. The 5 and 3 exonic sequences of the Group II introns (called intron-binding site or IBS) interact with conserved domains of the intron (called exon-binding site or EBS) to form long-range tertiary interactions. The intron-exon interactions are necessary for splicing as they bring the intron at the active site of the exons in order to facilitate the typical transesterification reaction that mediates the excision of the intron. The necessity of intron-exon interactions for splicing, translates into a limitation in transferring any group II intron into any gene of interest (GOI), as the exon sequences need to be conserved. To overcome this, a similar approach as the one developed in this patent for Group I introns can be used.
[0265] First, a mutant library of Group II introns can be generated in which the exon sequences (IBS1 and IBS2 for 5 exon and IBS3 for 3 exon) are mutated. In some cases, and especially when the IBS is heavily mutated, the EBS might need to be modified as well to maintain the IBS-EBS base-pairing necessary for the formation of long-range tertiary interactions. The generated mutant library is then assessed for the efficiency of the self-splicing activity of the intron, by following a similar approach as that was employed for LacZ as described in Examples 1-3. Important to note is that self-splicing efficiency can be assessed by any other in vitro or in vivo method (other than LacZ) as long as it can distinguish the formation of spliced products from un-spliced products, or the formation and quantity of active protein from inactive proteins.
[0266] In the case where the Group II intron mutant library will be assayed through a protein (similar to that of LacZ; Examples 1-3) then, for convenience and to maintain the coding sequence of the protein, the Group II intron can be transferred directly after the ATG start codon in order to maintain the coding sequence of the protein. This approach was described in Examples 1-3 (LacZ) and Example 5 (FnCas12a). The outcome of the Group II intron mutant library assay is expected to yield a range with good and bad splicing introns which can then be used to modulate/tune the expression of the gene/RNA/protein of interest.
[0267] After establishing the requirements for splicing as defined by the IBS-EBS interactions, a script similar to Example 4 and/or 11 can be developed that allows for transferring the mutant Group II intron to virtually any gene/RNA/protein of interest.
[0268] In case inducible self-splicing is required, an aptamer moiety which responds to specific small molecules (e.g. theophylline) is introduced at one or multiple pairing (P) domains of the intron. To achieve this, the approaches developed and applied by Thompson et al. (2002) Ibid, and Kertsburg and Soukup (2002) lbid, can be used.
[0269] After generating the mutant Group II intron library (mutations at the 5 and 3 exonic sequences) and achieving inducible control over the splicing of the Group II intron (through the introduction of an aptamer in one of the P domains of the intron), the generated Group II intron variants can be moved at the ATG start codon, or 5 of the ATG start of the polynucleotide portion encoding the POI, or within the polynucleotide portion encoding the POI. When transferring the intron at the location of choice, attention should be given in avoiding codon frameshifting after splicing as this will result in a non-sense protein.
Example 15: Turning any Group III Intron to a SIBR System
[0270] In general, Group III introns are short (approx. 100 nt) U-rich introns which are predominantly found in Euglena gracilis. Group III introns are considered streamlined versions of Group II introns as they retain the 5 splice site of group II introns but lack the catalytic domain V and the domains II-IV. To splice, a similar mechanism is used as that of Group II introns where the IBS1 pairs with EBS1 to form long-range tertiary interactions and facilitate splicing (Hong, L. and Hallick, R. B., 1994 A group III intron is formed from domains of two individual group II introns Genes & development, 8(13), pp. 1589-1599). In principle, Group III introns can be turned into SIBR by changing/mutating the IBS-EBS interactions as described in Example 14 and by introducing a ligand dependent aptamer to one of its domains (e.g. at the VI domain). The defined mutant libraries can then be used to modulate the splicing efficiency of the introns.
[0271] Throughout the description and claims of this specification, the words comprise and contain and variations of them mean including but not limited to, and they are not intended to (and do not) exclude other moieties, additives, components, integers or steps. Throughout the description and claims of this specification, the singular encompasses the plural unless the context otherwise requires. In particular, where the indefinite article is used, the specification is to be understood as contemplating plurality as well as singularity, unless the context requires otherwise.
[0272] Features, integers, characteristics, compounds, chemical moieties or groups described in conjunction with a particular aspect, embodiment or example of the invention are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. The invention is not restricted to the details of any foregoing embodiments. The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
[0273] The readers attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
[0274] Nucleotide Sequences
TABLE-US-00009 SequenceofWTT4tdintron(SEQIDNO:44): TAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACGGGGAACCT CTCTAGTAGACAATCCCGTGCTAAATTGTAGGACTTGCCCTTTAATAAATACTTCTATA TTTAAAGAGGTATTTATGAAAAGCGGAATTTATCAGATTAAAAATACTTTAAACAATAAA GTATATGTAGGAAGTGCTAAAGATTTTGAAAAGAGATGGAAGAGGCATTTTAAAGATT TAGAAAAAGGATGCCATTCTTCTATAAAACTTCAGAGGTCTTTTAACAAACATGGTAAT GTGTTTGAATGTTCTATTTTGGAAGAAATTCCATATGAGAAAGATTTGATTATTGAACG AGAAAATTTTTGGATTAAAGAGCTTAATTCTAAAATTAATGGATACAATATTGCTGATG CAACGTTTGGTGATACATGTTCTACGCATCCATTAAAAGAAGAAATTATTAAGAAACGT TCTGAAACTGTTAAAGCTAAGATGCTTAAACTTGGACCTGATGGTCGGAAAGCTCTTT ACAGTAAACCCGGAAGTAAAAACGGGCGTTGGAATCCAGAAACCCATAAGTTTTGTAA GTGCGGTGTTCGCATACAAACTTCTGCTTATACTTGTAGTAAATGCAGAAATCGTTCA GGTGAAAATAATTCATTCTTTAATCATAAGCATTCAGACATAACTAAATCTAAAATATCA GAAAAGATGAAAGGTAAAAAGCCTAGTAATATTAAAAAGATTTCATGTGATGGGGTTAT TTTTGATTGTGCAGCAGATGCAGCTAGACATTTTAAAATTTCGTCTGGATTAGTTACTT ATCGTGTAAAATCTGATAAATGGAATTGGTTCTACATAAATGCCTAACGACTATCCCTT TGGGGAGTAGGGTCAAGTGACTCGAAACGATAGACAACTTGCTTTAACAAGTTGGAG ATATAGTCTGCTCTGCATGGTGACATGCAGCTGGATATAATTCCGGGGTAAGATTAAC GACCTTATCTGAACATAATG SequenceoftheWTT4tdintronwiththetheophyllineaptamer(SEQ IDNO:45): TTCTTGGGTTAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACG GGGAACCTCTCTAGTAGACAATCCCGTGCTAAATTGATACCAGCATCGTCTTGATGCC CTTGGCAGCATAAATGCCTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCG AAACGATAGACAACTTGCTTTAACAAGTTGGAGATATAGTCTGCTCTGCATGGTGACA TGCAGCTGGATATAATTCCGGGGTAAGATTAACGACCTTATCTGAACATAATGCTA SequenceoftheTag1T4tdintronwiththetheophyllineaptamer (SEQIDNO:49): TCCTCAGGTTAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACG GGGAACCTCTCTAGTAGACAATCCCGTGCTAAATTGATACCAGCATCGTCTTGATGCC CTTGGCAGCATAAATGCCTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCG AAACGATAGACAACTTGCTTTAACAAGTTGGAGATATAGTCTGCTCTGCATGGTGACA TGCAGCTGGATATAATTCCGGGGTAAGATTAACGACCTTATCTGAACATAATGCTA SequenceoftheTag2T4tdintronwiththetheophyllineaptamer (SEQIDNO:46): TCCTCGGGTTAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAAC GGGGAACCTCTCTAGTAGACAATCCCGTGCTAAATTGATACCAGCATCGTCTTGATGC CCTTGGCAGCATAAATGCCTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTC GAAACGATAGACAACTTGCTTTAACAAGTTGGAGATATAGTCTGCTCTGCATGGTGAC ATGCAGCTGGATATAATTCCGGGGTAAGATTAACGACCTTATCTGAACATAATGCTA SequenceoftheTag3T4tdintronwiththetheophyllineaptamer (SEQIDNO:47): TCCTTGGGTTAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACG GGGAACCTCTCTAGTAGACAATCCCGTGCTAAATTGATACCAGCATCGTCTTGATGCC CTTGGCAGCATAAATGCCTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCG AAACGATAGACAACTTGCTTTAACAAGTTGGAGATATAGTCTGCTCTGCATGGTGACA TGCAGCTGGATATAATTCCGGGGTAAGATTAACGACCTTATCTGAACATAATGCTA SequenceoftheTag4T4tdintronwiththetheophyllineaptamer (SEQIDNO:48): TCCTCTGGTTAATTGAGGCCTGAGTATAAGGTGACTTATACTTGTAATCTATCTAAACG GGGAACCTCTCTAGTAGACAATCCCGTGCTAAATTGATACCAGCATCGTCTTGATGCC CTTGGCAGCATAAATGCCTAACGACTATCCCTTTGGGGAGTAGGGTCAAGTGACTCG AAACGATAGACAACTTGCTTTAACAAGTTGGAGATATAGTCTGCTCTGCATGGTGACA TGCAGCTGGATATAATTCCGGGGTAAGATTAACGACCTTATCTGAACATAATGCTA