Thermostable Cas9 nucleases

Abstract

Thermostable Cas9 nucleases. The present invention relates to the field of genetic engineering and more particularly to nucleic acid editing and genome modification. The present invention provides an isolated Cas protein or polypeptide fragment thereof having an amino acid sequence of SEQ ID NO: 1 or a sequence of at least 77% identity therewith. The Cas protein or polypeptide is capable of binding, cleaving, marking or modifying a double stranded target polynucleotide at a temperature in the range 20° C. and 100° C. inclusive. The invention further provides isolated nucleic acid molecules encoding the Cas9 nucleases, expression vectors and host cells. The invention also provides PAM sequences recognized by the Cas protein or polypeptide, The Cas9 nucleases disclosed herein provide novel tools for genetic engineering in general, in particular at elevated temperatures and are of particular value in the genetic manipulation of thermophilic organisms; particularly microorganisms.

Claims

1. A method of binding, cleaving, marking or modifying a double stranded target polynucleotide, wherein the double stranded target polynucleotide comprises a target nucleic acid strand comprising a target nucleic acid sequence, and a non-target nucleic acid strand comprising a protospacer nucleic acid sequence complementary to the target nucleic acid sequence, said method comprising: a) designing at least one targeting RNA molecule, wherein the targeting RNA molecule recognizes the target sequence in the target strand, and the non-target strand further comprises a protospacer adjacent motif (PAM) sequence directly adjacent the 3′ end of the protospacer sequence, wherein the PAM sequence comprises 5′-NNNNCNN-3′; b) forming a ribonucleoprotein complex comprising the targeting RNA molecule and a Cas protein, wherein the isolated Cas protein has the amino acid sequence of SEQ ID NO: 1 or a sequence of at least 89% identity therewith; and c) the ribonucleoprotein complex binding, cleaving, marking or modifying the target polynucleotide, and wherein said method is not used in human cells.

2. The method as claimed in claim 1, wherein the binding, cleaving, marking or modifying occurs at a temperature between 20° C. and 100° C.

3. The method as claimed in claim 1, wherein the double stranded target polynucleotide comprising the target nucleic acid sequence is cleaved by the Cas protein, wherein said cleavage is DNA cleavage, which results in a double stranded break in the polynucleotide.

4. The method as claimed in claim 1, wherein the target polynucleotide comprising the target nucleic acid sequence is double stranded DNA, the Cas protein lacks the ability to cut the double stranded DNA and said method results in gene silencing of the target polynucleotide.

5. The method as claimed in claim 1, wherein the PAM sequence comprises at least one sequence selected from the group consisting of 5′-NNNNCNNA-3′ (SEQ ID NO: 47), 5′-NNNNCSAA-3′ (SEQ ID NO: 48) and 5′-NNNNCCAA-3′ (SEQ ID NO: 50).

6. The method as claimed in claim 5, wherein the binding, cleaving, marking or modifying occurs at a temperature between 20° C. and 70° C.

7. The method as claimed in claim 1, wherein the Cas protein is obtainable from a species selected from the group consisting of a bacterium, an archaeon, a virus, a thermophilic bacterium; a Geobacillus sp. and Geobacillus thermodenitrificans.

8. The method as claimed in claim 1, wherein the targeting RNA molecule comprises a crRNA and a tracrRNA.

9. The method as claimed in claim 1, wherein at least one of (a) the length of the at least one targeting RNA molecule is in the range 35-200 nucleotide residues; and (b) the target nucleic acid sequence is from 15 to 32 nucleotide residues in length.

10. The method as claimed in claim 1, wherein the Cas protein further comprises at least one functional moiety selected from the group consisting of a helicase, a nuclease, a helicase-nuclease, a DNA methylase, a histone methylase, an acetylase, a phosphatase, a kinase, a transcription activator, a transcription coactivator, a transcription repressor, a DNA binding protein, a DNA structuring protein, a marker protein, a reporter protein, a fluorescent protein, a ligand binding protein, a signal peptide, a subcellular localization sequence, an antibody epitope and an affinity purification tag.

11. The method as claimed in claim 10, wherein the native activity of the Cas9 nuclease is inactivated and the Cas protein is linked to the at least one functional moiety.

12. The method as claimed in claim 10, wherein the double stranded target polynucleotide is dsDNA, the at least one functional moiety is selected from the group consisting of a nuclease and a helicase-nuclease, and the modification is selected from the group consisting of a single-stranded and a double-stranded break at a desired locus.

13. The method as claimed in claim 10, wherein the double stranded target polynucleotide is dsDNA and the functional moiety is selected from the group consisting of a DNA modifying enzyme, a methylase, an acetylase, a transcription activator and a transcription repressor and the binding, cleaving, marking or modifying results in modification of gene expression.

14. The method as claimed in claim 1, wherein said binding, cleaving, marking or modifying occurs in vivo; and optionally further wherein said binding, cleaving, marking or modifying occurs in an organism selected from the group consisting of a thermophilic organism and a mesophilic organism.

15. The method as claimed in claim 1, wherein the binding, cleaving, marking or modifying results in at least one selected from the group consisting of modifying a desired nucleotide sequence at a desired location, deleting a desired nucleotide sequence at a desired location, inserting a desired nucleotide sequence at a desired location, and silencing gene expression at a desired locus.

16. A transformed non-human cell, having a double stranded target polynucleotide comprising a target nucleic acid sequence, wherein the double stranded target polynucleotide comprises a target nucleic acid strand, comprising said target nucleic acid sequence, and a non-target nucleic acid strand, comprising a protospacer nucleic acid sequence complementary to the target nucleic acid sequence, said cell comprising: a clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) protein having the amino acid sequence of SEQ ID NO: 1 or a sequence of at least 89% identity therewith; at least one targeting RNA molecule which recognizes the target nucleic acid sequence in the target nucleic acid strand, wherein the non-target strand further comprises a protospacer adjacent motif (PAM) sequence directly adjacent the 3′ end of the protospacer sequence, wherein the PAM sequence comprises 5′-NNNNCNN-3′; and an expression vector comprising a nucleic acid encoding at least one of said Cas protein and said targeting RNA molecule.

17. The transformed cell as claimed in claim 16, wherein the cell is a prokaryotic cell.

18. The transformed cell as claimed in claim 16, wherein the Cas protein is expressed from an expression vector.

19. A nucleoprotein complex comprising a Cas protein, at least one targeting RNA molecule which recognizes a target nucleic acid sequence in a double stranded target polynucleotide, and the target polynucleotide, wherein the Cas protein has the amino acid sequence of SEQ ID NO: 1 or a sequence of at least 89% identity therewith; the double stranded target polynucleotide comprises a target nucleic acid strand, comprising said target nucleic acid sequence, and a non-target nucleic acid strand, comprising a protospacer nucleic acid sequence complementary to the target nucleic acid sequence and a protospacer adjacent motif (PAM) sequence directly adjacent the 3′ end of the protospacer sequence, wherein the PAM sequence comprises 5′-NNNNCNN-3′; wherein the nucleoprotein complex is not in a human cell.

20. The nucleoprotein complex as claimed in claim 19, wherein the nucleoprotein complex is in a prokaryotic cell.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) The invention will now be described in detail with reference to a specific embodiment and with reference to the accompanying drawings, in which:

(2) FIG. 1 shows a Neighbour-Joining tree of Cas9 protein sequences. All sequences having a sequence similarity above 40% with strain T12 based on pBLAST or PSI-BLAST were included, as well as currently well-characterized sequences (S. pyogenes, S. thermophiles and A. naeslundii), as well as all currently identified thermophilic sequences also when these were below 40% identity. For all thermophilic sequences, the percentage identity to T12 is indicated after the strain name. Gene identifier (gi) numbers are indicated before the species name. Legend: Closed circles: thermophilic (optimum above 60° C.) Cas9 sequences, closed squares: thermotolerant (optimum <50° C.) Cas9 sequences, open triangle: Cas9 sequence currently most used for genome editing purposes from mesophilic origin; no sign: mesophilic Cas9. Values at the nodes represent 1000-replicate bootstrap values; scale bar represents estimated amino acid substitutions per site.

(3) FIG. 2 shows a Neighbour-Joining tree of Cas9 gene sequences. Identity at the gene level was extremely poor; sequences from the same organisms as those used for the protein alignment were used for the gene alignment. Gene identifier (gi) numbers are indicated before the species name. Legend: Closed circles: thermophilic (optimum above 60° C.) Cas9 sequences, closed squares: thermotolerant (optimum <50° C.) Cas9 sequences, open triangle: Cas9 sequence currently most used for genome editing purposes from mesophilic origin; no sign: mesophilic Cas9. Values at the nodes represent 1000-replicate bootstrap values.

(4) FIG. 3 shows a protein sequence alignment for gtCas9 (SEQ ID NO: 1) (Type II-C) with well-characterized Type II-C (A. naeslundii/‘ana’; SEQ ID NO: 8) and Type II-A (S. pyogenes/‘pyo’; SEQ ID NO: 9 and S. thermophilus) Cas9 sequences. Important active site residues are well conserved and indicated with black arrows. Protein domains as described for Ana-Cas9 and Pyo-Cas9 (Jinek, et al., 2014, Science 343: 1247997) are indicated with shaded boxes and similarly coloured letters. The PAM recognition domain has been determined for the S. pyogenes Type II-A system but not for any Type II-C system and is therefore only indicated in the S. pyogenes sequence.

(5) FIG. 4 shows protein architecture of A. naeslundii Cas9 (Cas9-Ana) (Jinek et al., 2014). gtCas9 belongs to the same Type II-C CRISPR system and active site residues could be identified.

(6) FIG. 5 shows a comparison of crRNA-guided targeting of complementary dsDNA. Base pairing is indicated with dashed lines. RNA is depicted in black, DNA in grey. Base pairing between crRNA spacer and target protospacer is indicated with thick black dashed line, base pairing between DNA strands and between RNA strands is indicated with thick grey dashed lines. The 5′ end of the crRNA is indicated. Note that PAM (small white box) in Type I resides downstream of target strand (protospacer), whereas in Type II it resides at the other end on the displaced strand. Likewise, the seed (the predicted sequence of the guide where base pairing with target DNA strand starts, and where no mismatches are allowed) is located close to the PAM, and as such differs in types I and II (Van der Oost, 2014 ibid.). Panel A shows a schematic of a Type I Cascade system of E. coli. crRNA has internal spacer (grey box, 31-32 nt that allows for target recognition), flanked bt a 8 nt 5′ handle and a 29 nt 3′ handle that consists of a stem-loop structure (hairpin) (Jore 2011 ibid.). Panel B shows a schematic of a Type II Cas9 system of S. pyogenes. crRNA basepairs with tracrRNA, that allows for processing by RNaseIII (opposite black triangles). Additionally, the 5′ end of the crRNA is trimmed by an RNase (black triangle), typically resulting in a 20 nt spacer. Note that a synthetic loop may be introduced to link the crRNA and tracrRNA, resulting in a single guide RNA (sgRNA) (Jinek et al., 2012 ibid.).

(7) FIG. 6 shows an alignment of sequences of the G. thermodenitrificans T12 type IIc CRISPR system.

(8) FIG. 7 shows six single hits obtained to provide an in silico PAM prediction for gtCas9.

(9) FIG. 8 shows a weblogo combining the results of the alignments illustrated in FIG. 7. The weblogo was generated using weblogo.berkeley.edu.

(10) FIG. 9 shows the results of an in vitro cleavage assay at 60° C. targeting plasmids with purified gtCas9. The plasmids included specific 8 nucleotide-long sequence variants of the PAM sequences.

(11) FIG. 10 shows the results of in vitro assays to investigate the effect of gtCas9 concentration, using a targeted plasmid with the CCCCCCAA [SEQ ID NO: 11] PAM sequence.

(12) FIG. 11 shows the results of in vitro assays using a targeted plasmid with the CCCCCCAA [SEQ ID NO: 11] PAM sequence over a range of temperatures.

(13) FIG. 12 shows the results of in vivo genome editing of Bacillus smithii ET138 cells using gtCas9 and 8 nt PAM sequences, by the growth or absence of colonies of the Bacillus smithii ET138 cells on selection plates, as explained in Example 9. Colonies are indicated with arrows in FIG. 12.

(14) FIG. 13 shows the results of a PCR screen for colonies in which the pyrF gene was deleted. The colonies were generated following transformation of Bacillus smithii ET138 cells with construct 3 (negative control). 15 colonies were screened but none showed the deletion genotype −2.1 kb band size and instead all showed the wild type ˜2.9 kb band size, as explained in Example 9.

(15) FIG. 14 shows the results of a PCR screen for colonies in which the pyrF gene was deleted. The colonies were generated following transformation of Bacillus smithii ET138 cells with construct 1 (PAM sequence ATCCCCAA [SEQ ID NO: 21]). 20 colonies were screened and one showed the deletion genotype ˜2.1 kb band size whilst the rest showed both the wild type ˜2.9 kb band size and the deletion genotype ˜2.1 kb band size, as explained in Example 9. No wild type only genotypes were observed.

(16) FIG. 15 shows the Geobacillus thermodenitrificans T12 type-IIC CRISPR-Cas locus encodes a thermostable Cas9 homolog, ThermoCas9. (A) Schematic representation of the genomic locus encoding ThermoCas9. The domain architecture of ThermoCas9 based on sequence comparison, with predicted active sites residues highlighted in red. A homology model of ThermoCas9 generated using Phyre 2 (Kelley et al. Nat. Protoc. 10, 845-858 (2015)) is shown, with different colours for the domains. (B) Phylogenetic tree of Cas9 orthologues that are highly identical to ThermoCas9. Evolutionary analysis was conducted in MEGA7 (Kumar et al. Mol. Biol. Evol. 33, 1870-1874 (2016)). (C) SDS-PAGE of ThermoCas9 after purification by metal-affinity chromatography and gel filtration. The migration of the obtained single band is consistent with the theoretical molecular weight of 126 kD of the apo-ThermoCas9.

(17) FIG. 16 shows ThermoCas9 PAM analysis. (A) Schematic illustrating the in vitro cleavage assay for discovering the position and identity (5′-NNNNNNN-3′) of the protospacer adjacent motif (PAM). Black triangles indicate the cleavage position. (B) Sequence logo of the consensus 7 nt long PAM of ThermoCas9, obtained by comparative analysis of the ThermoCas9-based cleavage of target libraries. Letter height at each position is measured by information content. (C) Extension of the PAM identity to the 8.sup.th position by in vitro cleavage assay. Four linearized plasmid targets, each containing a distinct 5′-CCCCCCAN-3′ PAM, were incubated with ThermoCas9 and sgRNA at 55° C. for 1 hour, then analysed by agarose gel electrophoresis. (D) In vitro cleavage assays for DNA targets with different PAMs at 30° C. and 55° C. Sixteen linearized plasmid targets, each containing one distinct 5′-CCCCCNNA-3′ [SEQ ID NO: 13] PAM, were incubated with ThermoCas9 and sgRNA, then analysed for cleavage efficiency by agarose gel electrophoresis. See also FIG. 21.

(18) FIG. 17 shows ThermoCas9 is active at a wide temperature range and its thermostability increases when bound to sgRNA. (A) Schematic representation of the sgRNA and a matching target DNA. Target DNA is shown as a rectangular with black outline, and the PAM is shown as a dark grey, horizontal ellipse with back outline. The crRNA is shown as a dark grey rectangular with black outline and the site where the 3′-end of the crRNA is linked with 5′-end of the tracrRNA is shown as a black, vertical ellipse. The black box with the white letters and the light grey box with the black letters indicate the predicted three and two loops at the 3′-side of the tracrRNA, respectively. The 41-nt truncation of the repeat/anti-repeat region—formed by the complementary 3′-end of the crRNA and the 5′-end of the tracrRNA- is indicated with a long, light grey, vertical, dotted line. The predicted 3′ position of the first tracrRNA loop is marked with a black triangle and a black dotted line. The predicted 3′ position of the second tracrRNA loop is marked with a white triangle and a black dotted line. The predicted 3′ position of the third tracrRNA loop is marked with a white triangle and a white dotted line. (B) The importance of the predicted three stem-loops of the tracrRNA scaffold was tested by transcribing truncated variants of the sgRNA and evaluating their ability to guide ThermoCas9 to cleave target DNA at various temperatures. Average values of at least two biological replicates are shown, with error bars representing S.D. (C) To identify the maximum temperature, endonuclease activity of ThermoCas9:sgRNA RNP complex was assayed after incubation at 60° C., 65° C. and 70° C. for 5 or 10 min. The pre-heated DNA substrate was added and the reaction was incubated for 1 hour at the corresponding temperature. (D) Comparison of active temperature range of ThermoCas9 and SpCas9 by activity assays conducted after 5 min of incubation at the indicated temperature. The pre-heated DNA substrate was added and the reaction was incubated for 1 hour at the same temperature.

(19) FIG. 18 shows ThermoCas9-based genome engineering in thermophiles. (A) Schematic overview of the basic pThermoCas9_Δgene-of-interest (goi) construct. The thermocas9 gene was introduced either to the pNW33n (B. smithii) or to the pEMG (P. putida) vector. Homologous recombination flanks were introduced upstream thermocas9 and encompassed the 1 kb (B. smithii) or 0.5 kb (P. putida) upstream and 1 kb or 0.5 kb downstream region of the gene of interest (goi) in the targeted genome. A sgRNA-expressing module was introduced downstream the thermocas9 gene. As the origin of replication (ori), replication protein (rep), antibiotic resistance marker (AB) and possible accesory elements (AE) are backbone specific, they are represented with dotted outline. (B) Agarose gel electrophoresis showing the resulting products from genome-specific PCR on ten colonies from the ThermoCas9-based pyrF deletion process from the genome of B. smithii ET 138. All ten colonies contained the ΔpyrF genotype and one colony was a clean ΔpyrF mutant, lacking the wild type product. (C) Schematic overview of the basic pThermoCas9i_goi construct. Aiming for the expression of a catalytically inactive ThermoCas9 (Thermo-dCas9:D8A, H582A mutant), the corresponding mutations were introduced to create the thermo-dcas9 gene. The thermo-dcas9 gene was introduced to the pNW33n vector. An sgRNA-expressing module was introduced downstream the thermo-dcas9. (D) Graphical representation of the production, growth and RT-qPCR results from the IdhL silencing experiment using Thermo-dCas9. The graphs represent the lactate production, optical density at 600 nm and percentage of IdhL transcription in the repressed cultures compared to the control cultures. Average values from at least two biological replicates are shown, with error bars representing S.D.

(20) FIG. 19 shows a multiple sequence alignment of Type II-A, B and C Cas9 orthologues. Cas9 protein sequences of Streptococcus pyogenes (Sp), Streptococcus thermophilus (St), Wolinella succinogenes (Ws), Neisseria meningitides (Nm), Actinomyces naeslundii (An), and Geobacillus thermodenitrificans (Thermo) were aligned using ClustalW1 in MEGA7 2 with default settings; ESPript3 was used to generate the visualization. Strictly conserved residues are shown in white text on grey background; similar residues are shown in black text in white vertical rectangules with black outline. Pyramids indicate the two conserved nuclease domains in all sequences. Horizontal black arrows and curls indicate β-strands and α-helices, respectively, in the SpCas9 secondary structure (protein database nr 4CMP4). Structural domains are indicated for SpCas9 and ThermoCas9 using the same colour scheme as in FIG. 15A.

(21) FIG. 20 shows in silico PAM determination results. Panel (A) shows the two hits obtained with phage genomes using CRISPRtarget6. Panel (B) shows sequence logo of the consensus 7 nt long PAM of ThermoCas9, obtained by in silico PAM analysis. Letter height at each position is measured by information content.

(22) FIG. 21 shows ThermoCas9 PAM discovery. In vitro cleavage assays for DNA targets with different PAMs at 20° C., 37° C., 45° C. and 60° C. Seven (20° C.) or sixteen (37° C., 45° C., 60° C.) linearized plasmid targets, each containing a distinct 5′-CCCCCNNA-3′ [SEQ ID NO: 13] PAM, were incubated with ThermoCas9 and sgRNA, then analysed by agarose gel electrophoresis.

(23) FIG. 22 shows activity of ThermoCas9 at a wide temperature range using sgRNA containing one loop. The importance of the predicted three stem loops of the tracrRNA scaffold was tested by transcribing truncated variations of the sgRNA and evaluating their ability to guide ThermoCas9 to cleave target DNA at various temperatures. Shown above is the effect of one loop on the activity of ThermoCas9 at various temperatures. Average values from at least two biological replicates are shown, with error bars representing S.D.

(24) FIG. 23 shows ThermoCas9 mediates dsDNA targeting using divalent cations as catalysts and does not cleave ssDNA. Panel (A) shows in vitro plasmid DNA cleavage by ThermoCas9 with EDTA and various metal ions. M=1 kb DNA ladder. Panel (B) shows activity of ThermoCas9 on ssDNA substrates. M=10 bp DNA ladder.

(25) FIG. 24 shows spacer selection for the IdhL silencing experiment. Schematic representation of the spacer (sgRNA)-protospacer annealing during the IdhL silencing process; the selected protospacer resides on the non-template strand and 39 nt downstream the start codon of the IdhL gene.

(26) FIG. 25 shows a map of plasmid pThermoCas9_ppΔpyrF consisting of the pEMG backbone, the Pseudomonas putida pyrF flanking region and the thermocas9 gene and a Pseudomonas putida pyrF targeting sgRNA.

(27) FIG. 26 shows the results of capillary gel electrophoresis showing the resulting products from genomespecific PCR on the obtained colonies from the ThermoCas9-based pyrF deletion process from the genome of Pseudomonas putida. The 1854 bp band and the 1112 bp band corresponds to the pyrF and ΔpyrF genotype, respectively.

(28) Below are polynucleotide and amino acid sequences of Cas proteins used in accordance with the invention.

(29) TABLE-US-00001 [SEQ ID NO: 1] Geobacillus thermodenitrificans T12 Cas9 protein AA sequence MKYKIGLDIGITSIGWAVINLDIPRIEDLGVRIFDRAENPKTGESLALPRRLARSARRR LRRRKHRLERIRRLFVREGILTKEELNKLFEKKHEIDVWQLRVEALDRKLNNDELARI LLHLAKRRGFRSNRKSERTNKENSTMLKHIEENQSILSSYRTVAEMVVKDPKFSLH KRNKEDNYTNTVARDDLEREIKLIFAKQREYGNIVCTEAFEHEYISIWASQRPFASK DDIEKKVGFCTFEPKEKRAPKATYTFQSFTVWEHINKLRLVSPGGIRALTDDERRLIY KQAFHKNKITFHDVRTLLNLPDDTRFKGLLYDRNTTLKENEKVRFLELGAYHKIRKAI DSVYGKGAAKSFRPIDFDTFGYALTMFKDDTDIRSYLRNEYEQNGKRMENLADKVY DEELIEELLNLSFSKFGHLSLKALRNILPYMEQGEVYSTACERAGYTFTGPKKKQKT VLLPNIPPIANPVVMRALTQARKVVNAIIKKYGSPVSIHIELARELSQSFDERRKMQK EQEGNRKKNETAIRQLVEYGLTLNPTGLDIVKFKLWSEQNGKCAYSLQPIEIERLLE PGYTEVDHVIPYSRSLDDSYTNKVLVLTKENREKGNRTPAEYLGLGSERWQQFETF VLTNKQFSKKKRDRLLRLHYDENEENEFKNRNLNDTRYISRFLANFIREHLKFADSD DKQKVYTVNGRITAHLRSRWNFNKNREESNLHHAVDAAIVACTTPSDIARVTAFYQ RREQNKELSKKTDPQFPQPWPHFADELQARLSKNPKESIKALNLGNYDNEKLESL QPVFVSRMPKRSITGAAHQETLRRYIGIDERSGKIQTVVKKKLSEIQLDKTGHFPMY GKESDPRTYEAIRQRLLEHNNDPKKAFQEPLYKPKKNGELGPIIRTIKIIDTTNQVIPL NDGKTVAYNSNIVRVDVFEKDGKYYCVPIYTIDMMKGILPNKAIEPNKPYSEWKEMT EDYTFRFSLYPNDLIRIEFPREKTIKTAVGEEIKIKDLFAYYQTIDSSNGGLSLVSHDN NFSLRSIGSRTLKRFEKYQVDVLGNIYKVRGEKRVGVASSSHSKAGETIRPL*

(30) TABLE-US-00002 [SEQ ID NO: 7] Geobacillus thermodenitrificans T12 Cas9 DNA Sequence ATGAAGTATAAAATCGGTCTTGATATCGGCATTACGTCTATCGGTTGGGCTGTC ATTAATTTGGACATTCCTCGCATCGAAGATTTAGGTGTCCGCATTTTTGACAGAG CGGAAAACCCGAAAACCGGGGAGTCACTAGCTCTTCCACGTCGCCTCGCCCGC TCCGCCCGACGTCGTCTGCGGCGTCGCAAACATCGACTGGAGCGCATTCGCC GCCTGTTCGTCCGCGAAGGAATTTTAACGAAGGAAGAGCTGAACAAGCTGTTT GAAAAAAAGCACGAAATCGACGTCTGGCAGCTTCGTGTTGAAGCACTGGATCG AAAACTAAATAACGATGAATTAGCCCGCATCCTTCTTCATCTGGCTAAACGGCG TGGATTTAGATCCAACCGCAAGAGTGAGCGCACCAACAAAGAAAACAGTACGAT GCTCAAACATATTGAAGAAAACCAATCCATTCTTTCAAGTTACCGAACGGTTGCA GAAATGGTTGTCAAGGATCCGAAATTTTCCCTGCACAAGCGTAATAAAGAGGAT AATTACACCAACACTGTTGCCCGCGACGATCTTGAACGGGAAATCAAACTGATT TTCGCCAAACAGCGCGAATATGGGAACATCGTTTGCACAGAAGCATTTGAACAC GAGTATATTTCCATTTGGGCATCGCAACGCCCTTTTGCTTCTAAGGATGATATC GAGAAAAAAGTCGGTTTCTGTACGTTTGAGCCTAAAGAAAAACGCGCGCCAAAA GCAACATACACATTCCAGTCCTTCACCGTCTGGGAACATATTAACAAACTTCGT CTTGTCTCCCCGGGAGGCATCCGGGCACTAACCGATGATGAACGTCGTCTTAT ATACAAGCAAGCATTTCATAAAAATAAAATCACCTTCCATGATGTTCGAACATTG CTTAACTTGCCTGACGACACCCGTTTTAAAGGTCTTTTATATGACCGAAACACCA CGCTGAAGGAAAATGAGAAAGTTCGCTTCCTTGAACTCGGCGCCTATCATAAAA TACGGAAAGCGATCGACAGCGTCTATGGCAAAGGAGCAGCAAAATCATTTCGT CCGATTGATTTTGATACATTTGGCTACGCATTAACGATGTTTAAAGACGACACCG ACATTCGCAGTTACTTGCGAAACGAATACGAACAAAATGGAAAACGAATGGAAA ATCTAGCGGATAAAGTCTATGATGAAGAATTGATTGAAGAACTTTTAAACTTATC GTTTTCTAAGTTTGGTCATCTATCCCTTAAAGCGCTTCGCAACATCCTTCCATAT ATGGAACAAGGCGAAGTCTACTCAACCGCTTGTGAACGAGCAGGATATACATTT ACAGGGCCAAAGAAAAAACAGAAAACGGTATTGCTGCCGAACATTCCGCCGAT CGCCAATCCGGTCGTCATGCGCGCACTGACACAGGCACGCAAAGTGGTCAATG CCATTATCAAAAAGTACGGCTCACCGGTCTCCATCCATATCGAACTGGCCCGG GAACTATCACAATCCTTTGATGAACGACGTAAAATGCAGAAAGAACAGGAAGGA AACCGAAAGAAAAACGAAACTGCCATTCGCCAACTTGTTGAATATGGGCTGACG CTCAATCCAACTGGGCTTGACATTGTGAAATTCAAACTATGGAGCGAACAAAAC GGAAAATGTGCCTATTCACTCCAACCGATCGAAATCGAGCGGTTGCTCGAACCA GGCTATACAGAAGTCGACCATGTGATTCCATACAGCCGAAGCTTGGACGATAG CTATACCAATAAAGTTCTTGTGTTGACAAAGGAGAACCGTGAAAAAGGAAACCG CACCCCAGCTGAATATTTAGGATTAGGCTCAGAACGTTGGCAACAGTTCGAGAC GTTTGTCTTGACAAATAAGCAGTTTTCGAAAAAGAAGCGGGATCGACTCCTTCG GCTTCATTACGATGAAAACGAAGAAAATGAGTTTAAAAATCGTAATCTAAATGAT ACCCGTTATATCTCACGCTTCTTGGCTAACTTTATTCGCGAACATCTCAAATTCG CCGACAGCGATGACAAACAAAAAGTATACACGGTCAACGGCCGTATTACCGCC CATTTACGCAGCCGTTGGAATTTTAACAAAAACCGGGAAGAATCGAATTTGCAT CATGCCGTCGATGCTGCCATCGTCGCCTGCACAACGCCGAGCGATATCGCCCG AGTCACCGCCTTCTATCAACGGCGCGAACAAAACAAAGAACTGTCCAAAAAGAC GGATCCGCAGTTTCCGCAGCCTTGGCCGCACTTTGCTGATGAACTGCAGGCGC GTTTATCAAAAAATCCAAAGGAGAGTATAAAAGCTCTCAATCTTGGAAATTATGA TAACGAGAAACTCGAATCGTTGCAGCCGGTTTTTGTCTCCCGAATGCCGAAGC GGAGCATAACAGGAGCGGCTCATCAAGAAACATTGCGGCGTTATATCGGCATC GACGAACGGAGCGGAAAAATACAGACGGTCGTCAAAAAGAAACTATCCGAGAT CCAACTGGATAAAACAGGTCATTTCCCAATGTACGGGAAAGAAAGCGATCCAAG GACATATGAAGCCATTCGCCAACGGTTGCTTGAACATAACAATGACCCAAAAAA GGCGTTTCAAGAGCCTCTGTATAAACCGAAGAAGAACGGAGAACTAGGTCCTAT CATCCGAACAATCAAAATCATCGATACGACAAATCAAGTTATTCCGCTCAACGAT GGCAAAACAGTCGCCTACAACAGCAACATCGTGCGGGTCGACGTCTTTGAGAA AGATGGCAAATATTATTGTGTCCCTATCTATACAATAGATATGATGAAAGGGATC TTGCCAAACAAGGCGATCGAGCCGAACAAACCGTACTCTGAGTGGAAGGAAAT GACGGAGGACTATACATTCCGATTCAGTCTATACCCAAATGATCTTATCCGTATC GAATTTCCCCGAGAAAAAACAATAAAGACTGCTGTGGGGGAAGAAATCAAAATT AAGGATCTGTTCGCCTATTATCAAACCATCGACTCCTCCAATGGAGGGTTAAGT TTGGTTAGCCATGATAACAACTTTTCGCTCCGCAGCATCGGTTCAAGAACCCTC AAACGATTCGAGAAATACCAAGTAGATGTGCTAGGCAACATCTACAAAGTGAGA GGGGAAAAGAGAGTTGGGGTGGCGTCATCTTCTCATTCGAAAGCCGGGGAAAC TATCCGTCCGTTATAA

DETAILED DESCRIPTION

Example 1: Isolation of Geobacillus thermodenitrificans

(31) G. thermodenitrificans was surprisingly discovered during a search of a library of ±500 isolates for a thermophile capable of degrading lignocellulosic substrates under anaerobic conditions. At first a library of ±500 isolates was established which, after several selection rounds by isolation on cellulose and xylan, was trimmed down to 110 isolates. This library of 110 isolates consisted solely of Geobacillus isolates with G. thermodenitrificans representing 79% of the library.

(32) The isolated G. thermodenitrificans strain has been named “T12”. The Cas9 protein from G. thermodenitrificans T12 has been named “gtCas9”.

Example 2: Defining the Essential Consensus Sequences for Cas9 in Geobacillus thermodenitrificans

(33) The following database searches and alignments were performed:

(34) pBLAST and nBLAST were performed on the in-house BLAST server, in which either the protein or gene sequence of G. thermodenitrificans T12 was used as query sequence. This database was last updated May 2014 and therefore does not contain the most recently added Geobacillus genomes, but normal online BLAST was not used to prevent publication of the T12 sequence. Sequence identities found to be greater than 40% in the BLAST search are included in FIG. 1.

(35) To include more recent sequence data, the sequence of Geobacillus MAS1 (most closely related to gtCas9) was used to perform a PSI-BLAST on the NCBI website (Johnson et al., 2008 Nucleic Acids Res. 36(Web Server issue): W5-9). Two consecutive rounds of PSI-BLAST were performed, in which only sequences that met the following criteria were used for the next round: minimum sequence coverage of 96% in the first round and 97% in the second and third round, minimum identity 40%, only one strain per species.

(36) The sequences resulting from the PSI-BLAST, as well as the sequences with more than 40% identity to T12 from the internal server pBLAST that did not appear in the PSI-BLAST were aligned together with currently well-characterized mesophilic sequences and all currently identified thermophilic sequences also if these were more distantly related, from which a Neighbour-Joining tree was constructed (see FIG. 1). Alignment was performed in Mega6 using ClustalW, after which a tree was constructed using the Neighbour-Joining method and bootstrap analysis was performed using 1000 replicates.

(37) When BLASTn was performed using Geobacillus sp. MAS1 as the query sequence, only Geobacillus sp. JF8 Cas9 was identified with 88% identity, indicating very little homology at the gene level. FIG. 2 is a Neighbour-Joining tree of Clustal-aligned Cas9 gene sequences.

(38) Protein sequences of G. thermodenitrificans T12, A. naeslundii and S. pyogenes were further analyzed for protein domain homology (see FIG. 3) by aligning them in CloneManager using BLOSUM62 with default settings.

Example 3: Identifying Core Amino Acid Motifs which are Essential for the Function of CAS9 and Those which Confer Thermostability in Thermophilic Cas9 Nucleases

(39) Percentages identity of the above described aligned protein sequences are provided in FIG. 1. gtCas9 belongs to Type II-C. The best-studied and recently crystalized structure of a Type II-C system is from Actinomyces naeslundii (Jinek et al., 2014, Science 343: 1247997). This protein sequence shows only 20% identity to gtCas9 but can be used to estimate highly conserved residues. Two well-characterized Type II-A systems (S. pyogenes and S. thermophilus) were also included in the analyses (Jinek et al., 2014, Science 343: 1247997; Nishimasu et al., 2014, Cell 156: 935-949). Alignments of these four protein sequences are shown in FIG. 3; FIG. 4 shows the protein architecture as determined for A. naeslundii (‘Ana-Cas9’) (Jinek et al., 2014, Science 343: 1247997). The length of Cas9 from t12 (gtCas9) and Actinomyces naeslundii is highly similar (A. naeslundii 1101 aa, gtCas9 1082 aa) and gtCas9 is expected to have similar protein architecture but this remains to be determined, as the overall sequence identity to cas9-Ana is only 20%. All active side residues described by Jinek et al. (Jinek et al., 2014, Science 343: 1247997) in Cas9 from A. naeslundii and S. pyogenes could be identified in gtCas9 (see FIG. 3). The PAM-binding domain has been determined for the S. pyogenes Type II-A system but not for any Type II-C system and is therefore only indicated in the S. pyogenes sequence. Moreover, the PAM-recognition site varies strongly, not only between CRISPR systems but also between species containing the same system.

Example 4: Determination of the PAM Sequence of G. thermodenitrificans gtCas9

(40) It has been established that the prokaryotic CRISPR systems serve their hosts as adaptive immune systems (Jinek et al., 2012, Science 337: 816-821) and can be used for quick and effective genetic engineering (Mali et al., 2013, Nat Methods 10: 957-963.).

(41) Cas9 proteins function as sequence-specific nucleases for the type II CRISPR systems (Makarova et al., 2011, Nat Rev Micro 9: 467-477). Small crRNA molecules, which consist of a “spacer” (target) linked to a repetition region, are the transcription and processing products of a CRISPR loci. “Spacers” naturally originate from the genome of bacteriophages and mobile genetic elements, but they can also be designed to target a specific nucleotide sequence during a genetic engineering process (Bikard et al., 2013, Nucleic Acids Research 41: 7429-7437). The crRNA molecules are employed by the Cas9 as guides for the identification of their DNA targets. The spacer region is identical to the targeted for cleavage DNA region, the “protospacer” (Brouns et al., 2012, Science 337: 808-809). A PAM (Protospacer Adjacent Motif), next to the protospacer, is required for the recognition of the target by the Cas9 (Jinek et al., 2012, Science 337: 816-821).

(42) In order to perform in vitro or in vivo PAM-determination studies for Type II systems, it is necessary to in silico predict the CRISPR array of the system, the tracrRNA-expressing module. The CRISPR array is used for the identification of the crRNA module. The tracrRNA-expressing sequence is located either within a 500 bp-window flanking Cas9 or between the Cas genes and the CRISPR locus (Chylinski, K., et al. (2014) Classification and evolution of type II CRISPR-Cas systems. Nucleic Acids Res. 42, 6091-6105). The tracrRNA should consist of a 5′-sequence with high level of complementarity to the direct repeats of the CRISPR array, followed by a predicted structure of no less than two stem-loop structures and a Rho-independent transcriptional termination signal (Ran, F. A., et al. (2015) In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186-191). The crRNA and tracrRNA molecule can then be used to design a chimeric sgRNA module. The 5′-end of the sgRNA consists of a truncated 20 nt long spacer followed by the 16-20 nt long truncated repeat of the CRISPR array. The repeat is followed by the corresponding truncated anti-repeat and the stem loop of the tracrRNA module. The repeat and anti-repeat parts of the sgRNA are generally connected by a GAAA linker (Karvelis, T., et al. (2015) Rapid characterization of CRISPR-Cas9 protospacer adjacent motif sequence elements. Genome Biol. 16, 253).

(43) The cas genes (the cas9 followed by the cas1 and the cas2 genes) of the G. thermodenitrificans T12 type IIc CRISPR system are transcribed using the antisense strand of the T12 chromosome. The cas2 gene is followed by a 100 bp long DNA fragment which upon transcription forms an RNA structure with multiple loops. This structure obviously acts as a transcriptional terminator.

(44) A CRISPR array with 11 repeats and 10 spacer sequences is located upstream of the transcriptional termination sequence and the leader of the array is located at the 5′ end of the array. The DNA locus which is transcribed into the tracrRNA is expected to be downstream of the cas9 gene. The alignment of the 325 bp long sequence right downstream of the cas9 gene with the 36 bp long repeat from the CRISPR array revealed that there is a 36 bp long sequence in the tracrRNA locus almost identical to the repeat (shown in FIG. 6). This result led us to the conclusion that the direction of the transcription of the tracrRNA locus should be opposite to the direction of the transcription of the CRISPR array. Consequently the 5′-end of the tracrRNA will be complementary to the 3′-end of the crRNA, leading to the formation of the—required by the Cas9-dual-RNA molecule.

Example 5: Target Generation with Randomized PAM

(45) Two different spacers from the CRISPR II loci of the G. thermodenitrificans T12 strain were amplified by PCR using the G. thermodenitrificans T12 genomic DNA as template. Two pairs of degenerate primers were used for the amplification of each spacer:

(46) Firstly, a pair that cause the introduction of six random nucleotides upstream of the “protospacer” fragment were used, leading to the production of a pool of protospacers with randomized PAM sequences.

(47) Secondly, a pair that cause the introduction of six random nucleotides downstream of the “protospacer” fragment were used, leading to the production of a pool of protospacers with randomized PAM sequences.

(48) The produced fragments were ligated to the pNW33n vector, producing 4 pools of “protospacer” constructs, with all the possible 4096 different combinations of 6-nucleotide long PAMs each. The assembled DNA was used for the transformation of G. thermodenitrificans T12 cells. The cells were plated on chloramphenicol selection and more than 2×10.sup.6 cells from each protospacer pool will be pooled. The plasmid DNA was extracted from the pools, the target region will be PCR amplified and the products sent for deep sequencing. The PAMs with the fewest reads will be considered active and the process will be repeated only with pNW33n constructs that contain spacers with these PAMs. Reduced transformation efficiency of the G. thermodenitrificans T12 will confirm the activity of the PAMs.

Example 6: In Vitro Determination of PAM Sequences for gtCas9

(49) Construction of the pRham:Cas9.sub.gt Vector

(50) The cas9.sub.gt gene was PCR amplified from the G. thermodenitrificans T12 genome, using the BG6927 and BG6928 primers, and combined with the pRham C-His Kan Vector (Lucigen) in one mixture. The mixture was used for transforming E. cloni thermo-competent cells according to the provided protocol. 100 μl from the transformation mixture were plated on LB+50kanamycin plates for overnight growth at 37° C. Out of the formed E. cloni::pRham:cas9.sub.gt single colonies 3 were randomly selected and inoculated in 10 ml LB medium containing 50 μg/ml kanamucin. Glycerol stocks were prepared from the cultures by adding sterile glycerol to 1 ml from each culture up to a final concentration of 20% (v/v). The glycerol stocks were stored at −80° C. The remaining 9 ml from each culture were used for plasmid isolation according to the “GeneJET Plasmid Miniprep Kit” (Thermoscientific) protocol. The plasmids were sent for sequence verification of the cas9.sub.gt and one of the plasmids was verified to contain the gene with the right sequence. The corresponding culture was further used for heterologous expression and purification of the gtCas9.

(51) Heterologous Expression of gtCas9 in E. Cloni::pRham: Cas9.sub.gt Vector

(52) An E. cloni::pRham:cas9.sub.gt preculture was prepared after inoculating 10 ml LB+50kanamycin with the corresponding glycerol stocks. After overnight growth at 37° C. and 180 rpm, 2 ml from the preculture were used for inoculating 200 ml of LB+50kanamycin medium. The E. cloni::pRham: cas9.sub.gt culture was incubated at 37° C., 180 rpm until an OD.sub.600 of 0.7. The gtCas9 expression was then induced by adding L-rhamnose to a final concentration of 0.2% w/v. The expression was allowed to proceed for 8h, after which the cultures were centrifuged for 10 minutes at 4700 rpm, 4° C. to harvest the cells. The medium was discarded and the pelleted cells were either stored at −20° C. or used for the preparation of the cell free extract (CFE) according to the following protocol: 1. Resuspend the pellet in 20 ml Sonication Buffer (20 mM Sodium Phosphate buffer (pH=7.5), 100 mM NaCl, 5 mM MgCl2, 5% (v/v) Glycerol, 1 mM DTT) 2. Disrupt 1 ml of cells by sonication (8 pulses of 30 seconds, cool for 20 seconds on ice in between) 3. Centrifuge for 15 minutes at 35000 g, 4° C. in order to precipitate insoluble parts 4. Remove the supernatant and store it at 4° C. or on ice

(53) Designing and Construction of the PAM Library Targeting sgRNA Module for gtCas9

(54) After in silico determination of the tracrRNA expressing DNA module in the genome of G. thermodenitrificans T12 strain (see Example 4 above), a single guide (sg)RNA expressing DNA module that combines the crRNA and tracrRNA modules of the CRISPR/Cas9 system in a single molecule was designed. The spacer at the 5′-end of the sgRNA was designed to be complementary to the protospacer of the plasmid library and the module was set under the transcriptional control of a T7 promoter. The pT7_sgRNA DNA module was synthesized by Baseclear and received in a pUC57 vector, forming the pUC57:pT7_sgRNA vector. DH5a competent E. coli cells (NEB) were transformed with the vector and the transformation mixture was plated on LB-agar plates containing 100 μg/ml ampicillin. The plates were incubated overnight at 37° C. Three of the formed single colonies were inoculated in 10 ml LB medium containing 100 μg/ml ampicillin. Glycerol stocks were prepared from the cultures by adding sterile glycerol to 1 ml from each culture up to a final concentration of 20% (v/v). The glycerol stocks were stored at −80° C. The remaining 9 ml from each culture were used for plasmid isolation according to the “GeneJET Plasmid Miniprep Kit” (Thermoscientific) protocol. The isolated plasmid was used as a PCR template for amplification of the pT7_sgRNA module. The 218 bp long pT7_sgRNA DNA module (of which the first 18 bp correspond to the pT7) was obtained using the primers BG6574 and BG6575. The complete PCR mixture was run on a 1.5% agarose gel. The band with the desired size was excised and purified according to the “Zymoclean™ Gel DNA Recovery Kit” protocol.

(55) In vitro transcription (IVT) was performed using the “HiScribe™ T7 High Yield RNA Synthesis Kit” (NEB). The purified pT7_sgRNA DNA module was used as template. The IVT mixture was mixed with an equal volume of RNA loading dye (NEB) and heated at 70° C. for 15 minutes in order to disrupt the secondary structure. The heat treated IVT mixture was run on a denaturing Urea-PAGE and the resulting polyacrylamide gel was embaptised for 10 minutes in 100 ml 0.5×TBE buffer containing 10 μl of SYBR Gold (Invitrogen) for staining purposes. The band at the desired size (200 nt) was excised and the sgRNA was purified according to the following RNA purification protocol: 1. Cut RNA gel fragments with a scalpel and add 1 ml of RNA elution buffer, leave overnight at room temperature. 2. Divide 330 μl aliquots into new 1.5 ml tubes. 3. Add 3 volumes (990 μl) of pre-chilled (−20° C.) 100% EtOH. 4. Incubate for 60 minutes at −20° C. 5. Centrifuge for 20 minutes at 13000 rpm in a microfuge at room temperature. 6. Remove EtOH, wash pellet with 1 ml 70% EtOH. 7. Centrifuge for 5 minutes at 13000 rpm in a microfuge at room temperature. 8. Remove 990 μl of the supernatant. 9. Evaporate the rest EtOH in a thermomixer at 55° C. for 15 to 20 minutes. 10. Resuspend pellet in 20 μl MQ, store at −20° C.

(56) Designing and Construction of a 7 nt Long PAM Library, and Linearization of the Library

(57) The design and construction of the PAM library was based on the pNW33n vector. A 20 bp long protospacer was introduced to the vector, flanked at its 3′side by a 7 degenerate nucleotides long sequence; the degenerate sequence serves as the PAM and when the protospacer is flanked by a right PAM then it can be recognized as a target by an sgRNA loaded Cas9 and cleaved. The PAM library was prepared according to the following protocol: 1. Prepare the SpPAM double stranded DNA insert by annealing the single stranded DNA oligos 1 (BG6494) and 2 (BG6495) I. 10 μl 10×NEBuffer 2.1 II. 1 μl 50 μM oligo 1 (˜1.125 μg) III. 1 μl 50 μM oligo 2 (˜1.125 μg) IV. 85 μl MQ V. Incubate the mixture at 94° C. for 5 min and cool down to 37° C. at a rate of 0.03° C./sec 2. Add 1 μl Klenow 3′.fwdarw.5′ exo-polymerase (NEB) to each annealed oligos mixture and then add 2.5 μl of 10 μM dNTPs. Incubate at 37° C. for 1 h and then at 75° C. for 20 min. 3. Add 2 μl of the HF-BamHI and 2 μl of the BspHI restriction enzymes to 46 μl of the annealing mixture. Incubate at 37° C. for 1h. This process will lead to the SpPAMbb insert with sticky ends. Use the Zymo DNA cleaning and concentrator kit (Zymo Research) to clean the created insert. 4. Digest pNW33n with the HF-BamHI and BspHI (NEB) and purify the 3.400 bp long linear pNW33nbb fragment with sticky ends, using the Zymo DNA cleaning and concentrator kit (Zymo Research). 5. Ligate 50 ng of pNW33nBB with 11 ng of the SPPAMbb insert using the NEB T4 ligase according to the provided protocol. Purify the ligation mixture using the Zymo DNA cleaning and concentrator kit (Zymo Research). 6. Transform DH10b electro-competent cells (200 μl of cells with 500 ng of DNA). Recover the cells in SOC medium (200 μl cells in 800 μl SOC) for an hour and then inoculate 50 ml of LB+12.5 μg/ml chloramphenicol with the recovered cells. Incubate overnight the culture at 37° C. and 180 rpm. 7. Isolate plasmid DNA from the culture using the JetStar 2.0 maxiprep kit (GENOMED). 8. Use the SapI (NEB) restriction according to the provided protocol for linearizing the isolated plasmids.

(58) Designing and Execution of the PAM Determination Reactions

(59) The following cleavage reaction was set up for gtCas9-induced introduction of dsDNA breaks to the PAM library members that contain the right PAM downstream of the 3′ end of the targeted protospacer: 1. 2.5 μg of E. cloni::pRham:cas9.sub.gt CFE per reaction 2. sgRNA to 30 nM final concentration 3. 200 ng of linearized PAM library per reaction 4. 2 μl of cleavage buffer (100 mM Sodium Phosphate buffer (pH=7.5), 500 mM NaCl, 25 mM MgCl2, 25% (v/v) Glycerol, 5 mM DTT) 5. MQ water up to 20 μl final volume

(60) The reaction was incubated for 1h at 60° C. and stopped after adding 4 μl of 6× gel loading dye (NEB). The reaction mixture was then loaded to a 1% agarose gel. The gel was subjected to an 1h and 15 min long electrophoresis at 100V and then it was incubated for 30 min in 100 ml 0.5×TAE buffer containing 10 μl of SYBR Gold dye (ThermoFisher). After visualizing the DNA bands with blue light, the band that corresponded to the successfully cleaved and PAM containing DNA fragments was cut-off the gel and gel purified using the “Zymoclean™ Gel DNA Recovery Kit” according to the provided protocol.

(61) Tagging of the PAM-Containing gtCAs9 Cleaved DNA Fragments for Sequencing

(62) The Cas9-induced DNA breaks are usually introduced between the 3.sup.rd and the 4.sup.th nucleotide of a protospacer, proximally to the PAM sequence. As a result, it is not possible to design a pair of primers that can PCR amplify the PAM-containing part of the cleaved DNA fragments, in order to further on sequence and determine the PAM sequence. For this purpose a 5-step process was employed:

(63) Step 1: A-Tailing with Taq Polymerase

(64) A-Tailing is a process to add a non-templated adenine to the 3′ end of a blunt, double-stranded DNA molecule using Taq polymerase

(65) Reaction Components: gtCas9-cleaved and PAM-containing DNA fragments—200 ng 10×ThermoPol® Buffer (NEB)—5 μl 1 mM dATP—10 μl Taq DNA Polymerase (NEB)—0.2 μl H.sub.2O—up to 50 μl final reaction volume Incubation time—20 min Incubation temperature—72° C.

(66) Step 2: Construction of the Sequencing Adaptors

(67) Two complementary short ssDNA oligonucleotides were phosphorylated and annealed to form the sequencing adaptor for the PAM-proximal site of the DNA fragments from step 1. One of the oligonucleotides had an additional thymine at its 3′ end, in order to facilitate the ligation of the adaptor to the A-tailed fragments.

(68) Adaptor Oligonucleotides Phosphorylation (Separate Phosphorylation Reactions for Each Oligo) 100 μM oligonucleotide stock—2 μL 10×T4 DNA ligase buffer (NEB)—2 μL Sterile MQ water—15 μL T4 Polynucleotide Kinase (NEB)—1 μL Incubation time—60 min Incubation temperature—37° C. T4 PNK inactivation—65° C. for 20 min

(69) Annealing of the Phosphorylated Oligonucleotides Oligonucleotide 1—5 μL from the corresponding phosphorylation mixture Oligonucleotide 1—5 μL from the corresponding phosphorylation mixture Sterile MQ water—90 μL Incubate the phosphorylated oligos at 95° C. for 3 minutes. Cool the reaction slowly at room temperature for ˜30 min to 1 hr

(70) Step 3: Ligation of the gtCas9-Cleaved, A-Tailed Fragments with the Sequencing Adaptors

(71) The products of step 1 and 2 were ligated according to the following protocol: 10×T4 DNA Ligase Buffer—2 μl Product step 1—50 ng Product step 2—4 ng T4 DNA Ligase—1 μl Sterile MQ water—to 20 μl Incubation time—10 min Incubation temperature—20-25° C. Heat inactivation at 65° C. for 10 min

(72) Step 4: PCR Amplification of a 150-Nucleotides Long PAM-Containing Fragment

(73) 5 μl from the ligation mixture of step 4 were used as template for PCR amplification using Q5 DNA polymerase (NEB). The oligonucleotide with the thymine extension from step 2 was employed as the forward primer and the reverse primer was designed to anneal 150 nucleotides downstream of the PAM sequence.

(74) The same sequence was amplified using non-gtCas9 treated PAM-library DNA as template. Both PCR products were gel purified and sent for Illumina HiSeq 2500 paired-end sequencing (Baseclear).

(75) Analysis of the Sequencing Results and Determination of the Candidate PAM Sequences

(76) After analysing the sequencing results the following frequency matrices were constructed. The matrices depict the relative abundance of each nucleotide at every PAM position of the gtCas9 digested and non-digested libraries:

(77) TABLE-US-00003 pos1 pos2 pos3 pos4 pos5 pos6 pos7 Non- digested A 19.22 20.83 19.12 24.43 24.59 21.75 18.22 C 34.75 30 31.9 30.54 25.96 27.9 27.17 T 19.16 22.19 25.34 21.28 26.09 26 21.56 G 26.87 26.98 23.64 23.75 23.36 24.35 33.05 Digested A 10.63 18.65 14.6 14.49 3.36 8.66 27.54 C 66.22 49.59 56.82 60.35 92.4 62.26 34.94 T 8.09 11.21 19.12 12.15 2.35 14.66 5.58 G 15.05 20.54 9.45 13.01 1.89 14.43 31.94

(78) These results indicate a clear preference for targets with cytosine at the 5.sup.th PAM position and preference for targets with cytosines at the first 4 PAM positions.

Example 7: In Silico PAM Prediction for gtCas9

(79) In silico predictions of PAMs are possible if enough protospacer sequences are available in genome databases. The in silico prediction of gtCas9 PAM started with identification of hits of spacers from the CRISPR array in the genome of G. thermodenitrificans T12 strain by comparison to sequences in genome databases such as GenBank. The “CRISPR finder” (http://crispr.u-psud.fr/Server/) tool was used to identify candidate CRISPR loci in T12. The identified CRISPR loci output was then loaded into “CRISPR target” (http://bioanalysis.otago.ac.nz/CRISPRTarget/crispr analysis.html) tool, which searches selected databases and provides an output with matching protospacers. These protospacer sequences were then screened for unique hits and for complementarity to spacers—for example, mismatches in the seed sequence were considered to be likely false positive hits and were excluded from further analysis. Hits with identity to prophage sequences and (integrated) plasmids demonstrated that the obtained hits were true positives. Overall, this process yielded 6 single hits (FIG. 7). Subsequently, the flanking regions (3′ for Type II gtCas nuclease) of the remaining, unique protospacer hits were aligned and compared for consensus sequences using a WebLogo (weblogo.berkeley.edu/logo.cgi) (Crooks_G E, Hon G, Chandonia_J M, Brenner_S E WebLogo: A sequence logo generator, Genome Research, 14:1188-1190, (2004)) tool (FIG. 8).

(80) The in silico results were comparable to the in vitro PAM identification experimental results (see Example 6) in which there was a bias for the identity of the 5.sup.th residue of the PAM sequence to be a cytosine.

Example 8: Determination of 8 Nucleotide Long PAM Sequences for gtCas9

(81) The in silico data from Example 8 suggested that gtCas9 had some preference for adenosine at the 8.sup.th position, therefore further PAM determination experiments were carried out where the 8.sup.th position of the PAM sequence was also tested. This is consistent with the characterisation of mesophilic Brevibacillus laterosporus SSP360D4 (Karvelis et al., 2015) Cas9 PAM sequence which was found to extend between the 5.sup.th and the 8.sup.th positions at the 3′ end of a protospacer.

(82) Specific 8 nucleotide-long sequence variants of the PAMs were trialed with gtCas9:

(83) 1) CNCCCCAC [SEQ ID NO: 17],

(84) 2) CCCCCCAG [SEQ ID NO: 18],

(85) 3) CCCCCCAA [SEQ ID NO: 11],

(86) 4) CCCCCCAT [SEQ ID NO: 19],

(87) 5) CCCCCCAC [SEQ ID NO: 20],

(88) 6) NNNNTNNC (negative control PAM)

(89) After performing an in vitro cleavage assay at 60° C. targeting these (non-linearized) plasmids with purified gtCas9 and the same sgRNA as before (see Example 6) an increased gtCas9 cleavage activity when the CCCCCCAA [SEQ ID NO: 11] sequence was employed as PAM was observed (FIG. 9). However, cleavage activity was clearly detectable for all the tested PAM sequences, even for the negative control PAM sequence a faint cleavage band was observed. Without wishing to be bound to a particular theory, it is possible that use of high gtCas9 concentration contributed to the cleavage observed with the negative control. It has been generally observed that high Cas9 concentrations in in vitro assays lead to Cas9-induced DNA cleavage without stringent PAM requirement.

(90) Cas9 concentration in general is known to influence the efficiency of the Cas9 induced DNA cleavage (higher Cas9 concentration results in higher Cas9 activity). This was also observed when performing in vitro assays using the targeted plasmid with the CCCCCCAA [SEQ ID NO: 11] PAM sequence and different gtCas9 concentrations (FIG. 10)

(91) The targeted plasmid with the CCCCCCAA [SEQ ID NO: 11] PAM sequence for in vitro assays as described above was conducted over a wide temperature range between 38 and 78° C. (FIG. 11). Surprisingly, gtCas9 was active at all the temperatures showing the highest activity between 40.1 and 64.9° C.

(92) Thus the optimal temperature range of Cas9 from Geobacillus species is much higher than that of Cas9 proteins which have been characterised to date. Similarly the upper extent of the range in which it retains nuclease activity is much higher than that of known Cas9 proteins. A higher optimal temperature and functional range provides a significant advantage in genetic engineering at high temperatures and therefore in editing the genomes of thermophilic organisms, which have utility in a range of industrial, agricultural and pharmaceutical processes conducted at elevated temperatures.

Example 9: In Vivo Genome Editing of Bacillus smithii ET138 with gtCas9 and 8 Nucleotide Length PAM Sequences

(93) To confirm that the 8 nucleotide PAMs were also recognised by gtCas9 in vivo, an experiment was designed to delete the pyrF gene in the genome of Bacillus smithii ET138 at 55° C.

(94) This method relies upon providing a homologous recombination template construct in which regions complimentary to the upstream and downstream of the target (pyrF) gene are provided to B. smithii ET 138 cells. Introduction of the template allows for the process of homologous recombination to be used to introduce the homologous recombination template (with no pyrF gene) into the genome such that it also replaces the WT pyrF gene in the genome of a cell.

(95) Inclusion of a gtCas9 and a sgRNA in the homologous recombination construct can be used to introduce double stranded DNA breaks (DSDBs) into bacterial genomes that contain WT pyrF. DSDBs in a bacterial genome typically results in cell death. Therefore, a sgRNA that recognises a sequence in the WT pyrF could result in DSDB and death of cells containing the WT pyrF only. Introduction of DSDB is also dependent on a suitable PAM sequence being located downstream at the 3′ end of the protospacer that is recognised by gtCas9.

(96) The pNW33n plasmid was used as a backbone to clone: i) the cas9.sub.gt gene under the control of an in-house developed glucose repressible promoter; and ii) the 1 kb upstream and 1 kb downstream regions of the pyrF gene in the genome of B. smithii ET138 as a template for homologous recombination that would result in deletion of the pyrF gene from the genome of B. smithii ET138; and iii) single guide RNA (sgRNA) expressing module under the transcriptional control of a constitutive promoter.

(97) Three separate constructs were generated in which the sequence of the single guide RNAs differed at the first 20 nucleotides, which correspond to the sequence that guides the gtCas9 to its specific DNA target in the genome (also known as the spacer). The three different spacer sequences were designed to target three different candidate protospacers all in the pyrF gene of B. smithii ET138. The constructs are herein referred to as constructs 1, 2 and 3 respectively.

(98) The three different targeted protospacers had at their 3′-end the following candidate PAM sequences: 1. TCCATTCC (negative control according to the results of the in vitro assays; 3′-end of the protospacer targeted by the sgRNA encoded on construct number 3) 2. ATCCCCAA (3′-end of the protospacer targeted by the sgRNA encoded on construct number 1; [SEQ ID NO: 21]) 3. ACGGCCAA (3′-end of the protospacer targeted by the sgRNA encoded on construct number 2, [SEQ ID NO: 22])

(99) After transforming B. smithii ET 138 cells with one of the three constructs and plating on selection plates, the following results were obtained: 1. When the cells were transformed with the construct targeting the protospacer that had the negative control TCCATTCC PAM sequence at the 3′ end (construct number 3) the transformation efficiency was not affected (FIG. 12 A). The number of colonies was in the same range as the number of colonies after transformation with the pNW33n positive control construct (FIG. 12 B). Of the 15 colonies that were subjected to colony PCR to screen for colonies in which the pyrF gene was deleted, none showed the deletion genotype −2.1 kb expected band size-, all were wild-type −2.9 kb expected band size- (FIG. 13). This indicates that the tested PAM was indeed not recognised by the gtCas9 in vivo. 2. When the cells were transformed with construct number 1 only a few colonies were obtained (FIG. 12 C) when compared to the positive control (cells transformed with pNW33n). 20 colonies were subjected to colony PCR to screen for colonies in which the pyrF gene was deleted. The majority (19) of the colonies contained both the wild type and pyrF deletion genotype whilst one colonies had a pyrF deletion genotype (FIG. 14). This result indicated that the PAM sequence ATCCCCAA [SEQ ID NO: 21] is recognised in vivo by gtCas9 because no WT only genotypes were observed. The reduced transformation efficiency is also indicative that a proportion of the cell population has been reduced, which could be attributable to cell death caused of WT only genotype cells by DSDB due to successful targeting by gtCas9. 3. When the cells were transformed with construct number 2 no colonies were obtained (FIG. 12 D). The lack of colonies is indicative that all of the cell population had been successfully targeted by the gtCas9, which led to cell death caused by DSDB. This suggests that ACGGCCAA [SEQ ID NO: 22] PAM sequence is recognised by gtCas9.

(100) These results indicate that gtCas9 is active at 55° C. in vivo with the above mentioned PAM sequences, a result that comes in agreement with the in vitro PAM determination results. Moreover it can be used as a genome editing tool at the same temperature in combination with a plasmid borne homologous recombination template.

Example 10: ThermoCas9 Identification and Purification

(101) We recently isolated and sequenced Geobacillus thermodenitrificans strain T12, a Gram positive, moderately thermophilic bacterium with an optimal growth temperature at 65° C. (Daas et al. Biotechnol. Biofuels 9, 210 (2016)). Contrary to previous claims that type II CRISPR-Cas systems are not present in thermophilic bacteria (Li et al. Nucleic Acids Res. 44, e34-e34 (2016)), the sequencing results revealed the existence of a type-IIC CRISPR-Cas system in the genome of G. thermodenitrificans T12 (FIG. 15A). The Cas9 endonuclease of this system (ThermoCas9) was predicted to be relatively small (1082 amino acids) compared to other Cas9 orthologues, such as SpCas9 (1368 amino acids). The size difference is mostly due to a truncated REC lobe, as has been demonstrated for other small Cas9 orthologues (FIG. 19)(Ran et al. Nature 520, 186-191 (2015)). Furthermore, ThermoCas9 was expected to be active at least around the temperature optimum of G. thermodenitrificans T12 (Daas et al. Biotechnol. Biofuels 9, 210 (2016)). Using the ThermoCas9 sequence as query, we performed BLAST-P searches in the NCBI/non-redundant protein sequences dataset, and found a number of highly identical Cas9 orthologues (87-99% identity at protein level, Table 1), mostly within the Geobacillus genus, supporting the idea that ThermoCas9 is part of a highly conserved defense system of thermophilic bacteria (FIG. 15B). These characteristics suggested it may be a potential candidate for exploitation as a genome editing and silencing tool for thermophilic microorganisms, and for conditions at which enhanced protein robustness is required.

(102) We initially performed in silico prediction of the crRNA and tracrRNA modules of the G. thermodenitrificans T12 CRISPR-Cas system using a previously described approach (Mougiakos et al. Trends Biotechnol. 34, 575-587 (2016); Ran et al. Nature 520, 186-191 (2015)). Based on this prediction, a 190 nt sgRNA chimera was designed by linking the predicted full size crRNA (30 nt long spacer followed by 36 nt long repeat) and tracrRNA (36 nt long anti-repeat followed by a 88 nt sequence with three predicted hairpin structures). ThermoCas9 was heterologously expressed in E. coli and purified to homogeneity. Hypothesizing that the loading of the sgRNA to the ThermoCas9 would stabilize the protein, we incubated purified apo-ThermoCas9 and ThermoCas9 loaded with in vitro transcribed sgRNA at 60° C. and 65° C., for 15 and 30 min. SDS-PAGE analysis showed that the purified ThermoCas9 denatures at 65° C. but not at 60° C., while the denaturation temperature of ThermoCas9-sgRNA complex is above 65° C. (FIG. 15C). The demonstrated thermostability of ThermoCas9 implied its potential as a thermo-tolerant CRISPR-Cas9 genome editing tool, and encouraged us to analyze some relevant molecular features in more detail.

(103) TABLE-US-00004 TABLE 1 pBLAST results of Cas9 protein sequences from FIG. 1B compared to ThermoCas9. Species % identity.sup.a Geobacillus 47C-IIb 99 Geobacillus 46C-IIa 89 Geobacillus LC300 89 Geobacillus jurassicus 89 Geobacillus MAS1 88 Geobacillus stearothermophilus 88 Geobacillus stearothermophilus ATCC 88 12980 Geobacillus Sah69 88 Geobacillus stearothermophilus 88 Geobacillus kaustophilus 88 Geobacillus stearothermophilus 88 Geobacillus genomosp. 3 87 Geobacillus genomosp. 3 87 Geobacillus subterraneus 87 Effusibacillus pohliae 86

Example 11: ThermoCas9 PAM Determination

(104) The first step towards the characterization of ThermoCas9 was the in silico prediction of its PAM preferences for successful cleavage of a DNA target. We used the 10 spacers of the G. thermodenitrificans T12 CRISPR locus to search for potential protospacers in viral and plasmid sequences using CRISPRtarget (Biswas et al. RNA Biol. 10, 817-827 (2013)). As only two hits were obtained with phage genomes (FIG. 20A), it was decided to proceed with an in vitro PAM determination approach. We in vitro transcribed the predicted sgRNA sequence that contained a spacer for ThermoCas9-based targeting linear dsDNA substrates with a matching protospacer. The protospacer was flanked at its 3′-end by randomized 7-base pair (bp) sequences. After performing ThermoCas9-based cleavage assays at 55° C., the cleaved members of the library (together with a non-targeted library sample as control) were deep-sequenced and compared in order to identify the ThermoCas9 PAM preference (FIG. 16A). The sequencing results revealed that ThermoCas9 introduces double stranded DNA breaks that, in analogy to mesophilic Cas9 variants, are located mostly between the 3.sup.rd and the 4.sup.th PAM proximal nucleotides. Moreover, the cleaved sequences revealed that ThermoCas9 recognizes a 5′-NNNNCNR-3′ PAM, with subtle preference for cytosine at the 1.sup.st, 3.sup.rd, 4.sup.th and 6.sup.th PAM positions (FIG. 16B). Recent studies have revealed the importance of the 8.sup.th PAM position for target recognition of certain Type IIC Cas9 orthologues (Karvelis et al. Genome Biol. 16, 253 (2015); Kim et al. Genome Res. 24, 1012-9 (2014)). For this purpose, and taking into account the results from the in silico ThermoCas9 PAM prediction, we performed additional PAM determination assays. This revealed optimal targeting efficiency in the presence of an adenine at the 8.sup.th PAM position (FIG. 16C). Interestingly, despite the limited number of hits, the aforementioned in silico PAM prediction (FIG. 20B) also suggested the significance of a cytosine at the 5.sup.th and an adenine at the 8.sup.th PAM positions.

(105) To further clarify the ambiguity of the PAM at the 6.sup.th and 7.sup.th PAM positions, we generated a set of 16 different target DNA fragments in which the matching protospacer was flanked by 5′-CCCCCNNA-3′ [SEQ ID NO: 13] PAMs. Cleavage assays of these fragments (each with a unique combination of the 6.sup.th and 7.sup.th nucleotide) were performed in which the different components (ThermoCas9, sgRNA guide, dsDNA target) were pre-heated separately at different temperatures (20, 30, 37, 45, 55 and 60° C.) for 10 min before combining and incubating them for 1 hour at the corresponding assay temperature. When the assays were performed at temperatures between 37° C. and 60° C., all the different DNA substrates were cleaved (FIG. 16D, FIG. 21). However, the most digested target fragments consisted of PAM sequences (5.sup.th to 8.sup.th PAM positions) 5′-CNAA-3′ and 5′-CMCA-3′, whereas the least digested targets contained a 5′-CAKA-3′ PAM. At 30° C., only cleavage of the DNA substrates with the optimal PAM sequences (5.sup.th to 8.sup.th PAM positions) 5′-CNAA-3′ and 5′-CMCA-3′ was observed (FIG. 2D). Lastly, at 20° C. only the DNA substrates with (5.sup.th to 8.sup.th PAM positions) 5′-CVAA-3′ and 5′-CCCA PAM sequences were targeted (FIG. 21), making these sequences the most preferred PAMs. These findings demonstrate that at its lower temperature limit, ThermoCas9 only cleaves fragments with a preferred PAM. This characteristic could be exploited during in vivo editing processes, for e.g. to avoid off-target effects.

Example 12: Thermostability and Truncations

(106) The predicted tracrRNA consists of the anti-repeat region followed by three hairpin structures (FIG. 17A). Using the tracrRNA along with the crRNA to form a sgRNA chimera resulted in successful guided cleavage of the DNA substrate. It was observed that a 41-nt long deletion of the spacer distal end of the full-length repeat-anti-repeat hairpin (FIG. 17A), most likely better resembling the dual guide's native state, had little to no effect on the DNA cleavage efficiency. The effect of further truncation of the predicted hairpins (FIG. 17A) on the cleavage efficiency of ThermoCas9 was evaluated by performing a cleavage time-series in which all the components (sgRNA, ThermoCas9, substrate DNA) were pre-heated separately at different temperatures (37-65° C.) for 1, 2 and 5 min before combining and incubating them for 1 hour at various assay temperatures (37-65° C.). The number of predicted stem-loops of the tracrRNA scaffold seemed to play a crucial role in DNA cleavage; when all three loops were present, the cleavage efficiency was the highest at all tested temperatures, whereas the efficiency decreased upon removal of the 3′ hairpin (FIG. 17B). Moreover, the cleavage efficiency drastically dropped upon removal of both the middle and the 3′ hairpins (FIG. 22). Whereas pre-heating ThermoCas9 at 65° C. for 1 or 2 min resulted in detectable cleavage, the cleavage activity was abolished after 5 min incubation. The thermostability assay showed that sgRNA variants without the 3′ stem-loop result in decreased stability of the ThermoCas9 protein at 65° C., indicating that a full length tracrRNA is required for optimal ThermoCas9-based DNA cleavage at elevated temperatures. Additionally, we also varied the lengths of the spacer sequence (from 25 to 18 nt) and found that spacer lengths of 23, 21, 20 and 19 cleaved the targets with the highest efficiency. The cleavage efficiency drops significantly when a spacer of 18 nt is used.

(107) In vivo, the ThermoCas9:sgRNA RNP complex is probably formed within minutes. Together with the above findings, this motivated us to evaluate the activity and thermostability of the RNP. Pre-assembled RNP complex was heated at 60, 65 and 70° C. for 5 and 10 min before adding pre-heated DNA and subsequent incubation for 1 hour at 60, 65 and 70° C. Strikingly, we observed that the ThermoCas9 RNP was active up to 70° C., in spite of its pre-heating for 5 min at 70° C. (FIG. 17C). This finding confirmed our assumption that the ThermoCas9 stability strongly correlates with the association of an appropriate sgRNA guide (Ma et al., Mol. Cell 60, 398-407 (2015)).

(108) It would be advantageous in some applications for the ThermoCas9 to have a broad temperature activity range, that is, to be functional at both low and high temperatures. Also, in some circumstances, it would be advantageous if the activity of the ThermoCas9 could be restricted to narrower temperature ranges, for example, active at only low or only high temperatures. Consequently, the ability to manipulate the range of temperatures at which ThermoCas9 is capable of targeted cleavage or binding or at which targeted cleavage or binding takes place efficiently, by modifying structural features of ThermoCas9 or associated elements (such as the sgRNA), would enable a greater level of control to be exerted over nucleic acid sequence manipulation. Hence, we set out to compare the ThermoCas9 temperature range to that of the Streptococcus pyogenes Cas9 (SpCas9). Both Cas9 homologues were subjected to in vitro activity assays between 20 and 65° C. Both proteins were incubated for 5 min at the corresponding assay temperature prior to the addition of the sgRNA and the target DNA molecules. In agreement with previous analysis.sup.26, the mesophilic SpCas9 was active only between 25 and 44° C. (FIG. 17D); above these temperature SpCas9 activity rapidly decreased to undetectable levels. In contrast, ThermoCas9 cleavage activity could be detected between 25 and 65° C. (FIG. 17D). This indicates the potential to use ThermoCas9 as a genome editing tool for both thermophilic and mesophilic organisms.

(109) Previously characterized, mesophilic Cas9 endonucleases employ divalent cations to catalyze the generation of DSBs in target DNA (Jinek et al. Science 337, 816-821 (2012); Chen et al. J. Biol. Chem. 289, 13284-13294 (2014)). To evaluate which cations contribute to DNA cleavage by ThermoCas9, plasmid cleavage assays were performed in the presence of one of the following divalent cations: Mg.sup.2+, Ca.sup.2+, Mn.sup.2+, Co.sup.2+, Ni.sup.2+, and Cu.sup.2+; an assay with the cation-chelating agent EDTA was included as negative control. As expected, target dsDNA was cleaved in the presence of divalent cations and remained intact in the presence of EDTA (FIG. 23A). Based on reports that certain type-IIC systems were efficient single stranded DNA cutters (Ma et al. Mol. Cell 60, 398-407 (2015); Zhang et al. Mol. Cell 60, 242-255 (2015)), we tested the activity of ThermoCas9 on ssDNA substrates. However, no cleavage was observed, indicating that ThermoCas9 is a dsDNA nuclease (FIG. 23B).

Example 13: ThermoCas9-Based Gene Deletion in the Thermophile B. smithii

(110) We set out to develop a ThermoCas9-based genome editing tool for thermophilic bacteria. Here, we show a proof of principle using Bacillus smithii ET 138 cultured at 55° C. In order to use a minimum of genetic parts, we followed a single plasmid approach. We constructed a set of pNW33n-based pThermoCas9 plasmids containing the thermocas9 gene under the control of the native xylL promoter (P.sub.xyL), a homologous recombination template for repairing Cas9-induced double stranded DNA breaks within a gene of interest, and a sgRNA expressing module under control of the constitutive pta promoter (P.sub.pta) from Bacillus coagulans (FIG. 4A).

(111) The first goal was the deletion of the full length pyrF gene from the genome of B. smithii ET 138. The pNW33n-derived plasmids pThermoCas9_bsApyrF1 and pThermoCas9_bsEpyrF2 were used for expression of different ThermoCas9 guides with spacers targeting different sites of the pyrF gene, while a third plasmid (pThermoCas9_ctrl) contained a random non-targeting spacer in the sgRNA expressing module. Transformation of B. smithii ET 138 competent cells at 55° C. with the control plasmids pNW33n (no guide) and pThermoCas9_ctrl resulted in the formation of ˜200 colonies each. Out of 10 screened pThermoCas9_ctrl colonies, none contained the ΔpyrF genotype, confirming findings from previous studies that homologous recombination in B. smithii ET 138 is not sufficient to obtain clean mutants (Mougiakos et al. ACS Synth. Biol. 6, 849-861 (2017); Bosma et al. Microb. Cell Fact. 14, 99 (2015)). In contrast, transformation with the pThermoCas9_bsApyrF1 and pThermoCas9_bsEpyrF2 plasmids resulted in 20 and 0 colonies respectively, confirming the in vivo activity of ThermoCas9 at 55° C. and verifying the above described broad in vitro temperature range of the protein. Out of ten pThermoCas9_ΔpyrF1 colonies screened, one was a clean ΔpyrF mutant whereas the rest had a mixed wild type/ΔpyrF genotype (FIG. 4B), proving the applicability of the system, as the designed homology directed repair of the targeted pyrF gene was successful. Nonetheless, in the tightly controlled SpCas9-based counter-selection system we previously developed the pyrF deletion efficiency was higher (Olson et al., Curr. Opin. Biotechnol. 33, 130-141 (2015)). The low number of obtained transformants and clean mutants in the ThermoCas9-based tool can be explained by the low homologous recombination efficiency in B. smithii (Olson et al., Curr. Opin. Biotechnol. 33, 130-141 (2015)) combined with the constitutive expression of highly active ThermoCas9. It is anticipated that the use of a tightly controllable promoter will increase efficiencies.

Example 14: ThermoCas9-Based Gene Deletion in the Mesophile Pseudomonas putida

(112) To broaden the applicability of the ThermoCas9-based genome editing tool, and to evaluate whether in vitro results could be confirmed in vivo, its activity in the mesophilic Gram-negative bacterium P. putida KT2440 was evaluated by combining homologous recombination and ThermoCas9-based counter-selection. For this organism, a Cas9-based tool has not been reported to date. Once more, we followed a single plasmid approach. We constructed the pEMG-based pThermoCas9_ppΔpyrF plasmid containing the thermocas9 gene under the control of the 3-methylbenzoate-inducible Pm-promoter, a homologous recombination template for deletion of the pyrF gene and a sgRNA expressing module under the control of the constitutive P3 promoter. After transformation of P. putida KT2440 cells and PCR confirmation of plasmid integration, a colony was inoculated in selective liquid medium for overnight culturing at 37° C. The overnight culture was used for inoculation of selective medium and ThermoCas9 expression was induced with 3-methylbenzoate. Subsequently, dilutions were plated on non-selective medium, supplemented with 3-methylbenzoate. For comparison, a parallel experiment without inducing ThermoCas9 expression with 3-methylbenzoate was performed. The process resulted in 76 colonies for the induced culture and 52 colonies for the non-induced control culture. For the induced culture, 38 colonies (50%) had a clean deletion genotype and 6 colonies had mixed wild-type/deletion genotype. On the contrary, only 1 colony (2%) of the non-induced culture had the deletion genotype and there were no colonies with mixed wild-type/deletion genotype retrieved (FIG. 24). These results show that ThermoCas9 can be used as an efficient counter-selection tool in the mesophile P. putida KT2440 when grown at 37° C.

Example 15: ThermoCas9-Based Gene Silencing

(113) An efficient thermoactive transcriptional silencing CRISPRi tool is currently not available. Such a system could be useful in a number of applications. For example, such a system would greatly facilitate metabolic studies of thermophiles. A catalytically dead variant of ThermoCas9 could serve this purpose by steadily binding to DNA elements without introducing dsDNA breaks. To this end, we identified the RuvC and HNH catalytic domains of ThermoCas9 and introduced the corresponding D8A and H582A mutations for creating a dead (d)ThermoCas9. After confirmation of the designed sequence, Thermo-dCas9 was heterologously produced, purified and used for an in vitro cleavage assay with the same DNA target as used in the aforementioned ThermoCas9 assays; no cleavage was observed confirming the catalytic inactivation of the nuclease.

(114) Towards the development of a Thermo-dCas9-based CRISPRi tool, we aimed for the transcriptional silencing of the highly expressed ldhL gene from the genome of B. smithii ET138. We constructed the pNW33n-based vectors pThermoCas9i_ldhL and pThermoCas9i_ctrl. Both vectors contained the thermo-dCas9 gene under the control of P.sub.xylL promoter and a sgRNA expressing module under the control of the constitutive P.sub.pta promoter (FIG. 4C). The pThermoCas9i_ldhL plasmid contained a spacer for targeting the non-template DNA strand at the 5′ end of the 138 ldhL gene in B. smithii ET 138 (Figure S7). The position and targeted strand selection were based on previous studies (Bikard et al. Nucleic Acids Res. 41, 7429-7437 (2013); Larson et al. Nat. Protoc. 8, 2180-2196 (2013)), aiming for the efficient down-regulation of the ldhL gene. The pThermoCas9i_ctrl plasmid contained a random non-targeting spacer in the sgRNA-expressing module. The constructs were used to transform B. smithii ET 138 competent cells at 55° C. followed by plating on LB2 agar plates, resulting in equal amounts of colonies. Two out of the approximately 700 colonies per construct were selected for culturing under microaerobic lactate-producing conditions for 24 hours, as described previously (Bosma et al. Appl. Environ. Microbiol. 81, 1874-1883 (2015)). The growth of the pThermoCas9i_ldhL cultures was 50% less than the growth of the pThermoCas9i_ctrl cultures (FIG. 4E). We have previously shown that deletion of the ldhL gene leads to severe growth retardation in B. smithii ET 138 due to a lack of Ldh-based NAD.sup.+-regenerating capacity under micro-aerobic conditions (Bosma et al. Microb. Cell Fact. 14, 99 (2015)). Thus, the observed decrease in growth is likely caused by the transcriptional inhibition of the ldhL gene and subsequent redox imbalance due to loss of NAD.sup.+-regenerating capacity. Indeed, HPLC analysis revealed 40% reduction in lactate production of the ldhL silenced cultures, and RT-qPCR analysis showed that the transcription levels of the ldhL gene were significantly reduced in the pThermoCas9i_ldhL cultures compared to the pThermoCas9i_ctrl cultures (FIG. 4E).

Example 16: Summary

(115) Most CRISPR-Cas applications are based on RNA-guided DNA interference by Class 2 CRISPR-Cas proteins, such as Cas9 and Cas12a (Komor et al., Cell 168, 20-36 (2017); Puchta, Curr. Opin. Plant Biol. 36, 1-8 (2017); Xu et al. J. Genet. Genomics 42, 141-149 (2015); Tang et al. Nat. Plants 3, 17018 (2017); Zetsche et al. Nat. Biotechnol. 35, 31-34 (2016); Mougiakos et al., Trends Biotechnol. 34, 575-587 (2016)). Prior to this work, no Class 2 CRISPR-Cas immune systems were identified and characterized in thermophilic microorganisms, in contrast to the highly abundant Class 1 CRISPR-Cas systems present in thermophilic bacteria and archaea (Makarova et al., Nat. Rev. Microbiol. 13, 722-736 (2015); Weinberger et al., MBio 3, e00456-12 (2012)), a few of which have been used for genome editing of thermophiles (Li et al. Nucleic Acids Res. 44, e34-e34 (2016)). As a result, the application of CRISPR-Cas technologies was mainly restricted to temperatures below 42° C., due to the mesophilic nature of the employed Cas-endonucleases. Hence, this has excluded application of these technologies in obligate thermophiles and in experimental approaches that require elevated temperatures and/or improved protein stability.

(116) The inventors have characterized ThermoCas9, a Cas9 orthologue from the thermophilic bacterium G. thermodenitrificans T12, a strain that we previously isolated from compost (Daas et al., Biotechnol. Biofuels 9, 210 (2016)). Data mining revealed additional Cas9 orthologues in the genomes of other thermophiles, which were nearly identical to ThermoCas9, for the first time showing that CRISPR-Cas type II systems do exist in thermophiles, at least in some branches of the Bacillus and Geobacillus genera. The inventors have shown that ThermoCas9 is active in vitro in a wide temperature range of 20-70° C., which is much broader than the 25-44° C. range of its mesophilic orthologue SpCas9. The extended activity and stability of ThermoCas9 allows for its application in molecular biology techniques that require DNA manipulation at temperatures of 20-70° C., as well as its exploitation in harsh environments that require robust enzymatic activity. Furthermore, the inventors have identified several factors that are important for conferring the thermostability of ThermoCas9. Firstly, the inventors have demonstrated that the PAM preferences of ThermoCas9 are very strict for activity in the lower part of the temperature range 30° C.), whereas more variety in the PAM is allowed for activity at the moderate to optimal temperatures (37-60° C.). Secondly, the inventors have demonstrated that ThermoCas9 activity and thermostability strongly depends on the association with an appropriate sgRNA guide. Without wishing to be bound by any particular theory, the inventors hypothesise that this stabilization of the multi-domain Cas9 protein is most likely the result of a major conformational change from an open/flexible state to a rather compact state, as described for SpCas9 upon guide binding (Jinek et al. Science 343, 1247997-1247997 (2014)).

(117) Based on the here described characterization of the novel ThermoCas9, the inventors have successfully developed genome engineering tools for strictly thermophilic prokaryotes. We showed that ThermoCas9 is active in vivo at 55° C. and 37° C. and we adapted the current Cas9-based engineering technologies for the thermophile B. smithii ET 138 and the mesophile P. putida KT2440. Due to the wide temperature range of ThermoCas9, it is anticipated that the simple, effective and single plasmid-based ThermoCas9 approach will be suitable for a wide range of thermophilic and mesophilic microorganisms that can grow at temperatures from 37° C. up to 70° C. This complements the existing mesophilic technologies, allowing their use for a large group of organisms for which these efficient tools were thus far unavailable.

(118) Screening natural resources for novel enzymes with desired traits is unquestionably valuable. Previous studies have suggested that the adaptation of a mesophilic Cas9 orthologue to higher temperatures, with directed evolution and protein engineering, would be the best approach towards the construction of a thermophilic Cas9 protein.sup.29. Instead, we identified a clade of Cas9 in some thermophilic bacteria, and transformed one of these thermostable ThermoCas9 variants into a powerful genome engineering tool for both thermophilic and mesophilic organisms. With this study, we further stretched the potential of the Cas9-based genome editing technologies and open new possibilities for using Cas9 technologies in novel applications under harsh conditions or requiring activity over a wide temperature range.

Example 17: Materials and Methods

(119) a. Bacterial Strains and Growth Conditions

(120) The Moderate Thermophile B. smithii ET 138 ΔsigF ΔhsdR (Mougiakos, et al., (2017) ACS Synth. Biol. 6, 849-861) was used for the gene editing and silencing experiments using ThermoCas9. It was grown in LB2 medium (Bosma, et al. Microb. Cell Fact. 14, 99 (2015)) at 55° C. For plates, 30 g of agar (Difco) per liter of medium was used in all experiments. If needed chloramphenicol was added at the concentration of 7 μg/mL. For protein expression, E. coli Rosetta (DE3) was grown in LB medium in flasks at 37° C. in a shaker incubator at 120 rpm until an OD.sub.600 nm of 0.5 was reached after the temperature was switched to 16° C. After 30 min, expression was induced by addition of isopropyl-1-thio-β-d-gal-actopyranoside (IPTG) to a final concentration of 0.5 mM, after which incubation was continued at 16° C. For cloning PAM constructs for 6.sup.th and 7.sup.th, and 8.sup.th positions, DH5-alpha competent E. coli (NEB) was transformed according to the manual provided by the manufacturer and grown overnight on LB agar plates at 37° C. For cloning degenerate 7-nt long PAM library, electro-competent DH10B E. coli cells were transformed according to standard procedures (Sambrook, Fritsch & Maniatis, T. Molecular cloning: a laboratory manual. (Cold Spring Harbor Laboratory, 1989) and grown on LB agar plates at 37° C. overnight. E. coli DH5a λpir (Invitrogen) was used for P. putida plasmid construction using the transformation procedure described by Ausubel et al. (Current Protocols in Molecular Biology. (John Wiley & Sons, Inc., 2001). doi:10.1002/0471142727). For all E. coli strains, if required chloramphenicol was used in concentrations of 25 mg/L and kanamycin in 50 mg/L. Pseudomonas putida KT2440 (DSM 6125) strains were cultured at 37° C. in LB medium unless stated otherwise. If required, kanamycin was added in concentrations of 50 mg/L and 3-methylbenzoate in a concentration of 3 mM.

(121) b. ThermoCas9 Expression and Purification

(122) ThermoCas9 was PCR-amplified from the genome of G. thermodenitrificans T12, then cloned and heterologously expressed in E. coli Rosetta (DE3) and purified using FPLC by a combination of Ni.sup.2+-affinity, anion exchange and gel filtration chromatographic steps. The gene sequence was inserted into plasmid pML-1B (obtained from the UC Berkeley MacroLab, Addgene #29653) by ligation-independent cloning using oligonucleotides (Table 2) to generate a protein expression construct encoding the ThermoCas9 polypeptide sequence (residues 1-1082) fused with an N-terminal tag comprising a hexahistidine sequence and a Tobacco Etch Virus (TEV) protease cleavage site. To express the catalytically inactive ThermoCas9 protein (Thermo-dCas9), the D8A and H582A point mutations were inserted using PCR and verified by DNA sequencing.

(123) TABLE-US-00005 TABLE 2 Oligonucleotides used in this study. SEQ Oligo Sequence Description NO PAM BG6494 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 59 Library TTCACNNNNNNNCTAGATCCTTTTAAATTA with 7-nt long random PAM sequence con- AAAATGAAGTTTTAAATCAATC struction BG6495 TATGCC TCAGACCAAGTTTACTCA RV for construction of in vitro target DNA 60 TATATACTTTAGATTGATTTAAAACTTCATT sequences TTTAATTTAAAAGGATCTAG BG7356 TCGTCGGCAGCGTCAGATGTGTATAAGAGAC Adaptor when annealed with BG7357, ligates 61 AG-T- to A-tailed ThermoCas9 cleaved fragments BG7357 CTGTCTCTTATACACATCTGACGCTGCCGA Adaptor when annealed with BG7356, ligates 62 CGA to A-tailed ThermoCas9 cleaved fragments BG7358 TCGTCGGCAGCGTCAG FW sequencing adaptor for PCR amplification of 63 the ThermoCas9 cleaved fragments BG7359 GTCTCGTGGGCTCGGAGATGTGTATAAGA RV sequencing adapter for PCR amplification of 64 GACAGGACCATGATTACGCCAAGC the ThermoCas9 cleaved fragments BG7616 TCGTCGGCAGCGTCAGATGTGTATAAGAG RV sequencing adaptor for PCR amplification of 65 ACAGGGTCATGAGATTATCAAAAAGGAT the control fragments CTTC BG8157 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 66 TTCACCCCCCCAGCTAGATCCTTTTAAATT with “CCCCCCAG” PAM AAAAATGAAGTTTTAAATCAATC BG8158 TATGCCTCATGAGATTATCAAAAAGGATC FW for construction of in vitro target DNA 67 TTCACCCCCCCAACTAGATCCTTTTAAATT with “CCCCCCAA” PAM AAAAATGAAGTTTTAAATCAATC BG8159 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 68 TTCACCCCCCCATCTAGATCCTTTTAAATT with “CCCCCCAT” PAM AAAAATGAAGTTTTAAATCAATC BG8160 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 69 TTCACCCCCCCACCTAGATCCTTTTAAATT with “CCCCCCAC” PAM AAAAATGAAGTTTTAAATCAATC BG8161 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 70 TTCACNNNNTNNCTAGATCCTTTTAAATTA with “NNNNTNN” PAM AAAATGAAGTTTTAAATCAATC BG8363 ACGGTTATCCACAGAATCAG FW for PCR linearization of PAM identification 71 libraries BG8364 CGGGATTGACTTTTAAAAAAGG RV for PCR linearization of PAM identification 72 libraries BG8763 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 73 TTCACCCCCCAAACTAGATCCTTTTAAATT with PAM position 6&7 “AA” AAAAATGAAGTTTTAAATCAATC BG8764 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 74 TTCACCCCCCTACTAGATCCTTTTAAATTA with PAM position 6&7 “AT” AAAATGAAGTTTTAAATCAATC BG8765 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 75 TTCACCCCCCAGACTAGATCCTTTTAAATT with PAM position 6&7 “AG” AAAAATGAAGTTTTAAATCAATC BG8766 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 76 TTCACCCCCCACACTAGATCCTTTTAAATT with PAM position 6&7 “AC” AAAAATGAAGTTTTAAATCAATC BG8767 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 77 TTCACCCCCCTAACTAGATCCTTTTAAATT with PAM position 6&7 “TA” AAAAATGAAGTTTTAAATCAATC BG8768 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 78 TTCACCCCCCTTACTAGATCCTTTTAAATT with PAM position 6&7 “TT” AAAAATGAAGTTTTAAATCAATC BG8769 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 79 TTCACCCCCCTGACTAGATCCTTTTAAATT with PAM position 6&7 “TG” AAAAATGAAGTTTTAAATCAATC BG8770 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 80 TTCACCCCCCTCACTAGATCCTTTTAAATT with PAM position 6&7 “TC” AAAAATGAAGTTTTAAATCAATC BG8771 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 81 TTCACCCCCCGAACTAGATCCTTTTAAATT with PAM position 6&7 “GA” AAAAATGAAGTTTTAAATCAATC BG8772 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 82 TTCACCCCCCGTACTAGATCCTTTTAAATT with PAM position 6&7 “GT” AAAAATGAAGTTTTAAATCAATC sgRNA BG8773 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 83 module TTCACCCCCCGGACTAGATCCTTTTAAATT with PAM position 6&7 “GG” for in AAAAATGAAGTTTTAAATCAATC vitro BG8774 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 84 transcrip- TTCACCCCCCGCACTAGATCCTTTTAAATT with PAM position 6&7 “GC” tion AAAAATGAAGTTTTAAATCAATC BG8775 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 85 TTCACCCCCCCAACTAGATCCTTTTAAATT with PAM position 6&7 “CA” AAAAATGAAGTTTTAAATCAATC BG8776 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 86 TTCACCCCCCCTACTAGATCCTTTTAAATT with PAM position 6&7 “CT” AAAAATGAAGTTTTAAATCAATC BG8777 TATGCC custom character GATTATCAAAAAGGATC FW for construction of in vitro target DNA 87 TTCACCCCCCCGACTAGATCCTTTTAAATT with PAM position 6&7 “CG” AAAAATGAAGTTTTAAATCAATC BG8778 TATGCC GATTATCAAAAAGGATC FW for construction of in vitro target DNA 88 TTCACCCCCCCCACTAGATCCTTTTAAATT with PAM position 6&7 “CC” AAAAATGAAGTTTTAAATCAATC BG6574 AAGCTTGAAATAATACGACTCACTATAGG FW for PCR amplification of the sgRNA 89 template for the first PAM identification process (30nt long spacer) BG6576 AAAAAAGACCTTGACGTTTTCC FW for PCR amplification of the sgRNA template 90 for the first PAM identification process BG9307 AAGCTTGAAATAATACGACTCACTATAGG RV for PCR amplification of the sgRNA template 91 TGAGATTATCAAAAAGGATCTTCACGTC for all the PAM identification processes except the first one (25nt long spacer) BG9309 AAAACGCCTAAGAGTGGGGAATG RV for PCR amplification of the 3-hairpins 92 long sgRNA template for all the PAM identification processes except the first one BG9310 AAAAGGCGATAGGCGATCC RV for PCR amplification of the 2-hairpins 93 long sgRNA template for all the PAM identification processes except the first one BG9311 AAAACGGGTCAGTCTGCCTATAG RV for PCR amplification of the 1-hairpin 94 long sgRNA template for all the PAM identification processes except the first one BG9308 AAGCTTGAAATAATACGACTCACTATAG pT7 and 25nt spacer sgRNA Fw 95 GTGAGATTATCAAAAAGGATCTTCACGTC BG10118 AAGCTTGAAATAATACGACTCACTATAGG pT7 and 24nt spacer sgRNA Fw 96 AGATTATCAAAAAGGATCTTCACGTCA BG10119 AAGCTTGAAATAATACGACTCACTATAGGA pT7 and 23nt spacer sgRNA Fw 97 AGATTATCAAAAAGGATCTTCACGTCATAG BG10120 AAGCTTGAAATAATACGACTCACTATAGGA pT7 and 22nt spacer sgRNA Fw 98 TTATCAAAAAGGATCTTCACGTCATAGT BG10121 AAGCTTGAAATAATACGACTCACTATAGGA pT7 and 21nt spacer sgRNA Fw 99 ATTATCAAAAAGGATCTTCACGTCATAGTT BG10122 AAGCTTGAAATAATACGACTCACTATAGGT pT7 and 20nt spacer sgRNA Fw 100 TATCAAAATAGTTAAGGATCTTCACGTC BG10123 AAGCTTGAAATAATACGACTCACTATAGGT pT7 and 19nt spacer sgRNA Fw 101 ATCAAAAAGGAGTTCATCTTCACGTCAT BG10124 AAGCTTGAAATAATACGACTCACTATAGGA pT7 and 18nt spacer sgRNA Fw 102 TCAAAAAGGATCTTCACGTCATAGTTC Editing BG9312 AAAACGCCTAAGAGTGGGGAATGCCCGAA 3 loops sgRNA OH Rv 103 and GAAAGCGGGCGATAGGCGATCC silencing BG8191 AAGCTTGGCGTAATCATGGTC For the construction of the pThermoCas9_ctrl 104 constructs plasmid & pThermoCas9 bsApyrFl/2 BG8192 TCATGAGTTCCCATGTTGTG For the construction of the pThermoCas9_ctrl 105 plasmid & pThermoCas9 bsApyrFl/2 BG8194 tatggcgaatcacaacatgggaactcatga For the construction of the pThermoCas9_ctrl 106 GAACATCCTCTTTCTTAG plasmid & pThermoCas9 bsApyrFl/2 BG8195 gccgatatcaagaccgattttatacttcat For the construction of the pThermoCas9_ctrl 107 TTAAGTTACCTCCTCGATTG plasmid & pThermoCas9 bsApyrFl/2 BG8196 ATGAAGTATAAAATCGGTCTTG For the construction of the pThermoCas9_ctrl 108 plasmid & pThermoCas9 bsApyrFl/2 BG8197 TAACGGACGGATAGTTTC For the construction of the pThermoCas9_ctrl 109 plasmid & pThermoCas9 bsApyrFl/2 BG8198 gaaagccggggaaactatccgtccgttata For the construction of the pThermoCas9_ctrl 110 AATCAGACAAAATGGCCTGCTTATG plasmid & pThermoCas9 bsApyrFl/2 BG8263 gaactatgacactttattttcagaatggac For the construction of the pThermoCas9 ctrl 111 GTATAACGGTATCCATTTTAAGAATAATCC plasmid BG8268 accgttatacgtccattctgaaaataaagt For the construction of the pThermoCas9_ctrl 112 GTCATAGTTCCCCTGAGAT plasmid BG8210 aacagctatgaccatgattacgccaagctt For the construction of the pThermoCas9_ctrl 113 CCCTCCCATGCACAATAG plasmid & pThermoCas9 bsApyrFl/2 BG8261 gaactatgacatcatggagttttaaatcca For the construction of the 114 GTATAACGGTATCCATTTTAAGAATAATCC pThermoCas9_bsApyrF1 BG8266 accgttatactggatttaaaactccatgat For the construction of the 115 GTCATAGTTCCCCTGAGAT pThermoCas9_bsApyrF2 BG8317 gaactatgaccacccagcttacatcaacaa For the construction of the 116 GTATAACGGTATCCATTTTAAGAATAATCC pThermoCas9_AbspyrF2 BG8320 accgttatacttgttgatgtaagctgggtg For the construction of the 117 GTCATAGTTCCCCTGAGAT pThermoCas9_bsApyrF2 BG9075 CTATCGGCATTACGTCTATC For the construction of the pThermoCas9i_ctrl 118 BG9076 GCGTCGACTTCTGTATAGC For the construction of the pThermoCas9i_ctrl 119 BG9091 TGAAGTATAAAATCGGTCTTGCTATCGGCA For the construction of the pThermoCas9i_ctrl 120 TTACGTCTATC BG9092 CAAGCTTCGGCTGTATGGAATCACAGCGTC For the construction of the pThermoCas9i_ctrl 121 GACTTCTGTATAGC BG9077 GCTGTGATTCCATACAG For the construction of the pThermoCas9i_ctrl 122 BG9267 GGTGCAGTAGGTTGCAGCTATGCTTGTATA For the construction of the pThermoCas9i_ctrl 123 ACGGTATCCAT BG9263 AAGCATAGCTGCAACCTACTGCACCGTCAT For the construction of the pThermoCas9i_ctrl 124 AGTTCCCCTGAGATTATCG BG9088 TCATGACCAAAATCCCTTAACG For the construction of the pThermoCas9i ctrl 125 BG9089 TTAAGGGATTTTGGTCATGAGAACATCCT For the construction of the pThermoCas9i_ctrl 126 CTTTCTTAG BG9090 GCAAGACCGATTTTATACTTCATTTAAG For the construction of the pThermoCas9i_ctrl 127 BG9548 GGATCCCATGACGCTAGTATCCAGCTGGG For the construction of the pThermoCas9i_ldhL 128 TCATAGTTCCCCTGAGATTATCG BG9601 TTCAATATTTTTTTTGAATAAAAAATACG For the construction of the pThermoCas9i_ldhL 129 ATACAATAAAAATGTCTAGAAAAAGATAA AAATG BG9600 TTTTTTATTCAAAAAAAATATTGAATTTT For the construction of the pThermoCas9i_ldhL 130 AAAAATGATGGTGCTAGTATGAAG BG9549 CCAGCTGGATACTAGCGTCATGGGATCCG For the construction of the pThermoCas9i_ldhL 131 TATAACGGTATCCATTTTAAGAATAATCC BG8552 TCGGGGGTTCGTTTCCCTTG FW to check genomic pyrF deletion KO check 132 BG8553 CTTACACAGCCAGTGACGGAAC RV to check genomic pyrF deletion KO check 133 BG2365 GCCGGCGTCCCGGAAAACGA For the construction of the 134 pThermoCas9_ppApyrF BG2366 GCAGGTCGGGTTCCTCGCATCCATGCCC For the construction of the 135 CCGAACT pThermoCas9_ppApyrF BG2367 ggcttcggaatcgtlttccgggacgccgg For the construction of the 136 cACGGCATTGGCAAGGCCAAG pThermoCas9_ppApyrF BG2368 gacacaggcatcggtGCAGGGTCTCTTG For the construction of the 137 GCAAGTC pThermoCas9_ppApyrF BG2369 gccaagagaccctgCACCGATGCCTGT For the construction of the 138 GTCGAACC pThermoCas9_ppApyrF BG2370 cttggcggaaaacgtcaaggtcttttt For the construction of the 139 tacACGCGCATCAACTTCAAGGC pThermoCas9_ppApyrF BG2371 atgacgagctgttcaccagcagcgcTA For the construction of the 140 TTATTGAAGCATTTATCAGGG pThermoCas9_ppApyrF BG2372 GTAAAAAAGACCTTGACGTTTTC For the construction of the 141 pThermoCas9_ppApyrF BG2373 tatgaagcgggccatTTGAAGACGAAAGG For the construction of the 142 GCCTC pThermoCas9_ppApyrF BG2374 taatagcgctgctggtgaacagctcGTCA For the construction of the 143 TAGTTCCCCTGAGATTATCG pThermoCas9_ppApyrF BG2375 tggagtcatgaacatATGAAGTATAAAAT For the construction of the 144 CGGTCTTG pThermoCas9_ppApyrF BG2376 ccctttcgtcttcAAATGGCCCGCTTCAT For the construction of the 145 AAGCAG pThermoCas9_ppApyrF BG2377 gattttatacTTCATATGTTCATGACTCC For the construction of the 146 ATTATTATTG pThermoCas9_ppApyrF BG2378 gggggcatggatgCGAGGAACCCGACCTG For the construction of the 147 CATTCG pThermoCas9_ppApyrF BG2381 ACACGGCGGATGCACTTACC FW for confirmation of plasmid integration 148 and pyrF deletion in P. putida BG2382 TGGACGTGTACTTCGACAAC RV for confirmation of pyrF deletion in 149 P. putida BG2135 ACACGGCGGATGCACTTACC RV for confirmation of plasmid integration 150 in P. putida Sequencing BG8196 TGGACGTGTACTTCGACAAC thermocas9 seq. 1 151 primers BG8197 TAACGGACGGATAGTTTC thermocas9 seq. 2 152 BG6850 GCCTCATGAATGCAGCGATGGTCCGGTGT pyrV US 153 TC BG6849 GCCTCATGAGTTCCCATGTTGTGATTC pyrF DS 154 BG6769 CAATCCAACTGGGCTTGAC thermocas9 seq. 3 155 BG6841 CAAGAACTTTATTGGTATAG thermocas9 seq. 4 156 BG6840 TTGCAGAAATGGTTGTCAAG thermocas9 seq. 5 157 BG9215 GAGATAATGCCGACTGTAC pNW33n backbone seq. 1 158 BG9216 AGGGCTCGCCTTTGGGAAG pNW33n backbone seq. 2 159 BG9505 GTTGCCAACGTTCTGAG thermocas9 seq. 6 160 BG9506 AATCCACGCCGTTTAG thermocas9 seq. 7 161 Cleavage BG8363 ACGGTTATCCACAGAATCAG FW for PCR linearization of DNA target 162 assays BG8364 CGGGATTGACTTTTAAAAAAGG RV for PCR linearization of DNA target 163 BG9302 AAACTTCATTTTTAATTTAAAAGGATCTAG Non-template strand oligonucleotide for 164 AACCCCCCGTGAAGATCCTTTTTGATAATC ssDNA cleavage assays TCATGACCAAAATCCCTTAACGTGAGTTTT CGTTCCACTGAGCGTCAGACCCCGTAGAAA BG9303 TTTCTACGGGGTCTGACGCTCAGTGGAACG Template strand oligonucleotide for ssDNA 165 AAAACTCACGTTAAGGGATTTTGGTCATGA cleavage assays GATTATCAAAAAGGATCTTCACCCCCCCAA CTAGATCCTTTTAAATTAAAAATGAAGTTT BG9304 TTTCTACGGGGTCTGACGCTCAGTGGAACG Template strand oligonucleotide for ssDNA 166 AAAACTCACGTTAAGGGATTTTGGTCATGA cleavage assays GATTATCAAAAAGGATCTTCACGGGGGGTT CTAGATCCTTTTAAATTAAAAATGAAGTTT ThermoCas9 BG7886 TACTTCCAATCCAATGCAAAGTATAAAATC FW LIC_thermocas9 167 expression GGTCTTGATATCG and BG7887 TTATCCACTTCCAATGTTATTATAACGGAC RV LIC_thermocas9 168 RT-qPCR GGATAGTTTCCCCGGCTTTC ThermoCas9 BG9665 ATGACGAAAGGAGTTTCTTATTATG RV qPCR check ldhl 169 expression BG9666 AACGGTATTCCGTGATTAAG FW qPCR check ldhl 170 Restriction sites are shown in italics. Spacer regions are shown in bold. Nucleotides in lowercase letters correspond to primer overhangs for HiFi DNA Assembly. LIC: Ligase Independent cloning; FW: Forward primer; RV: Reverse primer.

(124) The proteins were expressed in E. coli Rosetta 2 (DE3) strain. Cultures were grown to an OD.sub.600 nm of 0.5-0.6. Expression was induced by the addition of IPTG to a final concentration of 0.5 mM and incubation was continued at 16° C. overnight. Cells were harvested by centrifugation and the cell pellet was resuspended in 20 mL of Lysis Buffer (50 mM sodium phosphate pH 8, 500 mM NaCl, 1 mM DTT, 10 mM imidazole) supplemented with protease inhibitors (Roche cOmplete, EDTA-free) and lysozyme. Once homogenized, cells were lysed by sonication (Sonoplus, Bandelin) using a using an ultrasonic MS72 microtip probe (Bandelin), for 5-8 minutes consisting of 2s pulse and 2.5s pause at 30% amplitude and then centrifuged at 16,000×g for 1 hour at 4° C. to remove insoluble material. The clarified lysate was filtered through 0.22 micron filters (Mdi membrane technologies) and applied to a nickel column (Histrap HP, GE Lifesciences), washed and then eluted with 250 mM imidazole. Fractions containing ThermoCas9 were pooled and dialyzed overnight into the dialysis buffer (250 mM KCl, 20 mM HEPES/KOH, and 1 mM DTT, pH 7.5). After dialysis, sample was diluted 1:1 in 10 mM HEPES/KOH pH 8, and loaded on a heparin FF column pre-equilibrated in IEX-A buffer (150 mM KCl, 20 mM HEPES/KOH pH 8). Column was washed with IEX-A and then eluted with a gradient of IEX-C(2M KCl, 20 mM HEPES/KOH pH 8). The sample was concentrated to 700 μL prior to loading on a gel filtration column (HiLoad 16/600 Superdex 200) via FPLC (AKTA Pure). Fractions from gel filtration were analysed by SDS-PAGE; fractions containing ThermoCas9 were pooled and concentrated to 200 μL (50 mM sodium phosphate pH 8, 2 mM DTT, 5% glycerol, 500 mM NaCl) and either used directly for biochemical assays or frozen at −80° C. for storage.

(125) c. In Vitro Synthesis of sgRNA

(126) The sgRNA module was designed by fusing the predicted crRNA and tracrRNA sequences with a 5′-GAAA-3′ linker. The sgRNA-expressing DNA sequence was put under the transcriptional control of the T7 promoter. It was synthesized (Baseclear, Leiden, The Netherlands) and provided in the pUC57 backbone. All sgRNAs used in the biochemical reactions were synthesized using the HiScribe™ T7 High Yield RNA Synthesis Kit (NEB). PCR fragments coding for sgRNAs, with the T7 sequence on the 5′ end, were utilized as templates for in vitro transcription reaction. T7 transcription was performed for 4 hours. The sgRNAs were run and excised from urea-PAGE gels and purified using ethanol precipitation.

(127) d. In Vitro Cleavage Assay

(128) In vitro cleavage assays were performed with purified recombinant ThermoCas9. ThermoCas9 protein, the in vitro transcribed sgRNA and the DNA substrates (generated using PCR amplification using primers described in Table 2) were incubated separately (unless otherwise indicated) at the stated temperature for 10 min, followed by combining the components together and incubating them at the various assay temperatures in a cleavage buffer (100 mM sodium phosphate buffer (pH=7), 500 mM NaCl, 25 mM MgCl2, 25 (V/V %) glycerol, 5 mM dithiothreitol (DTT)) for 1 hour. Each cleavage reaction contained 160 nM of ThermoCas9 protein, 4 nM of substrate DNA, and 150 nM of synthetized sgRNA. Reactions were stopped by adding 6×loading dye (NEB) and run on 1.5% agarose gels. Gels were stained with SYBR safe DNA stain (Life Technologies) and imaged with a Gel Doc™ EZ gel imaging system (Bio-rad).

(129) e. Library Construction for In Vitro PAM Screen

(130) For the construction of the PAM library, a 122-bp long DNA fragment, containing the protospacer and a 7-bp long degenerate sequence at its 3′-end, was constructed by primer annealing and Klenow fragment (exo-) (NEB) based extension. The PAM-library fragment and the pNW33n vector were digested by BspHI and BamHI (NEB) and then ligated (T4 ligase, NEB). The ligation mixture was transformed into electro-competent E. coli DH10B cells and plasmids were isolated from liquid cultures. For the 7 nt-long PAM determination process, the plasmid library was linearized by SapI (NEB) and used as the target. For the rest of the assays the DNA substrates were linearized by PCR amplification.

(131) f. PAM Screening Assay

(132) The PAM screening of thermoCas9 was performed using in vitro cleavage assays, which consisted of (per reaction): 160 nM of ThermoCas9, 150 nM in vitro transcribed sgRNA, 4 nM of DNA target, 4 μl of cleavage buffer (100 mM sodium phosphate buffer pH 7.5, 500 mM NaCl, 5 mM DTT, 25% glycerol) and MQ water up to 20 μl final reaction volume. The PAM containing cleavage fragments from the 55° C. reactions were gel purified, ligated with Illumina sequencing adaptors and sent for Illumina HiSeq 2500 sequencing (Baseclear). Equimolar amount of non-thermoCas9 treated PAM library was subjected to the same process and sent for Illumina HiSeq 2500 sequencing as a reference. HiSeq reads with perfect sequence match to the reference sequence were selected for further analysis. From the selected reads, those present more than 1000 times in the ThermoCas9 treated library and at least 10 times more in the ThermoCas9 treated library compared to the control library were employed for WebLogo analysis (Crooks et al., Genome Res. 14, 1188-1190 (2004)).

(133) g. Editing and Silencing Constructs for B. smithii and P. putida

(134) All the primers and plasmids used for plasmid construction were designed with appropriate overhangs for performing NEBuilder HiFi DNA assembly (NEB), and they are listed in Table 2 and 3 respectively. The fragments for assembling the plasmids were obtained through PCR with Q5 Polymerase (NEB) or Phusion Flash High-Fidelity PCR Master Mix (ThermoFisher Scientific), the PCR products were subjected to 1% agarose gel electrophoresis and they were purified using Zymogen gel DNA recovery kit (Zymo Research). The assembled plasmids were transformed to chemically competent E. coli DH5a cells (NEB), or to E. coli DH5a λpir (Invitrogen) in the case of P. putida constructs, the latter to facilitate direct vector integration. Single colonies were inoculated in LB medium, plasmid material was isolated using the GeneJet plasmid miniprep kit (ThermoFisher Scientific) and sequence verified (GATC-biotech) and 1 μg of each construct transformed of B. smithii ET 138 electro-competent cells, which were prepared according to a previously described protocol (Bosma, et al. Microb. Cell Fact. 14, 99 (2015)). The MasterPure™ Gram Positive DNA Purification Kit (Epicentre) was used for genomic DNA isolation from B. smithii and P. putida liquid cultures.

(135) For the construction of the pThermoCas9_ctrl, pThermoCas9_bsApyrF1 and pThermoCas9_bsEpyrF2 vectors, the pNW33n backbone together with the ΔpyrF homologous recombination flanks were PCR amplified from the pWUR_Cas9sp1_hr vector (Mougiakos, et al. ACS Synth. Biol. 6, 849-861 (2017)) (BG8191 and BG8192). The native P.sub.xylA promoter was PCR amplified from the genome of B. smithii ET 138 (BG8194 and BG8195). The thermocas9 gene was PCR amplified from the genome of G. thermodenitrificans T12 (BG8196 and BG8197). The P.sub.pta promoter was PCR amplified from the pWUR_Cas9sp1_hr vector (Mougiakos, et al. ACS Synth. Biol. 6, 849-861 (2017)) (BG8198 and BG8261_2/BG8263_nc2/BG8317_3). The spacers followed by the sgRNA scaffold were PCR amplified from the pUC57_T7t12sgRNA vector (BG8266_2/BG8268_nc2/8320_3 and BG8210).

(136) A four-fragment assembly was designed and executed for the construction of the pThermoCas9i_ldhL vectors. Initially, targeted point mutations were introduced to the codons of the thermocas9 catalytic residues (mutations D8A and H582A), through a two-step PCR approach using pThermoCas9_ctrl as template. During the first PCR step (BG9075, BG9076), the desired mutations were introduced at the ends of the produced PCR fragment and during the second step (BG9091, BG9092) the produced fragment was employed as PCR template for the introduction of appropriate assembly-overhangs. The part of the thermocas9 downstream the second mutation along with the IdhL silencing spacer was PCR amplified using pThermoCas9_ctrl as template (BG9077 and BG9267). The sgRNA scaffold together with the pNW33n backbone was PCR amplified using pThermoCas9_ctrl as template (BG9263 and BG9088). The promoter together with the part of the thermocas9 upstream the first mutation was PCR amplified using pThermoCas9_ctrl as template (BG9089, BG9090)

(137) A two-fragment assembly was designed and executed for the construction of pThermoCas9i_ctrl vector. The spacer sequence in the pThermoCas9i_ldhL vector was replaced with a random sequence containing BaeI restriction sites at both ends. The sgRNA scaffold together with the pNW33n backbone was PCR amplified using pThermoCas9_ctrl as template (BG9548, BG9601). The other half of the construct consisted of Thermo-dCas9 and promoter was amplified using pThermoCas9i_ldhL as template (BG9600, BG9549).

(138) A five-fragment assembly was designed and executed for the construction of the P. putida KT2440 vector pThermoCas9_ppΔpyrF. The replicon from the suicide vector pEMG was PCR amplified (BG2365, BG2366). The flanking regions of pyrF were amplified from KT2440 genomic DNA (BG2367, BG2368 for the 576-bp upstream flank, and BG2369, BG2370 for the 540-bp downstream flank). The flanks were fused in an overlap extension PCR using primers BG2367 and BG2370 making use of the overlaps of primers BG2368 and BG2369. The sgRNA was amplified from the pThermoCas9_ctrl plasmid (BG2371, BG2372). The constitutive P3 promoter was amplified from pSW_I-SceI (BG2373, BG2374). This promoter fragment was fused to the sgRNA fragment in an overlap extension PCR using primers BG2372 and BG2373 making use of the overlaps of primers BG2371 and BG2374. ThermoCas9 was amplified from the pThermoCas9_ctrl plasmid (BG2375, BG2376). The inducible Pm-XylS system, to be used for 3-methylbenzoate induction of ThermoCas9 was amplified from pSW_I-SceI (BG2377, BG2378).

(139) TABLE-US-00006 TABLE 3 Plasmids used in this study Restriction Plasmid Description sites used Primers Source pNW33n E. coli-Bacillus shuttle vector, cloning vector, Cam.sup.R — — BGSC pUC57_T7sgRNAfull pUC57 vector containing DNA encoding the sgRNA under the control Baseclear of T7 promoter; serves as a template for in vitro transcription of full length Repeat/Antirepeat sgRNAs pMA2_T7sgRNAtruncated Vector containing DNA encoding the truncated Repeat/Antirepeat part — — Gen9 R/AR of the sgRNA under the control of T7 promoter; serves as a template for in vitro transcription of truncated Repeat/Antirepeat sgRNAs pRARE T7 RNA polymerase based expression vector, Kan.sup.R — — EMD Millipore pML-1B E. coli Rosetta ™ (DE3) plasmid, encodes rare tRNAs, Cam.sup.R — — Macrolab, Addgene pEMG P. putida suicide vector, used as template for replicon and Kan.sup.R See .sup.1 Table 2 pSW_I-SceI P. putida vector containing I-SceI, used as template for xylS and P.sub.Pm See .sup.1 Table 2 pWUR_Cas9sp1_hr pNW33n with spCas9-module containing spacer targeting the pyrF — — .sup.2 gene. This plasmid was used as a template for constructing the ThermoCas9 based constructs pThermo_Cas9 thermocas9 with N-term. His-tag and TEV cleavage site in pML-1B. Sspl and BG7886 This study Expression vector for ThermoCas9 Ligase and Independent BG7887 Cloning pThermo_dCas9 cas9dthermocas9 with N-term. His-tag and TEV cleavage site in pML- Sspl and BG7886 This study 1B. Expression vector for catalytically inactive (dead) dThermoCas9 Ligase and Independent BG7888 Cloning pNW-PAM7nt Target sequence in pNW33n vector containing a 7-nt degenerate PAM BamHI and See This study for in vitro PAM determination assay BspHI Table 2 pNW63-pNW78 Target sequence in pNW33n vector containing distinct nucleotides at BamHI and See This study the 6th and 7th positions of the PAM (CCCCCNNA) BspHI Table 2 pThermoCas9_ctrl pNW33n with ThermoCas9-module.sup.1 containing a non-targeting spacer. — See This study Used as a negative control Table 2 pThermoCas9_bsΔpyrF1 pNW33n with ThermoCas9-module.sup.1 containing spacer 1 targeting the — See This study pyrF gene and the fused us + ds pyrF-flanks Table 2 pThermoCas9_bsΔpyrF2 pNW33n with ThermoCas9-module.sup.1 containing spacer 2 targeting the — See This study pyrF gene and the fused us + ds pyrF-flanks Table 2 pThermoCas9i_ctrl pNW33n with Thermo-dCas9-module.sup.2 containing a non-targeting — See This study spacer. Used as a wild-type control Table 2 pThermoCas9i_ldhL pNW33n with Thermo-dCas9-module.sup.2 containing spacer 2 targeting — See This study the ldhL gene Table 2 pThermoCas9_ppApyrF pEMG with ThermoCas9-module.sup.3 for Pseudomonas putida containing — See This study a spacer targeting the a spacer targeting the pyrF gene and the fused Table 2 us + ds pyrF-flanks .sup.1The ThermoCas9 module contains thermocas9 under the native P.sub.xylL promoter followed by the sgRNA under the B. coagulans P.sub.pta promoter (FIG. 4). .sup.2Like the ThermoCas9 module, but with the thermo-dCas9 instead of thermocas9 (FIG. 4). .sup.3The ThermoCas9 module for Pseudomonas putida contains thermocas9 under the transcriptional control of the inducible Pm-XylS system followed by the sgRNA under the constitutive P3 promoter.

(140) h. Editing Protocol for P. putida

(141) Transformation of the plasmid to P. putida was performed according to Choi et al. (Choi et al., J. Microbiol. Methods 64, 391-397 (2006)). After transformation and selection of integrants, overnight cultures were inoculated. 10 μl of overnight culture was used for inoculation of 3 ml fresh selective medium and after 2 hours of growth at 37° C. ThermoCas9 was induced with 3-methylbenzoate. After an additional 6h, dilutions of the culture were plated on non-selective medium supplemented with 3-methylbenzoate. For the control culture the addition of 3-methylbenzoate was omitted in all the steps. Confirmation of plasmid integration in the P. putida chromosome was done by colony PCR with primers BG2381 and BG2135. Confirmation of pyrF deletion was done by colony PCR with primers BG2381 and BG2382.

(142) i. RNA Isolation

(143) RNA isolation was performed by the phenol extraction based on a previously described protocol (van Hijum et al. BMC Genomics 6, 77 (2005)). Overnight 10 mL cultures were centrifuged at 4° C. and 4816×g for 15 min and immediately used for RNA isolation. After removal of the medium, cells were suspended in 0.5 mL of ice-cold TE buffer (pH 8.0) and kept on ice. All samples were divided into two 2 mL screw-capped tubes containing 0.5 g of zirconium beads, 30 μL of 10% SDS, 30 μL of 3 M sodium acetate (pH 5.2), and 500 μL of Roti-Phenol (pH 4.5-5.0, Carl Roth GmbH). Cells were disrupted using a FastPrep-24 apparatus (MP Biomedicals) at 5500 rpm for 45 s and centrifuged at 4° C. and 10 000 rpm for 5 min. 400 μL of the water phase from each tube was transferred to a new tube, to which 400 μL of chloroform-isoamyl alcohol (Carl Roth GmbH) was added, after which samples were centrifuged at 4° C. and 18 400×g for 3 min. 300 μL of the aqueous phase was transferred to a new tube and mixed with 300 μL of the lysis buffer from the high pure RNA isolation kit (Roche). Subsequently, the rest of the procedure from this kit was performed according to the manufacturer's protocol, except for the DNase incubation step, which was performed for 45 min. The concentration and integrity of cDNA was determined using Nanodrop-1000 Integrity and concentration of the isolated RNA was checked on a NanoDrop 1000.

(144) j. Quantification of mRNA by RT-qPCR

(145) First-strand cDNA synthesis was performed for the isolated RNA using SuperScript™ III Reverse Transcriptase (Invitrogen) according to manufacturer's protocol. qPCR was performed using the PerfeCTa SYBR Green Supermix for iQ from Quanta Biosciences. 40 ng of each cDNA library was used as the template for qPCR. Two sets of primers were used; BG9665:BG9666 amplifying a 150-nt long region of the IdhL gene and BG9889:BG9890 amplifying a 150-nt long sequence of the rpoD (RNA polymerase sigma factor) gene which was used as the control for the qPCR. The qPCR was run on a Bio-Rad C1000 Thermal Cycler.

(146) k. HPLC

(147) A high-pressure liquid chromatography (HPLC) system ICS-5000 was used for lactate quantification. The system was operated with Aminex HPX 87H column from Bio-Rad Laboratories and equipped with a UV1000 detector operating on 210 nm and a RI-150 40° C. refractive index detector. The mobile phase consisted of 0.16 N H.sub.2SO.sub.4 and the column was operated at 0.8 mL/min. All samples were diluted 4:1 with 10 mM DMSO in 0.01 N H.sub.2SO.sub.4.

(148) The following section of the description consists of numbered paragraphs simply providing statements of the invention already described herein. The numbered paragraphs in this section are not claims. The claims are set forth below in the later section headed “claims”.

(149) 1. An isolated clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) protein or polypeptide comprising;

(150) a. the amino acid motif EKDGKYYC [SEQ ID NO: 2]; and/or

(151) b. the amino acid motif X.sub.1X.sub.2CTX.sub.3X.sub.4 [SEQ ID NO: 3] wherein X.sub.1 is independently selected from Isoleucine, Methionine or Proline, X.sub.2 is independently selected from Valine, Serine, Asparagine or Isoleucine, X.sub.3 is independently selected from Glutamate or Lysine and X.sub.4 is one of Alanine, Glutamate or Arginine; and/or

(152) c. the amino acid motif X.sub.5LKX.sub.6IE [SEQ ID NO: 4] wherein X.sub.5 is independently selected from Methionine or Phenylalanine and X.sub.8 is independently selected from Histidine or Asparagine; and/or

(153) d. the amino acid motif X.sub.7VYSX.sub.8K [SEQ ID NO: 5] wherein X.sub.7 is Glutamate or Isoleucine and X.sub.8 is one of Tryptophan, Serine or Lysine; and/or

(154) e. the amino acid motif X.sub.9FYX.sub.10X.sub.11REQX.sub.12KEX.sub.13 [SEQ ID NO: 6] wherein X.sub.9 is Alanine or Glutamate, X.sub.10 is Glutamine or Lysine, X.sub.11 is Arginine or Alanine, X.sub.12 is Asparagine or Alanine and X.sub.13 is Lysine or Serine;

(155) wherein the Cas protein is capable of nucleic acid cleavage between 50° C. and 100° C. when associated with at least one targeting RNA molecule, and a polynucleotide comprising a target nucleic acid sequence recognised by the targeting RNA molecule.

(156) 2. An isolated Cas protein or polypeptide fragment having an amino acid sequence of SEQ ID NO: 1 or a sequence of at least 77% identity therewith, wherein the Cas protein is capable of binding, cleaving, modifying or marking a polynucleotide comprising a target nucleic acid sequence at a temperature between 50° C. and 100° C. when associated with at least one RNA molecule which recognizes the target sequence.

(157) 3. A Cas protein or polypeptide fragment as in numbered paragraph 1 or 2, wherein the Cas protein or fragment is capable of nucleic acid binding, cleavage, marking or modification at a temperature between 50° C. and 75° C., preferably at a temperature above 60° C.; more preferably at a temperature between 60° C. and 80° C.; more preferably at a temperature between 60° C. and 65° C.

(158) 4. A Cas protein or polypeptide fragment as in any of numbered paragraphs 1 to 3, wherein the nucleic acid binding, cleavage, marking or modification is DNA cleavage.

(159) 5. A Cas protein or polypeptide fragment as in any preceding numbered paragraph, wherein the amino acid sequence comprises an amino acid sequence of SEQ ID NO: 1 or a sequence of at least 77% identity therewith.

(160) 6. A Cas protein or polypeptide fragment as in any preceding numbered paragraph, wherein the Cas protein is obtainable from a bacterium, archaeon or virus.

(161) 7. A Cas protein or polypeptide fragment as in any preceding numbered paragraph, wherein the Cas protein is obtainable from Geobacillus sp., preferably from Geobacillus thermodenitrificans.

(162) 8. A ribonucleoprotein complex comprising a Cas protein as in any preceding numbered paragraph, and comprising at least one targeting RNA molecule which recognises a sequence in a target polynucleotide.

(163) 9. A ribonucleoprotein complex as in numbered paragraph 8, wherein the targeting RNA molecule comprises a crRNA and optionally a tracrRNA.

(164) 10. A ribonucleoprotein complex as in any of numbered paragraphs 7 to 9, wherein the length of the at least one RNA molecule is in the range 35-135 nucleotide residues.

(165) 11. A ribonucleoprotein complex as in numbered paragraph 8 or 9, wherein the target sequence is 31 or 32 nucleotide residues in length.

(166) 12. A Cas protein or polypeptide as in any of numbered paragraphs 1 to 7 or a ribonucleoprotein complex as in any of 8 to 11, wherein the protein or polypeptide is provided as part of a protein complex comprising at least one further functional or non-functional protein.

(167) 13. A Cas protein, polypeptide, or ribonucleoprotein complex as in numbered paragraph 12, wherein the Cas protein or polypeptide, and/or the at least one further protein further comprise at least one functional moiety.

(168) 14. A Cas protein or polypeptide, or ribonucleoprotein complex as in numbered paragraph 13, wherein the at least one functional moiety is fused or linked to the N-terminus and/or the C-terminus of the Cas protein, polypeptide or ribonucleoprotein complex; preferably the N-terminus.

(169) 15. A Cas protein or polypeptide, or a ribonucleoprotein complex as in numbered paragraph 13 or 14, wherein the at least one functional moiety is a protein; optionally selected from a helicase, a nuclease, a helicase-nuclease, a DNA methylase, a histone methylase, an acetylase, a phosphatase, a kinase, a transcription (co-) activator, a transcription repressor, a DNA binding protein, a DNA structuring protein, a marker protein, a reporter protein, a fluorescent protein, a ligand binding protein, a signal peptide, a subcellular localisation sequence, an antibody epitope or an affinity purification tag.

(170) 16. A Cas protein or polypeptide, or a ribonucleoprotein complex as in numbered paragraph 15, wherein the native activity of the Cas9 nuclease activity is inactivated and the Cas protein is linked to at least one functional moiety.

(171) 17. A Cas protein or polypeptide, or a ribonucleoprotein complex as in numbered paragraph 15 or 16, wherein the at least one functional moiety is a nuclease domain; preferably a FokI nuclease domain.

(172) 18. A Cas protein or polypeptide, or a ribonucleoprotein complex as in any of numbered paragraphs 15 to 17, wherein the at least one functional moiety is a marker protein, for example GFP.

(173) 19. An isolated nucleic acid molecule encoding a Cas protein or polypeptide, comprising;

(174) a. the amino acid motif EKDGKYYC [SEQ ID NO: 2]; and/or b. the amino acid motif X.sub.1X.sub.2CTX.sub.3X.sub.4 [SEQ ID NO: 3] wherein X.sub.1 is independently selected from Isoleucine, Methionine or Proline, X.sub.2 is independently selected from Valine, Serine, Asparagine or Isoleucine, X.sub.3 is independently selected from Glutamate or Lysine and X.sub.4 is one of Alanine, Glutamate or Arginine; and/or

(175) c. the amino acid motif X.sub.5LKX.sub.6IE [SEQ ID NO: 4] wherein X.sub.5 is independently selected from Methionine or Phenylalanine and X.sub.8 is independently selected from Histidine or Asparagine; and/or

(176) d. the amino acid motif X.sub.7NSX.sub.8K [SEQ ID NO: 5] wherein X.sub.7 is Glutamate or Isoleucine and X.sub.8 is one of Tryptophan, Serine or Lysine; and/or

(177) e. the amino acid motif X.sub.9FYX.sub.10X.sub.11REQX.sub.12KEX.sub.13 [SEQ ID NO: 6] wherein X.sub.9 is Alanine or Glutamate, X.sub.10 is Glutamine or Lysine, X.sub.11 is Arginine or Alanine, X.sub.12 is Asparagine or Alanine and X.sub.13 is Lysine or Serine;

(178) wherein the Cas protein or polypeptide is capable of DNA binding, cleavage, marking or modification between 50° C. and 100° C. when associated with at least one targeting RNA molecule, and a polynucleotide comprising a target nucleic acid sequence recognised by the targeting RNA molecule.

(179) 20. An isolated nucleic acid molecule encoding a clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) protein having an amino acid sequence of SEQ ID NO: 1 or a sequence of at least 77% identity therewith; or a polypeptide fragment thereof.

(180) 21. An isolated nucleic acid molecule as in numbered paragraph 19 or 20, further comprising at least one nucleic acid sequence encoding an amino acid sequence which upon translation is fused with the Cas protein or polypeptide.

(181) 22. An isolated nucleic acid molecule as in numbered paragraph 21, wherein the at least one nucleic acid sequence fused to the nucleic acid molecule encoding the Cas protein or polypeptide, encodes a protein selected from a protein selected from a helicase, a nuclease, a helicase-nuclease, a DNA methylase, a histone methylase, an acetylase, a phosphatase, a kinase, a transcription (co-)activator, a transcription repressor, a DNA binding protein, a DNA structuring protein, a marker protein, a reporter protein, a fluorescent protein, a ligand binding protein, a signal peptide, a subcellular localisation sequence, an antibody epitope or an affinity purification tag.

(182) 23. An expression vector comprising a nucleic acid molecule as in any of numbered paragraphs 19 to 22.

(183) 24. An expression vector as in numbered paragraph 23, further comprising a nucleotide sequence encoding at least one targeting RNA molecule.

(184) 25. A method of modifying a target nucleic acid comprising contacting the nucleic acid with:

(185) a. a ribonucleoprotein complex of any of numbered paragraphs 6 to 11; or

(186) b. a protein or protein complex of any of numbered paragraphs 12 to 18 and at least one targeting RNA molecule as defined in any of numbered paragraphs 6 to 11; and wherein said method is not used in human cells.

(187) 26. A method of modifying a target nucleic acid in a non-human cell, comprising transforming, transfecting or transducing the cell with an expression vector of numbered paragraph 24; or alternatively transforming, transfecting or transducing the cell with an expression vector of numbered paragraph 23 and a further expression vector comprising a nucleotide sequence encoding a targeting RNA molecule as defined in any of numbered paragraphs 6 to 11

(188) 27. A method of modifying a target nucleic acid in a non-human cell comprising transforming, transfecting or transducing the cell with an expression vector of numbered paragraph 23, and then delivering a targeting RNA molecule as defined in any of numbered paragraphs 6 to 11 to or into the cell.

(189) 28. A method of modifying a target nucleic acid as in any of numbered paragraphs 25 to 28, wherein the at least one functional moiety is a marker protein or reporter protein and the marker protein or reporter protein associates with the target nucleic acid; preferably wherein the marker is a fluorescent protein, for example a green fluorescent protein (GFP).

(190) 29. A method as in any of numbered paragraphs 25 to 28, wherein the target nucleic acid is DNA; preferably dsDNA.

(191) 30. A method as in any of numbered paragraphs 25 to 28, wherein the target nucleic acid is RNA.

(192) 31. A method of modifying a target nucleic acid as in numbered paragraph 29, wherein the nucleic acid is dsDNA, the at least one functional moiety is a nuclease or a helicase-nuclease, and the modification is a single-stranded or a double-stranded break at a desired locus.

(193) 32. A method of silencing gene expression at a desired locus according to any of the methods in any of numbered paragraphs 26, 27, 29 or 31.

(194) 33. A method of modifying or deleting and/or inserting a desired nucleotide sequence at a desired location according to any of the methods as in any of numbered paragraphs 26, 27, 29 or 31.

(195) 34. A method of modifying gene expression in a non-human cell comprising modifying a target nucleic acid sequence as in a method of any of numbered paragraphs 25 to 29; wherein the nucleic acid is dsDNA and the functional moiety is selected from a DNA modifying enzyme (e.g. a methylase or acetylase), a transcription activator or a transcription repressor.

(196) 35. A method of modifying gene expression in a non-human cell comprising modifying a target nucleic acid sequence as in a method of numbered paragraph 30, wherein the nucleic acid is an mRNA and the functional moiety is a ribonuclease; optionally selected from an endonuclease, a 3′ exonuclease or a 5′ exonuclease.

(197) 36. A method of modifying a target nucleic acid as in any of numbered paragraphs 25 to 35, wherein the method is carried out at a temperature between 50° C. and 100° C.

(198) 37. A method of modifying a target nucleic acid as in numbered paragraph 36, wherein the method is carried out at a temperature at or above 60° C., preferably between 60° C. and 80° C., more preferably between 60° C. and 65° C.

(199) 38. A method as in any of numbered paragraphs 25 to 37 wherein the cell is a prokaryotic cell.

(200) 39. A method as in any of numbered paragraphs 25 to 38 wherein the cell is a eukaryotic cell.

(201) 40. A host cell transformed by a method as in any of numbered paragraphs 22 to 36; wherein the cell is not a human cell.

Thermostable Cas9 nucleases

Assignee

Inventors

Cpc classification

Classification Explorer

C12N2310/20

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/907

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/102

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/11

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/113

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/902

CHEMISTRY; METALLURGY

Classification Explorer

C12N2330/51

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C12N15/113

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/10

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/90

CHEMISTRY; METALLURGY

Abstract

Claims

Description