CAS9-CAS9 FUSION PROTEINS
20210395710 · 2021-12-23
Inventors
- Scot A Wolfe (Winchester, MA, US)
- Mehmet Fatih Bolukbasi (Worcester, MA, US)
- Ankit Gupta (Worcester, MA, US)
- Erik J. Sontheimer (Auburndale, MA, US)
- Nadia Amrani (Shrewsbury, MA, US)
Cpc classification
C12N9/22
CHEMISTRY; METALLURGY
C07K2319/80
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention provides a Cas9 platform to facilitate single-site nuclease gene editing precision within a human genome. For example, a Cas9 nuclease/DNA-targeting unit (Cas9-DTU) fusion protein precisely delivers a Cas9/sgRNA complex to a specific target site within the genome for subsequent sgRNA-dependent cleavage of an adjacent target sequence. Alternatively, attenuating Cas9 binding using mutations to the a protospacer adjacent motif (PAM) recognition domain makes Cas9 target site recognition dependent on the associated DTU, all while retaining Cas9's sgRNA-mediated DNA cleavage fidelity. Cas9-DTU fusion proteins have improved target site binding precision, greater nuclease activity, and a broader sequence targeting range than standard Cas9 systems. Existing Cas9 or sgRNA variants (e.g., truncated sgRNAs (tru-gRNAs), nickases and FokI fusions) are compatible with these improvements to further reduce off-target cleavage. A robust, broadly applicable strategy is disclosed to impart Cas9 genome-editing systems with the single-genomic-site accuracy needed for safe, effective clinical application.
Claims
1-42. (canceled)
43. A method for genome editing of DNA within a cell, comprising i) delivering a fusion protein comprising a first Cas9 nuclease, said first nuclease comprising a protospacer adjacent motif recognition domain having a lysine-substituted, alanine-substituted or serine-substituted arginine residue and a second Cas9 nuclease; ii) delivering a first guide RNA for the first Cas9 comprising a guide sequence element that is complementary to a first target site, iii) delivering a second guide RNA for the second Cas9 comprising a guide sequence element that is complementary to a second target site; and iv) cleaving the first target site with the first Cas9 nuclease.
44. The method of claim 43, wherein the second Cas9 nuclease is an orthogonal Cas9 isoform.
45. The method of claim 44, wherein the first target site comprises a sequence complementary to the guide sequence element of the first guide RNA and a protospacer adjacent motif sequence for the first Cas9 nuclease, and the second target site comprises a sequence complementary to the guide sequence element of the second guide RNA and a protospacer adjacent motif sequence for the second Cas9 nuclease.
46. The method of claim 43, further comprising cleaving the second target site with the second Cas9 nuclease.
47. The method of claim 46, wherein a genomic deletion is generated.
48. The method of claim 43, wherein said first and second Cas9 nucleases are selected from the group consisting of Streptococcus pyogenes Cas9 (SpCas9), Staphylococcus aureus Cas9 (SaCas9), Neisseria meningitidis Cas9 (NmCas9) and Actinomyces naeslundii Cas9 (AnCas9).
49. The method of claim 43, wherein said substituted arginine protospacer adjacent motif (PAM) recognition residue participates in a base-specific binding or contacts a phosphodiester backbone.
50. The method of claim 43, wherein said substituted arginine PAM recognition residue participates in a base-specific binding.
51. The method of claim 43, wherein said substituted arginine PAM recognition residue contacts a phosphodiester backbone residue.
52. The method of claim 43, wherein the first Cas9 nuclease has a single point mutation.
53. The method of claim 43, wherein the first Cas9 nuclease has a double mutation.
54. The method of claim 43, wherein said first Cas9 nuclease is selected from the group consisting of SpCas9.sup.R1333K, SpCas9.sup.R1333S, SpCas9R.sup.1335K, NmCas9.sup.R1025A and NmCas9.sup.K1013A/R1025A.
55. The method of claim 43, wherein said second Cas9 nuclease is selected from the group consisting of a Cas9 nickase and a nuclease-dead Cas9 (dCas9).
56. The method of claim 43 1, wherein said second Cas9 nuclease is selected from the group consisting of nuclease-dead NmCas9 (NmdCas9), NmCas9 nuclease, NmCas9 HNH nickase, and NmCas9 RuvC nickase.
57. The method of claim 43, wherein said first and second Cas9 nucleases are bound to a guide RNA comprising a guide sequence element.
58. The method of claim 56, wherein said guide sequence element is truncated.
59. The method of claim 43, wherein the first Cas9 nuclease is selected from the group consisting of SpCas9.sup.R1333K, SpCas9.sup.R1333S, and SpCas9R.sup.1335K, and wherein the second Cas9 nuclease is NmCas9 nuclease.
60. The method of claim 43, wherein the fusion protein comprises a linker.
61. The method of claim 60, wherein the fusion protein comprises a linker that ranges between 20 and sixty amino acids.
62. The method of claim 60, wherein the first guide RNA is a single guide RNA or a combination of crRNA and tracrRNA, and wherein the second guide RNA is a single guide RNA or a combination of crRNA and tracrRNA.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0084] The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawings will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097]
[0098]
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
[0106]
[0107]
[0108]
[0109]
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
[0130]
[0131]
[0132]
[0133]
[0134]
[0135]
[0136]
[0137]
[0138]
[0139]
[0140]
[0141]
[0142]
[0143]
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151]
[0152]
[0153]
[0154]
[0155]
[0156]
[0157]
[0158]
[0159]
[0160]
[0161]
[0162]
[0163]
[0164]
[0165]
[0166]
[0167]
[0168]
[0169]
[0170]
[0171]
[0172]
[0173]
[0174]
[0175]
[0176]
[0177]
[0178]
[0179]
[0180]
[0181]
[0182]
[0183]
[0184]
[0185]
[0186]
[0187]
[0188]
[0189]
[0190]
[0191]
[0192]
[0193]
[0194]
[0195]
[0196]
[0197]
[0198]
[0199]
[0200]
[0201]
[0202]
[0203]
[0204]
[0205]
[0206]
[0207]
[0208]
[0209]
[0210]
[0211]
[0212]
[0213]
[0214]
[0215]
[0216]
[0217]
[0218]
[0219]
[0220]
[0221]
[0222]
[0223]
[0224]
[0225]
[0226]
[0227]
[0228]
[0229]
[0230]
[0231]
[0232]
[0233]
[0234]
DETAILED DESCRIPTION OF THE INVENTION
[0235] The present invention may be related to the field of genetic engineering. In particular, specific genes can be cleaved, edited or deleted using Cas9 nucleases with improved precision when coupled to DNA targeting units, which can be either programmable DNA-binding domains or an alternate isoform of Cas9 that are programmed to recognize a site neighboring the sequence targeted by the Cas9 nuclease.
[0236] The CRISPR/Cas9 system is commonly employed in biomedical research; however, the precision of Cas9 is sub-optimal for gene therapy applications that involve editing a large population of cells. Variations on a standard Cas9 system have yielded improvements in the precision of targeted DNA cleavage, but often restrict the range of targetable sequences. It remains unclear whether these variations can limit lesions to a single site within the human genome over a large cohort of treated cells. In some embodiments, the present invention contemplates that fusing a programmable DNA-binding domain (pDBD) to Cas9 combined with an attenuation of Cas9's inherent DNA binding affinity produces a Cas9-pDBD chimera with dramatically improved precision and increased targeting range. Because the specificity and affinity of this framework is easily tuned, Cas9-pDBDs provide a flexible system that can be tailored to achieve extremely precise genome editing at nearly any genomic locus—characteristics that are ideal for gene therapy applications.
[0237] Conventional CRISPR technology has been used to effect genome editing with Cas9 nuclease activity in combination with specific guide RNAs (sgRNAs) to place the enzyme on specific genomic DNA sequence where a double-stranded break is generated. Target location by Cas9 nuclease is typically a two step process. First, the PAM specificity of Cas9 acts as a first sieve by defining a subset of sequences that are bound for a sufficient length of time to be interrogated by the incorporate guide RNA. This step may be a kinetic selection for functional target sequences. Sequences with sufficient homology to a PAM specificity of the Cas9 nuclease are interrogated by the incorporate guide RNA through R-loop formation that allows Watson-Crick pairing between the guide RNA and the bound DNA target site. If there may be sufficient complementarity in this interaction the nuclease domains within Cas9 (the RuvC and HNH domains) will generate a double-stranded break in the DNA. Szczelkun et al., Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proc Natl Acad Sci USA. 2014 Jul. 8; 111(27): 9798-803.
[0238] The precision of a Cas9 nuclease—DNA targeting unit chimera may be improved by attenuating an independent recognition of target sites by a Cas9 nuclease, which can be achieved by altering its PAM recognition sequence and/or its affinity for a phosphodiester backbone by mutating residues that are involved in contacting the RNA or DNA. Further attenuation can be achieved by using a truncated single guide RNA to program a Cas9 nuclease. By attenuating the affinity of Cas9 for the DNA, the ability of a Cas9 nuclease to achieve a kinetic selection of a target sequence may be abrogated. Consequently, a Cas9 nuclease may be completely dependent on a coupled DNA targeting unit to achieve sufficient residence time on the DNA to allow R-loop formation with the incorporated guide RNA. Complementarity between a PAM specificity of a Cas9 nuclease and a target site may be still required for R-loop formation, but it may be no longer sufficient for initiating this event. This creates a system where the cleavage of a target site may be dependent on at least three features of the Cas9 nuclease—DNA targeting unit chimera: 1) recognition of the sequence by the DNA-targeting unit, 2) complementarity between the Cas9 nuclease PAM specificity and the sequence, and 3) complementarity between the guide RNA and the target site. An added advantage of the Cas9 nuclease—DNA targeting unit fusion may be that it expands the targeting range of the Cas9 nuclease by allowing a broader range of PAM sequences to be utilized, as normally low affinity PAM sequences can be utilized.
[0239] One potential advantage of a SpCas9-pDBD system over other Cas9 platforms is an ability to rapidly tune affinity and specificity of an attached pDBD to further improve its precision. Consequently, improved precision of SpCas9.sup.MT3-ZFP.sup.TS2 was obtained by truncating a zinc finger protein (commonly abbreviated as ZFP, ZnF or ZF) to reduce its affinity for target site OT2-2. Constructs with a truncation of any of the terminal zinc fingers may display high activity at a target site. However, these truncations also reduced or eliminated off-target activity at OT2-2, reflecting a profound improvement in the precision of SpCas9.sup.MT3-ZFP.sup.TS2.
[0240] Similarly, utilization of a ZFP (e.g., TS2*) that recognizes an alternate sequence neighboring an TS2 guide target site also abolishes off-target activity at OT2-2, confirming that cleavage by SpCas9.sup.MT3-ZFP.sup.TS2 at this off-target site is dependent on an ZFP.
[0241] GUIDE-seq.sup.17 was employed to provide an unbiased assessment of the propensity for SpCas9.sup.MT3-ZFP chimeras to cleave at alternate off-target sites within a genome. Using a modified protocol with a customized bioinformatics analysis of peaks within a genome, genome-wide DSB induction by SpCas9 and SpCas9.sup.MT3-ZFP.sup.TS2/TS3/TS4 were assessed. This analysis reveals a dramatic enhancement of the precision of the SpCas9.sup.MT3-ZFPs at all three of the target sites.
[0242] In some embodiments, the present invention contemplates compositions and methods that improve Cas9 effector systems. In some embodiments, Cas9 fusion proteins are contemplated comprising a DNA targeting unit that may be a DNA binding domain (DBD). In some embodiments, Cas9 fusion proteins are contemplated comprising a DNA targeting unit that may be another Cas9 isoform (e.g. SpCas9-NmCas9) programmed with an orthogonal sgRNA. In some embodiments a Cas9 nuclease would be directly fused to the DNA-targeting unit. In some embodiments, a Cas9 nuclease would be associated with the DNA-targeting unit via a dimerization domain. In some embodiments, the dimerization domain would be a heterotypic dimerization domain, which would allow control over component association. In some embodiments, the dimerization domain would be drug-dependent, which would provide temporal control over the activity of the nuclease based on the presence of the small molecule dimerizer within the cell.
[0243] Improvements in targeting precision have been achieved through the use of truncated sgRNAs (e.g., for example, less than 20 complementary bases). Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology (2014). Previous studies on truncated sgRNA have suggested that sgRNAs for spCas9 with less than 17 base pairs of complementarity to the target sequence have not been shown to be active in a genomic context. Improvements in precision have also been achieved by using pairs of Cas9 nickases to generate a double strand break. Mali et al., CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nature Biotechnology 31, 833-838 (2013); and Cho et al., Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Research 24, 132-141 (2014). In addition, nuclease dead Cas9 (dCas9) variants have been fused to the FokI nuclease domain to generate programmable nucleases where dCas9 serves as the DNA-targeting unit and FokI may be the cleavage domain. Tsai et al., Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nature Biotechnology. 2014 June; 32(6): 569-76; and Guilinger et al., Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nature Biotechnology. 2014 June; 32(6): 577-82.
[0244] The proposed strategies described herein provide improved and more efficient Cas9-pDBD platforms that facilitate the construction of compact Cas9 orthologs. These compact orthologs permit alternate delivery methods (e.g. adeno-associated virus or AAV) broadening the clinical therapeutic modalities available for diseases including, but not limited to CGD. These strategies are also applicable to the treatment of a wide range of other monogenic disorders.
I. Conventional Cas9 Protein Modifications
[0245] Recently, an RNA-guided adaptive immune system that may be widespread in bacteria and archaea has been adapted for achieving targeted DNA cleavage or gene regulation in prokaryotic and eukaryotic genomes. Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) RNA sequences and CRISPR-associated (Cas) genes form catalytic protein-RNA complexes that utilize the incorporated RNA to generate sequence-specific double strand breaks at a complementary DNA sequence. This nuclease platform has displayed remarkable robustness for targeted gene inactivation or tailor-made genome editing. Sander et al., CRISPR-Cas systems for editing, regulating and targeting genomes. Nature Biotechnology 32, 347-355 (2014); Mali et al. RNA-guided human genome engineering via Cas9. Science 339, 823-826 (2013); Ran et al. Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome Editing Specificity. Cell 154, 1380-1389 (2013); Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology 32, 279-284 (2014); and Wang et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910-918 (2013).
[0246] The CRISPR/Cas9 genome engineering system is revolutionizing biological sciences due to its simplicity and efficacy.sup.1-3. The most commonly studied Cas9 nuclease originates from Streptococcus pyogenes (SpCas9).sup.4. SpCas9 and its associated guide RNA license a DNA sequence for cleavage based on at least two stages of sequence interrogation.sup.4-8: i) compatibility of a PAM element with the specificity of the PAM-interacting domain, and ii) complementarity of a guide RNA sequence with the target site. Because it is straightforward to program Cas9 to cleave a desired target site through incorporation of a complementary single guide RNA (sgRNA).sup.4, a primary constraint on Cas9 targeting is the presence of a compatible PAM element.sup.4,9,10. For example, a PAM-interacting domain of wild-type SpCas9 (SpCas9.sup.WT) preferentially recognizes a nGG element.sup.4, although it can inefficiently utilize other PAM sequences (e.g. nAG, nGA).sup.9,11. The simplicity of a SpCas9/sgRNA system allows facile editing of genomes in a variety of organisms and cell lines.sup.1-3. Target specificity may be a function of recognition by both the guide RNA (through Watson-Crick base pairing) and an inherent specificity of Cas9 through recognition of a neighboring motif (e.g., for example, a protospacer adjacent motif (PAM)).
[0247] SpCas9 targeting precision is sub-optimal for most gene therapy applications involving editing of a large population of cells.sup.12,13. Numerous studies have demonstrated that SpCas9 can cleave a genome at unintended sites.sup.9,14-20, with some guides acting at more than 100 off-target sites.sup.17. Recent genome-wide analyses of SpCas9 precision indicate that a majority of genomic loci that differ from a guide RNA sequence at 2 nucleotides, and a subset of genomic loci that differ at 3 nucleotides are cleaved with moderate activi.sup.17-20. For some guides, off-target sites that differ by up to 6 nucleotides can be inefficiently cleaver.sup.17-20. In addition, at some off-target sites bulges can be accommodated within the sgRNA:DNA heteroduplex to allow cleavage.sup.15. In this light, a global analysis was performed of potential SpCas9 target sites in exons or promoter regions using CRISPRseek.sup.21,22 to assess the general frequency of potential off-target sites with three or fewer mismatches for guide RNAs falling in two categories of sequence elements: exon regions or promoter regions. A vast majority of guides (˜98% in exons and ˜99% in promoters) was found to have one or more off-target sites with 3 or fewer mismatches and thus are likely to have some level of off-target activity.
[0248] Reduced off-target cleavage rates have been reported with several modifications to the structure or delivery of a CRISPR/Cas9 system. Examples include, but are not limited to: changing guide sequence length and composition.sup.25,26; employing pairs of Cas9 nickases.sup.26-28; dimeric FokI-dCas9 nucleases.sup.10,29; inducible assembly of split Cas9.sup.30-33; Cas9 PAM variants with enhanced specificity.sup.34; and delivery of Cas9/sgRNA ribonucleoprotein complexes.sup.35-37. However, it remains uncertain whether these variations can restrict cleavage to a single site within the human genome over a large cohort of treated cells.sup.12,38. In addition, some of the most promising approaches (e.g., paired nickases or dimeric FokI-dCas9) restrict a targetable sequence space by requiring the proximity of two sequences compatible with Cas9 recognition.
[0249] Cas9 isoforms derived from different species can display different PAM specificities. Esvelt et al., Orthogonal Cas9 proteins for RNA-guided gene regulation and editing. Nature Methods 10, 1116-1121 (2013); Zhang et al., Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Molecular Cell 50, 488-503 (2013); Hou et al., Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitides (NmCas9). Proceedings of the National Academy of Sciences (2013); and Fonfara et al., Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Research 42, 2577-2590 (2014). The Cas9 nuclease from Streptococcus pyogenes (hereafter, Cas9, or SpCas9 or catalytically active Cas9) can be guided to specific sites in a genome through base-pair complementation between a 20 nucleotide guide region of an engineered RNA (sgRNA) and a genomic target sequence. Cho et al., Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nature Biotechnology 31, 230-232 (2013); Cong et al., Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823 (2013); Jinek et al., RNA-programmed genome editing in human cells. eLife 2, e00471 (2013); and Sternberg et al., DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 7490 (2014).
[0250] Structural information may be also available on Cas9 and Cas9-sgRNA-DNA complexes. Jinek et al., Structures of Cas9 Endonucleases Reveal RNA-Mediated Conformational Activation. Science (2014); Nishimasu et al., Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA. Cell 1-23 (2014); and Anders et al., Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature, 2014 Sep. 25; 513(7519): 569-73. Various other studies have reported on Cas9 precision (e.g., activity at its target site relative to off-target sequences) within a genome. Studies on Cas9 nuclease have demonstrated that off-target cleavage can occur at both NGG and NAG PAMs, where there can be up to 5 mismatches within the guide recognition sequence. Fu et al., High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature Biotechnology 31, 822-826 (2013); Pattanayak et al., High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nature Biotechnology 31, 839-843 (2013); and Hsu et al., DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology 31, 827-832 (2013).
[0251] Other Cas9 variants for improving specificity have also been investigated. For example, double-strand breaks may be generated through the nicks generated in each strand by RuvC and HNH nuclease domains of Cas9. Jinek et al., A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821 (2012); and
[0252] However, it has recently been shown that single nickases can be mutagenic with lesion rates >1% depending on the target site. Tsai et al., Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nature Biotechnology (2014); and Guilinger et al., Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nature Biotechnology 32, 577-582 (2014). Alternately, a catalytically-inactive, programmable, RNA-dependent DNA-binding protein (dCas9) can be generated by mutating both endonuclease domains within Cas9. Larson et al., CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat Protoc 8, 2180-2196 (2013); and Qi et al., Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013 Feb. 28; 152(5): 1173-83. When fused to a FokI endonuclease domain this construct can be used like zinc fingers or TALE domains to create the above dimeric nucleases, which display improved precision over a standard Cas9.
[0253] Type II CRISPR/Cas9 systems have been used for targeted genome editing in complex genomes, Barrangou et al., CRISPR-Cas systems: Prokaryotes upgrade to adaptive immunity. Molecular Cell. 2014 Apr. 24; 54(2): 234-44; Hsu et al., Development and Applications of CRISPR-Cas9 for Genome Engineering. Cell. 2014 Jun. 5; 157(6): 1262-78; and Sander et al., CRISPR-Cas systems for editing, regulating and targeting genomes. Nature Biotechnology. 2014 April; 32(4): 347-55. Editing sites can be selected based primarily on two features: complementarity to a single-guide RNA (sgRNA), and proximity to a short (2-5 base pair) sequence called a protospacer adjacent motif (PAM). Subsequent DNA cleavage and repair enables gene inactivation by non-homologous end joining (NHEJ), or sequence correction and/or insertion by homology-directed repair (HDR). This technology has relevance to the construction of animal and cell models and gene therapy. Hu et al., RNA-directed gene editing specifically eradicates latent and prevents new HIV-1 infection. Proceedings of the National. Academy of Sciences. 2014 Aug. 5; 111(31): 11461-6; and Yin et al., Genome editing with Cas9 in adult mice corrects a disease mutation and phenotype. Nature Biotechnology. 2014 June; 32(6): 551-3.
[0254] Despite these advantages, clinical genome editing may require even greater precision. Numerous reports have described promiscuity of standard Cas9, which leads to collateral damage at unintended sites. Fu et al., High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature Biotechnology. 2013 September; 31(9): 822-6; Pattanayak et al., High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nature Biotechnology. 2013 September; 31(9): 839-43; Hsu et al., DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology. 2013 September; 31(9): 827-32; and Lin et al., CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Research. 2014; 42(11): 7473-85.
[0255] Cas9/sgRNA variations that can improve precision but do not eliminate off-target activity include, but are not limited to: i) dual nickases (Mali et al., CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nature Biotechnology. 2013 September; 31(9): 833-8; and Ran et al., Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome Editing Specificity. Cell. 2013 Sep. 12; 154(6): 1380-9); ii) truncated sgRNAs (tru-sgRNAs;
[0256] The PAM interaction residues for SpCas9 have been described (Anders et al., (2014) Structural Basis of PAM-Dependent Target DNA Recognition by the Cas9 Endonuclease, Nature 513(7519), 569-573), but this study does not provide information on how to generate an improved Cas9 fusion protein with a DNA targeting unit or truncated sgRNA sequences.
[0257] It has been reported that PAM recognition sequences may play a role to efficiently engage Cas9 nucleolytic activity, thereby providing an explanation for low off-target editing rates. While describing Cas9 modification of DNA, this reference does not describe fusion proteins combining the elements, nor does it discuss modification of the Cas9 PAM site or other modifications beyond the targeting RNA. Cencic et al., (2014) Protospacer Adjacent Motif (PAM)-Distal Sequences Engage CRISPR Cas9 DNA Target Cleavage, PLoS. ONE 9(10), e109213.
[0258] An X-ray crystal structural analysis of Cas9 in a complex with guide RNA and target DNA has been reported. Nishimasu et al., (2014) Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA, Cell 156(5), 935-949. E-published Feb. 13, 2014. This structural analysis provides insight into the identity of a Cas9 protospacer adjacent motif recognition domain and other sequence recognition features. While describing the orientation and features of the Cas9 in complex with and sgRNA and DNA, this reference does not describe the type of Cas9 modifications, fusion proteins, or mutations needed to make an attenuated Cas9.
[0259] A fusion protein using catalytically inactive Cas9 and FokI nuclease (FokI-dCas9) has been reported. Guilinger et al. (2014) Fusion of Catalytically Inactive Cas9 to FokI Nuclease Improves the Specificity of Genome Modification, Nat Biotech 32(6), 577-582. Cleavage of the sequence requires the combination of two of these FokI-dCas9 monomers where the targeting was greater than 140 fold higher specificity than wild type Cas9 with the same efficiency. While describing a Cas9 fusion protein complex that increases targeting, this reference does not describe a fusion protein with specific DNA binding proteins, modification of the PAM site, or truncated targeting sequences.
[0260] The fusion of both zinc fingers and TAL effectors as programmable DNA binding protein with non-Cas9 proteins has been reported to produce various effects upon targeted DNA sequences. Strauβ et al., (2013) Zinc Fingers, Tal Effectors, or Cas9-Based DNA Binding Proteins: What's Best for Targeting Desired Genome Loci?, Mol. Plant 6(5), 1384-1387. While describing zinc fingers, TAL effectors, and Cas9, this reference does not describe fusion proteins combining these elements, nor does it discuss any modification of the Cas9 protein (e.g., for example, specific mutations), beyond the targeting RNA.
[0261] dCas9 or TALE proteins have been fused with effector constructs (e.g., activation or repression domains) to modulate the expression of the Oct4 genes. Hu et al., (2014) Direct Activation of Human and Mouse Oct4 Genes Using Engineered TALE and Cas9 Transcription Factors, Nucleic Acids Res. 42(7), 4375-4390. While describing zinc fingers, TAL effectors, and Cas9, this reference does not describe fusion proteins combining these elements, nor does it discuss modification of the Cas9 (e.g., for example, specific mutations), beyond the targeting RNA.
[0262] CRISPR/Cas systems has been reported to be generally useful for genomic modification and gene modulation. Wu, F. “CRISPR/Cas Systems for Genomic Modification and Gene Modulation,” U.S. Patent Application Publication Number U.S. 2014-0273226 (herein incorporated by reference). While describing Cas9 modification of DNA, this reference does not describe fusion proteins combining these elements, nor does it discuss modification of the Cas9 (e.g., for example, specific mutations), beyond the targeting RNA.
[0263] A single Cas enzyme has been programmed by a short RNA molecule to recognize a specific DNA target, in other words, the reported Cas enzyme can be recruited to a specific DNA target using said short RNA molecule. Cong et al., “CRISPR-Cas Component Systems, Methods and Compositions for Sequence Manipulation,” U.S. Patent Application Publication Number U.S. 2014/0273231 (herein incorporated by reference). The reference describes a vector system that delivers the elements of the Cas system to affect changes to the DNA target. The reference also describes the importance of the PAM sequences into target DNA. While describing Cas9 modification of DNA, this reference does not describe fusion proteins combining these elements, nor does it discuss modification of the Cas9 PAM recognition domain (e.g., for example, specific mutations) or other modifications beyond the targeting RNA.
[0264] Non-Cas9/TALE fusion proteins have been reported where the TALEs are engineered, programmable DNA-binding domains which bind specifically to a preselected target sequence. Joung et al., “Transcription Activator-Like Effector (TALE)—Lysine-Specific Demethylase 1 (Lsd1) Fusion Proteins,” WO/2014/059255. This reference does not describe a fusion protein with Cas9 systems, nor does it discuss modification of the Cas9 PAM recognition domain (e.g., for example, specific mutations) or other modifications.
[0265] It has been reported that a mutation within an active site of an enzyme results in a change in DNA binding affinity. Shroyer et al., (1999) Mutation of an Active Site Residue in Escherichia coli Uracil-DNA Glycosylase: Effect on DNA Binding, Uracil Inhibition and Catalysis, Biochemistry 38(15), 4834-4845. This reference does not describe Cas9 fusion proteins, nor does it discuss modification of the Cas9 PAM recognition domain (e.g., for example, specific mutations) or other modifications beyond the targeting RNA.
II. Cas9 Nuclease-DNA Targeting Unit Fusion Proteins
[0266] In some embodiments, the present invention contemplates a Cas9 nuclease-DNA Targeting Unit (Cas9-DTU) fusion protein that cleaves a single site within a genome. In one embodiment, the Cas9-DTU fusion protein may be compatible with previously reported specificity-enhancing variations of Cas9. In some embodiments, the present invention contemplates Cas9-DTU fusion proteins using a wide variety of Cas9 orthologs including, but not limited to, SpCas9 (e.g., Type II-A) and NmCas9 (e.g., Type II-C), both of which are validated as genome-editing platforms. Jinek et al., A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012 Aug. 7; 337(6096): 816-21; Hou et al., Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proceedings of the National Academy of Sciences. 2013 Sep. 24; 110(39): 15644-9; Jinek et al., RNA-programmed genome editing in human cells. eLife. 2013; 2: e00471; Mali et al., RNA-guided human genome engineering via Cas9. Science. 2013 Feb. 15; 339(6121): 823-6; and Cong et al., Multiplex genome engineering using CRISPR/Cas systems. Science. 2013 Feb. 15; 339(6121): 819-23. Because >90% of known Cas9 orthologs are either Type II-A or Type II-C (Fonfara I, Le Rhun A, Chylinski K, Makarova K S, Lécrivain A L, Bzdrenga J, Koonin E V, Charpentier E. Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Research. 2014 Feb. 1; 42(4): 2577-90. PMCID: PMC3936727), the present invention facilitates embodiments to nearly any desired Type II Cas9 system.
[0267] In one embodiment, the present invention contemplates an improved Cas9 platform, where target recognition precision is improved by incorporation of a programmable DNA-binding domain (pDBD), such as Cys2-His2 zinc finger protein (ZFPs).sup.39 or transcription-activator like effectors (TALEs).sup.40.
[0268] One favorable characteristic of the presently disclosed pDBDs is their inherent modularity whereby specificity and affinity can be rationally tuned by adjusting the number and composition of incorporated modules and the linkage between modules.sup.44,45. In one embodiment, the present invention contemplates that a fusion of a pDBD to a mutant SpCas9 with an attenuated DNA-binding affinity generates a chimeric nuclease fusion protein comprising a broad sequence targeting range and dramatically improved precision (as compared to conventional Cas9 platforms). Although it is not necessary to understand the mechanism of an invention, it is believed that the present disclosed SpCas9-pDBD platforms have favorable properties for genome engineering applications. In addition, it is shown herein that these SpCas9-pDBD chimeras provide new insights into the barriers involved in licensing target site cleavage by a SpCas9/sgRNA complex.
[0269] Innovations to achieve an ultimate goal of precisely editing a single site within a genome comprise two general strategies that have applicability to all Cas9 systems. First, a DTU could be a programmable DBD fusion protein comprising either a ZFP (Urnov et al., Genome editing with engineered zinc finger nucleases. Nat Rev Genet. 2010 Sep. 1; 11(9): 636-46) or a TALE protein (Joung J K, Sander J D. TALENs: a widely applicable technology for targeted genome editing. Nat. Rev. Mol. Cell Biol. 2013 January; 14(1): 49-55). These DTU fusion proteins can precisely deliver a Cas9/sgRNA complex to a specific site within a genome and thereby facilitate sgRNA-dependent cleavage of an adjacent target sequence. Alternately, a DTU could be an orthogonal Cas9 isoform (e.g. nmCas9) that through the use of an orthogonal sgRNA targets the Cas9 nuclease to a specific site in the genome. In some embodiments, an orthogonal Cas9 DTU would be a nuclease-dead Cas9, so that it merely functions as a DNA recognition domain. In some embodiments, an orthogonal Cas9 DTU would be an active nuclease (either a nickase or nuclease), so that it can also break the DNA. In some embodiments, an orthogonal Cas9 DTU could also have attenuated DNA-binding affinity (NmCas9.sup.DM,
[0270] In one embodiment, the present invention contemplates a coupled DNA cleavage system including at least three levels of licensing: 1) recognition of a neighboring site by an attached DTU, 2) PAM recognition, and 3) sgRNA complementarity. The data presented herein indicate that PAM specificity of a Cas9 can be tuned, which provides an opportunity to alter and/or refine the sequence preference of Cas9 to a high levels of precision, and may also allow allele-specific targeting using SNPs as discriminators—e.g., for inactivation of dominant disease alleles. In some embodiments, a combined DTU fusion protein and altered PAM recognition strategy may be also compatible with all prior variants of Cas9 (e.g., dual nickases, tru-sgRNAs, or FokI fusions) further extending the precision of these constructs. In some embodiments, a Cas nuclease-DTU will extend the number of target sites that are functional sequences, allowing the efficient discrimination of alleles based on SNPs that distinguish these alleles, where these SNPs if present in the PAM recognition sequence would be the discriminators between active and inactive target sites. Although it may be not necessary to understand the mechanism of an invention, it is believed that the presently disclosed Cas9-DTU fusion proteins yield constructs that provide a single site precision sufficient for targeted genome editing, thereby facilitating gene therapy applications.
[0271] In one embodiment, the present invention contemplates a flexible, highly precise Cas9-based nuclease platform that cleaves only a single site within a multigigabase genome. This level of precision facilitates Cas9-based in vivo gene corrections, which may require precise genome editing of billions to trillions of cells. Currently achievable levels of genome editing specificity with conventional platforms must be increased to circumvent the hazards of unintended, difficult-to-predict off-target mutations. Although it may be not necessary to understand the mechanism of an invention, it is believed that the specificity and activity of Cas9 gene editing can be dramatically improved through an incorporation of an appended, programmable DNA-binding domain (pDBD). It is also believed that such improvements in editing specificity may result from a Cas9 platform that comprises: i) PAM recognition by Cas9; ii) DNA recognition by an sgRNA; and iii) flanking sequence recognition by a DBD. The data herein demonstrate the improvement in precision with SpCas9 systems (Type II-A) and functionality with NmCas9 systems (Type II-C), but one of skill in the art would appreciate that the disclosed strategy is applicable to all Cas9 based systems, such as Staphylococcus aureus (SaCas9) systems (Type II-A). Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186-191 (2015).
[0272] The development of Cas9-DBDs in the context of these two most prevalent subtypes (with their distinct domain arrangements) facilitates application of the present invention to nearly any Cas9-based genome editing system. Jinek et al., Structures of Cas9 endonucleases reveal RNA mediated conformational activation. Science. 2014 Mar. 14; 343(6176): 1247997. In addition, the presently disclosed Cas9-DBD framework should also be compatible with existing variants (e.g. dual nickases, tru-sgRNAs and/or FokI fusions) that have been reported to increase nuclease precision thereby enhancing precision. Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology. 2014 March; 32(3): 279-84; Tsai et al., Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nature Biotechnology. 2014 June; 32(6): 569-76; and Guilinger et al., Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nature Biotechnology. 2014 June; 32(6): 577-82.
[0273] In some embodiments, the present invention contemplates a method for improving precision in genome editing using a Cas9-DBD fusion protein by engineering two representative Cas9 orthologs: S. pyogenes Cas9 (SpCas9; Type HA) and N. meningitidis Cas9 (NmCas9; Type II-C, almost 300 aa smaller than SpCas9). These orthologs are validated genome-editing platforms, and the Type II-A and II-C families together encompass >90% of all Cas9 sequences. Modifications are presented that permit fused DBDs to increase precision and activity of both of these Cas9 orthologs as well as refine their inherent targeting range. One of skill in the art recognizes that the embodiments presented herein may be extended to other Cas9 systems or related CRISPR nuclease effectors (e.g. Cpf1; Zetsche, B. et al. Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System. Cell (2015). doi: 10.1016/j.cell.2015.09.038), since it may be possible that alternative Cas9 variants within these classes or other CRISPR nuclease effectors may have equivalent or superior properties for clinical applications.
[0274] Based on reported structures of Cas9, some embodiments of the present invention contemplate fusions between any Cas9 protein and programmable DNA-binding domains (e.g., for example, Cys2His2 zinc fingers (ZFP), homeodomains or TALE domains). Both ZFPs, homeodomains and TALEs can be easily programmed to recognize a variety of DNA sequences, and have been employed with FokI nuclease to generate dimeric nucleases. Urnov et al., Genome editing with engineered zinc finger nucleases. Nat Rev Genet 11, 636-646 (2010); and Joung et al., TALENs: a widely applicable technology for targeted genome editing. Nat. Rev. Mol. Cell Biol. 14, 49-55 (2013); PMID 22539651. Although it may not be necessary to understand the mechanism of an invention, it is believed that by fusing a Cas9 to a DNA-binding domain (DBD), a hybrid nuclease may be created where the activity of the Cas9 component may be defined, in part, by an associated DNA-binding domain.
[0275] The genome editing precision of available nuclease platforms may be improved to circumvent the hazards of unintended, difficult-to-predict off-target mutations.sup.1, which can alter gene function through direct mutagenesis or translocations. Although it is not necessary to understand the mechanism of an invention, it is believed that the present method improves the specificity of Cas9 through an attachment of a pDBD to Cas9 with attenuated DNA-binding affinity, thereby establishing a system where Cas9 target site cleavage is dependent on sequence recognition by a pDBD. In addition, the present invention contemplates regulatable Cas9-pDBD prototypes where, for example, drug-dependent dimerization domains control the association of Cas9 and a pDBD.
[0276] In some embodiment of the present invention an association of a Cas9-nuclease and the DTU may be mediated by dimerization domains. These dimerization domains could be, but are not limited to homotypic dimerization domains, heterotypic dimerization domains, light mediated dimerization domains and/or drug-dependent dimerization domains. These dimerization domains could be, but are not limited to protein or RNA.
[0277] In one embodiment, the present invention contemplates a Cas9-pDBDs chimeric protein for target recognition and cleavage purposes by using a variety of Cas9 orthologs. In one embodiment, the method optimizes a SpCas9-pDBD system. In one embodiment, the method extends an approach to NmCas9 (Type II-C) and SaCas9.sup.16, which are more amenable to viral delivery. Although it is not necessary to understand the mechanism of an invention, it is believed that the development of Cas9-pDBDs in the context of the two most prevalent subtypes facilitates application of some of the present embodiments into future Cas9-based genome editing system. In one embodiment, the present invention provides a Cas9 editing platform that establishes efficient and precise gene correction. For example, by applying this approach in HSPCs an avenue for the ex vivo generation of a cell-based therapy can be established. Once established, this approach should be applicable to other HSPC-based monogenic disorders.
[0278] Preliminary data were collected using a Cas9-ZFP fusion protein (e.g., Zif268), where a ZFP was bound to both a Cas9 N-terminus (Zif268-Cas9) and/or a Cas9 C-terminus (Cas9-Zif268) via a long linker to provide flexibility in binding. The Zif268 sequence recognizes a nucleic acid target sequence of 5′-GCGTGGGCG-3′ (SEQ ID NO: 3). C-terminal Cas9-ZFP/sgRNA complex activity was demonstrated using a GFP reporter assay, where the reporter construct may be inactive until a double-strand break was created within a target sequence (e.g., demonstrating a gain of function readout). The data demonstrated that both N-terminal Zif268-Cas9 and C-terminal Cas9-Zif268 were active, but that C-terminal Cas9-Zif268 showed the greatest activity.
[0279] A. Development And Validation
[0280] Based on SpCas9 structures, a fusion protein was designed between SpCas9 and a programmable DBD, wherein a DBD comprised either ZFP or TALE domains (e.g.
[0281] Both ZFPs and TALEs can be programmed to recognize nearly any sequence within a genome, where their affinities and specificities can be tuned based on the number of modules incorporated. Rebar et al., Heritable targeted gene disruption in zebrafish using designed zinc-finger nucleases. Nature Biotechnology. 2008 Jun. 25; 26(6): 702-8; Bhakta et al., Highly active zinc-finger nucleases by extended modular assembly. Genome Research. 2013 March; 23(3): 530-8; Zhu et al., Using defined finger-finger interfaces as units of assembly for constructing zinc-finger nucleases. Nucleic Acids Research. 2013 Feb. 1; 41(4): 2455-65; Kim et al., Preassembled zinc-finger arrays for rapid construction of ZFNs. Nature Methods. 2011; 8(1): 7; Meckler et al., Quantitative analysis of TALE-DNA interactions suggests polarity effects. Nucleic Acids Research. 2013 April; 41(7): 4118-28; and Reyon et al., FLASH assembly of TALENs for high-throughput genome editing. Nature Biotechnology. 2012 May; 30(5): 460-5. Preliminary experiments discussed herein resulted in the fusion of Cas9 with Zif268 or a TALE domain programmed to recognize the same sequence (TAL268), to the N-terminus (e.g. Zif268-SpCas9) or C-terminus (e.g. SpCas9-Zif268) of SpCas9 via a long linker. Cermak et al., Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Research. 2011 July; 39(12): e82-2; and Meng et al., Counter-selectable marker for bacterial-based interaction trap systems. Biotechniques. 2006 February; 40(2): 179-84. Although it may be not necessary to understand the mechanism of an invention, it is believed that a DBD, by recruiting Cas9 to a target site, would allow suboptimal PAM sequences to be cleaved efficiently, since there may be a kinetic barrier to R-loop formation by Cas9 at suboptimal PAM sequences. Szczelkun et al., Direct observation of R-loop formation by single RNA-guided Cas9 and Cascade effector complexes. Proc Natl Acad Sci USA. 2014 May 27.
[0282] SpCas9 may be believed to have a strong sequence preference for NGG over NAG and NGA PAMs and may be essentially inactive at other NXX PAM trinucleotides. Hsu et al., DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology. 2013 September; 31(9): 827-32; Jiang et al., RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nature Biotechnology. 2013 March; 31(3): 233-9; and Zhang et al., Comparison of non-canonical PAMs for CRISPR/Cas9-mediated DNA cleavage in human cells. Sci Rep. 2014; 4: 5405. It has been reported that an SpCas9 target site with an NAG PAM shows increased activity mediated by an appended DBD. A co-transfected plasmid GFP reporter system assay in Human Embryonic Kidney (HEK 293T) cells may be used to measure targeted DSB activity. Wilson et al., Expanding the Repertoire of Target Sites for Zinc Finger Nuclease-mediated Genome Modification. Mol Ther Nucleic Acids. 2013 April; 2(4): e88.
[0283] It was observed that C-terminal DBD fusions (e.g., SpCas9-Zif268) display superior activity to N-terminal fusions (
[0284] Both SpCas9-ZFP and SpCas9-TALE proteins can dramatically enhance nuclease activity on a nAG PAM target to a level comparable to wild-type SpCas9 (SpCas9.sup.WT) activity on a nGG PAM while being expressed at similar levels.
[0285] In one embodiment, a linker between a Cas9 nuclease and a DBD contains a plurality of amino acids (e.g., for example, approximately fifty-eight (58) amino acids) thereby providing good flexibility between the nuclease and the DBD. The data show that a standard SpCas9/sgRNA may be only functional with an NGG PAM, but not on an NAG PAM (blue bars). SpCas9-Zif268 (red bars) may be active on all spacings and orientations of the tested binding sites. SpCas9-TAL268 (green bars) has a much more restricted spacing and orientation, but strong activity can nonetheless be observed. Shorter linkers (e.g., for example, approximately twenty-five (25) amino acids) between a Cas9 nuclease and Zif268 have also been evaluated which provide a more restricted spacing between the nuclease and the DBD (
[0286] SpCas9 and SpCas9-Zif268 were tested on all sixteen (16) possible NXX PAM combinations to define the breadth of sequences that can be targeted. It was found that NGG, NGA, NAG, and NGC PAMs have very similar activity for SpCas9-Zif268 in the presence of a neighboring Zif268 target site, whereas SpCas9 only cleaved NGG PAM efficiently.
[0287] B. Attenuated Cas9 Platforms
[0288] In one embodiment, the present invention contemplates an attenuated SpCas9 comprising a mutated PAM recognition sequence, wherein an SpCas9 has a reduced affinity for a specific target sequence (Cas9.sup.MT protein). Based on the structure of a SpCas9/sgRNA/target complex and conservation in phylogenetically neighboring Cas9 orthologs, two arginines involved in PAM recognition (R.sup.1333 and R.sup.1335) were identified as mutation targets (
[0289] The fusion of a pDBD to SpCas9 should increase nuclease precision if target cleavage is dependent on DNA recognition by the pDBD. To achieve this, DNA-binding affinity of SpCas9 was attenuated by independently mutating the key PAM recognition residues (Arg1333 and Arg1335).sup.7 to either Lysine or Serine.
[0290] R1333K (SpCas9.sup.MT1) retained independent activity on a subset of target sequences, whereas R1333S (SpCas9.sup.MT2) and R1335K (SpCas9.sup.MT3) display only background activity, which could be restored to wild type levels in the presence of a ZFP fusion. To confirm that the ZFP-dependent restoration of activity is general, the nuclease activity of three additional SpCas9.sup.MT3-ZFP fusions were assessed, two of which restore nuclease function.
TABLE-US-00001 TABLE 1 Summary of SpCas9.sup.MT3-pDBD nuclease activities (T7EI) SEQ ID Activity NO: pDBD Name Type Target Sequence sgRNA (%Lesion) 4 ZFP.sup.TS2 4 Finger ZGP GCGGGCAGGGGC TS2 36.64 5 ZFP.sup.TS2 4 Finger ZGP GCAGGGGCCGGA TS2 23.04 6 ZFP.sup.TS3 4 Finger ZGP GGCGTTGGAGCG TS3 26.75 7 ZFP.sup.TS4 4 Finger ZGP CCGGTTGATGTG TS4 12.86 8 Zif268 3 Finger ZGP GCGTGGGCG PLXNB2 25.81 9 ZFP 4 Finger ZGP GAAACGGGATCG DNAJC6 9.32 10 ZFP.sup.FactorIX 5 Finger ZGP ACACAGTACCTGGCA PLXDC2 9.90 11 ZFP.sup.HEBP2 4 Finger ZGP GAAAAGTATCAA GPRC5B N.D 12 TAL268 8.5 Module Tale TGCGTGGGCG PLXNB2 N.D 13 TALE.sup.TS3-S 9.5 Module Tale TTGGAGCGGGG TS3 8.00 14 TALE.sup.TS3-L 15.5 Module Tale TTGGAGCGGGGAGAAGG TS3 16.26 15 TALE.sup.TS4-S 9.5 Module Tale TCAACCGGTGG TS4 2.01** 16 TALE.sup.TS4-L 15.5 Module Tale TCAACCGGTGGCGCATT TS4 1.82** N.D: Not Detected **: Not above background independent activity for SpCas9.sup.MT3
Thus, altering an affinity of Cas9 PAM recognition domains through mutation generates SpCas9 variants that are dependent on an attached pDBD for efficient function. This dependence on an attached pDBD establishes a third stage of target site licensing for the presently disclosed SpCas9.sup.MT3-pDBDs, which are observed to increase their precision.
[0291] To evaluate precision of an SpCas9.sup.MT-DBD fusion protein, validated SpCas9 target sites were tested (e.g., TS2, TS3 & TS4; all with NGG PAMs). Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology. 2014 March; 32(3): 279-84; Fu, Y., Foden, J. A., Khayter, C., Maeder, M. L., Reyon, D., Joung, J. K., & Sander, J. D. (2013). High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature Biotechnology, 31(9), 822-826. doi: 10.1038/nbt.2623. SgRNAs that recognize these sites have well-defined on- and off-target activities, and thus provide a benchmark to rapidly assess improvements in precision by evaluating activity at high-efficiency off-target sites.
[0292] A ZFP DBD (i.e., for example, ZFP.sup.TS3) was designed to recognize a sequence near a TS3 target site (
[0293] Sequences of a number of the Cas9-DTU fusions used in these preliminary studies are presented in
[0294] C. NmCas9 Gene Editing Platform
[0295] Cas9 is believed to be a Type II CRISPR/Cas system and may be further subdivided into three subtypes: i) II-A (including the 1368-aa SpCas9); ii) II-B; and iii) II-C. Barrangou et al., CRISPR-Cas systems: Prokaryotes upgrade to adaptive immunity. Molecular Cell. 2014 Apr. 24; 54(2): 234-44. Type II-C Cas9s are believed to be compact and more prevalent than the other two subtypes; (e.g., for example, ˜55% II-C; ˜38% (II-A); ˜7% (II-B)). Further, Type II-C Cas9s may serve to extend the potential targeting specificity via their range of PAM recognition requirements. Fonfara et al., Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Research. 2014 Feb. 1; 42(4): 2577-90. The shorter length of some Type II-C Cas9s (as small as ˜970-1100 aa) may facilitate delivery, as viral payload limitations make the larger SpCas9 suboptimal for some clinical applications (e.g., adeno-associated viruses). Daya et al., Gene therapy using adeno-associated virus vectors. Clin. Microbiol. Rev. 2008 October; 21(4): 583-93.
[0296] An in-depth analysis of a Neisseria meningitidis Type II-C system (NmCas9), including a definition of its apparent PAM (5′-NNNNGATT-3′) (SEQ ID NO: 1), has been reported. Zhang et al., Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Molecular Cell. 2013 May 23; 50(4): 488-503. Further, a 1082-aa NmCas9 has been validated as an efficient genome-editing platform in human cells. Hou et al., Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proceedings of the National Academy of Sciences. 2013 Sep. 24; 110(39): 15644-9; and Esvelt et al., Orthogonal Cas9 proteins for RNA guided gene regulation and editing. Nature Methods. 2013 November; 10(11): 1116-21. The structure of a different Type II-C Cas9 from Actinomyces naeslundii (AnCas9) may be known, revealing a distinct arrangement of peripheral domains (in comparison with SpCas9) around a similarly structured nuclease core, though AnCas9's PAM specificity and genome editing efficacy have not been reported. Jinek et al., Structures of Cas9 endonucleases reveal RNA-mediated conformational activation. Science. 2014 Mar. 14; 343(6176): 1247997.
[0297] In mammalian cells, PAM requirements efficient editing by nmCas9 has been observed with NNNNG(A/C/T)TT PAMs (SEQ ID NO: 17). Hou et al., Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proceedings of the National Academy of Sciences. 2013 Sep. 24; 110(39): 15644-9; and Esvelt et al., Orthogonal Cas9 proteins for RNA guided gene regulation and editing. Nature Methods. 2013 November; 10(11): 1116-21. An ability of a pDBD fusion to extend the range of targetable PAMs has been examined for NmCas9 as previously shown with SpCas9. On genomic target sites with a ZFP (Zif268) fused to the N-terminus or the C-terminus and where a Zif268 binding site is downstream of the PAM an extension of the range of targetable sequences is observed. These data demonstrate that while wild-type NmCas9 is inactive at these genomic loci, the Zif268 fusion permits robust cleavage (
[0298] Although the molecular structure of NmCas9 is not known, we have utilized sequence homology with other Type IIC Cas9s from related species to identify residues that are likely involved in PAM recognition or DNA phosphodiester backbone contacts (e.g. K1013 and R1025;
[0299] Although it may be not necessary to understand the mechanism of an invention, it is believed that the above improvements in activity and precision realized by a fusion of a DBD to SpCas9 and NmCas9 and the corresponding attenuating mutations are broadly applicable to other Cas9s. Common design principles between Type II-A and Type II-C Cas9-DBD fusions that achieve excellent precision and improvements in activity demonstrate the applicability of the present invention to all Cas9 platforms and all specific genomic targets. These design principles may be applicable to other CRISPR-based single protein nuclease effector systems (e.g. Type V Cpf1).
TABLE-US-00002 TABLE 2 wild-type, attenuated and ZFP fused NmCaz9 editing efficiency at various genomic target sites EDITING EFFICIENCY Cas9-K1019A/- Z11268-K1013A/ SITE NAME Cas9 Cas9-R1025A R1025A Zif268-Cas9 Zif268-R1025A R1025A N-TS3(GATT-5bp-W) 25, 39, 33, 20 0 0 33, 34 30, 40 23, 16 N-TS5(GATT-5bp-C) 11, 9, 20 0 0 42 0 0 N-TS7(GATT-9bp-C) 14, 24, 33 0 0 21 20 17 N-TS8(GATT-9bp-W) 10, 19, 34 9 1 16 21 19 N-TS9(GATT-11bp-C) 20, 27 N-TS10(GATT-12bp-W) 0 0 0 31 0 0 N-TS11(GATT-14bp-W) 24, 13, 13 0 0 32 21 0 N-TS20(GATT-5bp-W) 0 0 0 19, 18, 12, 15, 13 0 0 N-TS21(GTCT-5bp-W) 8, 13, 3, 4, 8 0 0 23, 23, 11, 14, 23 0 0 N-TS22(GCTT-5bp-W) 0 0 0 18, 16, 12, 10, 17 0 0 N-TS24(GACA-5bp-W) 0 0 0 18, 14, 43, 19, 16 0 0 N-TS25(GACA-5bp-W) 22, 32, 28, 26 0 0 26, 30, 33, 31, 26 25, 25, 35, 33, 24 6, 8, 23, 20, 7
[0300] D. Broadened Range of Cas9 Specific Target Sequences
[0301] In one embodiment, the present invention contemplates a method comprising differentially controlling functional recognition of a target site and subsequent cleavage by sequence elements within a Cas9 protein. One of the current limitations of Cas9 may be that, although target site recognition sequence can be programmed with a sgRNA, the ability to bind and cleave the target site sequence may be also dictated by a Cas9 PAM recognition sequence. In some Cas9 isoforms, a PAM sequence of NGG may be highly preferred both for binding and for cleavage. Hsu et al., DNA targeting specificity of RNA-guided Cas9 nucleases. Nature Biotechnology 31, 827-832 (2013); Wu et al., Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nature Biotechnology (2014); and Kuscu et al., Genome-wide analysis reveals characteristics of off-target sites bound by the Cas9 endonuclease. Nature biotechnology (2014). Lower cleavage activity was observed for NAG PAMs, whereas other PAMs have no activity.
[0302] The data presented herein shows the activity of SpCas9 or SpCas9-Zif268 with a common sgRNA on target sites that have each of the 16 different PAM sequences with a flanking Zif268 site 5 base pairs away. Remarkably, a SpCas9-Zif268 construct may be highly active at multiple PAMs (i.e., for example, NGG, NAG, NGC and NGA) with a common sgRNA recognition sequence, equivalent activity at non-NGG PAMs has not been previously described.
[0303] Conventional SpCas9 sgRNAs (e.g., for example, TS2, TS3 & TS4; all NGG PAMs) are known to have well-defined off-target sites. Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology. 2014 March; 32(3): 279-84; Fu, Y., Foden, J. A., Khayter, C., Maeder, M. L., Reyon, D., Joung, J. K., & Sander, J. D. (2013). High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature Biotechnology, 31(9), 822-826. doi: 10.1038/nbt.2623. On- and off-target cleavage efficiencies at these sites may be evaluated for SpCas9-DBD constructs, where an attached DBD recognizes a sequence near each target site. Further, improved linkers may be combined with improved SpCas9 PAM recognition domain mutants to construct a Cas9 fusion protein most likely to eliminate off-target activity at the previously identified sequences.
[0304] Initial assessment of SpCas9-DBD precision may be done via T7EI assays on PCR amplicons from target and predicted off-target sites. For promising constructs, deep-sequencing of these amplicons will be used quantify lesion rates at each site. Gupta et al., Zinc finger protein-dependent and -independent contributions to the in vivo off-target activity of zinc finger nucleases. Nucleic Acids Research. 2011 Jan. 1; 39(1): 381-92. To assess nuclease activity at sites throughout a genome, GUIDE-seq analysis can be performed (Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33, 187-197 (2015)). Regions exhibiting significant GUIDE-seq oligonucleotide incorporation may be characterized for off-target cleavage rates in the nuclease-treated cells using the same PCR-based deep sequencing approach described above. Given preliminary results, it may be anticipated that the precision of Cas9.sup.mut-DBD has vastly improved and superior activity as compared to Cas9.
[0305] TALE or ZFP binding site length may also be varied to provide optimal binding precision. For example, binding site size and affinity of TALEs or ZFPs can be tuned by changing the number of recognition modules that are incorporated into the Cas9 fusion protein (
[0306] E. ZFP Or TALE Cas9 Fusion Proteins
[0307] In one embodiment, the present invention contemplates a method comprising binding a Cas9 fusion protein comprising a ZFP or TALE to a non-standard PAM target site. In one embodiment, a non-standard PAM target site comprises a NAG PAM sequence. Although it may be not necessary to understand the mechanism of an invention, it is believed that a NAG PAM sequence may be weakly cleaved by the standard SpCas9 (e.g., a sub-optimal PAM sequence).
[0308] The data presented herein examines spacing and orientation requirements between a DBD target site and a neighboring PAM sequence. For this analysis, a TALE protein was generated that recognized a Zif268 binding site (TAL268). This provided the advantage that the same reporter system to examine the activity of SpCas9, SpCas9-Zif268 and SpCas9-TAL268. The data show that a standard SpCas9/sgRNA may be only functional with a NGG PAM (yellow bar), but not on an NAG PAM (Blue bars). However, SpCas9-Zif268 (red bars) may be active at an NAG PAM on all spacings and orientations of its binding site. A similar broadening of targeting range is observed with ZFP fusions to NmCas9 (Table 2, above). SpCas9-TAL268 (green bars) has a much more restricted spacing and orientation for favorable activity.
[0309] F. Cas9-Cas9 Fusion Proteins
[0310] In one embodiment, the present invention contemplates a method comprising binding of a Cas9-Cas9 fusion protein (dual Cas9 system) to a composite binding site. This could involve one Cas9 component serving as the nuclease and the other nuclease-dead Cas9 (dCas9) component serving as the targeting domain (analogous to the ZFP or TALE component of the Cas9-ZFP/TALE fusions;
[0311] A split-GFP reporter assay was employed to demonstrate that SpCas9.sup.MT3-NmdCas9 and NmdCas9-SpCas9.sup.MT3 can generate target cleavage with certain arrangements of target sites for NmdCas9 (nuclease-dead) and SpCas9.sup.MT3 (attenuated).
[0312] One of the advantages of the dual Cas9 system over the Cas9-pDBD system is the ability to utilize both nuclease domains to achieve coordinated cleavage at two neighboring positions within the genome. For example, attenuated SpCas9 can be coupled to NmCas9 that is either a nickase or a double-strand nuclease to allow the formation of a single-strand nick neighboring a break or two double-strand breaks together. If a NmCas9 nickase is utilized, the strand that is cleaved can be controlled by the nuclease domain (either HNH or RuvC) that is inactivated. This can in principle be utilized to create extended 5′ or 3′ overhangs neighboring the blunt double-strand break that is generated by attenuate SpCas9, which are likely to have improved properties for certain types of DNA repair (alternate non-homologous end joining or homology directed repair from an exogenous template). These combinations of dual nuclease-nickase or dual nucleases are functional, and in the case of the dual nucleases provide clear deletions of the intervening sequence (
[0313] G. Drug-Dependent Cas9 pDBD Systems
[0314] In one embodiment, the present invention contemplates a method comprising binding of a drug-dependent nuclease system where the attenuated Cas9 and the pDBD (or alternate DTU such as an different Cas9 isoform) where the temporal activity of the nuclease can be controlled by the presence of a small molecule. Small molecule-(Yoshimi K, et. al. Nature Communications. 2014; 5:4240; Spencer D M, et. al. Science. 1993 Nov. 12; 262(5136): 1019-24; Hathaway N A, et. al. Cell. Elsevier Inc.; 2012 Jun. 22; 149(7): 1447-60; Liang F S, et al. Science Signaling. 2011; 4(164): rs2-rs2) or light-dependent (Konermann S, et al. Nature. 2013 Aug. 22; 500(7463): 472-6) dimerization systems have been developed that permit the control of activity of a two-component system. Since SpCas9/sgRNA off-target activity is dose dependent, these systems have been adapted to regulate the association of two fragments of Cas9 (Split-Cas9; Nihongaki Y, et. al. Nature biotechnology. 2015 July; 33(7): 755-60; Wright A V, et al. Proceedings of the National Academy of Sciences. 2015 Mar. 10; 112(10): 2984-9; Zetsche B, et. al. Nature biotechnology. 2015 February; 33(2): 139-42; Davis K M, et. al. Nat Chem Biol. 2015 May; 11(5): 316-8). However, this framework may not be ideal, as drug-dependent Split-SpCas9 displays reduced target activity and retains modest off-target activity (Zetsche B, et. al. Nature biotechnology. 2015 February; 33(2): 139-42). SpCas9-pDBD systems are amenable to the incorporation of a drug- or light-dependent dimerization system that regulates the association of SpCas9 and the pDBD by replacing the covalent linker with a conditional dimerization system (drug or light dependent) (
[0315] Activity and drug-responsiveness of this system has been improved through a number of additional modifications. To increase the turnover of the pDBD in the absence of drug, which can potentially compete with SpCas9-FKBP/FRB-ZFPs complexes if in excess, a destabilized FRB domain has been incorporated (i.e., for example, a PLF triple mutant-FRB*; Stankunas K, et. al. Chembiochem. 2007 Jul. 9; 8(10): 1162-9) on the pDBD component. The cellular localization sequences on Cas9 and the pDBD has also been improved. An absence of a nuclear import (NLS) or export (NES) sequence on Cas9 was found to provide the lowest background levels of cleavage while providing the largest drug-dependent activity. For the pDBD the presence of a combination of 2×NLS and 2×NES, which is believed to cause constant cycling between the nucleus and cytoplasm, thereby resulting in improved activity (
[0316] Regulated nuclease activity can be obtained by breaking the Cas9 protein into two independent components (e.g., termed herein “split-Cas9”), where assembly can be controlled. Switching into an active state can be driven through the delivery of a small molecule (Zetsche B, et. al. Nature biotechnology. 2015 February; 33(2): 139-42. Davis K M, et. al. Nat Chem Biol. 2015 May; 11(5): 316-8) or light of a suitable wavelength (Nihongaki Y, et al. Nature biotechnology. 2015 July; 33(7): 755-60). Most of these platforms display lower activity at the target site and off-target sites when compared with standard Cas9. Fusion of a pDBD to one of the Split-SpCas9 components has been demonstrated that dramatically increases its activity at alternate PAM sequences (e.g. NAG,
[0317] Activity and drug-responsiveness of the Split-Cas9-ZFP system has also been improved through a number of additional modifications. Using, for example, the cellular localization sequences on the N-terminal and C-terminal components of Split-Cas9. Inclusion or absence of a nuclear import (NLS) or export (NES) sequence on these segments was found to influence the background and drug-dependent cleavage rates of these constructs (
[0318] To generate a more precise system, MT3 attenuating mutations were introduced into the split-SpCas9 system. Using this system tethered to a ZFP that recognizes a neighboring sequence within the TS2 genomic region (split-SpCas9.sup.MT3-ZFP.sup.TS2 and a TS3 sgRNA) drug-dependent cleavage of the TS2 target site was achieved. To demonstrate the improvements in precision achieved through drug-dependent systems GUIDE-seq was employed (Tsai, S. Q. et al. Nature biotechnology 33, 187-197 (2015).) For this analysis, the precision of wild-type Cas9 was compared to ae drug-dependent Split-Cas9 system (Zetsche B, et. al. Nature biotechnology. 2015 February; 33(2): 139-42); a drug-dependent SpCas9-FKBP/ZFP-FRB* and a drug-dependent split-SpCas9.sup.MT3-ZFP.sup.TS2 through Illumina sequencing of genomic regions that have incorporated GUIDE-seq oligonucleotides. The number of reads that are associated with a locus are indicative of the nuclease cleavage activity. When the nuclease activity of these constructs are assayed with a sgRNA (and ZFP) programmed to recognize the TS2 locus, all of these constructs have high activity at the TS2 target site (
[0319] H. Increased sgRNA Activity
[0320] Truncated sgRNAs (i.e. less than 20 bases of complementarity) have been utilized to increase precision of Cas9/sgRNA complexes by reducing the degree of potential complementarity with off-target sequences. Fu et al., Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature Biotechnology (2014). Cleavage activity of truncated sgRNA was compared between SpCas9 and SpCas9-Zif268. The data demonstrate that SpCas9-Zif268 displays a higher cleavage activity than SpCas9 where both comprise an identical sgRNA, whether the sgRNA may be a full length sequence or a truncated sequence.
[0321] I. Cas9 PAM Recognition Sequence Mutations
[0322] The PAM interaction domain (PI) has been defined based on structural information on the Cas9/sgRNA/target complex and domain substitution studies. Jinek et al., Structures of Cas9 Endonucleases Reveal RNA-Mediated Conformational Activation. Science (2014); Nishimasu et al., Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA. Cell 1-23 (2014); PMID 25079318. Based on the reported crystal structures, there is evidence that for a conservation of residues within the PI domain between Cas9 isoforms from different species share a common PAM recognition sequence. Fonfara et al., Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Research 42, 2577-2590 (2014).
[0323] In some embodiments, the present invention contemplates a SpCas9 protein comprising two arginine residues at positions 1333 and 1335 (i.e., a RKR motif) that may be a NGG PAM recognition domain. In one embodiment, the present invention contemplates a mutated Cas9 protein (Cas9.sup.MT #) comprising an .sup.1333R.fwdarw..sup.1333K mutation or an .sup.1335R.fwdarw..sup.1335S mutation. The activity of a Cas9.sup.MT # or a Cas9.sup.MT #-Zif268 were tested using a target site that contains NGG, NAG or NCG PAMs with a neighboring Zif268 site. The data show that Cas9.sup.MT # may be inactivated by a single mutation, only modestly effect Cas9.sup.MT #-Zif268 activity, with the exception of the .sup.1335R.fwdarw..sup.1335S mutation (#4) where activity may be abrogated. The .sup.1333R.fwdarw..sup.1333K mutant (#1) displays similar activity to the wild type (WT) Cas9-Zif268 fusion.
[0324] As disclosed herein, a SpCas9-DBD fusion protein displays an improved activity and precision, especially when combined with a mutated PAM recognition sequence that attenuates intrinsic DNA binding affinity. While the presently disclosed mutations weaken native cleavage activity, it may be likely that further attenuation of the DNA-binding affinity of SpCas9 may increase absolute DBD dependence. For example, mutagenizing at least two regions of Cas9 may be expected to reduce its intrinsic activity: 1) the PAM recognition residues, and 2) apparent phosphate-contacting residues near the PAM binding site.
[0325] In one embodiment, the present invention contemplates mutations to the PAM recognition residues comprising arginines (e.g., R.sup.1333 & R.sup.1335) that participate in base-specific binding. In one embodiment, the mutation may be a substitution. In one embodiment, the mutation may be a combination mutation (e.g., a R1333K and a R1335K). In one embodiment, the mutations that abrogate the independent binding of the Cas9 nuclease to its target site are in phosphodiester backbone contacting residues that reduce the affinity of Cas9 for the DNA. A GFP reporter assay may be used with the array of 16 PAM target sites to monitor nuclease activity of each mutant with and without the DBD.
[0326] In one embodiment, the present invention contemplates mutations to Cas9 comprising arginine or lysine residues that participate in DNA phosphate binding. Neutralization of phosphate contacts within DBDs may be a demonstrated method to modulate their binding affinities. Khalil et al., A synthetic biology framework for programming eukaryotic transcription functions. Cell. 2012 Aug. 3; 150(3): 647-58. Lysines that are well-positioned to make non-specific contacts with the DNA downstream of the PAM contacting residues. Anders et al., Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature. 2014 Sep. 25; 513(7519): 569-73; and
[0327] Mutations may be identified in a PAM interaction domain and non-specific phosphate contacts that completely inactivate Cas9 activity independent of an attached DBD. Further characterization of promising constructs may be performed using PCR amplification of a genomic target and deep sequencing to quantify SpCas9.sup.MT activity with and without the DBD. PAM recognition domains serve not only as an initial DNA-binding toehold for Cas9, but the binding energy may be also used to provide local DNA unwinding in preparation for (or coupled to) R-loop nucleation, and perhaps allosteric nuclease activation. Thus, accumulated mutations could compromise DNA unwinding and activation so much that a defect cannot be overcome by an appended DBD.
[0328] In one embodiment, the present invention contemplates an SpCas9.sup.MT #-DBD fusion protein comprising a truncated sgRNA (tru-gRNA). Although it may be not necessary to understand the mechanism of an invention, it is believed that tru-gRNAs (i.e., for example, TS1, TS2, TS3 & TS4) improve, but do not eliminate, off-target activity. tru-gRNA (TS1) tested in a GFP reporter assay was found to display similar, or even slightly improved, on-target activity when used with a Cas9-DBD fusion protein relative to Cas9 alone.
[0329] In one embodiment, the present invention contemplates an SpCas9 comprising refined PAM specificity wherein genome editing may be improved. In one embodiment, the present invention contemplates a plurality of SpCas9.sup.MT variants that can target essentially any sequence within the genome with maximal precision, and that may be capable of allele-specific targeting. Selection strategies that generate SpCas9 variants having altered PAM specificity (SpCas9-PAM.sup.MT) have been discussed herein in the context of an altered SpCas9-DBD fusion protein. The precision of these SpCas9.sup.MT-DBD variants may be characterized within a genome and tested for allele-specific targeting, using PAM SNPs as discriminators.
[0330] For example, SpCas9-PAM specificity may be refined through mutagenesis of PAM recognition residues. A GFP reporter assay testing PAM recognition mutants demonstrated attenuation of intrinsic nuclease activity (e.g., for example, R.sup.1333 & R.sup.1335).
[0331] Using a B2H system, libraries may be searched of sufficient complexity (˜10.sup.8) to cover all possible amino acid combinations for possible PAM recognition mutants (
[0332] A negative selection protocol may be used to identify functional nucleases at alternate PAMs. For example, a bacterial 5-FOA/URA3 counter-selection system was reported that may be suitable for the identification of Cas9-DBD variants with mutated PAM sequences. Meng et al., Counter-selectable marker for bacterial-based interaction trap systems. Biotechniques. 2006 February; 40(2): 179-84. For example, a low-copy, IPTG-inducible URA3 plasmid (pSC101 origin, kanR-marked) containing a Cas9-DBD target site may be introduced with a mutated PAM sequence into a uracil auxotroph strain (ΔpyrF). Meng et al., A bacterial one-hybrid system for determining the DNA-binding specificity of transcription factors. Nature Biotechnology. 2005 August; 23(8): 988-94; and Lutz et al., Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Research. 1997 Mar. 15; 25(6): 1203-10. After transformant selection (KanR), these cells may be electroporated with a Cas9-DBD/sgRNA plasmid library (marked with ampR), and plated on YM media with ampicillin, IPTG (to induce URA3) and 5-FOA. Functional Cas9-DBDs variants can cleave and eliminate the URA3 plasmid, permitting survival; cells with nonfunctional Cas9-DBDs retain the plasmid and die via 5-FOA counter-selection. Surviving clones may be pooled and deep-sequenced to identify a consensus at the randomized positions. Chu et al., Exploring the DNA-recognition potential of homeodomains. Genome Research. 2012 October; 22(10): 1889-98. The specificity of individual SpCas9.sup.MT clones similar to the consensus sequence for each PAM selection can then be evaluated as described above using, for example, a B2H selection approach.
[0333] Alternatively, a library depletion strategy may be employed that may be analogous to RNAi-based strategies in mammalian cells to identify essential genes in a particular pathway. Murugaesu et al., High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing. Genome Biol. 2011; 12(10): R104; Moffat et al., A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell. 2006 Mar. 24; 124(6): 1283-98; and Root et al., Genome-scale loss-of-function screening with a lentiviral RNAi library. Nature Methods. 2006 September; 3(9): 715-9. In these screens, shRNA clones that target essential genes in a pathway of interest are depleted from the initial library because they are lost from the population.
[0334] Deep sequencing may be used to compare the distribution of clones in the initial library and in the survivors to identify shRNAs that are lost, which are then retested individually to assess their activity. Similarly, a depletion strategy may be used to identify barcoded clones of the above library that are active in bacteria at a desired PAM site within a kanR-marked plasmid. Based on a protocol for RNAi-based screens, an approximate 1000-fold oversampling of a library may observe reliable depletion of active Cas9-DBD clones. Thus, a smaller library (˜10.sup.5 clones) may be used to retain sufficient depth in a lane of HiSeq2000 sequencing (˜2×10.sup.8 reads/lane) to effectively employ this approach. Clones may be recovered that define a primary consensus sequence useful for bootstrapping through a second library construction (with fixed residues at positions of consensus from clones recovered from the first selection) and a deeper search of neighboring sequence space to identify the most active sequences. The specificity of each of these selected SpCas9-PAM.sup.MT clones may then be evaluated using a B2H selection technique as described above.
[0335] In one embodiment, the present invention contemplates a method for determining precision of SpCas9.sup.MT clones using a genome-wide survey. For example, precision of an SpCas9.sup.MT clone at a specific genomic target site and predicted off-target genomic sites can be determined by comparing new target sites for each SpCas9.sup.MT clone that have an appropriate PAM sequence (i.e., for example, a specific non-NGG PAM). An appropriate DBD can be constructed to target each sequence to create an SpCas9.sup.MT-DBD fusion protein. The most favorable off-target sites can then be predicted for these sgRNAs using, for example, a CRISPRseek algorithm. Zhu et al., CRISPRseek: A Bioconductor Package to Identify Target-Specific Guide RNAs for CRISPR-Cas9 Genome-Editing Systems. PLoS ONE. 2014; 9(9): e108424. In addition, GUIDE-seq analysis can be performed (Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33, 187-197 (2015)). Regions exhibiting significant GUIDE-seq oligonucleotide incorporation may be characterized for off-target cleavage rates in the nuclease-treated cells using PCR-based deep sequencing. Gupta et al., Zinc finger protein-dependent and -independent contributions to the in vivo off-target activity of zinc finger nucleases. Nucleic Acids Research. 2011 Jan. 1; 39(1): 381-92.
[0336] In one embodiment, the present invention contemplates a SpCas9.sup.MT-DBD fusion protein comprising mutated PAM sequences comprising unexpected and superior specific genomic target binding precision. Although it may be not necessary to understand the mechanism of an invention, it is believed that a Cas9.sup.MT-DBD fusion protein allows a precise cleavage of nearly any sequence within the genome and can provide allele-specific targeting through the use of SNPs that distinguish between alleles. For example, the inactivation of specific dominant-negative alleles could have great utility for gene therapy. In one embodiment, the method contemplates an SNP for siRNA-mediated silencing of Huntington alleles that contain CAG repeat expansions. Pfister et al., Five siRNAs targeting three SNPs may provide therapy for three-quarters of Huntington's disease patients. Curr Biol. 2009 May 12; 19(9): 774-8. In principle, Cas9s with allele-specific activity could provide an alternate therapeutic strategy to disable specific harmful alleles in patients.
[0337] In one embodiment, the present invention contemplates a method of stringently discriminating between single alleles by targeting a particular heterozygous SNP within a PAM. The data presented herein demonstrates that Cas9 and various PAM recognition mutants already generated could utilize a Cas9-DBD fusion protein to edit single alleles that are distinguished by functional vs. non-functional PAMs.
[0338] A database may be used to define cell lines with SNPs that could be used to test the allele-specific discrimination of a Cas9-DBD fusion protein. Forbes et al., COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Research. 2011 January; 39(Database issue): D945-50. The Forbes et al. database contains sequences from >100 cell lines, each with a searchable table of validated SNPs (e.g., 26 heterozygous SNPs in U2OS cells). Potentially distinguishable SNPs and sequence candidate loci in cell lines can be identified from this database to confirm heterozygosity. For validated SNPs, SpCas9.sup.MT-DBD/sgRNA combinations may be designed to target a single allele. The allelic targeting ratios (relative to negative controls lacking the cognate sgRNA or the appended DBD) can be determined by deep-sequencing PCR amplicons from treated cells.
[0339] PAM mutations can also be defined that attenuate NmCas9 activity to achieve dependence on an attached DBD for nuclease activity (
[0340] A preferred PAM of NmCas9 (i.e., for example, NNNNGATT (SEQ ID NO: 1)), wherein a T may be well-tolerated in place of the A, may be suited for protein-DNA photo-crosslinking using a commercially available, photoactivatible crosslinker 5-iododeoxyuridine (5 IdU), which may be isosteric with T90. Each of the three individual T-to-5 IdU substitutions within the NNNNGTTT PAM (SEQ ID NO: 21) of an oligonucleotide duplex may be bound to a purified, nuclease-dead NmCas9 (i.e., for example, a D16A/H588A double mutant, already expressed and purified) in the presence of a complementary sgRNA.
[0341] A single T can also be substituted on an opposite strand of the same duplex that carries a NNNNGATT PAM (SEQ ID NO: 1). Photo-crosslinking efficiency for each radiolabeled, 5 IdU-substituted target duplex (following irradiation at 308 nm) can also be determined by SDS-PAGE. Wolfe et al., Unusual Rel-like architecture in the DNA-binding domain of the transcription factor Fact. Nature. 1997 Jan. 9; 385(6612): 172-6; and Liu et al., Evidence for a non-alpha-helical DNA-binding motif in the Rel homology region. Proc Natl Acad Sci USA. 1994 Feb. 1; 91(3): 908-12. Mutant PAMs with inactivating mutations on the non-5 IdU-substituted strand can serve as specificity controls. Photo-crosslinking reactions for 5 IdU positions displaying efficient, specific crosslinking may be scaled up for mass-spectrometric analysis of trypsin- and Si nuclease/phosphatase-digested peptide fragments.
[0342] DNA contact residues identified by photo-crosslinking, as well as nearby arginine, lysine and glutamine residues, may be mutated and activity of each NmCas9.sup.MT # relative to wild-type NmCas9 evaluated in a GFP reporter assay. Luscombe et al., Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Research. Oxford University Press; 2001 Jul. 1; 29(13): 2860-74. NmCas9.sup.MT # clones with attenuated activity may then be fused to DBDs to test for recovery of nuclease activity. PAM specificities can be evaluated in a GFP reporter assay, where initially all PAM variants can be evaluated that have three of the four bases in the NNNNGATT (SEQ ID NO: 1) consensus sequence preserved (e.g., 12 combinations).
[0343] The above discussed identification of SpCas9 PAM recognition residues, R.sup.1333 & R.sup.1335, was made before any reported structure of these interactions. This discovery was facilitated both by available SpCas9 structural models and sequence alignments of closely related Cas9 orthologs, with the expectation that Cas9-DNA contacts are likely to be conserved. In protein-DNA complexes, guanine contacts (GATT PAM) and DNA phosphate contacts are likely to be mediated by either arginine or lysine residues. Luscombe et al., Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Research. Oxford University Press; 2001 Jul. 1; 29(13): 2860-74. Consequently, mutations of conserved NmCas9 arginine or lysine residues to an alanine are most likely to affect cleavage activity.
[0344] Based on the above data demonstrating attenuation of SpCas9.sup.MT # nuclease activity, it can be expected that, as a result of Cas9 PAM amino acid conservation, NmCas9.sup.MT would also demonstrate attenuated nuclease activity. The analysis of relevant residues may be aided by photo-crosslinking data, which should help to clarify DNA-proximal regions. Alterations in PAM specificity for these mutants can be evaluated in the GFP reporter assay. Genome editing activity of favorable NmCas9.sup.MT clones can be evaluated on genomic targets in HEK293T cells fused to DBDs programmed to bind neighboring sites. Differences in activity between each NmCas9.sup.MT versus NmCas9.sup.MT-DBD can be examined by T7EI assay. As with SpCas9, further characterization may be performed using PCR amplification of the genomic targets and deep sequencing to quantify editing frequencies at each target site with and without the DBD. Improvements in precision can also be further validated using the above described genome-wide analysis.
[0345] For example, a genome-wide assay may be used to define optimal NmCas9.sup.MT #-DBD fusion proteins for precise target cleavage in human cell lines. Precision of the most promising NmCas9.sup.MT #-DBD clones can be evaluated at target sites and predicted off-target sites within the genome. Appropriate DBDs can be created to facilitate targeting of each genomic sequence with an NmCas9.sup.MT #-DBD fusion protein. A set of the most favorable off-target sites can be predicted for these sgRNAs considering both the similarity of the sgRNA to genomic sequences and possible alternate PAMs that could be functional for each NmCas9.sup.MT # clone based on an evaluation in a GFP reporter assay and predictions developed using the CRISPRseek algorithm. Zhu et al., CRISPRseek: A Bioconductor Package to Identify Target-Specific Guide RNAs for CRISPR-Cas9 Genome-Editing Systems. PLoS ONE. 2014; 9(9): e108424. In addition, GUIDE-seq analysis can be performed (Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33, 187-497 (2015)). Regions exhibiting significant GUIDE-seq oligonucleotide incorporation may be characterized for off-target cleavage rates in the nuclease-treated cells using PCR-based deep sequencing.
[0346] J. Improved Cas9 Linkers
[0347] In one embodiment, the present invention contemplates a Cas9-DBD construct comprising a linker. In one embodiment, the linker comprises approximately sixty (60) amino acids. Although it may be not necessary to understand the mechanism of an invention, it is believed that such a linker improves the precision of specific genomic target binding. It has been observed that if a DBD binding site is merely repositioned or reoriented relative to the specific genomic target little improvement in precision results. These data indicate that linker flexibility reduces precision via off-target binding due to a large number of sgRNA/DBD binding site permutations that can potentially be cleaved. GFP reporters may be constructed containing alternate spacing and orientation of a DBD binding site relative to a Cas9 target site with a suboptimal NAG PAM. This configuration may also include finer intervals around the most active positions, as well as positions further removed from the Cas9 target site, to better define the distance dependence.
[0348] Fusion proteins such as SpCas9-Zif268 or SpCas9-TAL268 may contain a series of shorter linkers to define a minimal length that retains maximum activity at one (or more) binding site positions, but may place further restrictions on activity at other binding site positions. In one embodiment, the present invention contemplates a Cas9-TALE fusion protein or a Cas9-ZFP fusion protein comprising an optimized linker that can recognize virtually any target site. Cermak et al., Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Research. 2011 July; 39(12): e82-2; Lamb et al., Directed evolution of the TALE N-terminal domain for recognition of all 5′ bases. Nucleic Acids Research. 2013 November; 41(21): 9779-85; Kim et al., A library of TAL effector nucleases spanning the human genome. Nature Biotechnology. 2013 March; 31(3): 251-8; and Briggs et al., Iterative capped assembly: rapid and scalable synthesis of repeat-module DNA such as TAL effectors from individual monomers. Nucleic Acids Research. 2012 Jun. 26.
[0349] Using a GFP system SpCas9-DBD fusion proteins may be constructed with short linkers (e.g., less than sixty amino acids) that display both high activity and more selectivity in the particular arrangement of the Cas9 and DBD binding sites. Although it may be not necessary to understand the mechanism of an invention, it is believed that a maximum improvement in linker length and/or binding site position/orientation for a DBD relative to a Cas9 nuclease will differ between ZFPs and TALEs due to their respective structural folds and docking with the DNA. Mak et al., The crystal structure of TAL effector PthXo1 bound to its DNA target. Science. 2012 Feb. 10; 335(6069): 716-9; and Deng et al., Structural basis for sequence-specific recognition of DNA by TAL effectors. Science. 2012 Feb. 10; 335(6069): 720-3; and Pavletich et al., Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A. Science. 1991 May 10; 252(5007): 809-17. This linkage will also need to be optimized for any Cas9-nuclease fusion to an orthogonal Cas9 used as the DTU (
[0350] A functional B2H selection system was established that may be sensitive to the binding of nuclease-dead SpCas9 (dSpCas9) to a target site upstream of a pair of selectable reporter genes.
[0351] Linkages for NmCas9-DBD fusion proteins may also improve precision and activity using a similar procedure to that described above for SpCas9. In particular, an improvement protocol finds a fusion point (N- or C-terminal) and approximate linker length capable of creating a functional fusion between NmCas9 and a DBD (e.g., TALE or ZFP). PAM specificities have been interrogated for NmCas9 and may be believed to involve a consensus NNNNGATT (SEQ ID NO: 1) sequence. To evaluate NmCas9-DBD fusion activities a suboptimal PAM (i.e., for example, NNNNGAAT (SEQ ID NO: 22)) may be used to assess improvements in activity that are imparted by a fused DBD.
[0352] As discussed above in the context of SpCas9, experiments can be carried out in two steps to validate a functional NmCas9 fusion: 1) using a GFP reporter assay to define an optimal linker length; and 2); a bacterial (e.g., E. coli) two hybrid selection of the linker sequence. Esvelt et al., Orthogonal Cas9 proteins for RNA guided gene regulation and editing. Nature Methods. 2013 November; 10(11): 1116-21. The ability to utilize a NmCas9 system with a bacterial selection system has been widely reported. Hou et al., Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proceedings of the National Academy of Sciences. 2013 Sep. 24; 110(39): 15644-9; Zhang et al., Processing-independent CRISPR RNAs limit natural transformation in Neisseria meningitidis. Molecular Cell. 2013 May 23; 50(4): 488-503; Zhu et al., Using defined finger-finger interfaces as units of assembly for constructing zinc-finger nucleases. Nucleic Acids Research. 2013 Feb. 1; 41(4): 2455-65; Gupta et al., An optimized two-finger archive for ZFN-mediated gene targeting. Nature Methods. 2012 Apr. 29; 9(6): 588-90; Meng et al., A bacterial one-hybrid system for determining the DNA-binding specificity of transcription factors. Nature Biotechnology. 2005 August; 23(8): 988-94; Noyes et al., A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system. Nucleic Acids Research. 2008 May; 36(8): 2547-60; Noyes et al., Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell. 2008 Jun. 27; 133(7): 1277-89; and Enuameh et al., Global analysis of Drosophila Cys.sub.2-His.sub.2 zinc finger proteins reveals a multitude of novel recognition motifs and binding determinants. Genome Research. 2013 June; 23(6): 928-40.
[0353] Functionality of the NmCas9-DBDs may be verified through assays on genomic target sites with DBDs that are programmed to recognize neighboring sequences, where activity can be assessed by T7EI assay. In these genomic assessments, activity on properly spaced/oriented binding sites and the absence of activity on improperly spaced/oriented sites can be determined.
III. Improved Precision Using Mutant pDBDs
[0354] In some embodiments, the present invention contemplates a chimeric Cas9 system that dramatically improves the precision and targeting range of the Cas9 nuclease. In one embodiment, precision and targeting range is improved by augmentation of its specificity with an attached pDBD. In one embodiment, the Cas9-pDBD precision is tunable. In one embodiment, the tunable precision includes, but is not limited to, specificity and/or affinity of the associated pDBD. Although it is not necessary to understand the mechanism of an invention, it is believed that therapeutic genome editing, where cleavage precision is of paramount importance, utilizing customized Cas9-pDBDs will play a role in clinical development process.
[0355] A. Improved Precision With Mutant Cas9 pDBD Fusions
[0356] The data presented herein evaluates an improved precision of a SpCas9.sup.MT #-pDBD framework at SpCas9 target sites (e.g., for example, TS2, TS3 & TS4; all with NGG PAMs). SgRNAs that recognize these sites have defined on- and off-target activities, which provide a known benchmark to assess improvements in precision. A ZFP was constructed to recognize a sequence near each target site and compared the editing activities of sgRNA programmed SpCas9, SpCas9.sup.MT3 and SpCas9.sup.MT3-ZFP.sup.TS #.
[0357] To discover new off-target sites of SpCas9.sup.MT3-ZFPs, a GUIDE-seq analysis was performed on SpCas9 and SpCas9.sup.MT3-ZFP.sup.TS #. These data are consistent with the focused deep sequencing data of known off target sites: there is a dramatic improvement in precision for the SpCas9.sup.MT3-ZFP.sup.TS #. In addition, ESAT peak picking analysis (garberlab.umassmed.edu/software/esat) of the SpCas9.sup.MT3-ZFP.sup.TS # GUIDE-seq data reveal that there is a dramatic reduction in SpCas9.sup.MT3-ZFP.sup.TS # off-target activity genome-wide.
[0358] The precision of SpCas9-ZFPs to SpCas9 was compared using sgRNAs with previously defined off-target sites.sup.14,25. Three different four-finger ZFPs were constructed to recognize 12 base pair sequences neighboring the TS2, TS3 or TS4 sgRNA target sites for use as SpCas9.sup.MT3-ZFP fusions.
[0359] Two TALE arrays were also programmed to target SpCas9.sup.MT3 to TS3 and TS4 (TALE-TS3 and TALE-TS4). Nuclease activity at the TS3 site but not TS4 can be restored by the related SpCas9.sup.MT3-TALE fusion.
TABLE-US-00003 TABLE 3 Average nuclease activity (%Lesion) values of TS3 sgRNA mismatches SEQ ID sgRNA Cas9.sup.WT- Cas9.sup.MT3- NO: Name sgRNA sequence Cas9.sup.WT ZFP.sup.TS3 ZFP.sup.TS3 23 TS3 GGTGAGTGAGTGTGTGCGTG 22.44 24.39 19.34 24 TS3-M1 gCCTGAGTGAGTGTGTGCGTG 17.17 21.9 1.28 25 TS3-M2 GGCAAGTGAGTGTGTGCGTG 0.41 3.24 N.D 26 TS3-M3 GGTGCTTGAGTGTGTGCGTG N.D N.D. N.D 27 TS3-M4 GGTGAGGAAGTGTGTGCGTG N.D N.D N.D 28 TS3-M5 GGTGAGTGCTTGTGTGCGTG N.D N.D N.D 29 TS3-M6 GGTGAGTGAGCATGTGCGTG N.D N.D N.D 30 TS3-M7 GGTGAGTGAGTGCATGCGTG N.D N.D N.D 31 TS3-M8 GGTGAGTGAGTGTGGTCGTG N.D N.D N.D 32 TS3-M9 GGTGAGTGAGTGTGTGTATG N.D 1.57 N.D 33 TS3-M10 GGTGAGTGAGTGTGTGCGCA N.D N.D N.D N.D: Not Desecred
Consistent with an increased sensitivity to disruptions in sgRNA-target interactions, SpCas9.sup.MT3-ZFPs exhibit reduced activity with truncated sgRNAs.sup.25, confirming that a higher degree of guide-target site complementarity is required for efficient cleavage with our chimeras.
[0360] B. Cas9-pDBD System Tunability
[0361] One advantage of a SpCas9-pDBD system over other Cas9 platforms is the ability to rapidly tune the affinity and specificity of the attached pDBD to improve its precision. In one embodiment, improved precision of SpCas9.sup.MT3-ZFP.sup.TS2 was achieved by truncating the four zinc finger array to reduce its affinity for off-target site OT2-2. High activity at the TS2 target site was maintained despite removal of either of the terminal zinc fingers from SpCas9.sup.MT3-ZFP.sup.TS2 However, these truncations reduced or eliminated activity at OT2-2, reflecting a profound improvement in the precision of SpCas9.sup.MT3-ZFP.sup.TS2. Similarly, utilization of a ZFP.sup.TS2* that recognizes an alternate sequence neighboring the TS2 guide target site also abolishes off-target activity at OT2-2.
[0362] Given the improvements in precision realized by these selective alterations in the composition of a ZFP, it should be possible to achieve even greater enhancements in precision via more focused modification of a ZFP composition and a linker connecting it to SpCas9. These data demonstrate the functionality of SpCas9-pDBD chimeras, their broader targeting range and improved precision when compared to standard SpCas9.
[0363] C. Increased SpCas9 Precision Through Direct and Drug-Dependent pDBD Fusions
[0364] In one embodiment, the present invention establishes a framework to facilitate use of the SpCas9-pDBD system to efficiently design, assay and permute this platform to achieve single-site precision for editing the human genome. There are a number of parameters that remain to be optimized in the SpCas9-pDBD system. For example, an initial four-finger SpCas9-ZFPs still retains a low level of off-target activity.
[0365] 1. Improved Precision Using SpCas9-pDBD Frameworks
[0366] In some embodiments, the present invention contemplates a method utilizing different parameters regulating precision and activity of a SpCas9-pDBD framework to define a framework for highly active and extremely precise nucleases.
[0367] a. The Cas9-pDBD Linker
[0368] In one embodiment, a SpCas9.sup.MT3-pDBD construct is connected by a 60-aa linker and displays improvements in precision.
[0369] For example, a GFP reporter assay is used herein to identify improved linker lengths joining SpCas9 to either ZFPs or TALEs that increase their fidelity of target site cleavage. In one embodiment, the GFP reporter assay defines a minimal linker length for SpCas9-Zif268 and SpCas9-TAL268 constructs that retains maximum activity at one (or more) binding site positions, but places further restrictions on the activity at other positions. Improved linkers may be tested for both activity and precision in the context of SpCas9.sup.MT3-pDBDs designed for TS2/TS3/TS4 genomic sites.
[0370] The most promising linkers can be further evaluated by GUIDE-seq to assess genome-wide off-target activity. GUIDE-seq results may be verified by targeted deep sequencing of PCR products spanning these loci.
[0371] b. Improved Precision Using DNA Recognition Modules (ZFP or TALE)
[0372] The data herein has shown that the precision of SpCas9-ZFPs is dependent on the number of ZFP recognition modules, where excessive affinity reduces precision.
[0373] c. SpCas9 Modifications for pDBD Functional Dependence
[0374] As shown above, PAM-attenuated SpCas9.sup.MT3 displays residual nuclease activity at TS2/TS3/TS4 in the absence of the pDBD. Further, attenuation of SpCas9 DNA-binding affinity increases absolute pDBD dependence and thus its precision. In one embodiment, the present invention contemplates at least one mutation in at least two regions of SpCas9 to reduce its intrinsic activity including, but not limited to; i) PAM recognition residues, and ii) phosphate-contacting residues near the PAM binding site.
[0375] In one embodiment, the present invention contemplates a Cas9 complex comprising PAM recognition residue mutations. In one embodiment, the mutations are located at arginine residues (e.g., for example, R1333 & R1335) that make base-specific PAM contacts. In one embodiment, the mutations are a combination mutations (e.g. combining R1333K & R1335K). Such combination mutations are believed to further attenuate independent SpCas9 activity but retain activity in the presence of a fused pDBD. The double-strand break (DSB) formation rate in the absence and presence of the pDBD may be estimated by qPCR-based quantification of the rate of capture of GUIDE-seq oligos at each target site (TS2/TS3/TS4) as a proxy for deep sequencing.
[0376] In one embodiment, the present invention contemplates arginine or lysine residue mutations that contact DNA phosphates. Although it is not necessary to understand the mechanism of an invention, it is believed that neutralization of phosphate contacts within pDBDs can modulate their binding affinities. In one embodiment, SpCas9 is mutated at lysine or arginine residues that are positioned to make non-specific contacts with the DNA downstream of the PAM-contacting residues, and so should not affect the efficiency of R-loop formation or the precision of DNA cleavage. The activity of these mutants may be assayed as described for the PAM recognition mutants.
[0377] Mutations can be identified that render SpCas9 completely dependent on an attached pDBD. Since the capture of GUIDE-seq oligos is not be a perfect surrogate for the rate of DSB formation, lesion rates may be assessed for the most promising mutants by deep sequencing. Alternatively, lysine or arginine mutants can be combined with PAM mutants for further attenuation of SpCas9 DNA-binding affinity. Although it is not necessary to understand the mechanism of an invention, it is believed that the improved precision of the presently disclosed SpCas9.sup.MT-pDBDs for TS2/TS3/TS4 are vastly superior to those previously reported. To confirm that superiority, the precision should be shown to be cell line-independent via deep-sequencing and GUIDE-seq analysis.
[0378] 2. Allele-Specific Targeting Using Single Nucleotide Polymorphisms
[0379] The ability to selectively inactivate specific dominant-negative alleles could have great utility. For example, single nucleotide polymorphisms (SNPs) have been proposed as discriminators for siRNA-mediated silencing of Huntingtin alleles that contain CAG repeat expansions. Cas9s with allele-specific activity could provide a therapeutic strategy to disable specific harmful alleles. SpCas9 has been used to achieve incomplete discrimination using a SNP within the guide recognition sequence. Analysis of the presently disclosed Cas9.sup.MT3-ZFP framework has revealed dramatically improved discrimination for single-base changes within a target sequence.
[0380] For example, a COSMIC database may be used comprising a list of validated cell line SNPs to test the feasibility of this approach (e.g., identifying twenty-six heterozygous SNPs in U2OS cells). Candidate loci may then be sequenced to confirm the reported SNP heterozygosity and then design SpCas9.sup.MT-pDBD/sgRNA combinations to target a single allele. Allelic targeting ratios (relative to negative controls lacking the cognate sgRNA or the appended pDBD) may be determined by a frequency that each allele captures GUIDE-seq oligos (via deep sequencing). If DSBs are restricted to a single allele, then only the targeted SNP should be found neighboring the GUIDE-seq oligo sequence. As SpCas9 mutants are identified that have improved attenuation, single base change discrimination can then be examined. Although it is not necessary to understand the mechanism of an invention, it is believed that SpCas9.sup.MT-pDBDs have great potential for allele-specific targeting but should be subjected empirical verification. If necessary, discrimination can be tested using paralogous sequences that differ by a single base within the genome (e.g. CCR2 and CCR5, which contain many >30 bp regions that differ by a single nucleotide). Relative editing efficiencies on one paralog or the other can be assessed by the PCR/deep sequencing approach described above.
[0381] 3. Drug- or Photo-Dependent spCas9-pDBD Nuclease Regulation
[0382] Small molecule- or photo-dependent dimerization systems have been developed that permit the control of activity of a two-component system. Since SpCas9/sgRNA off-target activity is dose dependent, these systems have been adapted to regulate the association of two fragments of Cas9 (e.g., Split-Cas9).
[0383] In one embodiment, the presently disclosed SpCas9-pDBD system comprises a drug- or photo-dependent dimerization system that regulates the association of SpCas9 and the pDBD. In one embodiment, the present invention contemplates a rapamycin-dependent Cas9 complex comprising a SpCas9-FRB/FKBP-ZFP and/or a SpCas9-FRB/FKBP-TALE and/or Split-SpCas9.sup.MT-pDBD.
[0384] 4. SpCas9-FKBP/pDBD-FRB System Improvements
[0385] a. SpCas9-FKBP/pDBD-FRB Linkers
[0386] In one embodiment, the present invention contemplates a GFP reporter system comprising genomic targets to identify a optimal linker length joining Cas9 to a dimerization domain and the pDBD to a dimerization domain that maximizes activity and restricts the relative spacing and orientation of the active binding sites. In one embodiment, the linker joins an SpCas9-FKBP domain and an pDBD-FRB domain.
[0387] b. ZFP or TALE DNA Recognition Modules
[0388] In one embodiment, the present invention contemplates DNA recognition modules that improve SpCas9-FKBP/pDBD-FRB precision at sites including, but not limited to, TS2, TS3 and TS4 sites. Although it is not necessary to understand the mechanism of an invention it is believed that the optimal number and composition of recognition modules in the pDBD may differ when compared to a Cas9-pDBD covalent system, since greater cooperativity in the binding is likely to occur in the covalent system.
[0389] c. Nm-dCas9 as pDBDs
[0390] In one embodiment, the present invention contemplates a nuclease-dead NmCas9 as a pDBD for an association through dimerization (
[0391] d. Nuclear Export Sequences
[0392] Photo-dependent TALE regulators or drug-dependent Split-SpCas9 fusions have been reported to decrease off-target activity by fusing a nuclear export sequence (NES) instead of a Nuclear Localization Sequence (NLS) to one component. It is believed that an Cas9-NES fusion protein is restricted to the cytoplasm until the inducer is present (light/drug), at which point an NLS-tagged partner can drive nuclear import. In one embodiment, an NES-SpCas9.sup.MT-FRB fusion protein may be excluded from the nucleus in the absence of rapamycin. In one embodiment, a combination of an NLS with NES-SpCas9.sup.MT-FRB fusion protein facilitates a transit between the nucleus and cytoplasm in the presence of rapamycin allowing more efficient import of the partner that is located in the cytoplasm (e.g.
[0393] Assessments of activity and precision for constructs of particular interest may occur at an TS2/TS3/TS4 loci initially by T7EI assays such that dose and duration of rapamycin exposure on activity and precision can be examined. The precision of the most promising constructs may be evaluated by GUIDE-seq followed by targeted deep sequencing (e.g.
[0394] e. The Abscisic Acid Regulatory System
[0395] A drug-based dimerization system has been previously described based on a plant hormone (i.e., for example, abscisic acid) and its protein partners (ABI & PYL; Liang, F. S., Ho, W. Q. & Crabtree, G. R. Engineering the ABA plant stress pathway for regulation of induced proximity. Science Signaling 4, rs2-rs2 (2011)). Abscisic acid is believed to be bioavailable, and the plant-derived components should have minimal crosstalk with endogenous factors (unlike a rapamycin system). Consequently, a SpCas9.sup.MT-ABI/PYL-pDBD system may be useful for drug-dependent regulation.
[0396] Photo-dependent (e.g., for example, visible light or non-visible light) regulation of TALE-effector and Split-SpCas9 nuclease function have been described. In one embodiment, the present invention contemplates a light-inducible dimerization domain comprising nMag/pMag or CRY2/CIB1 (Nihongaki, Y., Kawano, F., Nakajima, T. & Sato, M. Photoactivatable CRISPR-Cas9 for optogenetic genome editing. Nature biotechnology 33, 755-760 (2015)).
[0397] D. Improved Precision With NmCas9-pDBD and SaCas9-pDBD Frameworks
[0398] Development of a SpCas9-pDBD system (supra) has benefited from extensive data on the 1368-aa SpCas9 protein. However, full realization of genome editing goals involves the development of additional Cas9 orthologs to provide additional PAM specificities and simultaneous deployment of Cas9s with orthogonal guides. In addition, for clinical deployment, the physical size of SpCas9's limits in vivo deliverability to platforms such as AAV vectors and synthetic mRNAs. Alternatively, most Type II-C Cas9s (e.g. N. meningitidis; 1082 residues) and a few Type II-A Cas9s (e.g. S. aureus; 1053 residues) are considerably smaller than SpCas9 and may have clinical delivery advantages over SpCas9 platforms.
[0399] For example, a compact Cas9 (i.e., for example, NmCas9) was recently validated for genome editing. Alternatively, an SaCas9 platform was also characterized, and its utility for editing in an all-in-one (Cas9+sgRNA) AAV format was documented. Because Cas9s is believed to have some propensity for promiscuous cleavage, compact orthologs should be modified to provide an enhanced precision to tap their clinical potential. In one embodiment, the present invention contemplates NmCas9- and SaCas9-based editing platforms with single-genomic-site accuracy.
[0400] Preliminary data using NmCas9 demonstrate that a PAM consensus is 5′-N4GATT-3′, with considerable variation allowed during bacterial interference (data not shown). However, PAM requirements are more stringent in mammalian cells, and efficient editing has only been documented at N4G(A/C/T)TT, N4GAC(A/T), N4GATA, and N4GTCT PAMs.
[0401] The structure of NmCas9 is not known, nor are associated PAM-recognition residues defined. Nonetheless, some information can be discerned from an A. naeslundii Type II-C Cas9 structure (AnCas9). For example, two positively charged NmCas9 residues (e.g., Lys1013 and Arg1025) are particularly well-conserved in Type II-C Cas9 alignments, and the corresponding AnCas9 residues map to a candidate PAM interaction region. The activity of the NmCas9 K1013A/R1025A double mutant (hereafter NmCas9.sup.DM1) is severely attenuated in the GFP assay in HEK293 cells, but can be rescued by an appended Zif268 pDBD (with a Zif268 binding site downstream of the PAM).
[0402] 1. PAM Attenuation/pDBD Fusion Parameters for Enhanced-Precision NmCas9 and SaCas9
[0403] a. NmCas9.sup.MT-pDBD And SaCas9.sup.MT-pDBD Frameworks
[0404] The data presented herein demonstrates that a fused pDBD (either N- or C-terminal, with a 60-aa linker) allows editing of targets with nonfunctional PAMs having a pDBD binding site 5 bp from the PAM.
[0405] b. NmCas9 and SaCas9 Accuracy
[0406] In one embodiment, the present invention contemplates a GUIDE-seq assay to compare the editing precision of Cas9.sup.WT orthologs and the Cas9.sup.MT-pDBD variants. In one embodiment, the GUIDE-seq assay identifies Indel frequencies at off-target sites. In one embodiment, the Indel frequencies are quantified by deep-sequencing PCR-amplified loci. In one embodiment, mismatch tolerance at chromosomal editing sites measure the effects of PAM attenuation and pDBD fusion. In one embodiment, off-target propensities of the on-vs. off-target lesion rate ratios identify successful pDBD tunability by varying the number of ZFPs or TALE modules.
[0407] 2. NmCas9 and SaCas9 Drug-Inducible Dimerization Systems
[0408] One disadvantage with AAV delivery of active Cas9/guideRNA combinations is that Cas9 activity (both on- and off-target) may persist indefinitely. Accordingly, by successfully implementing drug-inducible Cas9.sup.MT-pDBD association in the context of one or both compact Cas9s, the system's accuracy enhancements are further improved, and by preventing on-going off-target lesions once the drug is withdrawn and after editing is complete. In one embodiment, the present invention contemplates a NmCas9 and/or a SaCas9 drug-inducible dimerization system.
[0409] For example, DNA-binding modules (e.g., ZFP and TALE) attached to NmCas9 or SaCas9 could both be RNA-guided. NmCas9 and its guideRNAs are orthogonal to all Type II-A Cas9s and sgRNAs tested to date, and SaCas9's expected orthogonality and its sgRNAs can be easily confirmed as well. Drug-inducible dimerization modules (e.g., for example, FRB/FKBP or ABI/PYL and all pair-wise combinations) can be fused to a PAM-attenuated but catalytically active version of a compact Cas9, and the nuclease-dead version of the other. Whether dCas9 can fulfill the same precision-enhancing function provided by the pDBD may then be tested. Initially, a GFP reporter system is used to improve PAM/target orientation and spacing, and then tested using actual chromosomal loci. If this framework can edit its chromosomal loci target sites efficiently, as judged by T7EI assay, an unbiased assay can define the precision of this system relative to the drug-induced pDBD dimerization system.
[0410] 3. Functional AAV NmCas9.sup.MT-pDBD and SaCas9.sup.MT-pDBD Constructs
[0411] It is believed that native NmCas9 and SaCas9 ORFs are ˜3.25 and 3.16 kb, respectively, so even with added NLSs and minimal expression/processing signals, they are well under the ˜4.5 kb packaging limit of current AAV vectors. For example, a four-finger ZFP with a 60-aa linker would increase the ORF size by an additional 0.6 kb, still well within the AAV vector size limit. As explained herein, some embodiments of the present invention minimize linker length to further reduce an AAV Cas9-pDBD packaging size. In some embodiments, the present invention contemplates the delivery of NmCas9.sup.MT-ZFPs and SaCas9.sup.MT-ZFPs via AAV into cultured cells. In one embodiment, the AAV comprises a liver-specific promoter. In one embodiment, the AAV is an AAV8 serotype. In one embodiment, the AAV8 serotype is hepatocyte-tropic. In one embodiment, the cultured cells comprise HepG2 cells. In one embodiment, the genome of the HepG2 cells comprise Pcsk9 as an editing (NHEJ) target. In one embodiment, the AAV expression constructs is a transfection plasmid.
[0412] E. Cas9-pDBD Mediated Gene Correction of Defective CYBB in CGD
[0413] Chronic granulomatous disease (CGD), a disorder of phagocytic function, generally presents early in life with severe recurrent infections. The estimated incidence per live birth is 1/200,000 in the US. Conventional clinical management allows many patients to reach adulthood, but CGD patients have only 50% cumulative survival at age 50, and the only curative therapy is hematopoietic stem and progenitor cell (HSPC) transplantation. The molecular defects causing CGD affect the phagocyte NADPH oxidase responsible for the generation of microbicidal reactive oxygen species. About 60% of cases are X-linked (X-CGD) due to mutations in CYBB, an Xp21.1 gene that encodes gp91phox, the glycoprotein subunit of the oxidase.
[0414] CGD has long been considered a prime target for gene therapy. Clinical improvement should occur with replacement of a low level of oxidase activity, as CGD patients with as little as 3% normal activity show a much milder phenotype. A normal phenotype could be achieved with high-level correction of only 5-10% of phagocytes, as occurs in asymptomatic XCGD carriers with a skewed Lyon distribution of X-inactivation. As all phagocytes are bone marrow-derived, gene therapy approaches have aimed to replace the defective gene ex vivo in blood or bone marrow HSPCs, and then engraft the autologous cells in the patient. For example, one such trial, using an SFFV-based retroviral vector, showed initial correction of the CGD phenotype in 2 of 3 subjects, but gene expression was eventually diminished or silenced. Further, peripheral blood myeloid cells showed expansion of clones containing insertions at loci associated with immortalization or leukemia. All patients eventually died or underwent HSPC transplantation.
[0415] Current CGD gene therapy approaches are focused on gene replacement in CD34+ HSPCs through insertion via self-inactivating lentivirus or knock-in at a safe-harbor locus (AAVS1) via ZFNs. A current trial employs a self-inactivating lentiviral vector encoding a chimeric myeloid promoter to drive CYBB expression. However, achieving near wild-type gp91phox expression requires 8 or more integrations per cell. Because lentivirus generates insertions throughout the genome there is also danger of viral integration causing disruption or dysregulation of nearby genes.
[0416] Targeted insertion in the AAVS1 locus limits random integration, but suffers from the challenge of finding a myeloid-specific promoter that can drive high level gp91phox expression with only one integration site. Ideally, gene repair at the defective locus would harness endogenous regulatory elements to drive appropriate gene expression. As inactivating mutations in CYBB are broadly distributed throughout the coding sequence, tailoring a gene correction cassette to each patient's specific mutation is impractical.
[0417] In one embodiment, the present invention contemplates a minigene cassette flanked by a splice acceptor and polyadenylation site, for insertion into an early intron to capture transcription from the locus and correct any downstream mutations.
[0418] To define neutral sites for repair cassette insertion, the CYBB regulatory landscape in three myeloid cell lines was analyzed using ENCODE H3K4Me1 ChIP-seq data and 3C analysis. These data revealed a complex regulatory landscape that extends to CYBB introns 1-3. In one embodiment, the present invention contemplates a gene correction strategy comprising high efficiency and precision, as well as a minimal impact of minigene insertion on gene expression levels, as some insertion sites may disrupt regulatory elements. For example, a Cas9.sup.MT-pDBD nuclease may be used for correction of CYBB defects in CD34+ HSPCs from XCGD patients through a systematic optimization including, but not limited to: i) pilot experiments in XCGD-PLB-985 cells, a human myeloid cell line with a disruption in exon 3 of CYBB9; ii) optimization of gene correction in normal CD34+ HSPCs; and iii) assessment of efficacy in HSPCs from XCGD carriers. Although it is not necessary to understand the mechanism of an invention, it is believed that these preliminary data identifies improved nuclease precision and efficiency to provide a clinically effective platform for CGD gene therapy.
[0419] 1. SpCas9.sup.MT-pDBD Nuclease And Donor Constructs
[0420] In one embodiment, the present invention contemplates assessing nuclease activity and precision in HEK293T cells. Preliminary data show that CYBB introns 1 & 2 are compatible with sgRNAs having NGG PAMs and are predicted to be highly active based on the latest genome-wide sgRNA analyses. These sgRNAs have few predicted off-target matches by CRISPRseek analysis and avoid potential regulatory regions identified in ENCODE data. SpCas9 nuclease activity mediated by sgRNAs of interest may be used to determine and identify active guides.
[0421] In one embodiment, the present invention contemplates a construct comprising Cas9.sup.MT-pDBDs for active sgRNAs. Nuclease activity may be confirmed by T7EI, and then GUIDE-seq followed by focused deep sequencing to determine off-target profiles. In one embodiment, active nuclease pDBDs can be tuned and precision improved to eliminate residual off-target activity. One advantage of the presently disclosed embodiments in contrast to conventional methods is the achievement of precise editing with off-target events that are undetectable by Illumina short-read sequencing. In one embodiment, the construct comprises single-stranded oligonucleotide (ssODN) donors with homology arms to the target site that encode a unique restriction enzyme (RE) site within the region. HDR efficiency may be assayed by PCR amplification and RE digestion.
[0422] 2. Gene Correction Efficiency
[0423] XCGD-PLB-985 cells provide a model for gene correction of CYBB due to the presence of a single defective allele. In one embodiment, nucleofection conditions are improved for XCGD-PLB-985s to maximize the rate of nuclease-based HDR insertion of the validated ssODN compared to that of indel formation (e.g., using a T7EI assay). HDR efficiency and precision level obtained for each nuclease may then be confirmed using GUIDE-seq.
[0424] XCGD-PLB-985 cells were nucleofected with SpCas9-sgRNA, a Cybb-minigene cassette, and GFP (as a marker for nucleofection) and then flow-sorted for GFP expression. GFP(+) and (−) cells were assessed for SpCas9-induced lesions by T7EI assay, and for NHEJ-mediated minigene insertion by PCR amplification of a newly-formed junction.
[0425] Although it is not necessary to understand the mechanism of an invention, it is believed that the present methods result in dramatic improvements in rescue frequency in comparison to conventionally available assays. Alternatively, the present invention contemplates a knock-in of a human codon-optimized minigene rescue construct comprising sequence features distinct from an endogenous locus.
[0426] Donor DNA insertion efficiency can be evaluated by qPCR, and the integrity of donor integration assessed by PacBio SMRT sequencing to define the donor cassette insertion rate and fidelity. The rate of spurious donor integration can be determined by LAM-PCR sequencing. To increase rates of HDR, alternate DNA repair pathways can be inhibited. Differentiation of XCGD-PLB-985 cells containing targeted minigene insertions into neutrophils can assess the functional effects of gene correction. The rate of splice donor capture by an integrated minigene can be determined by qRT-PCR. XCGD-PLB-985-derived neutrophils can be determined by flow cytometric assays of mAb7D5 binding for gp91phox protein expression, dihydrorhodamine (DHR) fluorescence for NADPH oxidase activity, and/or loss of microbial propidium iodide staining for microbicidal activity. Although it is not necessary to understand the mechanism of an invention, it is believed that the above embodiments are able to define minigene insertion sites that permit an efficient correction of CYBB defects in XCGD-PLB-985 cells by optimizing a splice acceptor sequence of a repair cassette for efficient gene capture. Functional assays should allow correlation with correction of the CGD phenotype.
[0427] F. Gene Correction Efficiency And Precision in CD34+ HSPCs
[0428] It is generally believed that achieving high levels of donor DNA integration via nuclease-mediated HDR is more challenging in primary HSPCs than in transformed cell lines. To overcome this disadvantage of conventional methods, due to a limited availability of XCGD patient-derived CD34+ cells, the presently disclosed nuclease-based knock-in strategy may be fine-tuned using CD34+ HSPCs from healthy male donors. It has been reported that SpCas9/sgRNA gene inactivation has been performed through the delivery of plasmid-encoded components, but efficient rates of donor DNA integration and cell viability in another study required delivery of nucleases as mRNAs.
[0429] In one embodiment, the present invention contemplates a method comparing the efficiency of gene editing and cell viability for SpCas9.sup.MT-DBDs/sgRNA delivered by plasmid vs. mRNA/sgRNA nucleofection. For example, target site lesion rates can be assessed by T7EI assay.sup.19, and cell viability by Annexin V and 7-Aminoactinomycin D FACS analyses. Further, the efficiency of HDR can be examined using different donor DNAs encoding the required repair cassette. Due to potential plasmid toxicity in CD34+ cells, assays may be performed in both plasmid-based, minicircle-based and/or viral DNA donors (IDLV, Adenoviral and AAV, respectively), particularly AAV6, which efficiently transduces CD34+ HSPCs and has proven to be an efficient non-integrating donor for nuclease-mediated HDR. In some embodiments, the timing of the donor and nuclease delivery can be varied to maximize the efficiency of HDR. In other embodiments, small molecules that support progenitor maintenance during expansion may be used. The precision of the nucleases and the integrity and specificity of donor integration can be assessed as described above.
[0430] XCGD-like CD34+ HSPCs have recently been created by transducing normal CD34+ cells with a Cerulean-marked lentivirus encoding shRNAs targeting CYBB transcripts. This system can be utilized to assess the efficiency of CYBB gene correction mediated by the optimal nuclease and donor DNA, with a recoded minigene that is not targeted by the shRNAs, to determine the fraction of macrophages and neutrophils differentiated from marked CD34+ cells with restored NADPH oxidase activity and function. Although it is not necessary to understand the mechanism of an invention, it is believed that with the presently disclosed improved nucleases, alternate donor DNA platforms and supporting culture conditions, are able to achieve high levels of targeted gene correction in CD34+ HSPCs, that equal or exceed the 5-10% level needed for a functional CGD cure.
[0431] G. Efficient Gene Correction in XCGD CD34+ HSPCs
[0432] In one embodiment, the present invention contemplates a nuclease-mediated CYBB correction in SCGD patient CD34+ HSPCs. In one embodiment, the nuclease comprises a minigene repair cassette having mutations. In one embodiment, improved targeted gene correction conditions (e.g., for example, nucleases, donors, cultures) that are shown to improve efficiency. In one embodiment, the method determines the fraction of functionally corrected macrophages and neutrophils differentiated from these cells.
[0433] RNA levels can also be assessed for a minigene donor cassette and the fraction of correctly spliced RNAs between the endogenous exon and the minigene cassette. In other embodiments, an in vivo engraftment potential and function of nuclease-manipulated HSPCs. Preferably, NSG-3GS mice can be evaluated, which unlike NSG mice, produce functional human phagocytes. Although it is not necessary to understand the mechanism of an invention, it is believed that the presently disclosed method achieves a frequency of appropriate RNA splicing with a repair cassette sufficient to generate gp91phox in patient-derived XCGD cells comprising endogenous locus regulatory elements.
[0434] H. Excision or Inactivation of HIV Proviral DNA in Reservoir Cells.
[0435] Highly active antiretroviral therapy (HAART) has dramatically changed the prognosis for individuals infected with HIV-1. Yet, even when HIV-1 viremia has been well controlled by these drugs for years, termination of HAART results in viral rebound, most likely coming from latent provirus in long-lived memory CD4.sup.+ T cells. So long as latent HIV-1 provirus persists—probably for the life of the infected individual—HAART will be required. Most efforts to eradicate latent HIV-1 proviruses have focused on reactivation of proviral transcription to potentiate the elimination of cells bearing HIV-1 provirus. To date, though, such reactivation efforts have largely been unsuccessful. Alternative approaches for the effective elimination of latent HIV-1 provirus are therefore needed.
[0436] Recent advances in the development of targeted gene editing tools provide a potential method for direct inactivation or excision of latent HIV-1 provirus. Specifically, the Cas9/CRISPR programmable nuclease system, a versatile platform for the generation of targeted double-strand breaks within the genome, has been shown to excise HIV-1 provirus in cell lines. However, the activity and precision of the Cas9/CRISPR system is suboptimal for clinical application. SpCas9.sup.MT3-ZFPs have been developed that specifically target the HIV LTR with higher precision than wild-type SpCas9.
[0437] Three different SpCas9.sup.MT3-ZFPs were generated that target different regions of the HIV LTR (T5, T6 and Z1;
IV. Deep Sequencing Analysis of Off-Target Activity
[0438] To more broadly assess improvements in Cas9-pDBD precision, PCR products were deep sequenced spanning previously defined off-target sites for sgRNA.sup.TS2/TS3/TS4; 14,25 as well as several additional genomic loci that have favorableZFP.sup.TS2/TS3/TS4 recognition and were predicted using CRISPRseek.sup.21,22 to have some complementarity to the TS2/TS3/TS4 guide sequences. Nuclease activity was compared between SpCas9, SpCas9.sup.MT3, SpCas9.sup.WT-ZFP.sup.TS2/TS3/TS4 and SpCas9.sup.MT3-ZFP.sup.Ts2/Ts3/Ts4 at these target and off-target sites, and found that SpCas9.sup.MT3-ZFP.sup.TS2/TS3/TS4 dramatically increased the precision of target site cleavage.
V. Clinical Applications and Insights
[0439] Some embodiments of the present invention encompass of the activity of SpCas9-pDBD chimeric activity that provide new insights into a mechanism of target site licensing by SpCas9 and the methods by which this mechanism can be exploited to improve precision.
[0440] Mutations to the SpCas9 PAM interacting domain may introduce a third stage of licensing (pDBD site recognition) for efficient target site cleavage within the SpCas9.sup.MT-pDBD system. The weakened interaction between mutant Cas9 and the PAM sequence now necessitates increased effective concentration for nuclease function that is achieved by the high affinity interaction of the tethered pDBD with its target site. This dramatically improves precision as assessed using targeted deep sequencing and GUIDE-seq analysis. Compared with previous GUIDE-seq analysis of TS2, TS3 and TS4 targets for SpCas9, five, three and three of the top 5 off-target sites, respectively, were found that were previously described.sup.17. The discrepancy between these studies could be due to our lower sequencing depth, the use of an alternate cell line, or different delivery methods. Nonetheless, the present analysis excludes the presence of a new class of highly active off-target sites that are generated by the fusion of the ZFP to Cas9. This system has advantages over other previously described Cas9 variant systems that improve precision.sup.10,25-30. The presently disclosed SpCas9.sup.MT-pDBD system increases the targeting range of the nuclease by expanding the repertoire of highly active PAM sequences. This is in contrast to dimeric systems (e.g., for example, dual nickases or FokI-dCas9 nucleases) that have a more restricted targeting range due to the requirement for a pair of compatible target sequences. Moreover, the presently disclosed chimeric system may be compatible with either of these dimeric nuclease variants, providing a further potential increase in precision while also expanding the number of compatible target sites for these platforms. In addition, the affinity and the specificity of the pDBD component can also be easily tuned to achieve the desired level of nuclease activity and precision for demanding gene therapy applications.
[0441] SpCas9-ZFPs targeting TS2/TS3/TS4 were programmed with four-finger ZFPs, as it was believed that these would have an optimal balance of specificity and affinity, for example, SpCas9.sup.MT3-ZFP.sup.TS3. However, SpCas9.sup.MT3-ZFP.sup.TS2 resulted in improved precision by utilizing a three finger ZFP demonstrating pDBD flexibility. In addition to tuning a pDBD, further improvements by adjusting linker lengths and its composition should realize improvements in precision (and potentially activity) by further restricting the relative orientation and spacing of the SpCas9 and pDBD. Finally, it should be possible to generate Cas9-pDBD fusions for Cas9 orthologs from other species that have superior characteristics for gene therapy applications (e.g. more compact Type IIC Cas9 nucleases.sup.49-50 for viral delivery). Ultimately, for gene therapy applications where precision, activity and target site location are of paramount importance, the expanded targeting range and precision achieved by the Cas9-pDBD framework provides a potent platform for the optimization of nuclease-based reagents that cleave a single target site in the human genome.
VI. Kits
[0442] In another embodiment, the present invention contemplates kits for the practice of the methods of this invention. The kits preferably include one or more containers containing a Cas9 nuclease—DNA targeting unit fusion protein to practice a method of this invention. The kit can optionally contain a Cas9 nuclease fused to a dimerization domain and a DNA-targeting unit fused to a complementary dimerization domain. The kit can optionally include a zinc finger protein. The kit can optionally include a transcription activator-like effector protein. The kit can optionally include a homeodomain protein. The kit can optionally include a orthogonal Cas9 protein serving as the DNA targeting unit. The kit can optionally include a Cas9 fusion protein comprising a mutated PAM recognition domain. The kit can optionally include a single guide RNA molecule or gene, complementary to a specific genomic target. The kit can optionally include a second single guide RNA molecule or gene, complementary to a specific genomic target for the orthogonal Cas9 protein serving as the DNA-targeting unit. The kit can optionally include a truncated single guide RNA molecule or gene, completely complementary to a desired specific genomic target. The kit can optionally include enzymes capable of performing PCR (i.e., for example, DNA polymerase, Taq polymerase and/or restriction enzymes). The kit can optionally include a pharmaceutically acceptable excipient and/or a delivery vehicle (e.g., a liposome). The reagents may be provided suspended in the excipient and/or delivery vehicle or may be provided as a separate component which can be later combined with the excipient and/or delivery vehicle. The kit may optionally contain additional therapeutics to be co-administered with the nuclease to drive the desired type of DNA repair (e.g. Non-homologous end joining or homology directed repair). The kit may include a small molecule to drive drug-dependent dimerization of the Cas9-nuclease and the DNA targeting unit. The kit may include an exogenous donor DNA (either single stranded or duplex) that can be used as a donor for introducing tailor-made changes to the DNA sequence. The kit may include a small molecule to drive a change in subcellular localization for the Cas9 nuclease or the DNA-targeting unit to control the kinetics of its activity. The kit may include a small molecule to stabilize the Cas9 nuclease-DTU by attenuating degradation due to an attached destabilization domain.
[0443] The kits may also optionally include appropriate systems (e.g. opaque containers) or stabilizers (e.g. antioxidants) to prevent degradation of the reagents by light or other adverse conditions.
[0444] The kits may optionally include instructional materials containing directions (i.e., protocols) providing for the use of the reagents in the editing and/or deletion of a specific genomic target. While the instructional materials typically comprise written or printed materials they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user may be contemplated by this invention. Such media include, but are not limited to electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such media may include addresses to internet sites that provide such instructional materials or assistance in the design and implementation of the Cas9 nuclease-DTU for specific genomic targets.
EXPERIMENTAL
Example 1
Plasmid Constructs
[0445] For Cas9-DBD experiments an sgRNA expression plasmid pLKO1-puro was used as described previously. Stewart et al., Lentivirus-delivered stable gene silencing by RNAi in primary cells. RNA. 2003 April; 9(4): 493-501. SpCas9 and SpCas9-DBD fusions are expressed from pCS2-Dest gateway plasmid under chicken beta globin promoter. Villefranc et al., Gateway compatible vectors for analysis of gene function in the zebrafish. Dev Dyn. 2007 November; 236(11): 3077-87. For SSA directed nuclease activity assay, an M427 plasmid was used as previously reported. Wilson et al., Design and Development of Artificial Zinc Finger Transcription Factors and Zinc Finger Nucleases to the hTERT Locus. Mol Ther Nucleic Acids. 2013 Apr. 23; 2: e87.
[0446] Cas9-DBD target sites are cloned into Sbf1 digested backbone in ligation-independently. The Sbf1 digested M427 vector backbone may be treated with T4 DNA polymerase to recess the ends. Small double stranded oligonucleotides with flanking ends compatible to the recessed ends of vector are hybridized with the vector backbone in a thermocycler and directly transformed into bacteria.
[0447] ZFPs were assembled as gBlocks (Integrated DNA Technologies) from finger modules based on previously described recognition preferences. ZFPs were cloned into a pCS2-Dest-SpCas9 plasmid backbone cloned thorough BspEI and XhoI sites.
[0448] TALEs were assembled via golden gate assembly.sup.55 into JDS TALE plasmids.sup.56. Assembled TALEs were cloned into BbsI digested pCS2-Dest-SpCas9-TALEntry backbone through Acc65I and BamHI sites.
[0449] Sequences of the SpCas9-pDBDs are presented herein and these plasmids are deposited at addgene for distribution to the community. Plasmid reporter assays of nuclease activity utilized the restoration of GFP activity through SSA-mediated repair of an inactive GFP construct using the M427 plasmid.sup.46. SpCas9 target sites were cloned into plasmid M427 via ligation independent methods following Sbf1 digestion. Mutations in the PAM interacting domain of SpCas9 were generated by cassette mutagenesis.
Example II
Cell Culture and Transfection
[0450] Human Embryonic Kidney (HEK293T) cells were cultured in high glucose DMEM with 10% FBS and 1% Penicillin/Streptomycin (Gibco) at 37° C. incubator with 5% CO.sub.2. For transient transfection, early to mid-passage cells (passage number 5-25) were used. Approximately 1.6×10.sup.5 cells were transfected with 50 ng SpCas9/DBD expressing plasmid, 50 ng sgRNA expressing plasmid, 100 ng mCherry plasmid via Polyfect transfection reagent (Qiagen) in 24-well format according to manufacturer suggested protocol. For SSA-reporter assay, 150 ng M427 SSA-reporter plasmid may be also supplemented to the co-transfection mix.
Example III
Western Blot Analysis
[0451] HEK293T cells are transfected with 500 ng Cas9 and 500 ng sgRNA expressing plasmid in a 6-well plate by Lipofectamine 3000 transfection reagent (Invitrogen) according to manufacturer's suggested protocol. 48 hours after transfection, cells are harvested and lysed with 100 ul RIPA buffer. 8 μl of cell lysate is used for electrophoresis and blotting. The blots are probed with anti-HA (Sigma #H9658) and anti alpha-tubilin (Sigma #T6074) primary antibodies; then HRP conjugated anti-mouse IgG (Abcam #ab6808) and anti-rabbit IgG secondary antibodies, respectively. Visualization employed Immobilon Western Chemiluminescent HRP substrate (EMD Millipore #WBKLS0100).
Example IV
Flow Cytometry Reporter Assay
[0452] 48 hours post transfection; cells were trypsinized and harvested into a microcentrifuge tube. Cells were centrifuged at 500×g for 2 minutes, washed once with 1×PBS and resuspended in 1×PBS for flow cytometry (Becton Dickonson FACScan). For FACS analysis, 10000 events are counted from each sample. To minimize effect of transfection variations among samples, first cells were gated for mCherry expression, and the percentage of EGFP expressing cells were quantified within mCherry positive cells. All the experiment replicates were performed in triplicate on different days and mean values and standard error of the mean may be calculated.
Example V
Genomic Target Analysis (T7E1)
[0453] 72 hours post transfection; cells were harvested and genomic DNA was extracted via DNeasy Blood and Tissue kit (Qiagen) according to manufacturer suggested protocol. 50 ng input DNA was PCR amplified with Phusion High Fidelity DNA Polymerase (New England Biolabs): 98° C., 15s; 67° C. 25s; 72° C. 18s)×30 cycles. 10 ul of a PCR product was hybridized and treated with 0.5 μl T7 Endonuclease I in 1×NEB Buffer2 for 45 minutes.sup.57. The samples were run on 2.5% agarose gel and quantified with ImageJ software (PMID 22930834). Indel percentages were calculated as previously described (PMID 23478401). All the experiment replicates were performed in triplicate on different days and mean values and standard error of the mean may be calculated.
Example VI
Targeted Deep-Sequencing
[0454] For each generation of each amplicon, a two-step PCR amplification approach was used to first amplify the genomic segments and then installed with barcodes and indexes.
[0455] In a first step, “locus-specific primers” were used bearing common overhangs with complementary tails to the TruSeq adaptor sequences. 50 ng input DNA was PCR amplified with Phusion High Fidelity DNA Polymerase (New England Biolabs): (98° C., 15s; 67° C. 25s; 72° C. 18s)×30 cycles. 5 μl of each PCR reaction was gel-quantified by ImageJ against a reference ladder and equal amounts from each genomic locus PCR were pooled for each treatment group (15 different treatment groups). The pooled PCR products from each group were run on a 2% agarose gel and the DNA from the expected product size (between 100 and 200 bp) was extracted and purified via QIAquick Gel Extraction Kit (Qiagen).
[0456] In a second step, the purified pool from each treatment group was amplified with a “universal forward primer and an indexed reverse primer” to reconstitute the TruSeq adaptors. 2 ng of input DNA was PCR amplified with Phusion High Fidelity DNA Polymerase (New England Biolabs): (98° C., 15s; 61° C., 25s; 72° C., 18s)×9 cycles. 5 μl of each PCR reaction was gel-quantified by ImageJ, and then equal amounts of the products from each treatment group were mixed and run on a 2% agarose gel. Full-size products (˜250 bp in length) were gel-extracted and purified via QIAquick Gel Extraction Kit (Qiagen). The purified library was deep sequenced using a paired-end 150 bp Miseq run. Sequences from each genomic locus within a specific index were identified based on a perfect match to the final 11 bp of the proximal genomic primer used for locus amplification.
[0457] Insertions or deletions in a SpCas9 target region were defined based on the distance between a “prefix” sequence at the 5′ end of each off-target site (typically 10 bp) and a “suffix” sequence at the 3′ end of each off-target site (typically 10 bp).sup.59, where there were typically 33 bp between these elements in the unmodified locus.
[0458] Distances that were greater than expected were binned as “insertions (I)”, and distances that were shorter were binned as “deletions (D)”. Reads that did not contain the suffix sequence were marked as undefined (U). For some loci the background sequencing error rate was high. For example for OT2-1 a homopolymer sequence in the guide region leads to a high error rate. All statistical analyses were performed using R, a system for statistical computation and graphics.sup.60.
[0459] Log odd ratios of lesion were calculated for the on-target and off-target sites of each individual Cas9 treatment group vs. the untreated control for each of the three independent experiments. T-test was applied to assess whether the log odd ratio was significantly different from 0, i.e., whether there was a significant difference in lesion odds between each individual Cas9 treatment group and the untreated control for the on-target and off-target sites. Odds ratios and their 99% confidence intervals were obtained by taking exponent of the estimated log odds ratios and their 99% confidence intervals. These analyses were also applied to the sum of the lesion rates across all three replicates (combined).
[0460] To adjust for multiple comparisons, p-values were adjusted using the Benjamini-Hochberg (BH) method.sup.61. Only loci that have significant BH-adjusted p-values in the combined data for the treatment group relative to the control were considered significant. GUIDE-Seq off-target analysis for SpCas9-pDBDs. GUIDE-Seq was performed with some modifications to the original protocol.sup.17. The following primer sets were used for the positive (+) and negative (−) strands to get successful library amplification:
TABLE-US-00004 Nuclease_off_+_GSP1 (SEQ ID NO: 36) GGATCTCGACGCTCTCCCTGTTTAATTGAGTTGTCATATGT TAATAAC Nuclease_off_-_GSP1 (SEQ ID NO: 37) GGATCTCGACGCTCTCCCTATACCGTTATTAACATATGACA Nuclease_off_+_GSP2 (SEQ ID NO: 38) CTCTCTATGGGCAGTCGGTGATTTGAGTTGTCATATGTTA ATAACGGTA Nuclease_off_-_GSP2 (SEQ ID NO: 39) CCTCTCTATGGGCAGTCGGTGATACATATGACAACTCAAT TAAAC -
In addition, this protocol differed from a previously published protocol.sup.17 in the following manner: In a 24-well format, HEK293T cells were transfected with 250 ng Cas9, 150 ng sgRNA, 50 ng GFP, and 10 pmol of annealed GUIDE-Seq oligonucleotide using Lipofectamine 3000 transfection reagent (Invitrogen) according to manufacturer's suggested protocol. 48 hours post-transfection, genomic DNA was extracted via DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer's suggested protocol. Library preparations were done with original adaptors according to protocols described by the Joung laboratory.sup.17, where each library was barcoded for pooled sequencing. The barcoded, purified libraries were deep sequenced as a pool using two paired-end 150 bp MiSeq runs.
[0461] Reads containing the identical molecular index and identical starting 8 bp elements on the Read1 were pooled into one unique read. The initial 30 bp and the final 50 bp of the unique Read2 sequences were clipped for removal of the adapter sequence and low quality sequences and then mapped to the human genome (hg19) using Bowtie.sup.2. Peaks containing mapped unique reads were identified using a pile-up program ESAT (garberlab.umassmed.edu/software/esat/) using a window of 25 bp with a 15 bp overlap. Neighboring windows that were on different strands of the genome and less than 50 bp apart were merged using Bioconductor package ChIPpeakAnno.sup.62,63. Peaks that were present with multiple different guides (hotspots.sup.17) or do not contain unique reads for both sense and anti-sense libraries.sup.17 were discarded. The remaining peaks were searched for sequence elements that were complementary to the nuclease target site using CRISPRseek.sup.21. Only peaks that harbor a sequence with less than 7 mismatches to the target site were considered potential off-target sites. The number of reads from these regions of the sense and the anti-sense libraries were combined into the final read number.
Example VII
CRISPRseek Analysis
[0462] Human hg19 exon and promoter sequences were fetched using Bioconductor packages ChIPpeakAnno.sup.62,63 and TxDb.Hsapiens.UCSC.hg19.knownGene. A subset of 16500 exons and 192 promoter sequences of 2 kb each were selected for sgRNA searching and genome-wide off target analysis was using Bioconductor package CRISPRseek.sup.21,22 using the default settings (both nGG and nAG PAMs were allowed) except BSgenomeName=BSgenome.Hsapiens.UCSC.hg19, annotateExon=FALSE, outputUniqueREs=FALSE, exportAllgRNAs=“fasta” and fetchSequence=FALSE.
[0463] After excluding sgRNAs with on-target or/and off-targets in the haplotype blocks, there were 124793 unique sgRNAs from exon sequences and 55687 unique gRNA from promoter sequences included in the analysis. Each guide was binned based on either the off-target site with the fewest number of mismatches to the guide sequence or the sum of the off-target scores for the top 10 off-target sites. The fraction of guides in each bin for exons or promoters was displayed as a pie chart.
Example VIII
Cas9-ZFP Fusions
[0464] In principle, Zinc Finger Protein (ZFPs) containing from three to six fingers can be designed for the construction of Cas9-ZFPs, which bind 9 bp to 18 bp target sites respectively (e.g., approximately 3 bp per finger). Based on the data presented herein with the Cas9-ZFP.sup.TS2/TS3/TS4 system, construction of a four-finger ZFP is preferable for initial testing of Cas9-ZFPs at a particular target site.
[0465] For Cas9-ZFPs containing a 58 aa linker the target site can be 5 to 14 bp downstream of the last base pair of the PAM triplet and can be on either the Watson or the Crick strand. If longer ZFPs are desired (5 or 6 fingers), one or more TGSQKP linkers are preferable to break an array into 2 or 3 finger module sets.sup.1. Other modified linkers can be utilized to skip a base between pairs of zinc finger modules to achieve more favorable recognition by neighboring arrays if desired. For the commercial design of zinc fingers, Sangamo Biosciences' proprietary zinc finger module archive has a design density likely less than every 10 bp.sup.4, combined with the flexibility of the spacing and orientation, multiple ZFPs can be designed and tested around almost any Cas9 target site. These ZFPs can be purchased from Sigma Aldrich.
[0466] In addition, a number of open-source systems have been described for selecting or assembling ZFPs. Highly specific ZFPs can be selected from randomized finger libraries using phage or bacterial selections, but this process is labor intensive and may be accessible to only few laboratories. By contrast, modular assembly.sup.6,7,16-20 wherein pre-characterized single zinc finger modules that recognize 3-base-pair (bp) subsites are joined into arrays, rapidly yields ZFPs that bind desired target sites, and has proven to be an effective method for the creation of active Cas9-ZFPs. For modular assembly, a number of zinc finger archives have been described focusing on single-finger (1F).sup.5,17,19,21 and two-finger (2F) modules.sup.6,7,16,18,22.
[0467] Using phage-based selections, Barbas lab identified 1F-modules that target 49 of the 64 triplets.sup.11-14,17. The Kim lab has reported 1F-modules recognizing 38 of the 64 triplets.sup.19. A curated archive of 1F-modules that bind 27 of 64 triplets has been published.sup.21.
[0468] Recently, using bacterial-one-hybrid based selections Noyes lab defined zinc finger modules that can recognize each of the 64 DNA triplets allowing targeting virtually any DNA sequence.sup.5. In addition, two-finger archives have been published that take into account finger-finger interface and therefore can yield ZFPs with higher specificity but the targeting range of these 2F archives is more limited.sup.6,7,16,18. The 1F and 2F archives described herein can be used to design a ZFP roughly every 10 bp, whereas some of the other finger archives can achieve even higher design densities. With the number of finger archives now available, it is possible to design a ZFPA targeting almost every DNA sequence.
[0469] Moreover, there are a number of tools available to help users to identify the best target site and design a ZFP. A web-based tool has been designed for the identification of Cas9-ZFP target sites for which ZFPs can be designed from our zinc finger archive. mccb.umassmed.edu/Cas9-pDBD_search. This site provides a simple scoring function for the evaluation of ZFPs with higher activity based on the number of arginine-guanine contacts that are present. Tools from other laboratories are available for the construction of ZFPAs. The “Zinc Finger Tools” published by Barbas lab can identify target sites for single ZFPs and design ZFPs using their archive of 49 1F-modules.sup.23. scripps.edu/barbas/zfdesign/zfdesignhome.php. The Joung laboratory has developed a suite of tools “ZiFiT” that allows the design of ZFPAs for a particular target sequence.sup.24. zifit.partners.org/ZiFiT/. In addition, a zinc finger tool developed by Noyes laboratory can be used to design zinc finger arrays one finger at a time for a desired target sites.sup.5. zf.princeton.edu/b1h/dna.html. This tool provides multiple zinc finger(s) for every DNA triplet but does not identify the best zinc finger site in a given target sequence.
Example IX
Cas9-TALE Fusions
[0470] When designing TALE-arrays for Cas9-TALE fusion, a minimum of a 10 bp target site is preferred (excluding the 5′ T) located approximately 10-14 bp downstream and on the Watson strand relative to the NGG PAM site. Alternatively, a target site may comprise a 5′ T.sup.25. Multiple programs are available that allow design of single TAL-arrays including TALE-NT.sup.26 (tale-nt.cac.cornell.edu/) and SAPTA TAL Targeter Tool.sup.27. bao.rice.edu/Research/BioinformaticTools/TAL_targeter.html.
REFERENCES
[0471] 1. Doudna, J. A. & Charpentier, E. Genome editing. The new frontier of genomeengineering with CRISPR-Cas9. Science 346, 1258096-1258096 (2014). [0472] 2. Sander, J. D. & Joung, J. K. CRISPR-Cas systems for editing, regulating and targeting genomes. Nature biotechnology 32, 347-355 (2014). [0473] 3. Hsu, P. D., Lander, E. S. & Zhang, F. Development and Applications of CRISPRCas9 for Genome Engineering. Cell 157, 1262-1278 (2014). [0474] 4. Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptivebacterial immunity. Science 337, 816-821 (2012). [0475] 5. Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNAinterrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62-67 (2014). [0476] 6. Szczelkun, M. D. et al. Direct observation of R-loop formation by single RNA guided Cas9 and Cascade effector complexes. Proceedings of the National Academy of Sciences 111, 9798-9803 (2014). [0477] 7. Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural basis of PAMdependen target DNA recognition by the Cas9 endonuclease. Nature 513, 569-573 (2014). [0478] 8. Jiang, F., Zhou, K., Ma, L., Gressel, S. & Doudna, J. A. STRUCTURAL BIOLOGY. A Cas9-guide RNA complex preorganized for target DNA recognition. Science 348, 1477-1481 (2015). [0479] 9. Hsu, P. D. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology 31, 827-832 (2013). [0480] 10. Tsai, S. Q. et al. Dimeric CRISPR RNA-guided FokI nucleases for highly specific genome editing. Nature biotechnology 32, 569-576 (2014). [0481] 11. Zhang, Y. et al. Comparison of non-canonical PAMs for CRISPR/Cas9-mediated DNA cleavage in human cells. Sci Rep 4, 5405 (2014). [0482] 12. Gabriel, R., Kalle, von, C. & Schmidt, M. Mapping the precision of genome editing. Nature biotechnology 33, 150-152 (2015). [0483] 13. Ledford, H. CRISPR, the disruptor. Nature 522, 20-24 (2015). [0484] 14. Fu, Y. et al. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nature biotechnology 31, 822-826 (2013). [0485] 15. Lin, Y. et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Research 42, 7473-7485 (2014). [0486] 16. Pattanayak, V. et al. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nature biotechnology 31, 839-843 (2013). [0487] 17. Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nature biotechnology 33, 187-197 (2015). [0488] 18. Frock, R. L. et al. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nature biotechnology 33, 179-186 (2015). [0489] 19. Kim, D. et al. Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects in human cells. Nature Methods 12, 237-243 (2015). [0490] 20. Wang, X. et al. Unbiased detection of off-target cleavage by CRISPR-Cas9 and TALENs using integrase-defective lentiviral vectors. Nature biotechnology (2015). [0491] 21. Zhu, L. J., Holmes, B. R., Aronin, N. & Brodsky, M. H. CRISPRseek: A Bioconductor Package to Identify Target-Specific Guide RNAs for CRISPR-Cas9 Genome-Editing Systems. PLoS ONE 9, e108424 (2014). [0492] 22. Zhu, L. J. Overview of guide RNA design tools for CRISPR-Cas9 genome editing technology. Frontiers in Biology (2015). [0493] 23. Brunet, E. et al. Chromosomal translocations induced at specified loci in human stem cells.
[0494] Proceedings of the National Academy of Sciences 106, 10620-10625 (2009). [0495] 24. Lee, H. J., Kim, E. & Kim, J. S. Targeted chromosomal deletions in human cells using zinc finger nucleases. Genome Research 20, 81-89 (2010). [0496] 25. Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Joung, J. K. Improving CRISPRCas nuclease specificity using truncated guide RNAs. Nature biotechnology 32, 279-284 (2014). [0497] 26. Cho, S. W. et al. Analysis of off-target effects of CRISPR/Cas-derived RNA guided endonucleases and nickases. Genome Research 24, 132-141 (2014). [0498] 27. Ran, F. A. et al. Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome Editing Specificity. Cell 154, 1380-1389 (2013). [0499] 28. Mali, P. et al. CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nature biotechnology 31, 833-838 (2013). [0500] 29. Guilinger, J. P., Thompson, D. B. & Liu, D. R. Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nature biotechnology 32, 577-582 (2014). [0501] 30. Zetsche, B., Volz, S. E. & Zhang, F. A split-Cas9 architecture for inducible genome editing and transcription modulation. Nature biotechnology 33, 139-142 (2015). [0502] 31. Nihongaki, Y., Kawano, F., Nakajima, T. & Sato, M. Photoactivatable CRISPRCas9 for optogenetic genome editing. Nature biotechnology (2015). [0503] 32. Wright, A. V. et al. Rational design of a split-Cas9 enzyme complex. Proceedings of the National Academy of Sciences 112, 2984-2989 (2015). [0504] 33. Davis, K. M., Pattanayak, V., Thompson, D. B., Zuris, J. A. & Liu, D. R. Small molecule-triggered Cas9 protein with improved genome-editing specificity. Nat Chem Biol (2015). doi: 10.1038/nchembio.1793 [0505] 34. Kleinstiver, B. P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature (2015). [0506] 35. Kim, S., Kim, D., Cho, S. W., Kim, J. & Kim, J. S. Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins. Genome Research 24, 1012-1019 (2014). [0507] 36. Ramakrishna, S. et al. Gene disruption by cell-penetrating peptide-mediated delivery of Cas9 protein and guide RNA. Genome Research 24, 1020-1027 (2014). [0508] 37. Zuris, J. A. et al. Cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo. Nature biotechnology 33, 73-80 (2015). [0509] 38. Tsai, S. Q. & Joung, J. K. What's changed with genome editing? Cell Stem Cell 15, 3-4 (2014). [0510] 39. Urnov, F. D., Rebar, E. J., Holmes, M. C., Zhang, H. S. & Gregory, P. D. Genome editing with engineered zinc finger nucleases. Nat Rev Genet 11, 636-646 (2010). [0511] 40. Joung, J. K. & Sander, J. D. TALENs: a widely applicable technology for targeted genome editing. Nat. Rev. Mol. Cell Biol. 14, 49-55 (2013). [0512] 41. Persikov, A. V. et al. A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Research 43, 1965-1984 (2015). [0513] 42. Lamb, B. M., Mercer, A. C. & Barbas, C. F. Directed evolution of the TALE Nterminal domain for recognition of all 5′ bases. Nucleic Acids Research 41, 9779-9785 (2013). [0514] 43. Boissel, S. et al. megaTALs: a rare-cleaving nuclease architecture for therapeutic genome engineering. Nucleic Acids Research 42, 2591-2601 (2014). [0515] 44. Khalil, A. S. et al. A synthetic biology framework for programming eukaryotic transcription functions. Cell 150, 647-658 (2012). [0516] 45. Meckler, J. F. et al. Quantitative analysis of TALE-DNA interactions suggests polarity effects. Nucleic Acids Research 41, 4118-4128 (2013). [0517] 46. Wilson, K. A., Chateau, M. L. & Porteus, M. H. Design and Development of Artificial Zinc Finger Transcription Factors and Zinc Finger Nucleases to the hTERT Locus. Mol Ther Nucleic Acids 2, e87 (2013). [0518] 47. Atkinson, H. & Chalmers, R. Delivering the goods: viral and non-viral gene therapy systems and the inherent limits on cargo DNA and internal sequences. Genetica 138, 485-198 (2010). [0519] 48. Klemm, J. D. & Pabo, C. O. Oct-1 POU domain-DNA interactions: cooperative binding of isolated subdomains and effects of covalent linkage. Genes & Development 10, 27-36 (1996). [0520] 49. Chylinski, K., Makarova, K. S., Charpentier, E. & Koonin, E. V. Classification and evolution of type II CRISPR-Cas systems. Nucleic Acids Research 42, 6091-6105 (2014). [0521] 50. Hou, Z. et al. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proceedings of the National Academy of Sciences 110, 15644-15649 (2013). [0522] 51. Kearns, N. A. et al. Cas9 effector-mediated regulation of transcription and differentiation in human pluripotent stem cells. Development 141, 219-223 (2014). [0523] 52. Villefranc, J. A., Amigo, J. & Lawson, N. D. Gateway compatible vectors for analysis of gene function in the zebrafish. Dev Dyn 236, 3077-3087 (2007). [0524] 53. Gupta, A. et al. An optimized two-finger archive for ZFN-mediated gene targeting. Nature Methods 9, 588-590 (2012). [0525] 54. Zhu, C. et al. Using defined finger-finger interfaces as units of assembly for constructing zinc-finger nucleases. Nucleic Acids Research 41, 2455-2465 (2013). [0526] 55. Cermak, T. et al. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Research 39, e82-e82 (2011). [0527] 56. Kok, F. O., Gupta, A., Lawson, N. D. & Wolfe, S. A. Construction and application of site-specific artificial nucleases for targeted gene editing. Methods Mol Biol 1101, 267-303 (2014). [0528] 57. Gupta, A. et al. Targeted chromosomal deletions and inversions in zebrafish. Genome Research 23, 1008-1017 (2013). [0529] 58. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nature Methods 9, 671-675 (2012). [0530] 59. Gupta, A., Meng, X., Zhu, L. J., Lawson, N. D. & Wolfe, S. A. Zinc finger proteindependent and -independent contributions to the in vivo off-target activity of zinc finger nucleases. Nucleic Acids Research 39, 381-392 (2011). [0531] 60. Ihaka, R. & Gentleman, R. R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics 5, 299-314 (1996). [0532] 61. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57, 289-300 (1995). [0533] 62. Zhu, L. J. et al. ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics 11, 237 (2010). [0534] 63. Zhu, L. J. in Methods in Molecular Biology (eds. Lee, T. L. & Shui Luk, A. C.) 1067, 105-124 (Humana Press, 2013).
SUPPLEMENTARY REFERENCES
[0535] Li, H. et al. In vivo genome editing restores haemostasis in a mouse model of haemophilia. Nature 475, 217-221 (2011). [0536] Yusa, K. et al. Targeted gene correction of al-antitrypsin deficiency in induced pluripotent stem cells. Nature 478, 391-394 (2011). [0537] Mahiny, A. J. et al. In vivo genome editing using nuclease-encoding mRNA corrects SP-B deficiency. Nature biotechnology (2015). [0538] Gupta, R. M. & Musunuru, K. Expanding the genetic editing tool kit: ZFNs, TALENs, and CRISPR-Cas9. J Clin Invest 124, 4154-4161 (2014). [0539] Persikov, A. V. et al. A systematic survey of the Cys2His2 zinc finger DNAbinding landscape. Nucleic Acids Research 43, 1965-1984 (2015). [0540] Zhu, C. et al. Using defined finger-finger interfaces as units of assembly for constructing zinc-finger nucleases. Nucleic Acids Research 41, 2455-2465 (2013). [0541] Gupta, A. et al. An optimized two-finger archive for ZFN-mediated gene targeting. Nature Methods 9, 588-590 (2012). [0542] Maeder, M. L., Thibodeau-Beganny, S., Sander, J. D., Voytas, D. F. & Joung, J. K. Oligomerized pool engineering (OPEN): an ‘open-source’ protocol for making customized zinc-finger arrays. Nat Protoc 4, 1471-1501 (2009). [0543] Maeder, M. et al. Rapid “‘Open-Source’” Engineering of Customized Zinc-Finger Nucleases for Highly Efficient Gene Modification. Molecular Cell 31, 294-301 (2008). [0544] Meng, X., Noyes, M. B., Zhu, L. J., Lawson, N. D. & Wolfe, S. A. Targeted gene inactivation in zebrafish using engineered zinc-finger nucleases. Nature biotechnology 26, 695-701 (2008). [0545] Dreier, B. et al. Development of zinc finger domains for recognition of the 5′-CNN-3′ family DNA sequences and their use in the construction of artificial transcription factors. J Biol Chem 280, 35588-35597 (2005). [0546] Dreier, B., Beerli, R., Segal, D., Flippin, J. & Barbas, C. Development of zinc finger domains for recognition of the 5′-ANN-3′ family of DNA sequences and their use in the construction of artificial transcription factors. Journal of Biological Chemistry 276, 29466 (2001). [0547] Dreier, B., Segal, D. J. & Barbas, C. F. Insights into the molecular recognition of the 5′-GNN-3′ family of DNA sequences by zinc finger domains. J Mol Biol 303, 489-502 (2000). [0548] Segal, D. J., Dreier, B., Beerli, R. R. & Barbas, C. F. Toward controlling gene expression at will: selection and design of zinc finger domains recognizing each of the 5′-GNN-3′ DNA target sequences. Proc Natl Acad Sci USA 96, 2758-2763 (1999). [0549] Greisman, H. A. & Pabo, C. O. A general strategy for selecting high-affinity zinc finger proteins for diverse DNA target sites. Science 275, 657-661 (1997). [0550] Sander, J. D. et al. Selection-free zinc-finger-nuclease engineering by context-dependent assembly (CoDA). Nature Methods 8, 67-69 (2011). [0551] Carroll, D., Morton, J. J., Beumer, K. J. & Segal, D. J. Design, construction and in vitro testing of zinc finger nucleases. Nat Protoc 1, 1329-1341 (2006). [0552] Kim, S., Lee, M. J., Kim, H., Kang, M. & Kim, J. S. Preassembled zincfinger arrays for rapid construction of ZFNs. Nature Methods 8, 7 (2011). [0553] Kim, H. J., Lee, H. J., Kim, H., Cho, S. W. & Kim, J. S. Targeted genome editing in human cells with zinc finger nucleases constructed via modular assembly. Genome Research 19, 1279-1288 (2009). [0554] Bhakta, M. S. et al. Highly active zinc-finger nucleases by extended modular assembly. Genome Research 23, 530-538 (2013). [0555] Zhu, C. et al. Evaluation and application of modularly assembled zincfinger nucleases in zebrafish. Development 138, 4555-4564 (2011). [0556] Doyon, Y. et al. Heritable targeted gene disruption in zebrafish using designed zinc-finger nucleases. Nature biotechnology 26, 702-708 (2008). [0557] Mandell, J. G. & Barbas, C. F. Zinc Finger Tools: custom DNA-binding domains for transcription factors and nucleases. Nucleic Acids Research 34, W516-W523 (2006). [0558] Sander, J. D. et al. ZiFiT (Zinc Finger Targeter): an updated zinc finger engineering tool. Nucleic Acids Research 38, W462-W468 (2010). [0559] Miller, J. C. et al. Improved specificity of TALE-based genome editing using an expanded RVD repertoire. Nature Methods (2015). [0560] Doyle, E. L. et al. TAL Effector-Nucleotide Targeter (TALE-NT) 2.0: tools for TAL effector design and target prediction. Nucleic Acids Research 40, W117-22 (2012). [0561] Lin, Y. et al. SAPTA: a new design tool for improving TALE nuclease activity. Nucleic Acids Research gkt1363 (2014). [0562] Zhu, L. J., Holmes, B. R., Aronin, N. & Brodsky, M. H. CRISPRseek: A Bioconductor Package to Identify Target-Specific Guide RNAs for CRISPRCas9 Genome-Editing Systems. PLoS ONE 9, e108424 (2014). [0563] Lin, Y. et al. CRISPR/Cas9 systems have off-target activity with insertions or deletions between target DNA and guide RNA sequences. Nucleic Acids Research 42, 7473-7485 (2014). [0564] Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569-573 (2014). [0565] Elrod-Erickson, M., Rould, M. A., Nekludova, L. & Pabo, C. O. Zif268 protein-DNA complex refined at 1.6 A: a model system for understanding zinc finger-DNA interactions. Structure 4, 1171-1180 (1996). [0566] Lu, X. J. & Olson, W. K. 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat Protoc 3, 1213-1227 (2008). [0567] Wilson, K. A., Chateau, M. L. & Porteus, M. H. Design and Development of Artificial Zinc Finger Transcription Factors and Zinc Finger Nucleases to the hTERT Locus. Mol Ther Nucleic Acids 2, e87 (2013). [0568] Fu, Y., Sander, J. D., Reyon, D., Cascio, V. M. & Joung, J. K. Improving CRISPR-Cas nuclease specificity using truncated guide RNAs. Nature biotechnology 32, 279-284 (2014). [0569] Gupta, A. et al. An improved predictive recognition model for Cys (2)-His (2)zinc finger proteins. Nucleic Acids Research 42, 4800-4812 (2014).