CODON-OPTIMIZED CAS9 ENDONUCLEASE ENCODING POLYNUCLEOTIDE
20230075913 · 2023-03-09
Inventors
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C12N15/8241
CHEMISTRY; METALLURGY
International classification
C12N9/22
CHEMISTRY; METALLURGY
Abstract
It was now found that the expression of a nucleotide sequence as described in the method of the invention in a plant cell results in much higher rates of indels compared to those seen in cells transformed with a control nucleic acid molecule. Thus, the invention is directed to codon-optimized Cas9 endonuclease-encoding polynucleotide. Accordingly, the present invention provides a method for modifying a target site in the genome of a plant cell, the method comprising providing one or more guide RNA and a Cas endonuclease to said plant cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, and wherein the Cas9 endonuclease is expressed in the plant cell from a polynucleotide comprising an codon-optimized Cas9 endonuclease encoding nucleic acid molecule with a nucleotide sequence selected from the disclosed nucleotide sequences.
Claims
1. A method for modifying a target site in the genome of a plant cell, the method comprising providing one or more guide RNA and a Cas endonuclease to said plant cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site, and wherein the Cas9 endonuclease is expressed in the plant cell from a polynucleotide comprising a codon-optimized Cas9 endonuclease encoding nucleic acid molecule with a nucleotide sequence selected from the following nucleotide sequences: a. a nucleotide sequence that in an alignment with the nucleotide sequence depicted in SEQ ID NO. 1 has at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotide combinations: i. 3018 A and 201 G, ii. 3018 A and 639 G, iii. 3018 A and 1248 T, iv. 4014 A and 201 G, v. 4014 A and 1329 A, vi. 4014 A and 1248 T, vii. 4014 A and 438 G, viii. 4014 A and 2805 T ix. 201 G and 2805 T, x. 201 G and 1248 T, xi. 201 G and 2460 T, and/or xii. 201 G and 3648 A, b. a nucleotide sequence at least 90% identical to SEQ ID NO.: 1; and/or c. a nucleotide sequence being at least 80% identical to SEQ ID NO. 1 and having in an alignment to the sequence depicted in SEQ ID NO. 1 at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotides: i. 303 A, ii. 1029 A, iii. 1329 A, and/or iv. 2418 A.
2. The method of claim 1, wherein the nucleotide sequence that in an alignment with the nucleotide sequence depicted in SEQ ID NO. 1 has at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotide combinations: i. 3018 A, 201 G, 639 G and 1248 T, ii. 4014 A and 201 G, 1329 A,1248 T,438 G, and 2805 T, iii. 201 G and 2805 T, 1248 T, 2460 T, and 3648 A, and/or iv. 3018 A, 201 G, 639 G, 1248 T, 4014 A, 1329 A, 438 G, and 2805 T, 2460 T, and 3648 A
3. The method of claim 1, wherein the plant is a wheat plant.
4. The method of claim 1, wherein the Cas9 endonuclease comprises the polypeptide sequence as shown in SEQ ID NO: 2, or a polypeptide sequence 90% or more identical to SEQ ID NO. 2, or a polypeptide encoded by a nucleotide sequence 90% or more identical to SEQ ID NO. 1.
5. A polynucleotide molecule encoding a Cas9 endonuclease, wherein the nucleotide sequence of the polynucleotide molecule comprises a nucleotide sequence selected from the following nucleotide sequences: a. a nucleotide sequence that in an alignment with the nucleotide sequence depicted in SEQ ID NO. 1 has at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotide combinations: i. 3018 A and 201 G, ii. 3018 A and 639 G, iii. 3018 A and 1248 T, iv. 4014 A and 201 G, v. 4014 A and 1329 A, vi. 4014 A and 1248 T, vii. 4014 A and 438 G, viii. 4014 A and 2805 T ix. 201 G and 2805 T, x. 201 G and 1248 T, xi. 201 G and 2460 T, and/or xii. 201 G and 3648 A; b. a nucleotide sequence at least 90% identical to SEQ ID NO.: 1; and/or c. a nucleotide sequence being at least 80% identical to SEQ ID NO. 1 and having in an alignment to the sequence depicted in SEQ ID NO. 1 at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotides: i. 301 A, ii. 1029 A, iii. 1329 A, and/or iv. 2418 A.
6. The polynucleotide of claim 5, wherein the nucleotide sequence that in an alignment with the nucleotide sequence depicted in SEQ ID NO. 1 has at the following positions counting from the first nucleotide of the start codon one or more of the following nucleotide combinations: i. 3018 A, 201 G, 639 G and 1248 T, ii. 4014 A and 201 G, 1329 A,1248 T,438 G, and 2805 T, iii. 201 G and 2805 T, 1248 T, 2460 T, and 3648 A, and/or iv. 3018 A, 201 G, 639 G, 1248 T, 4014 A, 1329 A, 438 G, and 2805 T, 2460 T, and 3648 A.
7. A method for modifying a target site in the genome of a plant cell, the method comprising providing one or more guide RNAs and a donor DNA to a plant cell having a Cas9 endonuclease, wherein said guide RNA and Cas9 endonuclease are capable of forming a complex that enables the Cas9 endonuclease to introduce a double strand break at said target site, wherein the Cas9 endonuclease is expressed in the plant cell from a polynucleotide comprising the polynucleotide of claim 5, and wherein said donor DNA comprises a polynucleotide of interest.
8. A method for modifying a target site in the genome of a wheat cell, the method comprising: a) providing to a wheat cell one or more guide RNA and a Cas 9 endonuclease encoding sequence, wherein said guide RNA and a Cas 9 endonuclease expressed from said Cas 9 endonuclease encoding sequence are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at said target site; and, b) identifying at least one wheat cell that has a modification at said target, wherein the modification includes at least one deletion or substitution of one or more nucleotides in said target site; and wherein the Cas 9 endonuclease encoding sequence that comprises the polynucleotide sequence of the polynucleotide of claim 5.
9. A plant, host cell, a plant cell, a plant organ, or a plant cell compartment comprising a recombinant DNA construct, said recombinant DNA construct comprising a promoter operably linked to a codon-optimized nucleotide sequence encoding a Cas9 endonuclease, wherein said Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence said plant genome, and wherein the Cas 9 endonuclease encoding sequence that comprises the polynucleotide sequence of the polynucleotide of claim 5.
10. A plant, a host cell, a plant cell, a plant organ, or a plant cell compartment comprising a recombinant DNA construct and one or more guide RNA, wherein said recombinant DNA construct comprises a promoter operably linked to a codon-optimized nucleotide sequence encoding a Cas9 endonuclease, wherein said Cas9 endonuclease and guide RNA are capable of forming a complex and creating a double strand break in a genomic target sequence in said plant genome and wherein the Cas 9 endonuclease encoding sequence that comprises the polynucleotide sequence of the polynucleotide of claim 5.
11. A recombinant DNA construct comprising a promoter operably linked to a codon-optimized nucleotide sequence encoding a Cas9 endonuclease, wherein said Cas9 endonuclease is capable of binding to and creating a double strand break in a genomic target sequence of said plant genome, wherein the Cas 9 endonuclease encoding sequence comprising the polynucleotide sequence of the polynucleotide of claim 5.
12. A recombinant DNA construct comprising a promoter operably linked to a nucleotide sequence expressing a guide RNA, wherein said guide RNA is capable of forming a complex with a Cas9 endonuclease, and wherein said complex is capable of binding to and creating a double strand break in a genomic target sequence said plant genome, and wherein the Cas 9 endonuclease is expressed from a polynucleotide that comprises the polynucleotide sequence of the polynucleotide of claim 5.
13. A method for editing a nucleotide sequence in the genome of a cell, the method comprising providing one or more guide RNA, a Cas endonuclease, and optionally a polynucleotide modification template, to a cell, wherein said guide RNA and Cas endonuclease are capable of forming a complex that enables the Cas endonuclease to introduce a double strand break at a target site in the genome of said cell, wherein said polynucleotide modification template comprises at least one nucleotide modification of said nucleotide sequence, and wherein the Cas 9 endonuclease is expressed from a polynucleotide that comprises the polynucleotide sequence of the polynucleotide of claim 5.
14. The method of claim 13, wherein the nucleotide sequence in the genome of a cell is selected from the group consisting of a promoter sequence, a terminator sequence, a regulatory element sequence, a splice site, a coding sequence, a polyubiquitination site, an intron site and an intron enhancing motif.
15. (canceled)
16. The polynucleotide molecule of claim 5, wherein the nucleotide sequence molecule encoding the Cas9 endonuclease further comprises one or more NLS sequence.
17. The polynucleotide molecule of claim 5, wherein the Cas9 nuclease is a nickase having if aligned with the Cas9 polypeptide sequence depicted in SEQ ID NO. 2, and/or a D to A mutation at amino acid position 10 and/or a H to A amino acid mutation at position 840, or is a dead nuclease having a R to A mutation at amino acid position 70, and/or a D to A mutation at amino acid position 10 and/or a H to A mutation at amino acid position 840, or having one or more of the mutations as shown in
18. The polynucleotide molecule of claim 5, wherein the Cas9 nuclease is active or inactive, and is fused to another polypeptide.
19. The polynucleotide molecule of claim 5, wherein the Cas9 nuclease is inactive, and is fused to transcription activation or repression effectors, epigenetic factors, such as histone-modifying/DNA methylation enzymes, fluorescent proteins for imaging of specific genomic loci, cytosine or adenine deaminases for precisely altering DNA bases.
20. The polynucleotide molecule of claim 5, wherein the Cas9 nuclease is a nickase and is fused to a reverse transcriptase.
21. The polynucleotide molecule of claim 16, wherein the one or more NLS sequence is fused to the 5′ terminus and/or is fused to the 3′ terminus of the sequence encoding the Cas9 endonuclease.
Description
FIGURE DESCRIPTION
[0281]
[0282]
[0283]
[0284]
EXAMPLES
Chemicals and Common Methods
[0285] Unless indicated otherwise, cloning procedures carried out for the purpose of the current invention including restriction digest, agarose gel electrophoresis, purification and ligation of nucleic acids, transformation, selection and cultivation of bacterial cells are performed as described (Sambrook J, Fritsch E F and Maniatis T (1989)). Sequence analysis of recombinant DNA is performed by LGC Genomics (Berlin, Germany) using the Sanger technology (Sanger et al., 1977). Unless described otherwise, chemicals and reagents are obtained from Sigma Aldrich (Sigma Aldrich, St. Louis, USA), from Promega (Madison, Wis., USA) or Bio-Rad Laboratories (Hercules, Calif., USA). Restriction endonucleases and Gibson assembly reagents are from New England Biolabs (Ipswich, Mass., USA). Oligonucleotides are synthesized by Integrated DNA Technologies (Coralville, Iowa, USA). Codon-optimized genes are from Genewiz (South Plainfield, N.J., USA).
Example 1: Cas9 Coding Sequence Optimization
[0286] The original Cas9 gene was a codon-optimized version of the Streptococcus pyogenes Cas9 (SpCas9), constructed for expression in rice cells (Shan et al. (2013); Nature Biotech 31(8)). A second rice codon-optimized version, obtained using in-house GenEvolution Leto-1.7.23 software, was included in the experiments as well. To optimize the expression of Cas9 in wheat (Triticum aestivum) cells, we used GeneOptimizer, a BASF proprietary software tool. Different settings were tested with parameters set for codon usage for wheat high-expressing genes and optional removal of major cryptic splice sites. Alternatively, more stringent parameters were used for codon usage with only the most abundant wheat amino acid codons selected during optimization, followed by manual removal of major cryptic splice sites.
Example 2: Plasmid Construction
[0287] The original Cas9 gene as well as the codon-optimized versions described above were tagged with a SV40 nuclear localization signal at the N-terminus and a Xenopus-derived Nucleoplasmin C nuclear localization signal at the C-terminus and synthesized. The synthesized genes were digested with Ncol and Nhel and cloned into a proprietary expression plasmid between the Ncol and Nhel sites. The resulting expression vectors include the maize polyubiquitin (Ubi) promoter for constitutive expression located upstream of the Cas9 gene and a fragment of the 3′ untranslated region of either the nopaline synthase gene of Agrobacterium tumefaciens or the 35S gene of Cauliflower mosaic virus at the 3′end.
[0288] Also, a gRNA expression cassette containing a chimeric guide RNA composed of a 20-bp protospacer site (Gil-Humanes et al. (2017); Plant J 89(6)) targeting the wheat mlo gene (Q94F71_WHEAT), a 76 bp guide RNA scaffold and the wheat polymerase III terminator sequence was synthesized. The recognition site of the gRNA is located on the antisense strand within exon 4 of the wheat mlo gene. The guide is specific for the 5A and 4D alleles of the mlo gene and shows one mismatch with the 4B allele at position 6 from the PAM sequence. Expression of the gRNA is driven by the polymerase III-type promoter of the wheat U6 snRNA gene. The synthesized cassette was cloned into a standard E. coli vector (pUC derivative) via EcoRV blunt end ligation.
[0289] All plasmids were transformed in E. coli for propagation and isolated using a ZymoPure II Plasmid Gigaprep kit for DNA purification (Zymo Research, Irvine, Calif., USA).
Example 3: Design of a Droplet Digital PCR Assay to Simultaneously Identify Indel Mutations and Precise Edits
[0290] Introduction to Experimental Procedures
[0291] Current methods to detect genome editing events include gel-based systems, artificial reporter assays, high resolution melting curve analysis and next-generation sequencing. Droplet digital PCR is a rapid alternative to these methods enabling rapid and systematic quantification of genome editing outcomes at endogenous loci. In a droplet digital PCR system, each PCR sample is partitioned into many droplets. PCR amplification occurs simultaneously in each droplet. At the end of the run, each droplet is individually assessed for the presence (positive) or absence (negative) of a fluorescent signal. Using a Poisson statistical analysis, the ratio of positive to negative droplets yields absolute quantification of the initial number of copies of the target sequence.
[0292] To setup a ddPCR assay capable of simultaneously measuring NHEJ and HDR at endogenous loci, we designed three kinds of probes, all located within one amplicon. The first, a reference probe, is labeled with FAM and located away from the mutagenesis site. This probe counts all genomic copies of the target. The second, a so-called drop-off probe, is labeled with HEX and is located where the Cas9 nuclease cuts the mlo target. If Cas9 induces NHEJ, the drop-off probe loses its binding site, resulting in loss of HEX and leaving only the FAM signal of the reference probe. The third probe, also FAM-labeled, binds to the desired DNA edit, causing a gain of additional FAM signal when precise edits are introduced. With this assay, indel mutations, WT alleles and precise edits can be detected as distinct, clearly separated droplets with high sensitivity and low background signal.
[0293]
[0294] Probes, Primers and gBLocks Design
[0295] ddPCR assays were designed using Primer3Plus software with modified settings compatible with the master mix: that is, 50 mM monovalent cations, 3.0 mM divalent cations, and 0 mM dNTPs with SantaLucia 1998 thermodynamic and salt correction parameters. The predicted nuclease cut site (3 bp upstream of PAM) was positioned mid-amplicon, with 70-100 bp flanking sequence either side up to the primer binding sites. To avoid loss of binding sites, primers and reference probe were designed away from the cut site. In addition, a dark, 3′-phosphorylated non-extendible oligonucleotide was designed to prevent the edit probe from binding to the WT sequence.
[0296] PCR primers were designed according to the following guidelines: primer length of 17-24 bases, primer melting temperature of 55 to 60° C. with an ideal temperature of 58° C., melting temperatures of the two primers differ by no more than 2° C., primer GC content of 35-65%, amplicon size of 100-250 bases.
[0297] Considerations for probe design were as follows: probes can bind to either strand of the target, probe GC content of 35-65%, no G at the 5′ end to prevent quenching of the 5′ fluorophore, melting temperature of the drop-off probe ranges from 61° C. to 64° C. with an ideal temperature of 62° C., length of the drop-off probe is less than 20 bases, melting temperatures of the reference and edit probe range from 63° C. to 67° C. with an ideal temperature of 65° C., length of the reference and edit probe of 20-24 bases. Preferably, probes should have a Tm 4-8° C. higher than the primers. Primer and probe designs were also screened for complementarity and secondary structure with the maximum ΔG value of any self-dimers, hairpins, and heterodimers set to −9.0 kcal/mole. All primers and probes were designed against the 5A allele of the wheat mlo gene.
[0298] The optimal annealing temperature was empirically determined using a temperature gradient PCR.
[0299] Synthetic dsDNA fragments (gBlocks, Integrated DNA Technologies) were used as positive controls for assay validation. HDR-positive controls contain the R158Q substitution at the desired edit site, whereas NHEJ-specific controls have a 1-bp insert at the predicted nuclease cut site. Lyophylized gBlocks were resuspended in 300 μl of TE and stored at min 20° C. Three additional dilutions in TE resulted in a master stock of approximately 600 copies/μl that was confirmed by ddPCR quantification. High-copy gBlock stocks were kept in a post-PCR environment to avoid contamination.
[0300] ddPCR Experiments and Quantification of Data
[0301] 20×ddPCR mixes were composed of 18 μM forward and 18 μM reverse primers, 5 μM reference probe, 5 μM edit probe, 5 μM drop-off probe, and 10 μM dark probe . The following reagents were mixed in a 96-well plate to make a 25-μl reaction: 11 μl of ddPCR Supermix for Probes (no dUTP), 1.1 μl of 10× assay mix (BioRad Laboratories, Hercules, Calif., USA), 10 U of Hindlll-HF, 100-250 ng of genomic DNA in water, and water up to 22 μl.
[0302] Droplets were generated using a QX100 Droplet Generator according to the manufacturer's instructions (Bio-Rad Laboratories) and transferred to a 96-well plate for standard PCR on a C1000 Thermal cycler with a deep well block (BioRad Laboratories, Hercules, Calif., USA).
[0303] Thermal cycling consisted of a 10 min activation period at 95° C. followed by 40 cycles of a two-step thermal profile of 30 s at 95° C. denaturation and 3 min at 60° C. for combined annealing-extension and 1 cycle of 98° C. for 10 min.
[0304] After PCR, the droplets were analyzed using a QX100 Droplet Reader (BioRad Laboratories, Hercules, Calif., USA) in ‘absolute quantification’ mode. To enable proper gating for precise edits and indel events, experiments were performed using both negative and positive controls (non-modified genomic DNA and gBlocks containing the R158Q mutation, respectively). In two-dimensional plots, droplets without templates were gated as negative population. Droplets containing only NHEJ (FAM+, HEX−), only HDR alleles (FAM++, HEX−) or only WT alleles (FAM+, HEX+) were manually gated as separate populations. Allelic frequencies were quantified using the QuantaSoft v.1.2.10.0 software (BioRad Laboratories, Hercules, Calif., USA).
[0305] Validation of ddPCR by Amplicon Deep Sequencing and Determination of Detection Limit
[0306] The designed ddPCR assay was verified by next-generation sequencing (NGS) of the target region using a pair of primers specific for the A subgenome copy of the wheat mlo gene (Seq ID NO: 17/Seq ID NO: 18). The obtained amplicons were purified and subjected to deep-sequencing (2×250 bp paired ends) by Genewiz Inc using an Illumina MiSeq System. A very good correlation (R.sup.2=0.96) was observed between the indel allele frequencies detected by ddPCR and NGS across different samples, demonstrating the sensitivity and reliability of the ddPCR assay. In
[0307] To calculate the ddPCR assay's limit of detection, we spiked wild-type genomic wheat DNA with different amounts of the HDR- and NHEJ-specific gBlocks (Seq ID NO: 19/Seq ID NO: 20) and found that the assay was reproducible and linear over a wide range of input DNA. The limit of detection was approximately 0.1% for NHEJ and well below 0.04 for % HDR alleles. This indicates that at least one indel or precise edit event from 1,000 copies of the genome can be captured by the assay. The ddPCR assay sensitivity established by serial dilution of HDR and NHEJ synthetic templates in a constant background (200 ng) of WT genomic DNA is shown in
Example 4: Wheat Protoplast Transfection
[0308] Transformation of wheat protoplast cells was performed as described by Wang et al. (2014) Nature; 32(9) with minor modifications. Protoplasts were isolated from the youngest fully developed leaf of 10-day-old aseptically grown wheat seedlings. Healthy leaves are bundled in stacks of five and cut into fine strips with a sharp razor blade. The strips are then infiltrated with cell wall-dissolving enzyme solution (1.5% cellulase R10 and 0.75% macerozyme R10 in 10 mM KCl and 0.6 M mannitol, pH 7.5) and incubated overnight in the dark with gentle shaking (40 rpm) at 24° C. After enzymatic digestion, the released protoplasts are collected by filtering the mixture through 40-μm nylon meshes and resuspended in W5 solution (Wang et al. (2014) Nature; 32(9)). The resuspended protoplasts are kept on ice and allowed to settle by gravity, after which the cell pellet is suspended in MMG solution (Wang et al. (2014) Nature;32(9)). For transformation, 200 μl of cells (2.5×10.sup.5) are mixed with 20 μg plasmid DNA and 220 μl of freshly prepared polyethylene glycol (PEG) solution. The mixture is incubated for 15-20 min in the dark. After removing the PEG solution, the transformed protoplasts are transferred into six-well plates and incubated at 24° C. for at least 48 h. Finally, the protoplasts are collected by centrifuging at 12,000 rpm for 1 min at room temperature.
Example 5: Wheat Codon-Optimized Cas9 Shows Enhanced Indel Efficiency
[0309] To test the effectiveness of the codon-optimized Cas9 version, wheat protoplast cells were co-transfected with the CRISPR editing tools (Cas9 expression vectors, gRNA expression cassette) as described above. Transformed protoplasts were harvested 60 hours after transfection and indel formation was analyzed by ddPCR. As expected, very low levels of indels were found in negative controls, that is cells transfected with a GFP reporter plasmid (Ctrl. 0). Interestingly, transfecting cells with Cas9 codon-optimized for expression in wheat (‘Optimized’) resulted in much higher rates of indels compared to those seen in cells transformed with the rice-optimized Cas9 versions (Ctrl. 1 and Ctrl. 2). Pooled over two independent experiments, the wheat Cas9 optimized version showed a 2.75-fold increase in indel efficiency relative to the original gene (Ctrl .1) used for generation of stably edited wheat in Wang et al. (2014; Nature Biotech 32(9). The Impact of codon optimization on Cas9 activity in wheat protoplast cells is shown in