A METHOD FOR BASE EDITING IN PLANTS
20190292553 ยท 2019-09-26
Assignee
Inventors
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C12N2800/80
CHEMISTRY; METALLURGY
Y02A40/146
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
C12N15/90
CHEMISTRY; METALLURGY
A01H1/06
HUMAN NECESSITIES
C12N15/8261
CHEMISTRY; METALLURGY
C12N9/78
CHEMISTRY; METALLURGY
C12N15/8243
CHEMISTRY; METALLURGY
International classification
C12N15/82
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
Abstract
The present invention belongs to the field of plant genetic engineering. Specifically, the invention relates to a method for base editing in plants. More particularly, the invention relates to a method for performing efficient base editing to a target sequence in the genome of a plant (such as a crop plant) by a Cas9-cytidine deaminase fusion protein, as well as plants produced through said method and progenies thereof.
Claims
1. A system for performing base editing to a target sequence in a plant genome, comprising at least one of the following (i) to (v): i) a base editing fusion protein, and a guide RNA; ii) an expression construct comprising a nucleotide sequence encoding a base editing fusion protein, and a guide RNA; iii) a base editing fusion protein, and an expression construct comprising a nucleotide sequence encoding a guide RNA; iv) an expression construct comprising a nucleotide sequence encoding a base editing fusion protein, and an expression construct comprising a nucleotide sequence encoding a guide RNA; v) an expression construct comprising a nucleotide sequence encoding base editing fusion protein and a nucleotide sequence encoding guide RNA; wherein said base editing fusion protein contains a nuclease-inactivated Cas9 domain and a deaminase domain, said guide RNA can target said base editing fusion protein to the target sequence in the plant genome.
2. The system according to claim 1, wherein said deaminase is a cytidine deaminase, for example an apolipoprotein B mRNA editing complex (APOBEC) family deaminase.
3. The system according to claim 2, wherein said cytidine deaminase is an APOBEC1 deaminase or an activation-induced cytidine deaminase (AID), for example, said cytidine deaminase comprises an amino acid sequence set forth in SEQ ID NO: 1.
4. The system according to claim 1, wherein said nuclease-inactivated Cas9 comprises amino acid substitutions of D10A and/or H840A relative to wild-type Cas9, for example, said nuclease-inactivated Cas9 comprises an amino acid sequence set forth in SEQ ID NO:13 or 14.
5. The system according to claim 1, wherein said deaminase domain is fused to the N-terminal of said nuclease-inactivated Cas9 domain, or wherein said deaminase domain is fused to the C-terminal of said nuclease-inactivated Cas9 domain.
6. The system according to claim 1, wherein said deaminase domain and said nuclease inactivated Cas9 domain is fused through a linker, for example said linker is XTEN linker set forth in SEQ ID NO:12.
7. The system according to claim 1, wherein said base editing fusion protein further comprises a uracil DNA glycosylase inhibitor (UGI), for example said uracil DNA glycosylase inhibitor comprises an amino acid sequence set forth in SEQ ID NO: 15.
8. The system according to claim 1, wherein said base editing fusion protein further comprises a nuclear localization sequence (NLS), for example, said NLS comprises an amino acid sequence set forth in SEQ ID NO: 30 or 31.
9. The system according to claim 1, wherein said base editing fusion protein comprises an amino acid sequence set forth in SEQ ID NO: 22 or 23.
10. The system according to claim 1, wherein the nucleotide sequence encoding said base editing fusion protein is codon optimized for plants to be base edited, for example, said nucleotide sequence encoding said base editing fusion protein is set forth in SEQ ID NO: 19 or 20.
11. The system according to claim 1, wherein said guide RNA is a single guide RNA (sgRNA).
12. The system according to claim 1, wherein a nucleotide sequence encoding said base editing fusion protein and/or a nucleotide sequence encoding said guide RNA are operably linked to an expression regulatory element for the plant.
13. The system according to claim 12, wherein said expression regulatory element is a promoter, for example a 35S promoter, a maize Ubi-1 promoter, a wheat U6 promoter, a rice U3 promoter, or a maize U3 promoter.
14. A method for producing a genetically modified plant, comprising introducing the system according to claim 1 into the plant, and thereby said guide RNA target said base editing fusion protein to a target sequence in the genome of said plant, resulting in one or more C to T substitutions in said target sequence, preferably, said introduction is performed in the absence of any selective pressure.
15. The method according to claim 14, wherein one or more C in the positions 3-9 in said target sequence is substituted with T.
16. The method according to claim 14, further comprising screening for plants with desired nucleotide substitution(s).
17. The method according to any one of claim 14, wherein said plant is selected from monocotyledon and dicotylodon.
18. The method according to claim 17, wherein said plant is a crop plant, such as wheat, rice, maize, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava, or potato.
19. The method according to claim 14, wherein said target sequence is associated with a plant trait such as an agronomy trait, and thereby said base editing results in a change of the trait in the plant relative to a wild-type plant.
20. The method according to claim 14, wherein said system is introduced into said plant through a method selected from particle bombardment, PEG-mediated protoplast transformation, Agrobacterium-mediated transformation, plant virus-mediated transformation, a pollen tube approach, and ovary injection approach.
21. The method according to claim 14, further comprising obtaining a progeny of the genetically modified plant.
22. A genetically modified plant or a progeny thereof or a part thereof, wherein said plant is obtained through the method according to claim 14.
22. A plant breeding method, comprising crossing a first plant genetically modified through the method according to claim 14 with a second plant that does not contain said genetic modification, and thereby introducing said genetic modification into said second plant.
Description
DESCRIPTION OF DRAWINGS
[0013]
[0014]
[0015]
[0016]
[0017]
DETAILED DESCRIPTION OF THE INVENTION
1. Definition
[0018] In the present invention, unless indicated otherwise, the scientific and technological terminologies used herein refer to meanings commonly understood by a person skilled in the art. Also, the terminologies and experimental procedures used herein relating to protein and nucleotide chemistry, molecular biology, cell and tissue cultivation, microbiology, immunology, all belong to terminologies and conventional methods generally used in the art. For example, the standard DNA recombination and molecular cloning technology used herein are well known to a person skilled in the art, and are described in details in the following references: Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter refers to as Sambrook et al). In the meantime, in order to better understand the present invention, definitions and explanations for the relevant terminologies are provided below.
[0019] Cas9 nuclease and Cas9 can be used interchangeably herein, which refer to a RNA directed nuclease, including the Cas9 protein or fragments thereof (such as a protein comprising an active DNA cleavage domain of Cas9 and/or a gRNA binding domain of Cas9). Cas9 is a component of the CRISPR/Cas (clustered regularly interspaced short palindromic repeats and its associated system) genome editing system, which targets and cleaves a DNA target sequence to form a DNA double strand breaks (DSB) under the guidance of a guide RNA.
[0020] guide RNA and gRNA can be used interchangeably herein, which typically are composed of crRNA and tracrRNA molecules forming complexes through partial complement, wherein crRNA comprises a sequence that is sufficiently complementary to a target sequence for hybridization and directs the CRISPR complex (Cas9+crRNA+tracrRNA) to specifically bind to the target sequence. However, it is known in the art that single guide RNA (sgRNA) can be designed, which comprises the characteristics of both crRNA and tracrRNA.
[0021] Deaminase refers to an enzyme that catalyzes the deamination reaction. In some embodiments of the present invention, the deaminase refers to a cytidine deaminase, which catalyzes the deamination of a cytidine or a deoxycytidine to a uracil or a deoxyuridine, respectively.
[0022] Genome as it applies to plant cells encompasses not only chromosomal DNA found within the nucleus, but organelle DNA found within subcellular components (e.g., mitochondrial, plastid) of the cell.
[0023] As used herein, the term plant includes a whole plant and any descendant, cell, tissue, or part of a plant. The term plant parts include any part(s) of a plant, including, for example and without limitation: seed (including mature seed and immature seed); a plant cutting; a plant cell; a plant cell culture; a plant organ (e.g., pollen, embryos, flowers, fruits, shoots, leaves, roots, stems, and explants). A plant tissue or plant organ may be a seed, protoplast, callus, or any other group of plant cells that is organized into a structural or functional unit. A plant cell or tissue culture may be capable of regenerating a plant having the physiological and morphological characteristics of the plant from which the cell or tissue was obtained, and of regenerating a plant having substantially the same genotype as the plant. In contrast, some plant cells are not capable of being regenerated to produce plants. Regenerable cells in a plant cell or tissue culture may be embryos, protoplasts, meristematic cells, callus, pollen, leaves, anthers, roots, root tips, silk, flowers, kernels, ears, cobs, husks, or stalks.
[0024] Plant parts include harvestable parts and parts useful for propagation of progeny plants. Plant parts useful for propagation include, for example and without limitation: seed; fruit; a cutting; a seedling; a tuber; and a rootstock. A harvestable part of a plant may be any useful part of a plant, including, for example and without limitation: flower; pollen; seedling; tuber; leaf; stem; fruit; seed; and root.
[0025] A plant cell is the structural and physiological unit of the plant, and includes protoplast cells without a cell wall and plant cells with a cell wall. A plant cell may be in the form of an isolated single cell, or an aggregate of cells (e.g., a friable callus and a cultured cell), and may be part of a higher organized unit (e.g., a plant tissue, plant organ, and plant). Thus, a plant cell may be a protoplast, a gamete producing cell, or a cell or collection of cells that can regenerate into a whole plant. As such, a seed, which comprises multiple plant cells and is capable of regenerating into a whole plant, is considered a plant cell in embodiments herein.
[0026] The term protoplast, as used herein, refers to a plant cell that had its cell wall completely or partially removed, with the lipid bilayer membrane thereof naked, and thus includes protoplasts, which have their cell wall entirely removed, and spheroplasts, which have their cell wall only partially removed, but is not limited thereto. Typically, a protoplast is an isolated plant cell without cell walls which has the potency for regeneration into cell culture or a whole plant.
[0027] Progeny of a plant comprises any subsequent generation of the plant.
[0028] A genetically modified plant includes a plant which comprises within its genome an exogenous polynucleotide. For example, the exogenous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations. The exogenous polynucleotide may be integrated into the genome alone or as part of a recombinant DNA construct. The modified gene or expression regulatory sequence means that, in the plant genome, said sequence comprises one or more nucleotide substitution, deletion, or addition. For example, a genetically modified plant obtained by the present invention may contain one or more C to T substitutions relative to the wild type plant (corresponding plant that is not genetically modified).
[0029] The term exogenous with respect to sequence means a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
[0030] Polynucleotide, nucleic acid sequence, nucleotide sequence, or nucleic acid fragment are used interchangeably to refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. Nucleotides (usually found in their 5-monophosphate form) are referred to by their single letter designation as follows: A for adenylate or deoxyadenylate (for RNA or DNA, respectively), C for cytidylate or deoxycytidylate, G for guanylate or deoxyguanylate, U for uridylate, T for deoxythymidylate, R for purines (A or G), Y for pyrimidines (C or T), K for G or T, H for A or C or T, I for inosine, and N for any nucleotide.
[0031] Polypeptide, peptide, amino acid sequence and protein are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The terms polypeptide, peptide, amino acid sequence, and protein are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
[0032] As used herein, an expression construct refers to a vector suitable for expression of a nucleotide sequence of interest in a plant, such as a recombinant vector. Expression refers to the production of a functional product. For example, the expression of a nucleotide sequence may refer to transcription of the nucleotide sequence (such as transcribe to produce an mRNA or a functional RNA) and/or translation of RNA into a protein precursor or a mature protein.
[0033] Expression construct of the invention may be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, an RNA that can be translated (such as an mRNA).
[0034] Expression construct of the invention may comprise regulatory sequences and nucleotide sequences of interest that are derived from different sources, or regulatory sequences and nucleotide sequences of interest derived from the same source, but arranged in a manner different than that normally found in nature.
[0035] Regulatory sequence or regulatory element are used interchangeably and refer to nucleotide sequences located upstream (5 non-coding sequences), within, or downstream (3 non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. A plant expression regulatory element refers to a nucleotide sequence capable of controlling the transcription, RNA processing or stability or translation of a nucleotide sequence of interest in a plant.
[0036] Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0037] Promoter refers to a nucleic acid fragment capable of controlling transcription of another nucleic acid fragment. In some embodiments of the invention, the promoter is a promoter capable of controlling gene transcription in a plant cell whether or not its origin is from a plant cell. The promoter may be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
[0038] Constitutive promoter refers to a promoter that generally causes gene expression in most cell types in most circumstances. Tissue-specific promoter and tissue-preferred promoter are used interchangeably, and refer to a promoter that is expressed predominantly but not necessarily exclusively in one tissue or organ, but that may also be expressed in one specific cell or cell type. Developmentally regulated promoter refers to a promoter whose activity is determined by developmental events. Inducible promoter selectively expresses a DNA sequence operably linked to it in response to an endogenous or exogenous stimulus (environment, hormones, or chemical signals, and so on).
[0039] As used herein, the term operably linked means that a regulatory element (for example but not limited to, a promoter sequence, a transcription termination sequence, and so on) is associated to a nucleic acid sequence (such as a coding sequence or an open reading frame), such that the transcription of the nucleotide sequence is controlled and regulated by the transcriptional regulatory element. Techniques for operably linking a regulatory element region to a nucleic acid molecule are known in the art.
[0040] Introduction of a nucleic acid molecule (such as a plasmid, a linear nucleic acid fragment, RNA, and so on) or protein into a plant means transforming the plant cell with the nucleic acid or protein so that the nucleic acid or protein can function in the plant cell. Transformation as used herein includes stable transformation and transient transformation.
[0041] Stable transformation refers to introducing an exogenous nucleotide sequence into a plant genome, resulting in genetically stable inheritance. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the plant and any successive generations thereof.
[0042] Transient transformation refers to introducing a nucleic acid molecule or protein into a plant cell, performing its function without stable inheritance. In transient transformation, the exogenous nucleic acid sequence is not integrated into the plant genome.
[0043] Trait refers to the physiological, morphological, biochemical, or physical characteristics of a plant or a particular plant material or cell. In some embodiments, the characteristic is visible to the human eye, such as seed or plant size, or can be measured by biochemical techniques, such as detecting the protein, starch, or oil content of seed or leaves, or by observation of a metabolic or physiological process, e.g. by measuring tolerance to water deprivation or particular salt or sugar concentrations, or by the observation of the expression level of a gene or genes, or by agricultural observations such as osmotic stress tolerance or yield. In some embodiments, trait also includes ploidy of a plant, such as haploidy which is important for plant breeding. In some embodiments, trait also includes resistance of a plant to herbicides.
[0044] Agronomic trait is a measurable parameter including but not limited to, leaf greenness, yield, growth rate, biomass, fresh weight at maturation, dry weight at maturation, fruit yield, seed yield, total plant nitrogen content, fruit nitrogen content, seed nitrogen content, nitrogen content in a vegetative tissue, total plant free amino acid content, fruit free amino acid content, seed free amino acid content, free amino acid content in a vegetative tissue, total plant protein content, fruit protein content, seed protein content, protein content in a vegetative tissue, drought tolerance, nitrogen uptake, root lodging, harvest index, stalk lodging, plant height, ear height, ear length, disease resistance, cold resistance, salt tolerance, and tiller number and so on.
2. Base Editing System for Plants
[0045] The present invention provides a system for performing base editing to a target sequence in the genome of a plant, comprising at least one of the following (i) to (v):
[0046] i) a base editing fusion protein, and a guide RNA;
[0047] ii) an expression construct comprising a nucleotide sequence encoding a base editing fusion protein, and a guide RNA;
[0048] iii) a base editing fusion protein, and an expression construction comprising a nucleotide sequence encoding a guide RNA;
[0049] iv) an expression construct comprising a nucleotide sequence encoding a base editing fusion protein, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
[0050] v) an expression construct comprising a nucleotide sequence encoding base editing fusion protein and a nucleotide sequence encoding guide RNA;
[0051] wherein said base editing fusion protein contains a nuclease-inactivated Cas9 domain and a deaminase domain, said guide RNA can target said base editing fusion protein to the target sequence in the plant genome.
[0052] The DNA cleavage domain of Cas9 nuclease is known to contain two subdomains: the HNH nuclease subdomain and the RuvC subdomain. HNH subdomains cleave the chain that is complementary to gRNA, whereas the RuvC subdomain cleaves the non-complementary chain. Mutations in these subdomains can inactivate Cas9 nuclease to form nuclease-inactivated Cas9. The nuclease-inactivated Cas9 retains DNA binding capacity directed by gRNA. Thus, in principle, when fused with an additional protein, the nuclease-inactivated Cas9 can simply target said additional protein to almost any DNA sequence through co-expression with appropriate guide RNA.
[0053] Cytidine deaminase can catalyze the deamination of cytidine (C) in DNA to form uracil (U). If nuclease-inactivated Cas9 is fused with Cytidine deaminase, the fusion protein can target a target sequence in the genome of a plant through the direction of a guide RNA. The DNA double strand is not cleaved due to the loss of Cas9 nuclease activity, whereas the deaminase domain in the fusion protein is capable of converting the cytidine of the single-strand DNA produced during the formation of the Cas9-guide RNA-DNA complex into a U, and then C to T substitution may be achieved by base mismatch repair.
[0054] Therefore, in some embodiments of the invention, the deaminase is a cytidine deaminase, such as an apolipoprotein B mRNA editing complex (APOBEC) family deaminase. Particularly, the deaminase described herein is a deaminase that can accept single-strand DNA as the substrate.
[0055] Examples of cytidine deaminase can be used in the present invention include but are not limited to APOBEC1 deaminase, activation-induced cytidine deaminase (AID), APOBEC3G, or CDA1.
[0056] In some embodiments of the present invention, the cytidine deaminase comprises the amino acid sequence set forth in SEQ ID NO: 11.
[0057] The nuclease-inactivated Cas9 of the present invention can be derived from Cas9 of different species, for example, derived from S. pyogenes Cas9 (SpCas9, the nucleotide sequence of which is shown in SEQ ID NO: 18; the amino acid sequence is shown in SEQ ID NO: 21). Mutations in both the HNH nuclease subdomain and the RuvC subdomain of the SpCas9 (includes, for example, D10A and H840A mutations) inactivate S. pyogenes Cas9 nuclease, resulting in a nuclease dead Cas9 (dCas9). Inactivation of one of the subdomains by mutation allows Cas9 to gain nickase activity, i.e., resulting in a Cas9 nickase (nCas9), for example, nCas9 with a D10A mutation only.
[0058] Therefore, in some embodiments of the invention, the nuclease-inactivated Cas9 of the invention comprises amino acid substitutions D10A and/or H840A relative to wild-type Cas9.
[0059] In some preferred embodiments of the invention, the nuclease-inactivated Cas9 of the invention has nickase activity. Without being bound by any theory, it is believed that Eukaryotic mismatch repair uses nicks on a DNA strand for the removal and repair of the mismatched base in the DNA strand. The U: G mismatch formed by cytidine deaminase may be repaired into C: G. Through the introduction of a nick on the chain containing unedited G, it will be possible to preferentially repair the U: G mismatch to the desired U:A or T:A. Therefore, preferably, the nuclease-inactivated Cas9 is a Cas9 nickase that retains the cleavage activity of the HNH subdomain of Cas9, whereas the cleavage activity of the RuvC subdomain is inactivated. For example, the nuclease-inactivated Cas9 contains an amino acid substitution D10A relative to wild-type Cas9.
[0060] In some embodiments of the present invention, the nuclease-inactivated Cas9 comprises the amino acid sequence of SEQ ID NO:14. In some preferred embodiments, the nuclease-inactivated Cas9 comprises the amino acid sequence of SEQ ID NO: 13.
[0061] In some embodiments of the invention, the deaminase domain is fused to the N-terminus of the nuclease-inactivated Cas9 domain. In some embodiments, the deaminase domain is fused to the C-terminus of the nuclease-inactivated Cas9 domain.
[0062] In some embodiments of the invention, the deaminase domain and the nuclease inactivated Cas9 domain are fused through a linker. The linker can be a non-functional amino acid sequence having no secondary or higher structure, which is 1 to 50 (for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 20-25, 25-50) or more amino acids in length. For example, the linker may be a flexible linker, such as GGGGS, GS, GAP, (GGGGS)3, GGS, (GGS)7, and the like. In some preferred embodiments, the linker is an XTEN linker as shown in SEQ ID NO: 12.
[0063] In cells, uracil DNA glycosylase catalyzes the removal of U from DNA and initiates base excision repair (BER), which results in the repair of U: G to C: G Therefore, without any theoretical limitation, including uracil DNA glycosylase inhibitor in the base editing fusion protein of the invention or the system of the present invention will be able to increase the efficiency of base editing.
[0064] Accordingly, in some embodiments of the invention, the base editing fusion protein further comprises a uracil DNA glycosylase inhibitor (UGI). In some embodiments, the uracil DNA glycosylase inhibitor comprises the amino acid sequence set forth in SEQ ID NO: 15.
[0065] In some embodiments of the invention, the base editing fusion protein of the invention further comprises a nuclear localization sequence (NLS). In general, one or more NLSs in the base editing fusion protein should have sufficient strength to drive the accumulation of the base editing fusion protein in the nucleus of a plant cell in an amount sufficient for the base editing function. In general, the strength of the nuclear localization activity is determined by the number and position of NLSs, and one or more specific NLSs used in the base editing fusion protein, or a combination thereof.
[0066] In some embodiments of the present invention, the NLSs of the base editing fusion protein of the invention may be located at the N-terminus and/or the C-terminus. In some embodiments, the base editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the base editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the N-terminus. In some embodiments, the base-editing fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the C-terminus. In some embodiments, the base editing fusion protein comprises a combination of these, such as one or more NLSs at the N-terminus and one or more NLSs at the C-terminus. Where there are more than one NLS, each NLS may be selected as independent from other NLSs. In some preferred embodiments of the invention, the base-editing fusion protein comprises two NLSs, for example, the two NLSs are located at the N-terminus and the C-terminus, respectively.
[0067] In general, NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of a protein, but other types of NLS are also known in the art. Non-limiting examples of NLSs include KKRKV(nucleotide sequence 5-AAGAAGAGAAAGGTC-3), PKKKRKV(nucleotide sequence 5-CCCAAGAAGAAGAGGAAGGTG-3 or CCAAAGAAGAAGAGGAAGGTT), or SGGSPKKKRKV(nucleotide sequence 5-TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG-3).
[0068] In some embodiments of the invention, the N-terminus of the base editing fusion protein comprises an NLS with an amino acid sequence shown by PKKKRKV. In some embodiments of the invention, the C-terminus of the base-editing fusion protein comprises an NLS with an amino acid sequence shown by SGGSPKKKRKV.
[0069] In addition, the base editing fusion protein of the present invention may also include other localization sequences, such as cytoplasmic localization sequences, chloroplast localization sequences, mitochondrial localization sequences, and the like, depending on the location of the DNA to be edited.
[0070] In some embodiments of the present invention, the base editing fusion protein comprises the amino acid sequence set forth in SEQ ID NO: 22 or 23.
[0071] In order to obtain efficient expression in plants, in some embodiments of the invention, the nucleotide sequence encoding the base editing fusion protein is codon optimized for the plant to be base edited.
[0072] Codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at theCodon Usage Database available at www.kazusa.orjp/codon/ and these tables can be adapted in a number of ways. See Nakamura, Y, et al. Codon usage tabulated from the international DNA sequence databases: status for the year 2000 Nucl. Acids Res. 28:292 (2000).
[0073] In some embodiments of the invention, the codon-optimized nucleotide sequence encoding the base editing fusion protein is set forth in SEQ ID NO: 19 or 20.
[0074] In some embodiments of the invention, the guide RNA is a single guide RNA (sgRNA). Methods of constructing suitable sgRNAs according to a given target sequence are known in the art. See e.g., Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew. Nat. Biotechnol. 32, 947-951 (2014); Shan, Q. et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat. Biotechnol. 31, 686-688 (2013); Liang, Z. et al. Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system. J Genet Genomics. 41, 63-68 (2014).
[0075] In some embodiments of the invention, the nucleotide sequence encoding the base-edited fusion protein and/or the nucleotide sequence encoding the guide RNA is operably linked to a plant expression regulatory element, such as a promoter.
[0076] Examples of promoters that can be used in the present invention include but not limited to the cauliflower mosaic virus 35S promoter (Odell et al. (1985) Nature 313: 810-812), a maize Ubi-1 promoter, a wheat U6 promoter, a rice U3 promoter, a maize U3 promoter, a rice actin promoter, a TrpPro5 promoter (U.S. patent application Ser. No. 10/377,318; filed on Mar. 16, 2005), a pEMU promoter (Last et al. Theor. Appl. Genet. 81: 581-588), a MAS promoter (Velten et al. (1984) EMBO J. 3: 2723-2730), a maize H3 histone promoter (Lepetit et al. Mol. Gen. Genet. 231: 276-285 and Atanassova et al. (1992) Plant J. 2 (3): 291-300), and a Brassica napus ALS3 (PCT Application WO 97/41228) promoters. Promoters that can be used in the present invention also include the commonly used tissue specific promoters as reviewed in Moore et al. (2006) Plant J. 45 (4): 651-683.
3. The Method of Producing a Genetically Modified Plant
[0077] In another aspect, the present invention provides a method of producing a genetically modified plant, comprising introducing a system for performing base editing to a target sequence in a plant genome of the invention into a plant, and thereby said base editing fusion protein is targeted to the target sequence in said plant genome by the guide RNA, and results in one or more C to T substitutions in said target sequence.
[0078] The design of the target sequence that can be recognized and targeted by a Cas9 and guide RNA complex is within the technical skills of one of ordinary skill in the art. In general, the target sequence is a sequence that is complementary to a leader sequence of about 20 nucleotides comprised in guide RNA, and the 3-end of which is immediately adjacent to the protospacer adjacent motif (PAM) NGG.
[0079] For example, in some embodiments of the invention, the target sequence has the structure: 5-Nx-NGG-3, wherein N is selected independently from A, C, and T; X is an integer of 14X30; Nx represents X contiguous nucleotides, and NGG is a PAM sequence. In some specific embodiments of the invention, X is 20.
[0080] The base editing system of the present invention has a broad deamination window in plants, for example, a deamination window with a length of 7 nucleotides. In some embodiments of the methods of the invention, one or more C bases within positions 3 to 9 of the target sequence are substituted with Ts. For example, if present, any one, two, three, four, five, six, or seven Cs within positions 3 to 9 in the target sequence can be replaced with Ts. For example, if there are four Cs within positions 3 to 9 of the target sequence, any one, two, three, four Cs can be replaced by Ts. The C bases may be contiguous or separated by other nucleotides. Therefore, if there are multiple Cs in the target sequence, a variety of mutation combinations can be obtained by the method of the present invention. In some embodiments of the methods of the invention, further comprises screening plants having the desired nucleotide substitutions. Nucleotide substitutions in plants can be detected by T7EI, PCR/RE or sequencing methods, see e.g., Shan, Q., Wang, Y, Li, J. & Gao, C. Genome editing in rice and wheat using the CRISPR/Cas system. Nat. Protoc. 9, 2395-2410 (2014).
[0081] In the methods of the invention, the base editing system can be introduced into plants by various methods well known to people skilled in the art. Methods that can be used to introduce the base editing system of the present invention into plants include but not limited to particle bombardment, PEG-mediated protoplast transformation, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube approach, and ovary injection approach.
[0082] In the methods of the present invention, modification of the target sequence can be accomplished simply by introducing or producing the base editing fusion protein and guide RNA in plant cells, and the modification can be stably inherited without the need of stably transformation of plants with the base editing system. This avoids potential off-target effects of a stable base editing system, and also avoids the integration of exogenous nucleotide sequences into the plant genome, and thereby resulting in higher biosafety.
[0083] In some preferred embodiments, the introduction is performed in the absence of a selective pressure, thereby avoiding the integration of exogenous nucleotide sequences in the plant genome.
[0084] In some embodiments, the introduction comprises transforming the base editing system of the invention into isolated plant cells or tissues, and then regenerating the transformed plant cells or tissues into an intact plant. Preferably, the regeneration is performed in the absence of a selective pressure, i.e., no selective agent against the selective gene carried on the expression vector is used during the tissue culture. Without the use of a selective agent, the regeneration efficiency of the plant can be increased to obtain a modified plant that does not contain exogenous nucleotide sequences.
[0085] In other embodiments, the base editing system of the present invention can be transformed to a particular site on an intact plant, such as leaf, shoot tip, pollen tube, young ear, or hypocotyl. This is particularly suitable for the transformation of plants that are difficult to regenerate by tissue culture.
[0086] In some embodiments of the invention, proteins expressed in vitro and/or RNA molecules transcribed in vitro are directly transformed into the plant. The proteins and/or RNA molecules are capable of achieving base-editing in plant cells, and are subsequently degraded by the cells to avoid the integration of exogenous nucleotide sequences into the plant genome.
[0087] Plant that can be base-edited by the methods of the invention includes monocotyledon and dicotyledon. For example, the plant may be a crop plant such as wheat, rice, maize, soybean, sunflower, sorghum, rape, alfalfa, cotton, barley, millet, sugar cane, tomato, tobacco, cassava, or potato.
[0088] In some embodiments of the invention, the target sequence is associated with plant traits such as agronomic traits, and thereby the base editing results in the plant having altered traits relative to a wild type plant.
[0089] In the present invention, the target sequence to be modified may be located anywhere in the genome, for example, within a functional gene such as a protein-coding gene or, for example, may be located in a gene expression regulatory region such as a promoter region or an enhancer region, and thereby accomplish the functional modification of said gene or accomplish the modification of a gene expression. Accordingly, in some embodiments of the invention, C to T substitution(s) results in amino acid substitutions in a target protein or the truncation of a target protein (resulting in a stop codon). In other embodiments of the invention, C to T substitution(s) results in a change in the expression of a target gene.
[0090] In some embodiments, the gene modified by the methods of the invention can be wheat LOX2, rice CDC48, NRT1.1B and SPL14, maize CENH3, and ALS gene.
[0091] In some embodiments of the invention, the method further comprises obtaining progeny of the genetically modified plant.
[0092] In a further aspect, the invention also provides a genetically modified plant or progeny thereof or parts thereof, wherein the plant is obtained by the method of the invention described above.
[0093] In another aspect, the present invention also provides a plant breeding method comprising crossing a first genetically modified plant obtained by the above-mentioned method of the present invention with a second plant not containing said genetic modification, thereby introducing said genetic modification into said second plant.
4. Production Method of Maize Haploid Plants and the Use Thereof
[0094] CENH3 encodes centromeric histones, which are essential for the normal functioning of centromeres in animals and plants. TILLING studies have shown that amino acid substitutions of several residues in the C-terminal region of Arabidopsis CENH3 (AtCENH3) can result in haploid induction, which is advantageous for accelerating crop breeding. Substitution of a highly conserved leucine residue in the CENP-A targeting domain (CATD,
[0095] The present invention provides a method of producing a maize haploid inducer line comprising modifying the ZmCENH3 gene in a maize plant by the base editing method of the invention, resulting in one or more amino acid substitutions in the CATD domain in ZmCENH3, the one or more amino acid substitutions confer a haploid inducer activity to the maize plant.
[0096] In a specific embodiment, the modification results in one or more amino acid substitutions in the conserved motif alanine-leucine-leucine (ALL) at positions 109-111 in ZmCENH3 of SEQ ID NO: 25. For example, the ALL motif is modified as single substitution: AFL, VLL, or ALF; double substitution: AFF, VFL, or VLF; or triple substitution: VFF.
[0097] In some embodiments, the target sequence for modification of ZmCENH3 by the base editing method of the invention is AGCCCTCCTTGCGCTGCAAGAGG, wherein the underlined sequence is a PAM sequence.
[0098] In some embodiments, the maize plant is a Zong31 variety. In some embodiments, the maize plant is a HiII variety.
[0099] The present invention provides a method of producing maize haploid comprising crossing a maize haploid inducer line obtained by the method of the present invention with a wild-type maize plant, and harvesting the hybrid progeny to obtain a maize haploid plant. In some embodiments, the maize haploid inducer line is used as a male parent and the wild-type maize plant is used as a maternal parent for cross.
[0100] The present invention also encompasses the maize haploid inducer line and the maize haploid plant obtained by the methods of the invention and their use in maize breeding.
5. Methods of Producing Herbicide-resistant Maize Plants
[0101] The present invention provides a method of producing a herbicide-resistant maize plant comprising modifying the ZmALS gene (encoding acetolactate synthase) in a maize plant by the base editing method of the invention, resulting in one or more amino acid substitutions in ZmALS, said one or more amino acid substitutions confer herbicide resistance to the maize plant.
[0102] In a specific embodiment, the modification simultaneously results in one or more amino acid substitutions of both ZmALS1 of SEQ ID NO: 27 and ZmALS2 of SEQ ID NO: 29. For example, the 165th residues of ZmALS1 and ZmALS2 are substituted.
[0103] In some embodiments, the target sequence for the modification of ZmALS1 and ZmALS2 by the base editing methods of the present invention is CAGGTGCCGCGACGCATGATTGG, wherein the underlined sequence is a PAM sequence.
[0104] In some embodiments, the maize plant is a Zong31 variety. In some embodiments, the maize plant is a HiII variety.
[0105] The present invention also provides a method of breeding a herbicide-resistant maize plant, comprising crossing a first herbicide-resistant maize plant obtained by the above-described method of the present invention with a second plant so as to introduce the herbicide resistance into the second plant.
[0106] The present invention also encompasses herbicide-resistant maize plants or progeny thereof obtained by the methods of the invention.
[0107] A method of controlling undesired plants in a maize plant growing area, comprising applying an ALS inhibitor herbicide to the plants in the area, wherein the maize plant is a herbicide-resistant maize plant obtained by the method of the invention.
Examples
Materials and Methods
[0108] Construction of pn/dCas9-PBE Expression Vector
[0109] The APOBEC1, XTEN, nCas9(D10A), dCas9, and UGI sequences were codon-optimized for wheat (SEQ ID NO:1-5) and ordered from GenScript (Nanjing). The full length n/dCas9 fragment was amplified using the primer set AflII-F (with AflII restriction site) and MluI-R (with MluI restriction site). The PCR products were digested with AflII and MluI, and then inserted into the both enzymes-digested pUC57-APOBEC1-XTEN-UGI vector (the sequence of the vector is set forth in SEQ ID NO: 10) to generate the fusion cloning vector pUC57-APOBECI-XTEN-n/dCas9-UGI. Then the primer set BamHI-F and Bsp1047I-R was used to amplify the fragment of APOBECI-XTEN-n/dCas9-UGI. Products were digested with BamHI and Bsp1047I, and further inserted into BamHI and Bsp1047I-digested pUbi-GFP (the sequence of the vector is set forth in SEQ ID NO: 8) to generate the fusion expression vector pn/dCas9-PBE.
Construction of sgRNA Expression Vectors
[0110] sgRNA target sequences in the experiment are shown in the Table 1 below:
TABLE-US-00001 TABLE1 TargetgenesandgRNAtargetsequences sgRNA targetsequence Oligo-F Oligo-R sgRNA-OsBFPm ACCCACGGCGTGCAGTGCTTCGG GGCAACCCACGG AAACAAGCACTG CGTGCAGTGCTT CACGCCGTGGGT sgRNA-TaBFPm ACCCACGGCGTGCAGTGCTTCGG CTTGACCCACGG AAACAAGCACTG CGTGCAGTGCTT CACGCCGTGGGT sgRNA-ZmBFPm ACCCACGGCGTGCAGTGCTTCGG AGCAACCCACGG AAACAAGCACTG CGTGCAGTGCTT CACGCCGTGGGT sgRNA-OsCDC48 GACCAGCCAGCGTCTGGCGCCGG GGCAGACCAGCC AAACGCGCCAGA AGCGTCTGGCGC CGCTGGCTGGTC sgRNA-OsNRT1.1B CGGCGACGGCGAGCAAGTGGAGG GGCACGGCGACG AAACCCACTTGC GCGAGCAAGTGG TCGCCGTCGCCG sgRNA-OsSPL14 CTCTTCTGTCAACCCAGCCATGG GGCACTCTTCTG AAACTGGCTGGG TCAACCCAGCCA TTGACAGAAGAG sgRNA-TaLOX2-S1 GTCGACATCAACAACCTCGACGG CTTGGTCGACAT AAACTCGAGGTT CAACAACCTCGA GTTGATGTCGAC sgRNA-TaLOX2-S2 CTTCCTGGGCTACACGCTCAAGG CTTGCTTCCTGG AAACTGAGCGTG GCTACACGCTCA TAGCCCAGGAAG sgRNA-TaLOX2-S3 AAGGACCTCATCCCCATGGGCGG CTTGAAGGACCT AAACCCCATGGG CATCCCCATGGG GATGAGGTCCTT sgRNA-ZmCENH3 AGCCCTCCTTGCGCTGCAAGAGG AGCAAGCCCTCC AAACCTTGCAGC TTGCGCTGCAAG GCAAGGAGGGCT The italic Cs represent Cs to be mutated in the deamination window from positions 3-9; bold letters represents PAM sequence.
[0111] According to the previous description (Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew. Nat. Biotechnol. 32, 947-951, 2014; Shan, Q. et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat. Biotechnol. 31, 686-688, 2013; and Liang, Z. et al. Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system. J Genet Genomics. 41, 63-68, 2014), sgRNA expression vectors were constructed based on pTaU6-sgRNA (Addgene ID53062), pOsU3-sgRNA (Addgene ID53063), or pZmU3-sgRNA (Addgene ID53061):
[0112] pTaU6-BFP-sgRNA, pOsU3-BFP-sgRNA, pZmU3-BFP-sgRNA, pTaU6-LOX2-S1-sgRNA, pTaU6-LOX2-S2-sgRNA, pTaU6-LOX2-S3-sgRNA, pOsU3-CDC48-sgRNA, pOsU3-NRT1.1-sgRNA, pOsU3-SPL14-sgRNA, and pZmU3-CENH3-sgRNA.
BFP and GFP Expression Vectors
[0113] The sequence of pUbi-BFPm is set forth in SEQ ID NO: 9. The amino acid sequence of BFP is set forth in SEQ ID NO: 17.
[0114] The sequence of pUbi-GFP is set forth in SEQ ID NO: 8. The amino acid sequence of GFP is set forth in SEQ ID NO: 16.
Construction of pAG-n/dCas9-PBE-CDC48-sgRNA Expression Vector
[0115] The APOBECI-XTEN-d/nCas9-UGI fragment was fused to the StuI and SacI-digested pHUE411 (Addgen #62203) with primer set Gibson-F and Gibson-R through Gibson cloning method, generating the vector pHUE411-APOBECI-XTEN-d/nCas9-UGI without sgRNA target sites. Paired oligonucleotides (oligos) comprising OsCDC48 targeting sequences were synthesized, annealed and cloned into BsaI-digested pHUN411 vector, obtaining the pHUE411-sgRNA-CDC48. Then the fragment resulted from the digestion of pHUE411-sgRNA-CDC48 with PmeI and AvrII was inserted into pHUE411-APOBECI-XTEN-d/nCas9-UGI which was also digested with PmeI and AvrII, to finally obtain the Agrobacterium-mediated transformation vector pAG-n/dCas9-PBE-CDC48-sgRNA.
Protoplast Assays
[0116] Wheat Bobwhite variety, rice Nipponbare, and maize inbred line variety Zong31 were used in this study. Protoplasts transformation is performed as described below. The average transformation efficiency is 55-70%. Transformation is carried out with 10 g of each plasmid by PEG-mediated transfection. Protoplasts were collected after 48 h and DNA was extracted for T7E1 and PCR-RE assay.
Preparation and Transformation of Wheat (Maize) Protoplasts
[0117] 1) The middle parts of wheat (maize) tender leaves were cut into strips of 0.5-1 mm in width. The strips were placed into 0.6M Mannitol solution for 10 minutes, filtered, and then placed in 50 ml enzyme solution 20-25 C. in darkness, with gently shaking (10 rmp) for 5 hours.
[0118] 2) 10 ml W5 was added to dilute the enzymolysis products and the products were filtered with a 75 m nylon filter in a round bottom centrifuge tube (50 ml).
[0119] 3) 23 C. 100 g centrifugation for 3 min, and the supernatant was discarded.
[0120] 4) The products were gently suspended with 10 ml W5, placed on the ice for 30 min to allow the protoplasts gradually settling, and the supernatant was discarded.
[0121] 5) Protoplasts were suspended by adding an appropriate amount of MMG, placed on ice until transformation.
[0122] 6) 10-20 g plasmid, 200 l protoplasts (about 410.sup.5 cells), 220 l fresh PEG solution were added into a 2 ml centrifuge tube, mixed up, and placed under room temperature in darkness for 10-20 minutes to induce transformation.
[0123] 7) After the induction of transformation, 880 l W5 solution was slowly added, and the tubes were gently turned upside down for mixing, then 100 g horizontal centrifuged for 3 min, and the supernatant was discarded.
[0124] 8) The products were resuspended in 2 ml W5 solution, transferred to a six-well plate, cultivated under room temperature (or 25 C.) in darkness. For protoplast genomic DNA extraction, the products need to be cultivated for 48 h.
Preparation and Transformation of Rice Protoplast
[0125] 1) Leaf sheath of the seedlings were used for protoplasts isolation, and cut into about 0.5 mm wide with a sharp blade.
[0126] 2) Immediately after incision, transferred into 0.6M Mannitol solution, and placed in the dark for 10 min.
[0127] 3) Mannitol solution was removed by filtration, and the products were transferred into enzymolysis solution, and evacuated for 30 min.
[0128] 4) Enzymolysis was performed for 5-6 h in darkness with gently shaking (decolorization shaker, speed 10).
[0129] 5) After enzymolysis completion, an equal volume of W5 was added, horizontal shake for 10 s to release protoplasts.
[0130] 6) Protoplasts were filtered into a 50 ml round bottom centrifuge tube with a 40 m nylon membrane and washed with W5 solution.
[0131] 7) 250 g horizontal centrifugation for 3 min to precipitate the protoplasts, the supernatant was discarded.
[0132] 8) Protoplasts were resuspended by adding 10 ml W5, and then centrifuged at 250 g for 3 min, and the supernatant was discarded.
[0133] 9) An appropriate amount of MMG solution was added to resuspend the protoplasts to a concentration of 210.sup.6/ml.
[0134] Note: All the above steps were carried out at room temperature.
[0135] 10) 10-20 g plasmid, 200 l protoplasts (about 410.sup.5 cells), and 220 l fresh PEG solution were added into a 2 ml centrifugal tube, mixed, and placed at room temperature in darkness for 10-20 minutes to induce transformation.
[0136] 11) After the completion of the transformation, 880 l W5 solution was slowly added, and the tubes were gently turned upside down for mixing, 250 g horizontal centrifuged for 3 min, and the supernatant was discarded.
[0137] 12) The products were resuspended in 2 ml WI solution, transferred to a six-well plate, cultivated in room temperature (or 25 C.) in darkness. For protoplast genomic DNA extraction, the products need to be cultivated for 48 h.
Transformation of DNA Constructs into Wheat Calli by Particle Bombardment
[0138] Plasmid DNA (pnCas9-PBE and pTaU6-LOX2-S1-sgRNA) was used to bombard Bobwhite immature embryos. Particle bombardment transformation was performed as described previously (Zhang, K., Liu, J., Zhang, Y, Yang, Z. & Gao, C. Biolistic genetic transformation of a wide range of Chinese elite wheat (Triticum aestivum L.) varieties. J. Genet Genomics 42, 39-42 (2015)). After bombardment, the embryos are treated according to the literature, but no selection agents were used during tissue culture.
Transformation of pAG-n/dCas9-PBE-CDC48-sgRNA into Rice Calli by Agrobacterium
[0139] pAG-n/dCas9-PBE-CDC48-sgRNA binary vector was transformed into Agrobacterium AGL1 strain by electroporation. Agrobacterium-mediated transformation, tissue culture, and regeneration of rice Nipponbare were performed according to Shan et al. (Shan, Q. et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat. Biotechnol. 31, 686-688 (2013)). Hygromycin (50 g/ml) was used in all the subsequent tissue culture process for screen.
Identification of Mutations by T7EI and PCR/RE Assay
[0140] DNA from individual rice plant was extracted to detect the mutations by T7EI assay, and then mutations were confirmed by Sanger sequencing. In wheat, in order to save costs and labor, 3-4 plants were randomly selected as a group to detect mutations by T7EI and PCR/RE. Once a group showed a positive result, all the plants in the group were further tested by T7EI and PCR/RE, and then the results were confirmed by Sanger sequencing.
[0141] T7EI Detect Method:
[0142] 1) Genomic DNA was extracted from a plant, amplified by PCR and detected by electrophoresis.
[0143] 2) PCR products were added into T7EI buffer as follows:
TABLE-US-00002 10 T7EI buffer 1.1 l PCR product 5 l ddH.sub.2O 4.4 l
[0144] 3) The mixture was heated in the PCR device to 95 C. for 5 min, and then cooled to room temperature such that the PCR products re-anneal to form heteroduplex DNA.
[0145] 4) T7EI endonuclease was added as follows:
TABLE-US-00003 Anneal product in 3) 10.5 l T7EI, 5 units/l 0.5 l Total volume 11 l
[0146] 5) 37 C. for 1 h, all 11 l products was subjected to gel electrophoresis. Cleavage of the PCR products indicates that the products contain indel mutations.
[0147] PCR/RE:
[0148] 1) Plant genomic DNA was extracted.
[0149] 2) Fragments containing the target sites, the length of which is between 350-1000 bp, were amplified with synthetic gene-specific primers:
TABLE-US-00004 10 EasyTaq Buffer 5 l dNTP (2.5 mM) 4 l Forward primer (10 M) 2 l Forward primer (10 M) 2 l Easy Taq 0.5 l DNA 2 l ddH.sub.2O To 50 l
[0150] 3) The general reaction conditions are: denaturation at 94 C. for 5 min; denaturation at 94 C. for 30 s; anneal at 58 C. for 30 s, extension at 72 C. for 30 s, amplification for 30 to 35 cycles; incubation at 72 C. for 5 min; incubation at 12 C. 5 l PCR products were subjected to electrophoresis.
[0151] 4) PCR products were digested with restriction endonuclease as follows:
TABLE-US-00005 10 Fastdigest Buffer 2 l Restriction enzymes 1 l PCR product 3-5 l ddH.sub.2O To 20 l
[0152] 5) Digestion at 37 C. for 2-3 h. Products were analyzed by 1.2% agarose gel electrophoresis.
[0153] 6) The uncut mutant bands in the PCR products were recovered and purified, and subjected to TA cloning as follows:
TABLE-US-00006 pEasy-T Vector 1 l Recovered uncleaved PCR 4 l product
[0154] 7) The ligation was performed at 22 C. for 12 min. And the products were transformed into E. coli competent cells, which were then plated on LB plates (Amp100, IPTG, and X-gal), incubated at 22 C. for 12-16 h. White colonies were picked for identifying positive clones and sequencing.
In-Depth Sequencing
[0155] Different sgRNA expression vectors in combination with pnCas9-PBE, pdCas9-PBE, and pwCas9 respectively were transformed into wheat, rice, or maize protoplasts for 48 hours. After that, protoplasts were collected and DNA was extracted for in-depth sequencing. In the first round of PCRs, the target regions were amplified with site-specific primers (Table 5). In the second round of PCRs, forward and reverse tags were added to the end of the PCR products for library construction (Table 5). Equal amount of different PCR products were pooled. The samples were then sequenced with Illumina High-Seq 4000 at Beijing Genomics Institute.
Example 1. Base Editing of BFP in Plant Protoplasts by nCas9-PBE and dCas9-PBE
[0156] In the nCas9-PBE fusion protein, from N-terminal to C-terminal are NLD (SEQ ID NO: 30), APOBEC1 (SEQ ID NO: 11), XTEN linker (SEQ ID NO: 12), Cas9 nickase (nCas9, SEQ ID NO: 13), uracil DNA glycosylase inhibitors (UGI, SEQ ID NO: 15) and NLS (SEQ ID NO: 31) respectively; whereas in the dCas9-PBE fusion protein, from N-terminal to C-terminal are NLS, APOBEC1, XTEN linkers, catalytically deactivated Cas9 (dCas9, SEQ ID NO: 14), UGI, and NLS, respectively. Codon-optimized fusion protein coding sequences for efficient expression in cereal crops were placed downstream of the Ubiquitin-1 gene promoter Ubi-1 in the plasmid constructs pnCas9-PBE and pdCas9-PBE (
[0157] The ability to convert blue fluorescent protein (BFP) into green fluorescent protein (GFP) of the two constructs in wheat and rice protoplasts was compared. This conversion involves mutating the first nucleotide of the 66th codon of the BFP encoding gene from C to T, resulting in alteration of CAC (histidine) to TAC (tyrosine).
[0158] The target region of BFP-specific sgRNA is designed to cover codons ranging from codon 65 to 71 and the first two nucleotides of the codon 72, and the last three bases (CAG) constitute the protospacer adjacent motif (PAM) (
[0159] Because CAG is not the optimal PAM for CRISPR/Cas9, CAG was artificially mutated into CGG, and the resulting BFP sequence (BFPm) was cloned downstream of the Ubi-1 promoter to form the expression construct pUbi-BFPm (
[0160] The combination of pnCas9-PBE, pUbi-BFPm, and pOsU3-BFP-sgRNA resulted in the expression of GFP in 5.8% of cells when introduced into rice protoplasts, whereas the replacement of pnCas9-PBE with pdCas9-PBE resulted in only 0.5% GFP-expressing cells; no GFP-expressing cell was detected when pnCas9-PBE or pdCas9-PBE were not included. 58.4% of cells expressed GFP (
[0161] In wheat protoplasts, more GFP-expressing cells were produced by using pnCas9-PBE (6.8%) than pdCas9-PBE (0.3%).
[0162] In-depth sequencing of rice (wheat or maize) protoplasts transformed with pnCas9-PBE, pUbi-BFPm, and pOsU3-BFP-sgRNA (pTaU6-BFP-sgRNA or pZmU3-BFP-sgRNA) showed that about 4.00% of the total DNA reads carrying C to T mutation (
[0163] Therefore, the results of the fluorescent protein reporter assay show that both nCas9-PBE and dCas9-PBE are able to convert C to T in the target region in wheat, rice, and maize protoplasts, and the deamination window encompasses position 3-9 of the protospace sequence (target sequence). And the activity of nCas9-PBE is stronger than that of dCas9-PBE.
Example 2. Base Editing of Endogenous Genes in Plant Protoplasts by nCas9-PBE and dCas9-PBE
[0164] In this example, the activity of pnCas9-PBE or pdCas9-PBE on wheat, rice, and maize endogenous genes was further studied. As a control for the induction of indel, a construct pwCas9 (Addgene ID53064) expressing wild-type Cas9 was also used in this experiment.
[0165] Three different sgRNA target sites (Si, S2, and S3) were designed in the TaLOX2 gene of wheat. For each of the three rice genes OsCDC48, OsNRT1.1B, and OsSPL14, one sgRNA target site was designed. One sgRNA target site was designed in maize ZmCENH3 gene (see Table 1).
[0166] Each sgRNA expression construct was combined with pnCas9-PBE, pdCas9-PBE, and pwCas9, respectively, and co-expressed in wheat, rice, and maize protoplasts, respectively. Protoplast DNA was extracted. PCR amplicons for seven different targets were prepared and sequenced. 100000 to 270000 sequencing reads were used for detailed analysis of the mutagenicity.
[0167] For pnCas9-PBE, C to T transitions were observed in all 7 targets, and the deamination window encompasses positions 3-6 of the protospace sequence (target sequence) (
[0168] On the other hand, the expression of pdCas9-PBE produced only a low frequency of single C-editing (<0.96%) or multiple Cs-editing (<1.29%) at 7 target sites (
[0169] Using ZmCENH3 as an example, the amino acid mutation caused by nCas9-PBE in the target genome region was analyzed. As shown in
TABLE-US-00007 TABLE 2 Efficiency of dCas9-PBE treatment induced multiple C substitution 2 Cs 3 Cs 4 Cs 5 Cs Total Mutant Mutagenesis Mutant Mutagenesis Mutant Mutagenesis Mutant Mutagenesis Gene Name reads reads frequency reads frequency reads frequency reads frequency sgRNA-TaLOX2-s1 197064 72 0.04% 4 0.00% sgRNA-TaLOX2-s2 149200 420 0.28% sgRNA-TaLOX2-s3 96396 1244 1.29% sgRNA-OsCDC48 154760 40 0.03% 0 0.00% 0 0.00% sgRNA-OsNRT1.1B 142130 0 0.00% sgRNA-OsSPL14 96659 36 0.04% sgRNA-ZmCENH3 275466 36 0.01% 0 0.00% 0 0.00% 0 0.00% represents not applicable; .sup.aThe number of mutant reads relative to the number of total reads.
TABLE-US-00008 TABLE 3 Efficiency of Indel Induction nCas9-PBE dCas9-PBE wCas9 Untreated TaLOX2-S1 0.01 0.01 11.68 0.01 TaLOX2-S2 0.03 0.02 6.92 0.02 TaLOX2-S3 0.04 0.03 7.10 0.06 OsCDC48 0.22 0.13 9.04 0.02 OsNRT1.1B 0.06 0.05 7.12 0.03 OsSPL14 0.08 0.14 7.34 0.04 ZmCENH3 0.02 0.01 6.27 0.01
Example 3. Analysis for Mutant Plants
[0170] This example further investigated the functional differences between nCas9-PBE and dCas9-PBE and tested the functional properties of nCas9-PBE by analyzing mutant plants.
[0171] For wheat transformation, pnCas9-PBE and pTaU6-sgRNA-LOX2-s1 (
[0172] Rice variety Nipponbare (Japonica) was used for rice transformation, because this variety is highly responsive to genetic manipulation and its whole genome sequence is available. The OsCDC48 gene, which has been found to regulate rice senescence and cell death, is adopted as a target (
[0173] 92 and 87 independent transgenic T0 plants were obtained for nCas9-PBE and dCas9-PBE, respectively. After T7EI analysis and Sanger sequencing, it is found that among the 92 nCas9-PBE plants, 40 of them carry at least one C to T substitution in the OsCDC48 target region, and the mutant production efficiency is 43.48% (40/92) (
[0174] The potential off-target effect of the base editing method of the present invention was investigated based on the obtained 40 strains of nCas-PBE mutants. Five possible off-target sites are found in the Nipponbare genomic sequence, each of them has three nucleotide mismatches with the sgRNA target region of OsCDC48 (Table 4). Amplicons of these 5 sites are carefully analyzed using T7EI assay and Sanger sequencing. No point mutation or indel was detected, indicating that base editing by nCas9-PBE is highly specific.
TABLE-US-00009 TABLE4 Potentialoff-targeteffectanalysisforOsCDC48 Site Numberof Mutagenesis Sequencing name Sequence mismatch Gene frequency method On-target GACCAGCCAGCGTCTGGC LOC_Os03g05730 GCCGG OT-1 GACCAGCCgGCGTgTGGtG 3 LOC_Os12g09720 0 T7EI/Sanger CAGG Sequencing OT-2 GACCAGCCgGCGTgTGGtG 3 LOC_Os12g09700 0 T7EI/Sanger CAGG Sequencing OT-3 GACCAGCCgGCGTgTGGtG 3 LOC_Os12g14440 0 T7EI/Sanger CAGG Sequencing OT-4 GACCAGCCAGCGTCTGaa 3 LOC_Os03g19390 0 T7EI/Sanger GgCGG Sequencing OT-5 GACCAagCAGCGgCTGGC 3 LOC_Os04g34030 0 T7EI/Sanger GCCGG Sequencing Lowercase bases are bases mismatched with OsCDC48. The bold letters show the PAM sequence.
Example 4. Base Editing of Maize ALS Gene
[0175] Using AtALS (AT3G48560.1) as a seed sequence, two ZmALS homologous genes are obtained by BLASTN alignment analysis in a maize database (https://phytozome.jgi.doe.gov), wherein the two ZmALS homologous genes are named as ZmALS1 (Locus Name: GRMZM2G143357) and ZmALS2 (Locus Name: GRMZM2G143008). ZmALS1 and ZmALS2 have a sequence identify of 93.84%. A common sgRNA target sequence was designed in the conserved region of ZmALS1 (SEQ ID NO: 26) and ZmALS2 (SEQ ID NO: 28) (Svitashev et al., Plant Physiol., 2015), which is CAGGTGCCGCGACGCATGATTGG. A base editing system was constructed as described above, and transformed into maize variety Zong31 by particle bombardment method. After PCR/RE detection, plants with C7 to T7 substitution on ZmALS1 and ZmALS2 were obtained (
[0176] Based on the data shown above, the modified and optimized nCas9-PBE induces highly efficient and specific C to T base editing in wheat, rice, and maize cells, and the point mutation produced by nCas9-PBE is unique comparing to the ones produced by TILLING. nCas9-PBE in combination of a properly engineered sgRNA can make it possible to perform a point mutation to a desired residue and adjacent residues, and thus facilitates the analysis for the effect of a single or combined mutations of amino acids located in a key domain of a protein. On the other hand, point mutations identified by TILLING usually only appear to a single amino acid residue, and accordingly it is difficult to simultaneously obtain mutations of the target residue and its adjacent residues by TILLING. Therefore, nCas9-PBE is clearly advantageous for the rapid generation of multiple mutations, and can be used for the detail analysis of the function of one or more amino acids in a key protein domain. Important functional properties of nCas9-PBE include a relatively large deamination window (covering 7 bases of the protospace sequence/target sequence), its base editing is independent of the sequence context structure of the target region, and there are few indel mutations. nCas9-PBE has a larger deamination window in cereal plants, which is advantageous for generating more diverse mutations.
[0177] The present invention demonstrates that C to T base-editing, which is mediated by the Cas9 variant-cytidine deaminase fusion protein, is a highly efficient tool for producing site-directed point mutations in the plant genome, and thereby increases the efficiency of improving crops through genomic engineering.
TABLE-US-00010 TABLE5 AllofthePrimersUsedintheExamples PrimerName PrimerSequence(5-3) Application BFPm-F ACGGCGTGCAGTGCTTCGGCCGCTACCCCGACCA Construct BFPm-R TGGTCGGGGTAGCGGCCGAAGCACTGCACGCCGT pJIT163-Ubi-BFPm AfIII-n/dCas9-F GGCTTAAGGACAAGAAGTACTCGATCGGCCT Amplifyn/dCas9segment MluI-n/dCas9-R GCGACGCGTCTTCTTCTTCTTTGCTTGCCCTGC BamHI-n/dCas9-F CGGGATCCATGCCAAAGAAGAAGAGGAAGGTTTCATC Amplify Bsp1047I-n/dCas9-R CCGTGTACACTACACCTTCCGCTTCTTCTTTGGGCTC APOBEC1-XTEN-n/dCas9 segment Gibson-F AATACTTGTATGGCCGCGGCCATGCCAAAGAAGAAGAGG Amplify Gibson-R ACTTGTATGGAGGCCTGAGCTCTACACCTTCCGCTTCTT APOBEC1-XTEN-n/dCas9 segment BFP-F ATGGTGAGCAAGGGCGAGGAG AmplifyBFPgenetarget BFP-R CCTCGATGTTGTGGCGGATCT siteandforfirst-roundPCR forin-depthsequencing OsBFP-nCas9-F CGATGTCGAGGGCGATGCCACCTAC Second-roundPCRforBFP OsBFP-nCas9-R TGGTCAAAGTCGTGCTGCTTCATGTGG in-depthsequencingin nCas9-PBEtreatedrice protoplasts OsBFP-dCas9-F ATCACGCGAGGGCGATGCCACCTAC Second-roundPCRforBFP OsBFP-dCas9-R GCCTAAAAGTCGTGCTGCTTCATGTGG in-depthsequencingin dCas9-PBEtreatedrice protoplasts OsBFP-Cas9-F AGTTCCCGAGGGCGATGCCACCTAC Second-roundPCRforBFP OsBFP-Cas9-R CTCTACAAGTCGTGCTGCTTCATGTGG in-depthsequencingin wild-typeCas9treatedrice protoplasts OsBFP-CK-F CACTCACGAGGGCGATGCCACCTAC Second-roundPCRforBFP OsBFP-CK-R TGTTGGAAGTCGTGCTGCTTCATGTGG in-depthsequencingin controlriceprotoplasts TaBFP-nCas9-F GTGGCCCGAGGGCGATGCCACCTAC Second-roundPCRforBFP TaBFP-nCas9-R CGAAACAAGTCGTGCTGCTTCATGTGG in-depthsequencingin nCas9-PBEtreatedwheat protoplasts TaBFP-dCas9-F CGTACGCGAGGGCGATGCCACCTAC Second-roundPCRforBFP TaBFP-dCas9-R CCACTCAAGTCGTGCTGCTTCATGTGG in-depthsequencingin dCas9-PBEtreatedwheat protoplasts TaBFP-Cas9-F GGTAGCCCGAGGGCGATGCCACCTAC Second-roundPCRforBFP TaBFP-Cas9-R ATCAGTAAGTCGTGCTGCTTCATGTGG in-depthsequencingin wild-typeCas9treated wheatprotoplasts TaBFP-CK-F CACCGGCCGAGGGCGATGCCACCTAC Second-roundPCRforBFP TaBFP-CK-R ATCGTGAAGTCGTGCTGCTTCATGTGG in-depthsequencingin controlwheatprotoplasts ZmBFP-nCas9-F ATGAGCCGAGGGCGATGCCACCTAC Second-roundPCRforBFP ZmBFP-nCas9-R AGGAATAAGTCGTGCTGCTTCATGTGG in-depthsequencingin nCas9-PBEtreatedmaize protoplasts ZmBFP-dCas9-F CAAAAGGCGAGGGCGATGCCACCTAC Second-roundPCRforBFP ZmBFP-dCas9-R TAGTTGAAGTCGTGCTGCTTCATGTGG in-depthsequencingin dCas9-PBEtreatedmaize protoplasts ZmBFP-Cas9-F TCGGCACGAGGGCGATGCCACCTAC Second-roundPCRforBFP ZmBFP-Cas9-R GAATGAAAGTCGTGCTGCTTCATGTGG in-depthsequencingin wild-typeCas9treated maizeprotoplasts ZmBFP-CK-F TCCCGACGAGGGCGATGCCACCTAC Second-roundPCRforBFP ZmBFP-CK-R CTTCGAAAGTCGTGCTGCTTCATGTGG in-depthsequencingin controlmaizeprotoplasts OsCDC48-F TTCAGGACATCGAGATGGAGAAG AmplifyOsCDC48target OsCDC48-R ACAACGCAAATCTATCCATGCTC siteandforfirst-roundPCR forin-depthsequencing OsCDC48-nCas9-F CGATGTGCCGACATCCGCAAGTACCAG Second-roundPCRfor OsCDC48-nCas9-R TGGTCATCATCATCGTCAGCTGCGGC OsCDC48in-depth sequencinginnCas9-PBE treatedriceprotoplasts OsCDC48-dCas9-F ATCACGGCCGACATCCGCAAGTACCAG Second-roundPCRfor OsCDC48-dCas9-R GCCTAATCATCATCGTCAGCTGCGGC OsCDC48in-depth sequencingindCas9-PBE treatedriceprotoplasts OsCDC48-Cas9-F AGTTCCGCCGACATCCGCAAGTACCAG Second-roundofPCRfor OsCDC48-Cas9-R CTCTACTCATCATCGTCAGCTGCGGC OsCDC48in-depth sequencinginwild-type Cas9treatedrice protoplasts OsCDC48-CK-F CACTCAGCCGACATCCGCAAGTACCAG Second-roundPCRfor OsCDC48-CK-R TGTTGGTCATCATCGTCAGCTGCGGC OsCDC48deepsequencing incontrolriceprotoplasts OsNRT1.1B-F GATGTCACCTGATGATCTGAAGTAGC AmplifyOsNRT1.1Btarget OsNRT1.1B-R ATGATGGTGGTCGCCCAGAT siteandforfirst-roundPCR forin-depthsequencing OsNRT1.1B-nCas9-F CGATGTGGTGCAGGTTCCTGGACCAT Second-roundPCRfor OsNRT1.1B-nCas9-R TGGTCAATGATGGTGGTCGCCCAGAT OsNRT1.1Bin-depth sequencinginnCas9-PBE treatedriceprotoplasts OsNRT1.1B-dCas9-F ATCACGGGTGCAGGTTCCTGGACCAT Second-roundPCRfor OsNRT1.1B-dCas9-R GCCTAAATGATGGTGGTCGCCCAGAT OsNRT1.1Bin-depth sequencingindCas9-PBE treatedriceprotoplasts OsNRT1.1B-Cas9-F AGTTCCGGTGCAGGTTCCTGGACCAT Second-roundPCRfor OsNRT1.1B-Cas9-R CTCTACATGATGGTGGTCGCCCAGAT OsNRT1.1Bin-depth sequencinginwild-type Cas9treatedrice protoplasts OsNRT1.1B-CK-F CACTCAGGTGCAGGTTCCTGGACCAT Second-roundPCRfor OsNRT1.1B-CK-R TGTTGGATGATGGTGGTCGCCCAGAT OsNRT1.1Bin-depth sequencingincontrolrice protoplasts OsSPL14-F CGCTGATGTGTTGTTTGTTGCGA AmplifyOsSPL14target OsSPL14-R CCTGCAGAGCAAGCTCAAGCTCA siteandforfirst-roundPCR forin-depthsequencing OsSPL14-nCas9-F CGATGTTCGCTGGCCCAAATCTCCCT Second-roundPCRfor OsSPL14-nCas9-R TGGTCAGACATGGCTGCAGCCTGGTT OsSPL14in-depth sequencinginnCas9-PBE treatedriceprotoplasts OsSPL14-dCas9-F ATCACGTCGCTGGCCCAAATCTCCCT Second-roundPCRfor OsSPL14-dCas9-R GCCTAAGACATGGCTGCAGCCTGGTT OsSPL14in-depth sequencingindCas9-PBE treatedriceprotoplasts OsSPL14-Cas9-F AGTTCCTCGCTGGCCCAAATCTCCCT Second-roundPCRfor OsSPL14-Cas9-R CTCTACGACATGGCTGCAGCCTGGTT OsSPL14in-depth sequencinginwild-type Cas9treatedrice protoplasts OsSPL14-CK-F CACTCATCGCTGGCCCAAATCTCCCT First-roundPCRfor OsSPL14-CK-R TGTTGGGACATGGCTGCAGCCTGGTT OsSPL14in-depth sequencingincontrolrice protoplasts TaLOX2-S1-F ACTCCGTCTACCGACCATTGAG AmplifyTaLOX2S1target TaLOX2-S1-R TAGACCATGGAGGACATGGGCAT siteandforfirst-roundPCR forin-depthsequencing TaLOX2-S1-nCas9-F GTGGCCAGGGCCTCACCGTGGAGCAGA Second-roundPCRfor TaLOX2-S1-nCas9-R CGAAACTCCCCTCGCAGGAAGAGCAG TaLOX2-S1in-depth sequencinginnCas9-PBE treatedwheatprotoplasts TaLOX2-S1-dCas9-F CGTACGAGGGCCTCACCGTGGAGCAGA Second-roundPCRfor TaLOX2-S1-dCas9-R CCACTCTCCCCTCGCAGGAAGAGCAG TaLOX2-S1in-depth sequencingindCas9-PBE treatedwheatprotoplasts TaLOX2-S1-Cas9-F GGTAGCAGGGCCTCACCGTGGAGCAGA Second-roundPCRfor TaLOX2-S1-Cas9-R ATCAGTTCCCCTCGCAGGAAGAGCAG TaLOX2-S1in-depth sequencinginCas9treated wild-typewheatprotoplasts TaLOX2-S1-CK-F CGGAATAGGGCCTCACCGTGGAGCAGA Second-roundPCRfor TaLOX2-S1-CK-R TCTGAGTCCCCTCGCAGGAAGAGCAG TaLOX2-S1in-depth sequencingincontrolwheat protoplasts TaLOX2-S2-F CAATCATCGATGTACTAGTGTGGTCCAG AmplifyTaLOX2S2target TaLOX2-S2-R GGATGTCGGCGAAGGAGTCGAACT siteandforfirst-roundPCR forin-depthsequencing TaLOX2-S2-nCas9-F ATGAGCTATGTATGGCTGGCGCAGAGC Second-roundPCRfor TaLOX2-S2-nCas9-R AGGAATGTATGATCCCGTCCACCAGC TaLOX2-S2in-depth sequencinginnCas9-PBE treatedwheatprotoplasts TaLOX2-S2-dCas9-F CAAAAGTATGTATGGCTGGCGCAGAGC Second-roundPCRfor TaLOX2-S2-dCas9-R TAGTTGGTATGATCCCGTCCACCAGC TaLOX2-S2in-depth sequencingindCas9-PBE treatedwheatprotoplasts TaLOX2-S2-Cas9-F CACCGGTATGTATGGCTGGCGCAGAGC Second-roundPCRfor TaLOX2-S2-Cas9-R ATCGTGGTATGATCCCGTCCACCAGC TaLOX2-S2in-depth sequencinginCas9treated wild-typewheatprotoplasts TaLOX2-S2-CK-F CTAGCTTATGTATGGCTGGCGCAGAGC Second-roundPCRfor TaLOX2-S2-CK-R TCTGAGGTATGATCCCGTCCACCAGC TaLOX2-S2in-depth sequencingincontrolwheat protoplasts TaLOX2-S3-F GTCCCCTTCCTTCCGATCTAATCTC AmplifyTaLOX2S3target TaLOX2-S3-R TGCACGCAGTCAAATAATGGTACGA siteandforfirst-roundPCR forin-depthsequencing TaLOX2-S3-nCas9-F CGATGTCATCAAGCTGCCCAACATCCC Second-roundPCRfor TaLOX2-S3-nCas9-R TGGTCATCGGTCATCCATGCCTTCTCGT TaLOX2-S3in-depth sequencinginnCas9-PBE treatedwheatprotoplasts TaLOX2-S3-dCas9-F ATCACGCATCAAGCTGCCCAACATCCC Second-roundPCRfor TaLOX2-S3-dCas9-R GCCTAATCGGTCATCCATGCCTTCTCGT TaLOX2-S3in-depth sequencingindCas9-PBE treatedwheatprotoplasts TaLOX2-S3-Cas9-F AGTTCCCATCAAGCTGCCCAACATCCC Second-roundPCRfor TaLOX2-S3-Cas9-R CTCTACTCGGTCATCCATGCCTTCTCGT TaLOX2-S3in-depth sequencinginCas9treated wild-typewheatprotoplasts TaLOX2-S3-CK-F CTATACCATCAAGCTGCCCAACATCCC Second-roundPCRfor TaLOX2-S3-CK-R TCTGAGTCGGTCATCCATGCCTTCTCGT TaLOX2-S3in-depth sequencingincontrolwheat protoplasts ZmCENH3-F AATGTGCCAGTTCCATGTGGGTGT AmplifyZmCENH3target ZmCENH3-R GCAGGCCATAATGCTGTCGGGTAT siteandforfirst-roundPCR forin-depthsequencing ZmCENH3-nCas9-F ATGAGCATGGAAAGTTATTCTTCTGAGAA Second-roundPCRfor ZmCENH3-nCas9-R AGGAATTATGAAGAGGATCTTAACAGAGAG ZmCENH3in-depth sequencinginnCas9-PBE treatedmaizeprotoplasts ZmCENH3-dCas9-F CAAAAGATGGAAAGTTATTCTTCTGAGAA Second-roundPCRfor ZmCENH3-dCas9-R TAGTTGTATGAAGAGGATCTTAACAGAGAG ZmCENH3in-depth sequencingindCas9-PBE treatedmaizeprotoplasts ZmCENH3-Cas9-F TCGGCAATGGAAAGTTATTCTTCTGAGAA Second-roundPCRfor ZmCENH3-Cas9-R GAATGATATGAAGAGGATCTTAACAGAGAG ZmCENH3in-depth sequencinginCas9treated wild-typemaizeprotoplasts ZmCENH3-CK-F CTATACATGGAAAGTTATTCTTCTGAGAA Second-roundPCRfor ZmCENH3-CK-R TCTGAGTATGAAGAGGATCTTAACAGAGAG ZmCENH3in-depth sequencingincontrolmaize protoplasts Off-target-1F ATGTCGGCCAGCAACAACAA Potentialoff-targeteffect Off-target-1R AGTAGTGTATCCATCCTCGTGCAT fordetectionsite1 Off-target-2F AATAGCCCATTCACCTTGTTCAACA Potentialoff-targeteffect Off-target-2R CAGCCATAGACCATAGTACTACACCAC fordetectionsite2 Off-target-3F TCATCCTCGAACACTAGGCTGAAG Potentialoff-targeteffect Off-target-3R TACTACTCGCAGCGCATCACTCA fordetectionsite3 Off-target-4F ATTGAACGGTGTCACTTCAGACCA Potentialoff-targeteffect Off-target-4R TATTGAGCTGATCAGCTGAACAGAAC fordetectionsite4 Off-target-5F ACTCGCTGGAACTATCCATCTTGGC Potentialoff-targeteffect Off-target-5R AAGCGCTCGACGGCGTGGA fordetectionsite5 ZmALS1-F CTCCGACATCCTCGTCGAGGCT FormaizeZmALS1 ZmALS1-R GATTCACCAACAAGACGCAGCA PCR/REtest ZmALS2-F AACCACCTCTTCCGCCACGAG FormaizeZmALS2 ZmALS2-R ACGCAGCACCTGCTCAAGCAAC PCR/REtest