BASE EDITING SYSTEMS FOR ACHIEVING C TO A AND C TO G BASE MUTATION AND APPLICATION THEREOF

Abstract

The present invention discloses base editing systems for mutating a base C to A and a base C to G and applications thereof. The base editing system for mutating C to A disclosed in the present invention includes cytosine deaminase AID and nCas9 nuclease or includes cytosine deaminase AID, nCas9 nuclease and uracil DNA glycosidase; the base editing system for mutating C to G of the present invention includes cytosine deaminase APOBEC, nCas9 nuclease and uracil DNA glycosidase. The experiments show that a combination of the three base editing systems for mutating C to A, C to T and A to G can realize a mutation of A, T, C or G to any base in both prokaryotes and eukaryotes.

Claims

1-39. (canceled)

40. A method for mutating a target base C to A in a genome sequence, is D1) or D2) or D3) or D4) as follows: D1) the method includes the following steps: using a CRISPR/Cas9 system, cytosine deaminase and uracil DNA glycosidase for single-base editing to mutate a target base C to A; D2) the method includes the following steps: using a CRISPR/Cas9 system and cytosine deaminase for single-base editing to mutate a target base C to A; D3) the method includes the following steps: using a CRISPR/Cas9 system, cytosine deaminase AID and uracil DNA glycosidase for single-base editing to mutate a target base C to A; D4) the method includes the following steps: using a CRISPR/Cas9 system and cytosine deaminase AID for single-base editing to mutate a target base C to A.

41. The method according to claim 40, wherein the method is d1) or d2) or d3) or d4) as follows: d1) the method includes the following steps: introducing a coding gene of cytosine deaminase, a coding gene of CRISPR nuclease, a coding gene of uracil DNA glycosidase and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase, the coding gene of CRISPR nuclease, the coding gene of uracil DNA glycosidase and the coding sequence of sgRNA are all expressed to mutate a target base C to A; d2) the method includes the following steps: introducing a coding gene of cytosine deaminase, a coding gene of CRISPR nuclease and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase, the coding gene of CRISPR nuclease and the coding sequence of sgRNA are all expressed to mutate a target base C to A; d3) the method includes the following steps: introducing a coding gene of cytosine deaminase AID, a coding gene of nCas9 nuclease, a coding gene of uracil DNA glycosidase and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase AID, the coding gene of nCas9 nuclease, the coding gene of uracil DNA glycosidase and the coding sequence of sgRNA are all expressed to mutate a target base C to A; d4) the method includes the following steps: introducing a coding gene of cytosine deaminase AID, a coding gene of nCas9 nuclease and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase AID, the coding gene of nCas9 nuclease and the coding sequence of sgRNA are all expressed to mutate a target base C to A; The sgRNA targets a target sequence; the target base C locates in the target sequence.

42. The method according to claim 40, wherein the cytosine deaminase or the cytosine deaminase AID is cytosine deaminase pmCDA; or wherein the uracil DNA glycosidase is an uracil DNA glycosidase ung derived from Escherichia coli; or wherein the CRISPR nuclease or the nCas9 nuclease is a mutant nCas9-D10A of Cas9.

43. The method according to claim 40, wherein in the d3), the coding gene of the cytosine deaminase AID, the coding gene of the nCas9 nuclease and the coding gene of the uracil DNA glycosidase are introduced into the receptor organism or the cell of the receptor organism through a recombinant plasmid A; the recombinant plasmid A expresses a fusion protein composed of cytosine deaminase AID, nCas9 nuclease and uracil DNA glycosidase; or wherein in the d4), the coding gene of the cytosine deaminase AID and the coding gene of the nCas9 nuclease are introduced into the receptor organism or the cell of the receptor organism through a recombinant plasmid B; the recombinant plasmid B expresses a fusion protein composed of cytosine deaminase AID and nCas9 nuclease.

44. The method according to claim 43, wherein the nucleotide sequence of the recombinant plasmid B is as shown in sequence 1; or wherein the nucleotide sequence of the recombinant plasmid A is shown in SEQ ID NO:3.

45. The method according to claim 40, wherein the receptor organism is prokaryotes, preferably, wherein the prokaryote is Escherichia coli; further preferably, wherein the Escherichia coli is Escherichia coli MG1655 or Escherichia coli ATCC 8739.

46. A method for improving the base editing efficiency of mutating a target base C to G in a genome sequence is E1) or E2) as follows: E1) the method includes the following steps: using a CRISPR/Cas9 system, cytosine deaminase and uracil DNA glycosidase for single-base editing to improve the base editing efficiency of mutating a target base C to G; E2) the method includes the following steps: using a CRISPR/Cas9 system, cytosine deaminase APOBEC and uracil DNA glycosidase for single-base editing to improve the base editing efficiency of mutating a target base C to G.

47. The method according to claim 46, wherein the method is e1) or e2) as follows: e1) the method includes the following steps: introducing a coding gene of cytosine deaminase, a coding gene of CRISPR nuclease, a coding gene of uracil DNA glycosidase and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase, the coding gene of CRISPR nuclease, the coding gene of uracil DNA glycosidase and the coding sequence of sgRNA are all expressed to improve a base editing efficiency of mutating a target base C to G in a genome sequence; e2) the method includes the following steps: introducing a coding gene of cytosine deaminase APOBEC, a coding gene of nCas9 nuclease, a coding gene of uracil DNA glycosidase and a coding sequence of sgRNA into a receptor organism or the cells of a receptor organism, so that the coding gene of cytosine deaminase APOBEC, the coding gene of nCas9 nuclease, the coding gene of uracil DNA glycosidase and the coding sequence of sgRNA are all expressed to improve a base editing efficiency of mutating a target base C to G in a genome sequence; The sgRNA targets a target sequence, and the target base locates in the target sequence.

48. The method according to claim 46, wherein the cytosine deaminase or the cytosine deaminase APOBEC, is cytosine deaminase APOBEC1; or wherein the uracil DNA glycosidase is a protein represented by an amino acid sequence obtained by deleting the amino acid sequence shown at sites 1 to 84 of the human-derived uracil DNA glycosidase UNG amino acid sequence from the N-terminal; or wherein the CRISPR nuclease or the nCas9 nuclease is a Cas9 mutant nCas9-D10A.

49. The method according to claim 46, wherein in e2), the coding gene of the cytosine deaminase APOBEC, the coding gene of the nCas9 nuclease and the coding gene of the uracil DNA glycosidase are introduced into the receptor organism or the cells of the receptor organism through a recombinant plasmid C; the recombinant plasmid C expresses a fusion protein composed of cytosine deaminase APOBEC, nCas9 nuclease and uracil DNA glycosidase.

50. The method according to claim 49, wherein the nucleotide sequence of the recombinant plasmid C is as shown in SEQ ID NO: 5.

51. The method according to claim 46, wherein the receptor biological cells are eukaryotic cells, preferably, wherein the eukaryotic cells are mammalian cells.

52. A method for realizing a site-directed mutation from any base to any base in a genome sequence in prokaryotes is M1) or M2) or M3) or M4) as follows: M1 includes m1) or m2) or m3): m1) when a target base in a genome sequence is a base C, starting from the base C, the target base can be mutated from the base C to a base T using a base editing system for mutating C to T, so as to realize the editing from the base C to the base T; m2) when a target base in a genome sequence is a base C, starting from the base C, the target base can be mutated from the base C to a base A using a base editing system for mutating C to A, so as to realize the editing from the base C to the base A; m3) when a target base in a genome sequence is a base C, a mutant taking a base A as the target base is obtained according to the method described in m2); starting from the base A, the target base can be mutated from the base A to a base G using a base editing system for mutating A to G, so as to realize the editing from the base C to the base G; any site-directed mutation from the base C to the base T, the base A and the base G is therefore realized; M2) when a target base in a genome sequence is a base G, since the base G is a complementary base of a base C, any site-directed mutation from the base G to the base A, the base T and the base C is also realized according to the method described in M1); M3 includes m4) or m5) or m6): m4) when a target base in a genome sequence is a base T, a base A is a complementary base of the target base; starting from the base A, the complementary base of the target base can be mutated from the base A to a base G using a base editing system for mutating A to G, so as to realize the editing from the base T to the base G; m5) when a target base in a genome sequence is a base T, a mutant taking a base C as the target base is obtained according to the method described in m4); starting from the base C, the target base can be mutated from the base C to a base A using a base editing system for mutating C to A, so as to realize the editing from the base T to the base A; m6) when a target base in a genome sequence is a base T, a mutant taking a base A as the target base is obtained according to the method described in m5); starting from the base A, the target base can be mutated from the base A to a base G using a base editing system for mutating A to G, so as to realize the editing from the base T to the base G; any site-directed mutation from the base T to the base C, the base A and the base G is therefore realized; M4) when a target base in a genome sequence is a base A, since the base A is a complementary base of a base T, any site-directed mutation from the base A to the base G, the base T and the base C is also realized according to the method described in M3); The base editing system for mutating C to A is a base editing system I for mutating C to A, or a base editing system II for mutating C to A, or a base editing system III for mutating C to A, or a base editing system IV for mutating C to A; The base editing system I for mutating C to A comprises cytosine deaminase or a biomaterial related to the cytosine deaminase, CRISPR nuclease or a biomaterial related to the CRISPR nuclease, and uracil DNA glycosidase or a biomaterial related to the uracil DNA glycosidase; The base editing system II for mutating C to A comprises cytosine deaminase or a biomaterial related to the cytosine deaminase, and CRISPR nuclease or a biomaterial related to the CRISPR nuclease; The base base editing system III for mutating C to A comprises cytosine deaminase AID or a biomaterial related to the cytosine deaminase AID, nCas9 nuclease or a biomaterial related to the nCas9 nuclease and uracil DNA glycosidase or a biomaterial related to the uracil DNA glycosidase; The base editing system IV for mutating C to A comprises cytosine deaminase AID or a biomaterial related to the cytosine deaminase AID, and nCas9 nuclease or a biomaterial related to the nCas9 nuclease.

53. The method according to claim 52, wherein the cytosine deaminase or the cytosine deaminase AID is cytosine deaminase pmCDA; or the uracil DNA glycosidase is the uracil DNA glycosidase ung derived from Escherichia coli; or wherein the CRISPR nuclease or the nCas9 nuclease is a mutant nCas9-D10A of Cas9.

54. The method according to claim 52, wherein the prokaryote is Escherichia coli; preferably, wherein the Escherichia coli is Escherichia coli MG1655 or Escherichia coil ATCC 8739.

55. A method for realizing a site-directed mutation from any base to any base in a genome sequence in eukaryotes is N1) or N2) or N3) or N4) as follows: N1) includes n1) or n2) or n3): n1) when a target base in a genome sequence is a base C, starting from the base C, the target base can be mutated from the base C to a base T using a base editing system for mutating C to T to realize the base editing from C to T; n2) when a target base in a genome sequence is a base C, starting from the base C, the target base can be mutated from the base C to a base G using a base editing system for mutating C to G so as to realize the editing from the base C to the base G; n3) when a target base in a genome sequence is a base C, a mutant taking a base G as the target base is obtained according to the method described in n2), and the base C is a complementary base of the base G; starting from the base C, the target base can be mutated from the base C to a base T using a base editing system for mutating C to T, and a base A is a complementary base of the base T, realizing the editing from the base C to the base A; any site-directed mutation from the base C to the base T, the base A and the base G is therefore realized; N2) when a target base in a genome sequence is a base G, since the base G is a complementary base of a base C, any site-directed mutation from the base G to the base A, the base T and the base C is also realized according to the method described in N1); N3 includes n4) or n5) or n6): n4) when a target base in a genome sequence is a base T, a base A is a complementary base of the base T; starting from the base A, the complementary base of the target base can be mutated from the base A to a base G using a base editing system for mutating A to G so as to realize the editing from the base T to the base G; n5) when a target base in a genome sequence is a base T, a mutant taking a base C as the target base is obtained according to the method described in n4); starting from the base C, the target base can be mutated from the base C to a base G using a base editing system for mutating C to G so as to realize the editing from the base T to the base G; n6) when a target base in a genome sequence is a base T, a mutant taking a base G as the target base is obtained according to the method described in n5), and a base C is a complementary base of the base G; starting from the base C, the complementary base of the target base can be mutated from the base C to a base T using a base editing system for mutating C to T so as to realize the editing from the base T to the base A; any site-directed mutation from the base T to the base C, the base A and the base G is therefore realized; N4) when a target base in a genome sequence is a base A, since the base A is a complementary base of a base T, any site-directed mutation from the base A to the base G, the base T and the base C is also realized according to the method described in N3); The base editing system for mutating C to G is a base editing system I for mutating C to G, or a base editing system II for mutating C to G, or a base editing system III for mutating C to G, or a base editing system IV for mutating C to G; The base editing system I for mutating C to G comprises cytosine deaminase or a biomaterial related to the cytosine deaminase, CRISPR nuclease or a biomaterial related to the CRISPR nuclease, and uracil DNA glycosidase or a biomaterial related to the uracil DNA glycosidase; The base editing system II for mutating C to G comprises cytosine deaminase or a biomaterial related to the cytosine deaminase, and CRISPR nuclease or a biomaterial related to the CRISPR nuclease; The base editing system III for mutating C to G comprises cytosine deaminase APOBEC or a biomaterial related to the cytosine deaminase APOBEC, nCas9 nuclease or a biomaterial related to the nCas9 nuclease and uracil DNA glycosidase or a biomaterial related to the uracil DNA glycosidase; The base editing system IV for mutating C to G comprises cytosine deaminase APOBEC or a biomaterial related to the cytosine deaminase APOBEC, and nCas9 nuclease or a biomaterial related to the nCas9 nuclease.

56. The method according to claim 55, wherein the cytosine deaminase or the cytosine deaminase APOBEC is the cytosine deaminase APOBEC1; or wherein the uracil DNA glycosidase is a protein represented by an amino acid sequence obtained by deleting the amino acid sequence represented by positions 1 to 84 of the human uracil DNA glycosidase UNG amino acid sequence from the N-terminal; or wherein the CRISPR nuclease or the nCas9 nuclease is a Cas9 mutant nCas9-D10A.

57. The method according to claim 56, wherein the is eukaryotic cells, preferably, wherein the eukaryotic cells are mammalian cells.

58. A product comprising: any of the following products described in c1)-c5): c1) a product for mutating a target base C to A in a genome sequence, including the base editing system I for mutating C to A described in claim 52, or the base editing system II for mutating C to A, or the base editing system III for mutating C to A, or the base editing system IV for mutating C to A; c2) a product for improving a base editing efficiency of mutating a target base C to A in a genome sequence, including the base editing system I for mutating C to A, or the base base editing system III for mutating C to A; c3) a product for improving a base editing efficiency of mutating a target base C to G in a genome sequence, including the base editing system I for mutating C to G, or the base editing system III for mutating C to G; c4) a product for realizing a site-directed mutation from any base to any base in a genome sequence in prokaryotes, including a base editing system for mutating C to A, a base editing system for mutating C to T, and a base editing system for mutating A to G; wherein the base editing system for mutating C to A is the base editing system I for mutating C to A, or the base editing system II for mutating C to A, or the base editing system III for mutating C to A, or the base editing system IV for mutating C to A; c5) a product for realizing a site-directed mutation from any base to any base in a genome sequence in eukaryotes, including a base editing system for mutating C to G, a base editing system for mutating C to T, and a base editing system for mutating A to G; wherein the base editing system for mutating C to G is the base editing system I for mutating C to G, or the base editing system II for mutating C to G or the base editing system III for mutating C to G, or the base editing system IV for mutating C to G.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0161] FIG. 1 shows schematic diagrams of the base editing for mutating A, T, C or G to any base by combining a base editing system for mutating C to A, a base editing system for mutating C to T, and a base editing system for mutating A to G. The upper figure shows a schematic diagram of the base editing for mutating any base starting from C or G; and the lower figure shows a schematic diagram of the base editing for mutating any base starting from A or T.

[0162] FIG. 2 shows schematic diagrams of the base editing for mutating A, T, C or G to any base by combining a base editing system for mutating C to G, a base editing system for mutating C to T and a base editing system for mutating A to G. The upper figure shows a schematic diagram of the base editing for mutating any base starting from C or G; and the lower figure shows a schematic diagram of the base editing for mutating any base starting from A or T.

[0163] FIG. 3 is a map of the ptrc_nCas9_AID plasmid (pnCas9_AID plasmid).

[0164] FIG. 4 is a map of the Escherichia coli gRNA plasmid.

[0165] FIG. 5 is a map of the ptrc_ung_nCas9_AID plasmid (pUNG_nCas9_AID plasmid).

[0166] FIG. 6 is a map of the pAPOBEC_nCas9 plasmid (pAPOBEC_nCas9_UGI plasmid).

[0167] FIG. 7 is a map of the pAPOBEC_nCas9_UNG plasmid.

[0168] FIG. 8 is a map of the mammalian cells gRNA plasmid.

[0169] FIGS. 9A and 9B shows target genes, target sequences and editing results of example 3. FIG. 9A shows target genes, target sequences and editing results of base-directed replacement in HEK293T cells; FIG. 9B shows target genes, target sequences and editing results of base-directed replacement in Hela cells.

[0170] FIG. 10 is a map of the pTadA_nCas9 plasmid.

[0171] FIGS. 11A and 11B shows target genes, target sequences and editing results of example 4. FIG. 11A shows an editing efficiency of mutating a base C to any base; FIG. 11B shows an editing efficiency of mutating a base T to any base.

[0172] FIG. 12 is a map of the xcas9 (3.7)-ABE (7.10) plasmid.

[0173] FIG. 13 shows target genes, target sequences and editing results of example 5.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0174] The following embodiments are intended for a better understanding of the present invention, but not limiting the present invention. Unless otherwise noted, all the experimental methods in the following embodiments are conventional methods. Unless otherwise noted, all the experimental materials in the following embodiments can be purchased from conventional biochemical reagent shops. Three repeated experiments are set for the quantitative tests in the following embodiments, and a mean value is calculated as a result.

[0175] In the following embodiments, HEK293T cells, Hela cells, wild Escherichia coli MG1655, and wild Escherichia coli ATCC 8739 are all products of the American Type Culture Collection (ATCC).

[0176] In the following embodiments, the cytosine deaminase APOBEC used in mammalian cells is cytosine deaminase APOBEC1 (GenBank: AAH03792.1), and its encoding gene sequence is as shown at sites 1,038-1,721 in SEQ ID NO: 4.

[0177] In the following embodiments, the cytosine deaminase AID used in Escherichia coli is cytosine deaminase pmCDA (GenBank: ABO15149.1), and its encoding gene sequence is as shown at sites 4,405-5,028 in SEQ ID NO: 1.

[0178] In the following embodiments, the uracil DNA glycosidase used in Escherichia coli is Escherichia coil-derived uracil DNA glycosidase ung (GenBank: EGT65982.1), and its encoding gene sequence is as shown at sites 1-687 in SEQ ID NO: 3.

[0179] In the following embodiments, the uracil DNA glycosidase used in mammalian cells is modified human-derived uracil DNA glycosidase UNG, and its amino acid sequence is obtained by deleting an amino acid sequence shown at sites 1-84 from an amino acid sequence of human-derived uracil DNA glycosidase UNG (GenBank: CAG46474.1); its encoding gene sequence is as shown at sites 1-663 in SEQ ID NO: 5.

[0180] In the following embodiments, the nCas9 nuclease used in mammalian cells and Escherichia coli is a mutant nCas9-D10A of Cas9, and its amino acid sequence is obtained by mutating aspartic acid (D) to alanine (A) at a site 10 from the N -terminal of an amino acid sequence of the Cas9 nuclease (Accession: Q99ZW2.1); its encoding gene sequence is as shown at sites 1-4,104 in SEQ ID NO: 1.

[0181] In the following embodiments, the PCR detection primer sequences after editing a target sequence of genes are as shown in Table 1.

TABLE-US-00001 TABLE 1 PCR Detection Primer Sequences After Editing a Target Sequence of Genes Primer Name Primer Sequence dcuA_genome_F TGCTGGCGATCTTCTTGGG (SEQ ID NO: 8) dcuA_genome_R CCCGTGTCATCCATCTGTACC (SEQ ID NO: 9) dcuB_genome_F AACGGATCGCTGGTTATCTG (SEQ ID NO: 10) dcuB_genome_R CCGGTACGGAGATGAATTTCTG (SEQ ID NO: 11) dcuC_genome_F ATCGGCGCGAATGATATG (SEQ ID NO: 12) dcuC_genome_R ATCACTAGCCCAACAAGC (SEQ ID NO: 13) dcuD_genome_F CGGTTATGCCCGCTACATGG (SEQ ID NO: 14) dcuD_genome_R GGGATCGCTGTTCGCTTCAC (SEQ ID NO: 15) relA_genome_F TCGCGTACTGGATCTGTTCTGC (SEQ ID NO: 16) relA_genome_R GTTGCCAACACCTTCGACTACC (SEQ ID NO: 17) rpoS_genome_F AACCAGTACGCCTATCTC (SEQ ID NO: 18) rpo SgenomeR ACTCAGGGTTCTGGATTG (SEQ ID NO: 19) spoT_genome_F CCTGGCCTTTGAGATGAG (SEQ ID NO: 20) spoT_genome_R GTTCAGGACGCTGTAGAG (SEQ ID NO: 21) lacZ1_genome_F AGTTGCGTGACTACCTAC (SEQ ID NO: 22) lacZ1_genome_R AGACCAGACCGTTCATAC (SEQ ID NO: 23) lacZ2_genome_F CGTCTGAATTTGACCTGAG (SEQ ID NO: 24) lacZ2_genome_R CCGTCGATATTCAGCCATGTG (SEQ ID NO: 25) ung_genome_F CCCTCTTCCGCTTAGTAACTTG (SEQ ID NO: 26) ung_genome_R GAAGTGTTGCGTCGTCAG (SEQ ID NO: 27) RNF2_genome_F CCTGATCACCTCCCAAAGTC (SEQ ID NO: 28) RNF2_genome_R CCTGATCACCTCCCAAAGTC (SEQ ID NO: 29)

EXAMPLE 1

A Base Editing Method of Mutating C to A in Escherichia coli

[0182] The fusion expression of cytosine deaminase (AID) and nCas9 in Escherichia coli can achieve a site-directed mutation of cytosine (C) to thymine (T) as well as cytosine (C) to adenine (A) at a specific site of Escherichia coli under the guidance of gRNA; wherein C-to-T mutations accounted for 40.7% of total mutations and C-to-A mutations accounted for 59.3% of the total mutations.

I. Test Methods

[0183] The pnCas9_AID plasmid containing a cytosine deaminase (AID) and nCas9 fusion expression system and an Escherichia coli gRNA plasmid containing different target sites were introduced into wild Escherichia coli MG1655 or wild Escherichia coli ATCC 8739, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing of an edited site.

[0184] A map of the pnCas9_AID plasmid was shown in FIG. 3, and its nucleotide sequence was as shown in SEQ ID NO:1; wherein sites 1-4,104 represented an encoding gene sequence of the nCas9, sites 4,405-5,028 represented an encoding gene sequence of the cytosine deaminase (AID), sites 6,609-7,268 were chloramphenicol genes, and sites 8,335-6,245 were origins of replication. The pnCas9_AID plasmid expressed a fusion protein composed of cytosine deaminase (AID) and nCas9.

[0185] The Escherichia coli gRNA plasmid containing different target sites was a gRNA plasmid targeting at different genes such as dcuA, dcuB, dcuC, dcuD, relA, rpoS and lacZ or targeting at different sites of the same gene. The specific target sites of the gRNA plasmid were as shown in Table 2.

[0186] Taking the gRNA plasmid targeting at sites 1,444-1,463 of a lacZ gene as an example, its map was shown in FIG. 4, and its nucleotide sequence was as shown in SEQ ID NO: 2; wherein sites 336-1,148 were apramycin genes, sites 1,421-1,440 represented the target sequence sites 1,441-1,518 represented a gRNA sequence, and sites 2,001-2,620 were origins of replication. In this example or hereinafter, the Escherichia coli gRNA plasmids targeting at other sites can be obtained simply by replacing the target sequence in the gRNA plasmid shown in SEQ ID NO: 2 by the target sequence of other genes or other target sequences of the same gene.

TABLE-US-00002 TABLE 2 Specific Target Sites of the gRNA Plasmid N N N N N N N N N N N N N N N N N N N N E.coli N N N N N N N N N N N N E.coli ATCC8739 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 MG1655 20 19 18 17 16 15 14 13 12 11 10 N9 N8 N7 N6 N5 N4 N3 2 N1 dcuA643-662 wt A A G C A G A T T G A A A T C A A A T C lacZ 1444- T T A C G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C 1463 wt T A A T G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C T A A C G C C C G G T G C A G T A T G A A A G T A G A T T G A A A T C A A A T C T T C C G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C T T A C G C C C G G T G C A G T A T G A T T A C G C C C G G T G C A G T A T G A dcuB123-142 wt C C T T C A G C C A G G T A A A C C A C T T A C G C C C G G T G C A G T A T G A C A T T C A G A C A G G T A A A C C A C T T C A G C C C G G T G C A G T A T G A C A T T A A G C C A G G T A A A C C A C T A C T G C C C G G T G C A G T A T G A C T T T C A G C C A G G T A A A C C A C T T T C G C C C G G T G C A G T A T G A T A T T C A G C C A G G T A A A C C A C T T T C G C C C G G T G C A G T A T G A C T T T C A G C C A G G T A A A C C A C T A C C G C C C G G T G C A G T A T G A C A T T C A G C C A G G T A A A C C A C T A C C G C C C G G T G C A G T A T G A C T T T C A G C C A G G T A A A C C A C T A C C G C C C G G T G C A G T A T G A C T T T C A G C C A G G T A A A C C A C T A C C G C C C G G T G C A G T A T G A T T T T G C C C G G T G C A G T A T G A dcuC970-989 wt G C T C A G G G G C T T A G C A C C A T T T T C G C C C G G T G C A G T A T G A G A T C A G G G G C T T A G C A C C A T T T A C G C C C G G T G C A G T A T G A G A T C A G G G G C T T A G C A C C A T T T T C G C C C G G T G C A G T A T G A G C T A A G G G G C T T A G C A C C A T T T A C G C C C G G T G C A G T A T G A G A T C A G G G G C T T A G C A C C A T T T A C G C C C G G T G C A G T A T G A G T T C A G G G G C T T A G C A C C A T T A A C G C C C G G T G C A G T A T G A T T A C G C C C G G T G C A G T A T G A dcuD801-820 wt G C A G T C A G A A C T G C A T C T G G T T A C G C C C G G T G C A G T A T G A G A A G T C A G A A C T G C A T C T G G T A T C G C C C G G T G C A G T A T G A G A A G T T A G A A C T G C A T C T G G T T A C G C C C G G T G C A G T A T G A G A A G T C A G A A C T G C A T C T G G T T A C G C C C G G T G C A G T A T G A T T T C G C C C G G T G C A G T A T G A relA 129-148 wt G C A A C A G A C G C A G G G G C A T C T T T C G C C C G G T G C A G T A T G A G A A A C A G A C G C A G G G G C A T C T T C A G C C C G G T G C A G T A T G A G A A A C A G A C G C A G G G G C A T C T C T C G C C C G G T G C A G T A T G A G T A A C A G A C G C A G G G G C A T C T A A C G C C C G G T G C A G T A T G A G T A A C A G A C G C A G G G G C A T C T T T C G C C C G G T G C A G T A T G A T A C T G C C C G G T G C A G T A T G A rpoS175-194 wt C A G C T T T A C C T T G G T G A G A T T A C A G C C C G G T G C A G T A T G A C A G A T T T A C C T T G G T G A G A T T C A C G C C C G G T G C A G T A T G A T T A C G C C C G G T G C A G T A T G A E.coli MG1655 T T T C G C C C G G T G C A G T A T G A dcuA643-662 wt A A G C A G A T T G A A A T C A A A T C T T A C G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C T A A C G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C T A C C G C C C G G T G C A G T A T G A A A G A A G A T T G A A A T C A A A T C T A A C G C C C G G T G C A G T A T G A T A C A G C C C G G T G C A G T A T G A dcuC970-989 wt G C T C A G G G G C T T A G C A C C A T T A C C G C C C G G T G C A G T A T G A G A T C A G G G G C T T A G C A C C A T G A T C A G G G G C T T A G C A C C A T lacZ 2293- T T T C T T T C A C A G A T G T G G A T G A T C A G G G G C T T A G C A C C A T 2312 wt T T T A T T T C A C A G A T G T G G A T G A T C A G G G G C T T A G C A C C A T T T T A T T T C A C A G A T G T G G A T G T T C A G G G G C T T A G C A C C A T T T T T T T T C A C A G A T G T G G A T G A T C A G G G G C T T A G C A C C A T T T T A T T T C A C A G A T G T G G A T G A T C A G G G G C T T A G C A C C A T T T T T T T T C A C A G A T G T G G A T T T T A T T T C A C A G A T G T G G A T lacZ 1431-1450 wt A T C T G T C G A T C C T T C C C G C C T T T A T T T C A C A G A T G T G G A T A T T T G T C G A T C C T T C C C G C C T T T T T T T C A C A G A T G T G G A T A T A T G T A G A T C C T T C C C G C C T T T A T T T C A C A G A T G T G G A T A T A T G T A G A T C C T T C C C G C C A T A T G T C G A T C C T T C C C G C C lacZ 1640- T T G G C G G T T T C G C T A A A T A C A T A T G T C G A T C C T T C C C G C C 1659 wt T T G G A G G T T T C G C T A A A T A C A T T T G T C G A T C C T T C C C G C C T T G G T G G T T T C G C T A A A T A C A T A T G T T G A T C C T T C C C G C C lacZ 1608- T T G C G A A T A C G C C C A C G C G A 1627 wt T T G A G A A T A C G C C C A C G C G A T T G A G A A T A C G C C C A C G C G A T T G A G A A T A C G C C C A C G C G A T T G A G A A T A C G C C C A C G C G A T T G T G A A T A C G C C C A C G C G A T T G A G A A T A A G C C C A C G C G A T T G T G A A T A C G C C C A C G C G A

[0187] The left part of Table 2 shows the SEQ ID NO: 30-80 respectively; the right part shows the SEQ ID NO: 81-147 respectively.

II. Test Results

[0188] In the wild Escherichia coli MG1655 and ATCC 8739, the cytosine deaminase (AID) and nCas9 fusion expression system is used for the site-directed base editing by selecting different genes such as dcuA, dcuB, dcuC, dcuD, relA, rpoS and lacZ and different sites of the same gene, respectively.

[0189] The editing results are as shown in Table 2. The results show that among the 7 target sites of the Escherichia coli MG1655, a total of 51 bases C were mutated to bases T and 70 bases C were mutated to bases A. Among the 6 target sites of the Escherichia coli ATCC 8739, a total of 10 bases C were mutated to bases T and 19 bases C were mutated to bases A. Wherein the C-to-T mutations accounted for 40.7% (61/150) of total mutations, and C-to-A mutations accounted for 59.3% (89/150) of the total mutations. The above results indicate that the base editing system composed of cytosine deaminase (AID) and nCas9 can not only realize the base substitution of cytosine (C) to thymine (T), but also realize the base substitution of cytosine (C) to adenine (A).

EXAMPLE 2

A Method for Improving a Base Editing Efficiency of Mutating a Target Bace C to A in Escherichia coli

[0190] In order to improve a base editing efficiency of mutating cytosine (C) to adenine (A) at a specific site in Escherichia coli, the fusion expression of cytosine deaminase (AID), nCas9 and uracil DNA glycosidase was performed in Escherichia coli, so that the base editing efficiency of mutating cytosine (C) to adenine (A) can reach 94.5%.

I. Test Methods

[0191] The pUNG_nCas9_AID plasmid containing a cytosine deaminase (AID), nCas9 and uracil DNA glycosidase fusion expression system and the Escherichia coli gRNA plasmid containing different target sites were introduced into wild Escherichia coli MG1655, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing.

[0192] A map of the pUNG_nCas9_AID plasmid was shown in FIG. 5, and its nucleotide sequence was as shown in SEQ ID NO: 3; wherein sites 1-687 represented an encoding gene sequence of the uracil DNA glycosidase, sites 736-4,839 represented an encoding gene sequence of nCas9, sites 5,140-5,781 represented an encoding gene sequence of the cytosine deaminase (AID), sites 7,344-8,003 were chloramphenicol genes, and sites 6,070-6,980 were origins of replication. The pUNG_nCas9_AID plasmid expressed a fusion protein composed of cytosine deaminase (AID), nCas9 and uracil DNA glycosidase.

[0193] The Escherichia coli gRNA plasmid containing different target sites was the gRNA plasmid targeting at different sites of the lacZ gene. The specific target sites of the gRNA plasmid were as shown in Table 3.

TABLE-US-00003 TABLE 3 Specific Target Sites of the gRNA Plasmid N N N N N N N N N N N N N N N N N N N N E.coli N N N N N N N N N N N N E.coli MG1655 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 MG1655 20 19 18 17 16 15 14 13 12 11 10 N9 N8 N7 N6 N5 N4 N3 2 N1 lacZ 1444- T C C C G C C C G G T G C A G T A T G A lacZ 1431- A T C T G T C G A T C C T T C C C G C C 1463 wt T C C C G C C C G G T G C A G T A T G A 1450 wt A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A A T C T G T C G A T C C T T C C C G C C T C C C G C C C G G T G C A G T A T G A T C C C G C C C G G T G C A G T A T G A lacZ 1431- T T G C G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A 1450 wt T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T T G A G A A T A C G C C C A C G C G A T C C C G C C C G G T G C A G T A T G A T C C C G C C C G G T G C A G T A T G A lacz 984- C T G C G A T G T C G G T T T C C G C G T C C C G C C C G G T G C A G T A T G A 1003 wt C T G A G A T G T C G G T T T C C G C G T C C C G C C C G G T G C A G T A T G A C T G A G A T G T C G G T T T C C G C G T C C C G C C C G G T G C A G T A T G A C T G A G A T G T C G G T T T C C G C G T C C C G C C C G G T G C A G T A T G A C T G A G A T G T C G G T T T C C G C G T C C C G C C C G G T G C A G T A T G A C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G C T G A G A T G T C G G T T T C C G C G

[0194] The left part of table 3 shows the SEQ ID NO: 148-187 respectively; the right part shows the SEQ ID NO: 188-234 respectively.

II. Test Results

[0195] In the wild Escherichia coli MG1655, 4 sites of the lacZ gene were selected for site-directed base editing, and the base editing efficiency was calculated [base editing efficiency=(number of positive strains with target base substitutions/total number of positive strains analyzed)×100%].

[0196] The editing results are as shown in Table 7. The results show that among the 4 target sites of the Escherichia coli MG1655, a total of 121 bases C were mutated to bases A, 5 bases C were mutated to bases T, and 2 bases C were mutated to bases G. The C-to-A mutations accounted for 94.5% (121/128) of total mutations. The above results indicate that the base editing system composed of cytosine deaminase (AID), nCas9 and uracil DNA glycosidase can significantly improve the base editing efficiency of mutating C to A.

EXAMPLE 3

A Method for Improving a Base Editing Efficiency of Mutating a Target Base C to G in Mammalian Cells

[0197] In a literature (Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424 (2016)), it has been found that the fusion expression of cytosine deaminase (APOBEC) and nCas9 in mammalian cells can achieve site-directed substitutions of cytosine (C) to thymine (T) as well as cytosine (C) to guanine (G) at a specific site of the mammalian cells; wherein the C-to-T mutations accounted for 89.6% of total mutations and C-to-G mutations accounted for 10.4% of the total mutations.

[0198] In order to improve a base editing efficiency of mutating cytosine (C) to guanine (G) at a specific site in mammalian cells, the fusion expression of cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosidase was performed in mammalian cells, so that the base editing efficiency of mutating cytosine (C) to guanine (G) can reach 95.2%.

I. Test Methods

[0199] The pAPOBEC_nCas9_UGI plasmid containing a cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosylase inhibitory protein (UGI) fusion expression system and the pAPOBEC_nCas9_UNG palsmid containing a cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosidase fusion expression system were respectively transfected with mammalian cells gRNA plasmid containing targeting sites into HEK293T or Hela cells using a Lipofectamine 2000 (Life, Invitrogen, 11668019) reagent, and cells genome DNA was extracted after transfection for 96 h for PCR detection and sequencing of an edited site. Two parallel tests (test 1 and test 2) were performed for each type of cells using each combination method.

[0200] A map of the pAPOBEC_nCas9_UGI plasmid was as shown in FIG. 6, and its nucleotide sequence was as shown in SEQ ID NO: 4; wherein sites 1,038-1,721 represented an encoding gene sequence of the cytosine deaminase (APOBEC1), sites 1,773-5,873 represented an encoding gene sequence of the nCas9, sites 5,943-6,191 represented an encoding gene sequence of the uracil DNA glycosylase inhibitory protein (UGI), sites 7,430-8,018 were replicons for the amplification of Escherichia coli, and sites 8,189-9,049 were ampicillin resistance genes for the amplification of Escherichia coli. The pAPOBEC_nCas9_UGI plasmid expressed a fusion protein composed of cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosylase inhibitory protein (UGI).

[0201] A map of the pAPOBEC_nCas9_UNG plasmid was as shown in FIG. 7, and its nucleotide sequence was as shown in SEQ ID NO: 5; wherein sites 1-663 represented an encoding gene sequence of the uracil DNA glycosidase, sites 1,902-2,490 were replicons for the amplification of Escherichia coli, sites 2,661-3,521 were ampicillin resistance genes for the amplification of Escherichia coli, sites 4,695-5,375 represented an encoding gene sequence of the cytosine deaminase (APOBEC), and sites 5,430-9,530 represented an encoding gene sequence of nCas9. The pAPOBEC_nCas9_UNG plasmid expressed a fusion protein composed of cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosidase.

[0202] A map of the mammalian cells gRNA plasmid containing targeting sites (targeting sites 42,220-42,239 of RNF2 gene) was as shown in FIG. 8, and its nucleotide sequence was as shown in SEQ ID NO: 6; wherein sites 322-341 represented the target sequence, sites 342-417 represented a gRNA sequence, sites 1,167-1,766 are purinomycin genes of the mammalian cells, sites 2,453-3,041 were replicons for the amplification of Escherichia coli, and sites 3,212-4,072 were ampicillin genes for the amplification of Escherichia coli. In this example or hereinafter, the mammalian cells gRNA plasmids targeting at other sites can be obtained simply by substituting the target sequence in the gRNA plasmid shown in SEQ ID NO: 6 by the target sequence of other genes or other target sequences of the same gene.

II. Test Results

[0203] In the mammalian cells HEK293T or Hela cells, the RNF2 gene sites were selected for site-directed base editing, PCR was performed on the target sites, and deep sequencing analysis was performed on the PCR products, with more than 100,000 reads for the deep sequencing of each PCR product; the base editing efficiency was calculated according to the following formula: base editing efficiency=(number of reads with target base substitutions/total number of reads analyzed))×100%. The sequencing primer sequences were as follows:

TABLE-US-00004 RNF2-deep-F1: (SEQ ID NO: 235) CGTGTATCACCACGCC; RNF2-deep-R1: (SEQ ID NO: 236) CAATACAAAGATTTTCCTAC; RNF2-deep-F2: (SEQ ID NO: 237) TGAGATGGAGTCTTGCTGTG; RNF2-deep-R2: (SEQ ID NO: 238) CAGGCAGATCACAAGGTCAG.

[0204] The editing results are as shown in Table 9. The results show that the base editing efficiency of mutating a base C to G at the C6 site in the HEK293T cells is increased from 10.4% to 95.2%, and the base editing efficiency of mutating C to G at C6 site in the Hela cells increased from 14.8% to 87.9%.

EXAMPLE 4

A Base Editing Method of Mutating Any Base to Any Base in Escherichia coli

[0205] A combination of a base editing system for mutating C to A, a base editing system for mutating C to T and a base editing system for mutating A to G realizes the mutation of a base A, T, C or G to any base in Escherichia coli, as shown in FIG. 1.

I. Test Methods

1. Mutation from a Base C to Any Base

[0206] The pUNG_nCas9_AID plasmid containing a cytosine deaminase (AID), nCas9 and uracil DNA glycosidase fusion expression system and an Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was TTT custom-character TTTCACAGATGTGGAT (SEQ ID NO:239), in which underlined bases were specific sites to be edited) were introduced into wild Escherichia coli MG1655, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing of the edited sites, and the strains with C mutated to A at the specific sites were screened out respectively so as to realize the editing from a base C to a base A.

[0207] The pnCas9_AID plasmid containing a cytosine deaminase (AID) and nCas9 fusion expression system and the Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was TTT custom-character TTTCACAGATGTGGAT (SEQ ID NO:239), in which underlined bases were specific sites to be edited) were introduced into wild Escherichia coli MG1655, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing, and the strains with C mutated to T at the specific sites were screened out respectively so as to realize the editing from a base C to a base T.

[0208] The screened strains with C mutated to A were cultured, and the plasmid was discarded; then the pTadA_nCas9 plasmid containing an adenine deaminase (TadA) and nCas9 fusion expression system and an Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was TTT custom-character TTTCACAGATGTGGAT (SEQ ID NO:240), in which underlined bases were specific sites to be edited) were introduced into the strains with C mutated to A, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing of the edited sites, and the strains with C mutated to G at the specific sites were screened out so as to realize the editing from a base C to a base G.

2. Mutation from a Base T to Any Base

[0209] The pTadA_nCas9 plasmid containing an adenine deaminase (AID) and nCas9 fusion expression system and the Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was AGGCC custom-character ATCCGCGCCGGATG (SEQ ID NO:241), in which underlined bases were specific sites to be edited) were introduced into wild Escherichia coli MG1655, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing, and the strains with A mutated to G at the specific sites were screened out so as to realize the editing from a base T to a base C.

[0210] The screened strains with C mutated to A were cultured without antibiotic, and the plasmid was discarded; then the pUNG_nCas9_AID plasmid containing a cytosine deaminase (AID), nCas9 and UNG fusion expression system and the Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was GAT custom-character GGCCTGAACTGCCAGC(SEQ ID NO:242), in which underlined bases were specific sites to be edited) were introduced into the strains with A mutated to G, plating was performed after culturing for 24 h, a part of colonies were randomly selected for PCR detection and sequencing of the edited sites, and the strains with C mutated to A at the specific sites were screened out so as to realize the editing from a base T to a base A.

[0211] The screened strains with C mutated to A were cultured without antibiotic, and the plasmid was discarded; then the pTadA_nCas9 plasmid containing an adenine deaminase (TadA) and nCas9 fusion expression system and the Escherichia coli gRNA plasmid (the target sequence in the gRNA plasmid was GAT custom-character GGCCTGAACTGCCAGC (SEQ ID NO:243), in which underlined bases were specific sites to be edited) were introduced into the strains with C mutated to A, plating was performed after culturing for 24 h, and a part of colonies were randomly selected for PCR detection and sequencing of the edited sites, and the strains with A mutated to G at the specific sites were screened out so as to realize the editing from a base T to a base G.

[0212] A map of the pTadA_nCas9 plasmid was as shown in FIG. 10, and its nucleotide sequence was as shown in SEQ ID NO: 7; wherein sites 3,982-4,530 represented an encoding gene sequence of the adenine deaminase (TadA), sites 4,531-8,637 represented an encoding gene sequence of the nCas9, sites 1,563-2,222 were chloramphenicol genes, and sites 289-1,199 were origins of replication. The pTadA_nCas9 plasmid expressed a fusion protein composed of adenine deaminase (TadA) and nCas9.

II. Test Results

[0213] In the wild Escherichia coli MG1655, 2 sites of the lacZ gene were selected for editing any base, and the base editing efficiency was calculated [base editing efficiency=(number of positive strains with target base substitutions/total number of positive strains analyzed)×100%].

[0214] The editing results are as shown in Table 11. The results show that starting from the base C, the editing efficiency of mutating the base C to the base T is 66.7%, the editing efficiency of mutating the base C to the base A is 96%, and the editing efficiency of mutating the base C to the base G is 96%×41.2%=39.6%. Starting from the base T (its complementary base is the base A), the editing efficiency of mutating the base T to the base C is 45.8%, the editing efficiency of mutating the base T to the base A is 45.8%×95.4%=43.7%, and the editing efficiency of mutating the base T to the base G is 45.8%×95.4%×50.2%=21.9%.

EXAMPLE 5

A Base Editing Method of Mutating Any Base to Any Base in Mammalian Cells

[0215] A combination of a base editing system for mutating C to G, a base editing system for mutating C to T and a base editing system for mutating A to G realizes the mutation of a base A, T, C or G to any base in mammalian cells, as shown in FIG. 2.

I. Test methods

1. Mutation from a Base C to Any Base

[0216] The pAPOBEC_nCas9_UNG plasmid containing a cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosidase fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was TCC custom-character AAAGTACTGAGATTAC(SEQ ID NO:244), in which underlined bases were specific sites to be edited) were transfected with HEK293T cells, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using a flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with C mutated to G at the specific sites were screened out respectively so as to realize the editing from a base C to a base G.

[0217] The pAPOBEC_nCas9_plasmid containing a cytosine deaminase (APOBEC) and nCas9 fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was TCC custom-character AAAGTACTGAGATTAC(SEQ ID NO:244), in which underlined bases were specific sites to be edited) were transfected with HEK293T cells, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using the flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with C mutated to T at the specific sites were screened out respectively so as to realize the editing from a base C to a base T.

[0218] The pAPOBEC_nCas9_UGI plasmid containing a cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosylase inhibitory protein (UGI) fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was GTACTTT custom-character GGAGGCCGAGGC(SEQ ID NO:245), in which underlined bases were specific sites to be edited) were transfected with the cells with C mutated to G, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using the flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with C mutated to T at the specific sites were screened out so as to realize the editing from a base C to a base A.

2. Mutation from a Base T to Any Base

[0219] The xcas9 (3.7)-ABE (7.10) plasmid containing an adenine deaminase (TadA) and xCas9 (3.7) fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was GCTTT custom-character GCGTCTTGAGTAGC(SEQ ID NO:246), in which underlined bases were specific sites to be edited) were transfected with HEK293T cells, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using the flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with A mutated to G at the specific sites were screened out so as to realize the editing from a base T to a base C.

[0220] The pAPOBEC_nCas9_UNG plasmid containing a cytosine deaminase (APOBEC), nCas9 and uracil DNA glycosidase fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was CGC custom-character AAAGCAGGAGAATCGC(SEQ ID NO:247), in which underlined bases were specific sites to be edited) were transfected with the cells with A mutated to G, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using a flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with C mutated to G at the specific sites were screened out so as to realize the editing from a base T to a base G.

[0221] The pAPOBEC_nCas9 plasmid containing a cytosine deaminase (APOBEC) and nCas9 fusion expression system and the mammalian cells gRNA plasmid (the target sequence in the gRNA plasmid was GCTTT custom-character GCGTCTTGAGTAGC(SEQ ID NO:248), in which underlined bases were specific sites to be edited) were transfected with the cells with C mutated to G, puromycin with a final concentration of 5 ug/ml was added after transfection for 24 h, and single cells were separated using the flow cytometry after 72 h and then cultured in a 96-well plate; cellular genomes were extracted after 24 h for PCR detection and sequencing, and the cells with C mutated to T at the specific sites were screened out so as to realize the editing from a base T to a base A.

[0222] A map of the xcas9 (3.7)-ABE (7.10) plasmid was as shown in FIG. 12, and its nucleotide sequence was as shown in SEQ ID NO: 7; wherein sites 676-1,176 represented an encoding gene sequence of the adenine deaminase (TadA), sites 1,867-5,967 represented an encoding gene sequence of the xCas9 (3.7), sites 7,544-8,404 were chloramphenicol genes, and sites 6,785-7,373 were origins of replication.

II. Test Results

[0223] In the HEK293T cells, two RNF gene sites were selected for editing any base, PCR was performed on the target sites, and deep sequencing analysis was performed on the PCR products, with more than 100,000 reads for the deep sequencing of each PCR product; the base editing efficiency was calculated according to the following formula: base editing efficiency=(number of reads with target base substitutions/total number of reads analyzed)×100%. The sequencing primer sequences were as follows:

TABLE-US-00005 RNF2-deep-F1: (SEQ ID NO: 249) CGTGTATCACCACGCC; RNF2-deep-R1: (SEQ ID NO: 250) CAATACAAAGATTTTCCTAC; RNF2-deep-F2: (SEQ ID NO: 251) TGAGATGGAGTCTTGCTGTG; RNF2-deep-R2: (SEQ ID NO: 252) CAGGCAGATCACAAGGTCAG.

[0224] The editing results are as shown in Table 13. The results show that starting from the base C, the editing efficiency of mutating the base C to the base T is 52.5%, the editing efficiency of mutating the base C to the base G is 46.3%, and the editing efficiency of mutating the base C to the base A is 46.3%×43.5%=20.1%. Starting from the base T (its complementary base is the base A), the editing efficiency of mutating the base T to the base C is 48.6%, the editing efficiency of mutating the base T to the base G is 48.6%×38.2%=18.6%, and the editing efficiency of mutating the base T to the base A is 48.6%×38.2%×50.7%=9.4%.

[0225] The foregoing descriptions are merely some of the preferred embodiments of the present invention. It should be noted that for those of ordinary skill in the art, a number of improvements and modifications can be made without departing from the technical principle of the present invention. These improvements and modifications should also fall within the protection scope of the present invention.

INDUSTRIAL APPLICATION

[0226] The present invention provides a base editing system for mutating C to A in prokaryotes, a base editing system for mutating C to G in eukaryotes and applications thereof. The base editing system for mutating C to A of the present invention includes cytosine deaminase AID and nCas9 nuclease or includes cytosine deaminase AID, nCas9 nuclease and uracil DNA glycosidase; the base editing system for mutating C to G of the present invention includes cytosine deaminase AID and nCas9 nuclease or includes cytosine deaminase APOBEC, nCas9 nuclease and uracil DNA glycosidase. The experiments prove that a combination of the three base editing systems for mutating C to A, C to T and A to G can realize a mutation of A, T, C or G to any base in prokaryotes (such as Escherichia coil); a combination of the three base editing systems for mutating C to G, C to T and A to G can realize a mutation from A, T, C or G to any base in eukaryotes (such as mammalian cells).

BASE EDITING SYSTEMS FOR ACHIEVING C TO A AND C TO G BASE MUTATION AND APPLICATION THEREOF

Inventors

Cpc classification

Classification Explorer

C12N2310/20

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/907

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/78

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/2497

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/102

CHEMISTRY; METALLURGY

Classification Explorer

C12N2800/80

CHEMISTRY; METALLURGY

Classification Explorer

C12Y302/02027

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/90

CHEMISTRY; METALLURGY

Classification Explorer

Y02A50/30

GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Classification Explorer

C12N15/902

CHEMISTRY; METALLURGY

Classification Explorer

C12Y305/04001

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C12N15/10

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/90

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/24

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/78

CHEMISTRY; METALLURGY

Abstract

Claims

Description