Off-Target Single Nucleotide Variants Caused by Single-Base Editing and High-Specificity Off-Target-Free Single-Base Gene Editing Tool
20220136041 · 2022-05-05
Inventors
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N9/78
CHEMISTRY; METALLURGY
G01N2333/978
PHYSICS
A61K48/0066
HUMAN NECESSITIES
A61K49/0004
HUMAN NECESSITIES
G01N2500/02
PHYSICS
International classification
A61K48/00
HUMAN NECESSITIES
C12N15/90
CHEMISTRY; METALLURGY
Abstract
Provided are a method for reducing the off-target effect of a single-base editor, and a method (GOTI) for analyzing the targeting effect of a gene editing tool or a gene editing operation.
Claims
1.-34. (canceled)
35. A method for reducing the off-target effect of a single-base editor, comprising: modifying the cytosine deaminase in a single-base editor system to weaken its binding to DNA.
36. The method according to claim 35, wherein, modifying the cytosine deaminase is to modify the DNA binding region of the cytosine deaminase; the DNA binding region is a domain thereof that binds to DNA.
37. The method according to claim 36, wherein, the modification comprises: gene mutation, targeted blocking, interference.
38. The method according to claim 35, wherein, the single-base editor system is a BE3 gene editor system, or the DNA is single-stranded DNA or double-stranded DNA.
39. The method according to claim 35, wherein, the cytosine deaminase comprising an enzyme selected from the group consisting of: AID, APOBEC3G, APOBEC1, APOBECA3A, CDA1.
40. The method according to claim 39, wherein, the cytosine deaminase is APOBEC1, said modifying the cytosine deaminase is to modify the amino acid at position 126 of the enzyme.
41. The method according to claim 40, wherein, said modifying the cytosine deaminase is to modify R126 of the enzyme to E.
42. The method according to claim 40, wherein, modifying APOBEC1 comprising: modifying the amino acid at position 90 of the APOBEC1 enzyme.
43. The method according to claim 42, wherein, said modifying is to modify the amino acid at position 90 to Y.
44. The method according to claim 39, wherein, the cytosine deaminase is APOBECA3A, and modifying the cytosine deaminase is to modify the amino acid at position 130 of the enzyme.
45. The method according to claim 44, wherein, the enzyme is modified to alter Y at position 130 to F.
46. A mutant of cytosine deaminase, wherein the DNA binding region of the cytosine deaminase is modified to weaken its binding to DNA, such as single-stranded DNA.
47. The cytosine deaminase according to claim 46, wherein, the cytosine deaminase comprises an enzyme selected from the group consisting of: AID, APOBEC3G, APOBEC1, APOBECA3A, CDA1.
48. The mutant according to claim 46, wherein, the enzyme is APOBEC1, the domain is modified to alter R at position 126 to E.
49. The mutant according to claim 46, wherein, APOBEC1 is further modified at the 90th amino acid of the enzyme; the enzyme is modified to alter the amino acid at position 90 to Y.
50. The mutant according to claim 46, wherein, the enzyme is APOBECA3A, and the enzyme is modified to alter Y at position 130 to F.
51. An isolated polynucleotide, wherein the polynucleotide encodes the mutant according to claim 46.
52. A single-base editor, comprising a mutant of the cytosine deaminase according to claim 43, the editor is a BE3 single-base editor.
53. A method for screening a substance for reducing the off-target effect of a single-base editor, comprising: (1) treating a system with a candidate substance, the system containing interaction between a cytosine deaminase or its DNA binding domain and DNA; and (2) detecting the interaction between the cytosine deaminase or its DNA binding domain and DNA in the system; wherein, if the candidate substance inhibits, blocks or down-regulates the interaction between the cytosine deaminase or its DNA binding domain and DNA, the candidate substance is useful for reducing the off-target effect of the gene editor.
54. A method for analyzing the on-target effect of gene editing or the on-target effect of a single-base gene editing tool, the method comprising: (1) obtaining a n-cell stage embryo, subjecting one to n−1 cells thereof to gene editing, wherein n is a positive integer from 2 to 10; (2) observing or detecting the occurrence of gene editing in the downstream development stages of the embryo.
55. The method according to claim 54, wherein, in step (1), n is a positive integer of 2 to 8, 2 to 6 or 2 to 4; or, n is 2.
56. The method according to claim 54, wherein, the method is an in vitro cultivation method or an in vivo cultivation method.
57. The method according to claim 54, wherein, in step (2), the downstream development stage of the embryo is from gastrulation stage of the embryo to prenatal stage, or from embryo implantation into a uterus to prenatal stage in vivo.
58. The method according to claim 54, wherein, the embryo is a mouse embryo, and the downstream development stage of the embryo is the 8th to 20th day of embryonic development, or is the 9.5th to 18.5th day of embryonic development, or is the 12th to 16th day of embryonic development.
59. The method according to claim 54, wherein, during the cleavage stage of the embryo, the gene-edited blastomere and the unedited blastomere of the embryo is separated and transplanted into recipients to develop separate adults.
60. The method according to claim 59, wherein, the gene-edited blastomere and the unedited blastomere form separate embryos, which are transplanted to different recipients or the same recipient, or used to establish embryonic stem cell lines in vitro.
61. The method according to claim 54, wherein, the gene editing comprises: CRISPR-mediated gene editing, Base Editor-mediated gene editing, Cre/loxP-mediated gene editing, Prime editor.
62. The method according to claim 61, wherein, the CRISPR-mediated gene editing comprises: CRISPR/Cas9-mediated gene editing, CRISPR/Cas9n-mediated gene editing, CRISPR/Cas13-mediated gene editing, CRISPR/CasRx-mediated gene editing.
63. The method according to claim 61, wherein, the Base Editor comprises: BE1, BE2, BE3, BE4, or BE4-Max.
64. The method according to claim 61, wherein, the adenine base editor comprises: ABE7.10, ABE6.3, ABE7.8, ABE7.9, Prime Editing.
65. The method according to claim 54, wherein, step (1) comprises: introducing an enzyme for cutting a nucleic acid target site together with a corresponding guide sequence into one of the cells, and performing gene editing.
66. The method according to claim 65, wherein, the enzyme for cutting a nucleic acid target site is selected from the group consisting of: Cas9, Cas9n, Cas13a, CasRx, BE1, BE2, BE3, BE4, ABE7.10, ABE 6.3, ABE 7.8, ABE 7.9.
67. The method according to claim 54, wherein, in step (1), a detectable marker is used to label the gene editing, and the gene editing is performed on 1 to n−1 of the cells and labeled by the detectable marker.
68. The method according to claim 67, wherein, the detectable marker includes: a dye marker, a fluorescent signal molecule, a reporter gene; or, the detectable marker is tdTomato, EGFP, mCherry, GFP, dsred.
69. The method according to claim 54, wherein, in step (2), observing the occurrence of gene editing comprises: sorting cells that have undergone gene editing and cells that have not undergone gene editing; analyzing by sequencing; analyzing through a single nucleotide variation analysis tool and/or a indel analysis tool; comparing edited cells with unedited cells to identify on-target effects or off-target effects, including detection of SNVs and indels.
70. The method according to claim 69, wherein, the single nucleotide variation analysis tool comprises: Mutect2, Lofreq and Strelka or a combination thereof, or the indel analysis tool comprises: Mutect2, Scalpel, Strelka or a combination thereof.
71. The method according to claim 54, wherein, the embryo is derived from a mammal, including a non-human mammal.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
DETAILED DESCRIPTION
[0079] Genome editing is expected to correct disease-causing mutations. However, due to single nucleotide polymorphisms between different individuals, it is difficult to determine the off-target effects of gene editing. In order to study such off-target effects, the inventors developed a method for whole-genome off-target analysis by two- or multi-cell (preferably two-cell) embryo injection, named GOTI. The method of the present invention is suitable for tracking analysis detection of on-target effect/efficiency upon CRISPR-mediated gene editing, BaseEditor-mediated gene editing. Cre/loxP-mediated gene editing, adenine base editor-mediated gene editing.
[0080] The present invention provides a method (GOTI) for analyzing the targeted effect of a single-base gene editing tool, the method includes the steps of: (1) obtaining a n-cell stage embryo, gene editing 1 to n−1 cells thereof; where n is a positive integer from 2 to 10; (2) observing the occurrence and development of gene editing in the downstream development stages of the embryo. In some preferred embodiments, n is a positive integer of 2-8, 2-6 or 2-4. In a preferred embodiment, n is preferably 2.
[0081] The method of the present invention is suitable for embryo culture in vitro, for example, embryo culture in a test tube or other embryo culture container. The method of the present invention is also suitable for embryo cultivation in vivo, for example: performing the method of the present invention in vitro, transplanting the developed cells into the body, (for example transplanting into the fallopian tube of an animal, then the embryo can swim by itself into the uterus; or transplanting into the uterus of an animal).
[0082] The method of the present invention is suitable for embryo culture in vitro, for example, embryo culture in a test tube or other embryo culture container.
[0083] The method of the present invention is suitable for embryo culture in vitro, embryo culture in an embryo culture container, to establish an embryonic stem cell line.
[0084] The method of the present invention is suitable for embryo culture in vitro, embryo culture in an embryo culture container, to establish an embryonic stem cell line from the edited blastomere and the unedited blastomere, respectively.
[0085] The method of the present invention is suitable for the same embryo to separate the edited blastomere and the unedited blastomere and form two embryos which are respectively transplanted into recipients (different mice) or used to establish embryonic stem cell lines in vitro.
[0086] The method of the present invention is suitable for the same embryo to separate the edited blastomere and the unedited blastomere and form two embryos which are transplanted into the same recipient (one mouse) or used to establish embryonic stem cell lines in vitro.
[0087] The method of the present invention is also suitable for embryo cultivation in vivo, for example: performing the method of the present invention in vitro, transplanting the developed cells into the body, (for example transplanting into the fallopian tube of an animal, then the embryo can swim by itself into the uterus; or transplanting into the uterus of an animal).
[0088] In a preferred embodiment, the downstream development stages of the embryo are from gastrulation stage of the embryo to prenatal stage, or from embryo implantation into a uterus to prenatal stage in vivo. The inventor found that it is ideal to sort cells and determine the effect of gene editing at the “appropriate time” of embryonic development. Generally, the “appropriate time” is the stage where the embryo grows to a stage suitable for being broken down into single cells by enzymes. For example, n-cell stage embryo is a mouse embryo, and the downstream development stage of the embryo is the 8th to 20th day of embryonic development (E8-E20 stage), preferably is the 9.5th to 18.5th day of embryonic development (E9.5-E18.5 stage), more preferably is the 12th to 16th day of embryonic development (E11-E16 stage, such as E14.5).
[0089] The method of the present invention is applicable to a variety of single-base gene editing methods. The method of the present invention can be adopted in gene editing involving various enzyme(s) that cuts DNA target sites. The enzymes that cut the DNA target site can be a variety of enzymes involved in this process familiar to those skilled in the art, such as but not limited to the group consisting of Cas9, Cas9n, Cas13a, CasRx, BE1, BE2, BE3, BE4, ABE7.10, ABE 6.3, ABE 7.8, ABE7.9, Prime Editing.
[0090] In the GOTI method, detectable markers can be used to label the gene editing. The detectable markers include, but are not limited to: dye markers, fluorescent signal molecules, and reporter genes.
[0091] In the embodiment of the present invention, tdTomato is used, which is a preferred solution. Other markers can also be applied to the present invention.
[0092] As a preferred embodiment, observing the occurrence and development of gene editing includes: sorting cells that have undergone gene editing (such as tdTomato positive cells) and cells that have not undergone gene editing (such as tdTomato negative cells); analyzing by sequencing (such as WGS analysis); analyzing through SNV analysis tools and/or indel analysis tools; comparing edited cells with unedited cells to identify off-target SNVs and indels. It should be understood that the sequencing tools and analysis tools are not limited to those listed above and in the embodiments of the present invention. Other sequencing tools and analysis tools may also be applied to the present invention. Various methods known in the art can be used for cell sorting, such as but not limited to magnetic bead method, flow cytometry and the like.
[0093] In the present invention, the term “animal” refers to a mammal, including a human, a non-human primate (a monkey, an orangutan), a domestic animal and an agricultural animal (for example, a pig, a sheep, a cattle), a rat (a mouse), and a rodent (e.g., a mouse, a rat, a rabbit), etc. The animal is an animal that does not include a human; in limited or special circumstances, the animal can also be a human, but this is only suitable for an application that does not involve “commercial applications of human embryos”.
[0094] In a specific embodiment of the present invention, the comparison of the whole genome sequence of the progeny cells of edited and unedited blastomeres at E14.5 showed that in CRISPR-Cas9 or adenine single-base edited embryos, single-nucleotide vibration (SNV) off-target is rare, with a frequency close to the spontaneous mutation rate. In contrast, cytosine single-base editing induces more than 20-fold off-target single-nucleotide vibrations.
[0095] Before clinical application, mammalian cells are required to have no genome-wide off-target. However, due to the nucleotide polymorphisms in individuals, it is difficult to determine the extent of off-target effects. The GOTI (genome-wide off-target analysis by two-cell embryo injection) method developed by the present invention changes this current situation, which detects off-target mutations without interfering with SNPs, and can accurately and effectively analyze genome on-target effects.
[0096] The present inventors further studied the causes of off-target effects (such as single-nucleotide off-target mutations) in single-base editing. Upon observing that the single-base editing tool BE3 will cause a large number of single nucleotide off-target variants (SNVs), the inventors conducted a lot of research work and finally determined that these off-target mutations were caused by the overexpression of APOBEC1 and its binding with DNA (such as ssDNA). In a specific embodiment, the present invention discloses a solution to solve the off-target effect induced by BE3 by adding mutation(s) on APOBEC1, such as R126E, R132E, W90F, W90Y and W90F/R126E, W90Y/R126E mutation(s).
[0097] As mentioned above, the present invention has determined a useful method for reducing the off-target effect of single-base editors, including: modifying the cytosine deaminase in the single base editor system to weaken its binding to DNA (such as ssDNA). Preferably, the modification is the modification of the DNA binding region of cytosine deaminase; more preferably, the DNA binding region is a domain that binds to DNA. The single-base editor is, for example, the BE3 gene editor.
[0098] A variety of modification methods for cytosine deaminase can be used herein, as long as the weakening effect can be realized. As an alternative, the modification may includes: gene mutation, targeted blocking (such as blocking by binding proteins or antibodies, or blocking by competitive binding molecules), interference, etc.
[0099] A variety of cytosine deaminase that can be applied to the single-base editor system or enzymes having the same function can be modified by the method of the present invention to reduce the off-target effect of the single-base editor system. For example, the cytosine deaminase includes but is not limited to an enzyme selected from the group consisting of: AID (e.g., human AID), APOBEC3G (e.g., human APOBEC3G), APOBEC1, CDA1 (e.g. lamprey CDA1).
[0100] In the present invention, the term “weaken” or “weakening” means that the interaction (binding) ability of a cytosine deaminase with DNA is down-regulated or eliminated. For example, the weakening reduces the binding ability of cytosine deaminase to DNA by 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more, or 100%.
[0101] As a preferred embodiment of the present invention, a specific cytosine deaminase APOBEC1 (see SEQ ID NO: 1 for the wild-type sequence, and SEQ ID NO: 4 for a mutant thereof) is provided. After modification of the enzyme's DNA binding region, the editing results of the single-base editor system involving the enzyme have changed substantially, with the off-target effect significantly reduced. Preferably, such modification is to modify the amino acid at position 126 of the enzyme; more preferably, the modification is to mutate the R at position 126 to E.
[0102] In a more preferred embodiment, the modification of APOBEC1 further occurs at amino acid at position 90 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
[0103] In a more preferred embodiment, the modification of APOBEC1 further occurs at the 90th amino acid of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
[0104] As another preferred embodiment of the present invention, a specific cytosine deaminase APOBECA3A (SEQ ID NO: 37) is provided. The modification of APOBECA3A occurs at or near the 130th amino acid of the enzyme. Preferably, the modification is to alter its (SEQ ID NO: 37) Y at position 130 to F.
[0105] Based on the inventor's discovery, further provided is a method for screening substances useful for reducing off-target effect of BE3 gene editor, including: (1) treating a system with candidate substance(s), the system containing interaction (binding) between a cytosine deaminase or its DNA binding domain and DNA; and (2) detecting the interaction between the cytosine deaminase DNA binding domain and DNA in the system; wherein, if the candidate substance inhibits, blocks or down-regulates the interaction between the cytosine deaminase or its DNA binding domain and DNA, the candidate substance is useful for reducing the off-target effect of BE3 gene editor.
[0106] In a preferred embodiment of the present invention, in order to observe changes in interaction (binding) between cytosine deaminase or its DNA binding domain and DNA during the screening, a control group can also be set. A control may be a system containing interaction (binding) between a cytosine deaminase or its DNA binding domain and DNA without adding the candidate substance.
[0107] In preferable embodiments, the method further includes: performing a cell experiment and/or animal experiment on the obtained potential substances to further select and determine a substance that is really useful for regulating the interaction (binding) between the cytosine deaminase or its DNA binding domain and DNA.
[0108] The disclosure is further illustrated by the specific examples described below. It should be understood that these examples are merely illustrative, and do not limit the scope of the present disclosure. The experimental methods without specifying the specific conditions in the following examples generally used the conventional conditions, such as those described in J. Sambrook, Molecular Cloning: A Laboratory Manual (3rd ed. Science Press, 2002) or followed the manufacturer's recommendation.
[0109] Materials and Methods
[0110] 1. Experimental Design Including GOTI Method
[0111] The mixture of Cre. Cas9/BE3/ABE7.10 mRNA and sgRNA were injected into one blastomere of two-cell embryos derived from wild-type female mice X Ai9 male mice. The addition of Cre produces chimeric embryos in which the injected cells are marked with tdTomato (red). A positive tdTomato indicates that editing has occurred, and a negative tdTomato indicates unedited cells. TdTomato positive cells and tdTomato negative cells were separated from chimeric embryos by FACS at E14.5 and used for WGS analysis respectively. Off-target SNVs and indels were identified by comparing tdTomato+ cells and tdTomato− cells using three algorithms (Mutect2, Lofreq and Strelka for SNV analysis, and Mutect2, Scalpel and Strelka for indel analysis). SNVs and indels are represented as colored dots and crosses in
[0112] 2. Animals and Care
[0113] Female C57BL/6 mice (4 weeks old) and heterozygous Ai9 (B6.Cg-Gt(ROSA)26Sortm9(CAG-td-Tomato)Hze/J; JAX strain 007909) male mice were used for embryo collection. ICR female mice are used as recipients. The treatment and care of animals conform to the guidelines of the Biomedical Research Ethics Committee of the Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences.
[0114] 3. Cas9 mRNA, BE3 mRNA, ABE7.10 mRNA, Cre mRNA and sgRNA
[0115] The Cas9 protein coding region was amplified from the px260 plasmid using primers Cas9F and R.Purify the T7-Cas9 PCR product, and use mMESSAGE mMACHINE T7 ULTRA to transcribe mRNA. T7-sgRNA PCR was amplified from the px330 plasmid and transcribed into RNA in vitro using MEGA Shortcript T7 kit (Life Technologies). The T7 promoter was added to the Cre template by PCR amplification, and the T7-Cre PCR product was purified, and it was transcribed into mRNA in vitro using the mMESSAGE mMACHINE T7 ULTRA kit (Life Technologies). Use MEGA clear kit (Life Technologies) to purify Cas9 mRNA, Cre mRNA and sgRNA, and elute in RNase-free water.
TABLE-US-00001 sgRNA sequence (from top to bottom: SEQ ID NO: 5-11) Locus Sequence (5′-3′) Tyr-A (22) GCGAAGGCACCGCCCTCTTTTGG Tyt-B (22) CCAGAAGCCAATGCACCTATCGG LacZ (23) TGCGAATACGCCCACUCGATOGG Pde6b (24) CCAACCTAAGTAGCAGAAAGTGG Tyr-C (11) GACCTCAGTTCCCCTTCAAAGGG Tyr-D CTGTGCCAAGGCAGAAACCCTGG Tyr-E CCATAACAGAGACTCTTACATGG Primer sequence (from top to bottom: SEQ ID NO: 12-26) Name Sequence (5′-3′) Cre IVT F TAATACGACTCACTATAGGGAGACAGATCACCTTTCCTAT CAACC Cre IVT R TCGGTATTTCCAGCACACTGGA BE3 IVT F TCCGCGGCCGCTAATACGACT BE3 IVT R TGGTTCTTTCCGCCTCAGAAGCC C3s9 IVT F TAATACGACTCACTATAGGGATTCAGGTTGGACCG GTG C2s9 IVT R GACGTCAGCGTTCGAATTGC ABE7.10 IVT F GAGGTCTATATAAGCAGAGCTC ABE7.10 IVT R ATTAATAACTAGCGGCCGCTCCC Tyr-A IVT F TAATACGACTCACTATAGGGGCGAAGGCACCGCCCTCT TTGTTTTAGAGCTAGAAATAG Tyr-B IVT F TAATACGACTCACTATAGGGCCAGAAGCCAATGCACCT ATGTTTTAGAGCTAGAAATAG Tyr-C IVT F TAATACGACTCACTATAGGGGACCTCAGTTCCCCTTCA AAGTTTTAGAGCTAGAAATAG Tyr-D IVT F TAATACGACTCACTATAGGGCTGTGCCAAGGCAGAAA CCCGTTTTAGAGCTAGAAATAG Tyr-E IVT F TAATACGACTCACTATAGGGCCATAACAGAGACTCTTAC ACTTTTAGAGCTAGAAATAG LacZ-IVT F TAATACGACTCACTATAGGGTGCGAATACGCCCACGCGA TGTTTTAGAGCTAGAAATAG sgRNA IVT R AAAAGCACCGACTCGGTGCC
[0116] 4. 2-Cell Injection, Embryo Culture and Embryo Transfer
[0117] Superovulate C57BL/6 females (4 weeks old) mated with heterozygous Ai9 B6.Cg-Gt(ROSA)26Sortm9(CAG-td-Tomato)Hze/J; JAX strain 007909) males. 23 hours after hCG injection, fertilized eggs was taken from the fallopian tube. For 2-cell editing, a mixture of Cas9 mRNA (50 ng/μl), BE3 mRNA (50 ng/μl) or ABE7.10 mRNA (50 ng/μl), sgRNA (50 ng/μl) and Cre mRNA (2 ng/μl) in a drop of HEPES-CZB medium containing 5 μg/ml cytochalasin B (CB), was injected into the cytoplasm of one blastomere in a 2-cell embryo by FemtoJet micro-syringe (Eppendorf) at a constant flow, 48 hours after hCG injection. The injected embryos were cultured in KSOM medium containing amino acids at 37° C. and 5% CO.sub.2 for 2 hours, and then transplanted into the fallopian tubes of pseudopregnant ICR females.
[0118] 5. Single Cell PCR Analysis
[0119] Under a dissecting microscope, 8-cell mouse embryos were digested with acid Tyrode solution to remove the zona pellucida use homemade glass capillaries, then the embryos were transferred to 0.25% trypsin and gently pipette to separate individual blastomeres. Finally, wash the blastomere in KSOM for 7 to 10 times and transfer to a PCR tube. Then 1.5 μl of lysis buffer containing 0.1% Tween 20, 0.1% Triton X-100 and 4 μg/m proteinase K was pipetted into the tube. Each tube was centrifuged to promote mixing. The lysate was incubated at 56° C. for 30 minutes, and then at 95° C. for 5 minutes. The product of the lysis procedure is used as a template in nested PCR analysis. Avoid contaminating samples in all operations.
TABLE-US-00002 Nest PCR Primer sequence (from top to bottom: SEQ ID NO: 27-30) Tyr Outer F: GTTATCCTCACACTACTTCTG Outer R: GTAATCCTACCAAGAGTCTCA Inner F: TCCTCACACTACTTCTGATG Inner R: GTCTCAAGATGGAAGATCAC
[0120] 6. T Vector Cloning and Genotype Testing
[0121] The PCR product was purified and ligated to pMD18-T vector and transformed into competent E. coli strain DH5α. After culturing overnight at 37° C., randomly selected clones were sequenced by the Sanger method. The genotype of mutant E14.5 embryos was determined by PCR of genomic DNA extracted from cells. ExTaq was activated at 95° C. for 3 minutes; PCR was carried out for 34 cycles: 95° C. for 30 seconds, 62° C. for 30 seconds, 72° C. for 1 minute; and finally at 72° C. for 5 minutes. For embryos, after washing 6 times with KSOM, a single embryo was transferred directly to a PCR tube containing 1.5 μl embryo lysis buffer (0.1% Tween 20. 0.1% Triton X-100 and 4 μg/ml proteinase K) and incubated for 30 minute. At 56° C., inactivating at 95° C. for 10 minutes. Nest primers were used for PCR amplification. ExTaq was activated at 95° C. for 3 minutes; PCR was carried out for 34 cycles: 95° C. for 30 seconds, 62° C. for 30 seconds, 72° C. for 1 minute; and finally at 72° C. for 5 minutes. The second PCR was performed using 0.5 μg product of the first round PCR and inner primers. PCR is performed in the same reaction mixture. The PCR product was gel purified and cloned using the pMD-19t cloning kit (Takara) according to the manufacturer's instructions. Colonies was selected from each transformation and then subjected to Sanger sequencing to detect mutations.
TABLE-US-00003 Printer sequence (from top to bottom: SEQ ID NO: 31-36) Name Sequence (5′-3′) Tyr F GTTATCCTCACACTACTTCTG Tyr R GTAATCCTACCAAGAGTCTCA Tyr-OF GTCTGTGACACTCATTAACC Tyr-OR CATAGGAGGTGCTAACAATAC Tyr-IF GTATTGCCTTCTGTGGAGTT Tyr-IR TGAACCAATCAGTCCTTGTT
[0122] 7. Fluorescence Activated Cell Sorting (FACS)
[0123] In order to separate the cells, the shredded tissue was enzymatically hydrolyzed in 5 mL trypsin-EDTA (0.05%) solution at 37° C. for 30 minutes. The digestion was stopped by adding 5 ml of DMEM medium containing 10% fetal bovine serum (FBS). Then repeatedly pipetting 30-40 times by a 1 ml pipette tip to homogenize the fetal tissue. The cell suspension was centrifuged for 6 minutes (800 rpm), and the pellet was re-suspended in DMEM medium containing 10% FBS. Finally, the cell suspension was filtered through a 40-μm cell strainer, and tdtomato+/tdtomato− cells were separated by FACS. The second round was subjected to flow cytometry and fluorescence microscopy analysis and evaluation, with a sample purity >95% as qualified.
[0124] 8. Whole Genome Sequencing and Data Analysis
[0125] According to the manufacturer's instructions, DNeasy Blood and Tissue Kit (Cat. No. 69504. Qiagen) was used to extract genomic DNA from the cells. WGS is performed by Illumina HiSeq X Ten with an average coverage rate of 50 times. BWA (v0.7.12) is used to map qualified sequencing reads to the reference genome (mm10). Then the Picard tool (v2.3.0) was used to rank and mark the duplicates of the mapped BAM file. In order to identify de novo genome-wide mutations with high confidence, three algorithms Mutect2 (v3.5), Lofreq (v2.1.2) and Strelka (v2.7.1) were used for single-nucleotide mutations (25-27) analysis. At the same time, Mutect2 (v3.5), Scalpel (v0.5.3) and Strelka (v2.7.1) were used to detect the whole genome sequence. The overlap of the three SNV or indel algorithms indicate the true variant. The variants were identified in the location BAM file of the tdTomato+ sample, where the tdTomato− sample is in the same embryo as the control, and only the mutant variant in the tdTomato+ sample can be identified. For example, if the WT allele is G at certain position, and tdTomato+ cells show A, and tdTomato− cells show G at the position, then mutant A will be referred to as a de novo mutation. However, if tdTomato-cells show A at the position, the mutant cannot be identified. In order to further verify that off-target SNVs are only identified in tdTomato+ samples, the inventors also used the variants in tdTomato− samples and tdTomato+ samples in the same embryo as controls, wherein only the variants were mutated in tdTomato− cells but could be identified in WT tdTomato+ cells.
[0126] WGS analysis showed that the low-level targeted editing range in tdTomato− cells in the Cas9-Tyr-A and Cas9-Tyr-B groups was 0-6.3%, which may be caused by false negative FACS sorting (known to occur in low level). Therefore, the inventors only considered that variants with an allele frequency higher than 10% are reliable in the subsequent analysis. We also marked variants that overlap with UCSC repeat regions and microsatellite sequences, or exist in dbSNP (v138) and MGP (v3) databases. All sequencing data are stored in NCBI (SRA).
[0127] In order to verify the target efficiency, we compared the BAM file with the on-target with the e-value of 0.0001. Two algorithms were used to predict the potential off-target out of on-target (Cas-OFFinder (http://www.rgenome.net/cas-offinder/) and CRISPOR (http://crispor.tefor.net)/)).
[0128] SNVs and indels were annotated using the RefSeq database by annovar (version 2016 Feb. 1). Proto-oncogenes and tumor suppressor genes were searched from UniprotKB/Swiss-Prot database (2018 September). The inventor downloaded 5 ATAC-seq files from the CistromeDB database, wherein the biological source is embryos and passed all quality control. The live data sets retrieved include CistromeDB IDs “79877” (GSM2551659), “79976” (GSM2551677), “80493” (GSM2535470), “81049” (GSM2551664) and “81052” (GSM2551667). Based on the position in a chromosome, the off-target site is located to the peak area in each file, and then the peak areas with or without off-target are compared with each other through the two-sided Wilcoxon rank sum test.
[0129] 9. Simulation of Spontaneous Mutations During Embryonic Development
[0130] In order to estimate the amount of spontaneous mutations from the 2-cell stage to the E14.5 stage, considering an average sequencing coverage of 40 and an allele frequency threshold of 10%, single nucleotide mutations were found in computer simulations. For each round of simulation, given the mutation rate of 1.8×10.sup.10 and the size of the mouse nuclear genome (2,785,490,220 bp), we considered the replication process from the 2-cell stage to the 16-cell stage. The mutation occurred after 16-cell stage will not be detected considering the allele frequency. During each replication, each cell can be mutated or not. Once a mutation occurs, the dividing cells will inherit the mutation. Then cumulative mutations and their wild-type alleles were randomly select for sequencing with a depth of 40. The selected mutations were added up as the number of spontaneous mutations in each round, and the same process was repeated 10,000 times.
[0131] 10. Digenome-Seq Analysis
[0132] As mentioned above (32), multiple Digenome-seq was performed, including Cas9-LacZ, Cas9-Pde6b, Cas9-Tyr-A and Cas9-Tyr-B. Specifically, TIANamp Genomic DNA Kit (Tiangen) was used to purify genomic DNA from the tail of the mouse according to the manufacturer's instructions. The sgRNA target site of each gene, including the flanking genomic region, was PCR amplified. PCR products were purified with Universal DNA Purification Kit (Tiangen) according to the manufacturer's instructions. The Cas9 protein (1 μg) and sgRNA (1 μg) were pre-incubated for 10 minutes at room temperature to form the RNP complex. The DNA (4 μg) and RNP complexes were incubated in the reaction buffer at 37° C. for 3 hours. After adding RNase A (100 μg/ml) to remove sgRNA, the digested DNA was purified again with Universal DNA Purification Kit (Tiangen).
[0133] The library was sequenced (WGS) by the Illumina HiSeq X Ten sequencer at a sequencing depth of 30× to 40×. Digenome-seq2 (https://github.com/chizksh/digenome-toolkit2) was used to calculate and identify DNA cleavage sites. The in vitro cleavage sites were classified and identified by the R package “Biostrings” based on editing distance and listed.
[0134] 11. Statistical Analysis
[0135] R version 3.5.1 (http://www.R-proiect.org/) was used for all statistical analysis in this disclosure. All tests are two-sided tests, and P<0.05 indicates that the difference is statistically significant.
Example 1. Evaluation of Three Gene Editing Tools
[0136] Three commonly used gene editing tools CRISPR-Cas9, cytosine base editor 3 (BE3, rAPOBEC1-nCas9-UGI) and adenine base editor 7.10 (ABE7.10, TadA)-TadA*-nCas9) were evaluated by GOTI for off-target effects (references 6-8).
[0137] CRISPR-Cas9, BE3 or ABE7.10 together with Cre mRNA and the corresponding sgRNA were injected into one blastomere of 2-cell embryos from Ai9 (CAG-LoxP-Stop-LoxP-tdTomato) mice (References 9-10) (
[0138] FACS was used to separate E14.5-day embryos and sort the cells based on the tdTomato in the cells. At such time, the whole embryo can be easily digested to obtain enough single cells (
[0139] The inventors further demonstrated that edited cells treated with Cre and Cas9/BE3 systems can be effectively separated from unedited cells. During the Cre-mediated recombination process, about 50% of embryonic cells express tdTomato. This is verified by observation of 4-cell stage or 8-cell stage under a fluorescence microscope or flow cytometry analysis of E14.5-day cells, as shown in
[0140] Whole genome sequencing (WGS) was performed on the separated tdTomato+ and tdTomato− cells, and the tdTomato+ samples were identified by three algorithms for SNVs and indels. At the same time, the tdTomato− samples from the same embryo were used as references.
[0141] The inventors also verified the editing efficiency of this method when targeting Tyr gene. To study the embryo injection method on whole-genome sequencing, four sgRNAs were designed for CRISPR/Cas9 editing, Cas9-Tyr-A and Cas9-Tyr-B targeting to Tyr; a control sgRNAs targeting a LacZ lacking of a cleavage site in the genome of C57 mice; an sgRNA targeting Pde6b, which has a mismatch as compared with the C57 mouse genome, and is reported to capable of producing a large amount of SNVs. Through DNA cleavage experiments, the cleavage efficiency of these sgRNAs was verified in vitro. The results are shown in
[0142] The inventors also assayed two sgRNAs targeting Tyr gene through BE3 mediation. Three groups of embryos injected with Cre only, Cre and Cas9, Cre and BE3 were included as control groups. A mixture of CRISPR/Cas9 or BE3, Cre mRNAs and sgRNAs was injected into one blastomere, and embryo development was found to be undamaged, as shown by the normal blastocyst rate (
[0143] In order to further explore the editing efficiency and potential whole-genome off-target effects, whole-genome sequencing were performed with an average depth of 47 (47×) on 36 samples from 18 E14.5 embryos and 9 treatments: Cre only, Cre and Cas9, Cre and Cas9-LacZ, Cre and Cas9-Pde6b, Cre and Cas9-Tyr-A, Cre and Cas9-Tyr-B, Cre and BE3, Cre and BE3-Tyr-C. Cre and BE3-Tyr-D, of which Only Cas9-Tyr-A, Cas9-Tyr-B, BE3-Tyr-C and BE3-Tyr-D have re-editing sites in the C57 genome. On-target analysis of Cas9-Tyr-A and Cas9-Tyr-B showed that there were 56% and 72% Tyr allele mutations in tdTomato+ cells, respectively, indicating that there is a high-efficiency on-target efficiency on the Tyr gene; Similarly, BE3-Tyr-C and BE3-Tyr-D both showed high-efficiency editing in tdTomato+ cells (with an average of 75% and 92% Tyr allele mutations, respectively), as shown in
[0144] In order to evaluate off-target effects, three different mutation calling algorithms were used in each embryo to compare tdTomato+ cells and tdTomato− cells. The inventors analyzed the genome-wide mutation throughout the whole genome. The variables defined by the three algorithms are all true variable. Only 0-4 indels were found in all 9 groups (
[0145] In addition, by calling the opposite variables, the tdTomato− and tdTomato+ samples of each embryo were compared, it was found that the amount of SNVs was similar, indicating that CRISPR/Cas9 editing did not produce off-target effects. The SNVs observed by the inventors came from spontaneous mutations (
[0146] The inventors further designed 12 groups for detection: one Cre group (Cre only), six Cas9 groups with or without sgRNA (Cas9, Cas9-LacZ, Cas9-Pde6b, Cas9-Tyr-A, Cas9-Tyr-B and Cas9-Tyr-C), three BE3 groups with or without sgRNA (BE3, BE3-Tyr-C, BE3-Tyr-D) (Reference I1) and two ABE groups with or without sgRNA (ABE7.10, ABE7.10-Tyr-E).
[0147] The targeting efficiency of embryos at 8-cell and E14.5 stage was verified by Sanger sequencing. In order to further explore the editing efficiency of the target site and potential genome-wide off-target effects, 46 samples from 23 E14.5 embryos were subjected to WGS with an average depth of 47× (Table 1).
TABLE-US-00004 TABLE 1 Summary of HiSeq X Ten Sequencing Mapped bases Sample Group Accession (Gbp) Coverage Cre-#1 tdTomato+ SRS2549042 127.72 45.85 tdTomato− SRS2549043 130.14 46.72 Cre-#2 tdTomato+ SRS2549040 131.92 47.36 tdTomato− SRS2549031 131.91 47.36 Cas9-#1 tdTomato+ SRS2549032 124.06 44.54 tdTomato− SRS2549035 135.23 48.55 Cas9-#2 tdTomato+ SRS2604284 132.48 47.56 tdTomato− SRS2604285 132.56 47.59 Cas9-LacZ-#1 tdTomato+ SRS2549038 128.44 46.11 tdTomato− SRS2549039 119.22 42.80 Cas9-LacZ-#2 tdTomato+ SRS2604286 127.49 45.77 tdTomato− SRS2604287 140.58 50.47 Cas9-Pde6b-#1 tdTomato+ SRS3024198 127.20 45.67 tdTomato− SRS3024199 122.52 43.98 Cas9-Pde6b-#2 tdTomato+ SRS3024196 131.14 47.08 tdTomato− SRS3024197 135.07 48.49 Cas9-Tyr-A-#1 tdTomato+ SRS2549029 133.97 48.10 tdTomato− SRS2549030 130.04 46.69 Cas9-Tyr-A-#2 tdTomato+ SRS2549033 135.19 48.53 tdTomato− SRS2549034 116.56 41.85 Cas9-Tyr-B-#1 tdTomato+ SRS2549037 129.74 46.58 tdTomato− SRS2549036 132.69 47.64 Cas9-Tyr-B-#2 tdTomato+ SRS2549041 139.00 49.90 tdTomato− SRS2549028 134.09 48.14 Cas9-Tyr-C tdTomato+ 147.33 52.89 tdTomato− 147.45 52.94 BE3-#1 tdTomato+ SRR8169137 123.84 44.69 tdTomato− SRR8169136 128.06 46.21 BE3-#2 tdTomato+ SRR8169139 117.97 42.58 tdTomato− SRR8169138 128.93 46.53 BE3-Tyr-C-#1 tdTomato+ SRR8169133 150.20 54.21 tdTomato− SRR8169132 149.12 53.81 BE3-Tyr-C-#2 tdTomato+ SRR8169135 150.08 54.16 tdTomato− SRR8169134 148.89 53.73 BE3-Tyr-D-#1 tdTomato+ SRR8169131 151.27 54.59 tdTomato− SRR8169130 151.62 54.72 BE3-Tyr-D-#2 tdTomato+ SRR8169141 143.01 51.61 tdTomato− SRR8169140 143.54 51.80 ABE7.10-#1 tdTomato+ 133.22 47.83 tdTomato− 115.20 41.36 ABE7.10-#2 tdTomato+ 144.31 51.81 tdTomato− 143.07 51.36 ABE7.10- tdTomato+ 130.09 46.70 Tyr-E-#1 tdTomato− 148.97 53.48 ABE7.10- tdTomato+ 148.18 53.20 Tyr-E-#2 tdTomato− 133.12 47.79
[0148] The activities of Cas9, BE3 and ABE7.10 in tdTomato+ cells were confirmed by the high indel s and high SNVs ratios of the targeted sites (
TABLE-US-00005 Table 2. WGS identification of SNVs and indels in each embryo Cas9+ Cre+ LacZ- LacZ- Pde6b- Pde6b- Tyr-A- Tyr-A- Tyr-B- Tyr-B- Variants -#1 -#2 -#1 -#2 #1 #2 #1 #2 #1 #2 #1 #2 On-target 0 0 0 0 0 0 0 0 1 1 1 1 mutations Off-target SNVs 2 26 22 3 8 2 0 0 22 10 5 33 Off-target 0 3 0 1 0 1 0 0 0 0 2 0 Indels Exon off-target 0 0 0 0 0 0 0 0 1 0 0 4 SNVs Exon off-target 0 0 0 0 0 0 0 0 0 0 0 0 Indels Nousynonymous 0 0 0 0 0 0 0 0 0 0 0 2 off-target SNVs Frameshift 0 0 0 0 0 0 0 0 0 0 0 0 off-target Indels BE3+ ABE7.10+ Cas9+ Tyr-C- Tyr-C- Tyr-D- Tyr-D- Tyr-E- Tyr-E- Variants Tyr-C -#1 -#2 #1 #2 #1 #2 -#1 -#2 #1 #2 On-target 2 0 0 1 1 1 1 0 0 1 3 mutations Off-target SNVs 31 277 137 320 356 332 277 1 1 17 21 Off-target 4 1 0 1 4 1 0 1 1 3 2 Indels Exon off-target 1 3 4 3 6 6 4 0 0 0 0 SNVs Exon off-target 1 0 0 0 0 0 0 0 0 0 0 Indels Nousynonymous 1 2 2 0 4 4 2 0 0 0 0 off-target SNVs Frameshift 0 0 0 0 0 0 0 0 0 0 0 off-target Indels *The sgRNA for Pde6b has one mismatch with the C57 genome (3), so there was no on-target sites. #Two types of on-target variants, shown in FIG. S4.
TABLE-US-00006 TABLE 3 Mutect2 Scalpel Strelka Mutant (Mut/Total (Mut/Total (Mut/Total Manual Indels positions Mutant reads) reads) reads) realignment Cas9-Tyr-A-#1 chr7:87438074 CCAAAAGAGGG 16/36 11/29 15/34 15/33 (deletion) Cas9-Tyr-A-#2 chr7:87498083 TCAT 13/37 13/32 15/35 13/31 (insertion) Cas9-Tyr-B-#1 chr7:87498085 GATAG 14/43 12/32 11/29 12/34 (deletion) Cas9-Tyr-B-#2 chr7:87498054 TGC (deletion) 13/23 10/23 11/22 11/22 Cas9-Tyr-C chr7:87493149 CTTTGAAGGGGAA 44/45 32/32 45/48 44/45 (deletion) Mutect2 Scalpel Strelka Mutant (Mut/Total (Mut/Total (Mut/Total Manual Indels positions Mutant reads) reads) reads) realignment BE3-Tyr-C-#1 chr7:87493149 G>A 13/15 33/35 30/31 32/34 BE3-Tyr-C-#2 chr7:87493149 G>A 17/36 17/30 19/40 22/40 BE3-Tyr-D-#1 chr7:87492721, C>T; C>T 13/28; 15/28 12/28; 15/34; chr7:87492722 29/29 28/28 34/34 BE3-Tyr-D-#2 chr7:87492722 C>T 10/12 16/17 11/14 10/12 ABE7.10-Tyr-E-#1 chr7:87438041, G>A; G>A 7/34; 7/34; 7/34 7/34; 11/39; chr7:87438042 7/34 7/34 11/39 ABE7.10-Tyr-E-#2 chr7:87438041, G>A; G>A 20/31; 19/29 14/29; 24/37; chr7:87438042, G>A; G>A 20/31; 17/29; 21/37; chr7:87438044, 10/32; 10/32; 15/32; chr7:87438039 9/31 7/29 9/37
[0149] As for off-target effects, the inventors found that there were only 0-4 indels in embryos from all 12 groups (Tables 2 and 4), and none of them overlapped with predicted off-target sites (Table 5).
TABLE-US-00007 TABLE 4 Mutect2 vs Mutect2 vs Scalpel vs Overlap of Sample Mutect2 Scalpel Strelka Scalpel Strelka Strelka 3 methods Cre-#1 107 11400 4930 4 6 462 0 Cre-#2 118 8929 4665 6 4 379 3 Cas9-#1 98 10854 4378 6 0 357 0 Cas9-#2 64 10253 5703 6 2 434 1 Cas9-LacZ-#1 131 11941 4746 5 3 401 0 Cas9-LacZ-#2 57 9394 5338 2 3 398 1 Cas9-Pde6b-#1 137 12285 4687 3 4 443 0 Cas9-Pde6b-#2 125 12313 5397 7 4 505 0 Cas9-Tyr-A-#1 75 12348 5180 3 5 464 0 Cas9-Tyr-A-#2 81 11993 5480 3 4 471 0 Cas9-Tyr-B-#1 117 10659 4734 4 5 427 2 Cas9-Tyr-B-#2 70 9015 4791 2 0 447 0 Cas9-Tyr-C 287 21965 4539 13 14 828 4 BE3-#1 280 10654 3826 3 13 397 1 BE3-#2 269 10729 4176 1 10 432 0 BE3 + Tyr-C-#1 289 14614 5502 9 9 607 1 BE3 + Tyr-C-#2 259 14418 5111 7 9 606 4 BE3 + Tyr-D-#1 273 14585 5510 4 13 590 1 BE3 + Tyr-D-#2 268 12240 5240 4 7 518 0 ABE7.10-#1 284 53199 3662 25 5 1501 1 ABE7.10-#2 250 16468 3343 5 4 525 1 ABE7.10-Tyr-E-#1 283 90132 4684 30 7 2531 3 ABE7.10-Tyr-E-#2 238 32903 4378 16 5 1029 2
TABLE-US-00008 TABLE 5 Digenome Chr Position score DNA sequence Sample-#1 Tyr-A_1 chr7 87438083 205.585443 GCGAAGGCACCGCCCTCTTTTGG (On-target site) Tyr-A_2 chr8 70679420 87.5901275 TGGTTCATGCACCCCCCCTTAGG Tyr-A_3 chr2 11906262 12.9829391 catgtatagcagtgtgccagaag Tyr-A_4 chr6 94012018 5.738594479 CTATGGGAGGAGGTAACTAAGCG Tyr-B_1 chr5 1.22E+08 39.98904854 AAGAGGGCGGTGCTAAGATGGGG Tyr-B_2 chrX 1.12E+08 21.24457577 AGGTACATAGGCTTCATATCAGG Tyr-B_3 chr11 1.14E+08 8.274157156 CCCATGGGGAACACTCCTGGGGG Tyr-B_4 chr11 31846521 8.198383346 ACAAGCAAGTGTTGGTCCATAGG Tyr-B_5 chr11 1.14E+08 8.861354096 CCCATGGGGAACACTCCTGGGGG Tyr-B_6 chrX 87640980 5.514465963 CAAAAGGAGCAATTTCCAATAGG Tyr-B_7 chr7 87438053 4.826363636 CCGATAGGTGCATTGGCTTCTGG (On-target site) Tyr-B_8 chr1 23481074 4.209876693 ATATAAGTTAACATCCCAAAAGG Tyr-B_9 chr11 95292492 3.644424083 TATTGGGTGTCATCTCTTTCTCC Tyr-B_10 chr1 1.28E+08 3.544329556 CCCAAGACATGCACACCGATAGG Tyr-B_11 chr6 68111031 2.614949838 caagaCATAAAACATACCTAAAg LacZ_1 chr2 32395622 43.46541216 TTCGGCTTCGGGGCGGGGTCAAG LacZ_2 chr13 54153138 37.98678846 TAATGGTGCTGACTGCTATGAGG Pde6b_1 chr10 16088519 65.24995196 ATTACAATTAtttatgcctatag Pde6b_2 chr1 88276189 5.989287063 CTACTGCATGTTAGGAAAGGCCG Sample-#2 Tyr-A_1 chr8 70679420 100.166815 TGGTTCATGCACCCCCCCTTAGG (On-target site) Tyr-A_2 chr7 87438083 80.04553734 GCGAAGGCACCGCCCTCTTTTGG Tyr-A_3 chr2 32395622 52.38775481 AGAGGGCGGGGCCTTATAGTGGG Tyr-A_4 chr10 16088519 48.26614325 catgaagccaaaacacctatagg Tyr-A_5 chr2 11906264 20.65930936 catgtatagcagtgtgccagaag Tyr-A_6 chr9 73142622 5.706386646 tcttctggtgtgtctaaagacag Tyr-A_7 chr6 94012018 4.735788874 CTATGGGAGGAGGTAACTAAGCG Tyr-B_1 chr5 1.22E+08 53.80789887 AAGAGGGCGGTGCTAAGATGGGG (On-target site) Tyr-B_2 chr7 87438053 48.19727891 AAGAGGGCGGTGCTAAGATGGGG Tyr-B_3 chr11 1.14E+08 12.7891659 CCGATAGGTGCATTGGCTTCTGG Tyr-B_4 chrX 1.12E+08 8.13690641 CCCATGGGGAACACTCCTGGGGG Tyr-B_5 chr11 3184621 7.883665333 ACAAGCAAGTGTTGGTCCATAGG Tyr-B_6 chr16 24592641 7.196863075 CTATAGGCTTTGAACTGTCAGGG Tyr-B_7 chr1 23481074 4.891318316 ATATAAGTTAACATCCCAAAAGG Tyr-B_8 chr15 88729863 2.849386317 ATTCGGGCACAGCACGCAATCCG LacZ_1 chr13 54153138 18.86615566 TAATGGTGCTGACTGCTATGAGG LacZ_2 chr17 57065755 4.891340168 AGAGGGTGTTGCCTTCCCACGGG Pde6b_1 chr4 69960267 7.637319157 ACCTTTGGGTCCTGGGAAGGATG
[0150] For all Cas9-edited embryos, there were no significant differences in SNVs between the different Cas9 groups (an average of 12 SNVs per embryo), and there was no significant difference compared with the “Cre” group (an average of 14 SNVs per embryo) (Table 2).
[0151] The SNVs detected in the samples treated with Cre or Cas9 may be caused by spontaneous mutations during genome replication during development. This is because the number of SNV detected herein is within the range of simulated spontaneous mutations, and the adjacent sequence showed no sequence similarity with the target site (Ref 12).
[0152] Surprisingly, the inventors found an average of 283 SNV/embryos in embryos edited by BE3, which was at least 20 times higher than the levels observed in embryos treated with Cre or Cas9 (
TABLE-US-00009 TABLE 6 Mutect 2 vs Mutect2 vs Lofreq vs Overlap of SNVs Mutect2 Lofreq Strelka Lofreq Strelka Strelka 3 methods Cre-#1 527 66 865 4 21 8 3 Cre-#2 379 109 1494 14 42 48 12 Cas9-#1 420 146 1161 7 29 48 7 Cas9-#2 416 107 1276 13 38 56 8 Cas9-LacZ-#1 634 80 1111 3 30 17 1 Cas9-LacZ-#2 604 68 1349 8 25 49 6 Cas9-Pde6b-#1 549 51 633 5 21 3 0 Cas9-Pde6b-#2 273 65 751 3 38 2 0 Cas9-Tyr-A-#1 3781 160 2057 47 374 104 36 Cas9-Tyr-A-#2 230 68 778 9 16 25 8 Cas9-Tyr-B-#1 549 91 1009 14 35 38 13 Cas9-Tyr-B-#2 1421 100 1391 16 106 51 14 BE3-#1 953 66 722 17 34 20 15 BE3-#2 968 75 807 23 43 24 19 BE3-Tyr-C-#1 602 106 1059 18 43 32 12 BE3-Tyr-C-#2 671 102 1019 24 42 35 19 BE3-Tyr-D-#1 667 136 1128 33 58 55 30 BE3-Tyr-D-#2 1261 64 1526 13 67 20 7 Mutect2 vs Mutect2 vs Scalpel vs Overlap of Indels Mutect2 Scalpel Strelka Scalpel Strelka Strelka 3 methods Cre-#1 134 12372 4380 1 383 428 3 Cre-#2 125 9368 5162 2 7 6 0 Cas9-#1 177 10771 4342 14 4 393 1 Cas9-#2 83 9532 3975 11 6 394 2 Cas9-LacZ-#1 108 10849 4097 0 3 342 0 Cas9-LacZ-#2 68 10438 3886 3 5 317 1 Cas9-Pde6b-#1 255 4145 3335 8 7 256 0 Cas9-Pde6b-#2 215 3124 3079 7 6 255 0 Cas9-Tyr-A-#1 85 10913 4795 5 8 371 4 Cas9-Tyr-A-#2 78 8477 3953 4 2 459 1 Cas9-Tyr-B-#1 128 12457 4965 5 5 405 2 Cas9-Tyr-B-#2 79 10925 4751 4 5 387 1 BE3-#1 279 11847 4127 7 4 400 1 BE3-#2 280 12215 4434 4 2 440 1 BE3-Tyr-C-#1 240 14395 5223 4 10 545 1 BE3-Tyr-C-#2 264 15901 5518 5 7 617 0 BE3-Tyr-D-#1 291 14952 5487 2 8 606 1 BE3-Tyr-D-#2 263 12703 5431 4 6 517 1
[0153] The off-targets detected in the E3 samples were not duplicated in each group, and were randomly distributed throughout the genome. The inventors then compared these off-target mutations with all potential off-target sites predicted by Cas-OFFinder and CRISPROR softwares. Not surprisingly, these two prediction tools predicted a large number of off-target sites, but they did not appear in the SNVs detected by the inventors. In addition, there is no sequence similarity between the adjacent sequence of the identified SNVs and the BE3 sgRNA target sites, and the site with the most predicted off-target points is similar to the target site BE3 sequence. It is worth noting that although the SNV produced by BE3 editing is unique, the mutation type is consistent with the mutation type of APOBEC1.
[0154] It is noted that more than 90% of the SNVs identified in the BE3 edited cells were mutations from G to A or from C to T, and no mutation preference was observed in Cre-, Cas9- or ABE7.10-treated cells (
[0155] It is reported that the combinability of DNA is related to the efficiency of gene editing. Therefore, the inventors evaluated the ATAC-seq data set from mouse embryonic cells in the Cistrome database to determine whether off-target sites are enriched in open chromatin regions. In fact, in the E8.5 embryos with mixed C57BL6/DBA2 background and the four high-quality data sets of Cistrome database, off-target sites were significantly enriched in regions with higher binding (
[0156] In addition, no sequence similarity was observed between off-target and target sites, and off-target sites predicted by computer showed high sequence similarity with the targeted sites of BE3. Therefore, BE3 off-target SNVs are sgRNA-independent and may be caused by overexpression of APOBEC1.
[0157] Among the 1698 SNVs induced by BE3, 26 were located on exons, and 14 of them caused non-synonymous changes. The inventors successfully amplified 20 SNVs by PCR, and confirmed their existence by Sanger sequencing (Table 7).
TABLE-US-00010 TABLE 7 Alt Ref Alt Ref Allele Sanger Mutant Type Gene reads reads reads reads frequency dbSNP Repeats PCR sequeuce BE3-#1 chr2 p.V2987M/c.119964795G > A exonic Mga 11 20 0 39 35.48% Y Y chr2 p.D419N/c.140158610C > T exonic Esf1 21 8 0 28 72.41% Y Y chr4 p.L376L/c.128589747G > A exonic Zscan20 18 20 0 44 47.37% Y Y BE3-#2 chr15 p.P15F/c.80091438C > T exonic Syngr1 13 19 0 36 40.63% N N chr19 p.P184P/c.60756817G > C exonic Nanos1 6 30 0 40 16.67% N N chr1 p.E488K/c.140507758G > A exonic Kcnt2 8 31 0 28 20.51% Y Y chr3 p.E59K/c.96708345C > T exonic Nudt17 14 23 0 41 37.84% N N BE3-Tyr-C-#1 chr11 p.F1507F/c.110030023G > A exonic Abca8a 12 23 0 35 34.29% Y Y chr3 p.F314F/c.93826961C > T exonic Tdpoz3 24 22 0 48 52.17% Y Y Y chr7 p.Q21Q/c.127920229C > T exonic Pnt2 18 24 0 35 42.86% Y Y BE3-Tyr-C-#2 chr10 p.D627N/c.45158272G > A exonic Prep 22 21 0 49 51.16% Y Y chr11 p.L29L/c.35833265C > T exonic Rars 17 23 0 49 42.50% Y Y chr13 p.G230G/c.63545050C > T exonic Ptch1 27 17 0 33 61.36% Y Y chr13 p.Q282X/c.104189738G > A exonic Trim23 21 23 0 41 47.73% Y Y chr16 p.E3404K/c.15809689G > A exonic Prkdc 15 18 0 41 45.45% Y Y chr1 p.Q202X/c.173462096C > T exonic Aim2 18 33 0 33 35.29% Y Y BE3-Tyr-D-#1 chr11 p.F311F/c.73354687C > T exonic Olfr20 25 15 0 27 62.50% Y Y chr19 p.E33K/c.38396211G > A exonic Slc35g1 21 22 0 35 48.84% N N chr1 p.F22F/c.60094502G > A exonic Wdr12 13 26 0 34 33.33% Y Y Y chr1 p.H346P/c.173683317A > C exonic Ifi208 7 58 1 48 10.77% Y Y N N chr6 p.D401N/c.145862884C > T exonic Bhlhe41 17 9 0 38 65.38% N N chr7 p.E421K/c.104265600C > T exonic Trim5 19 1 0 21 95.00% Y Y Y BE3-Tyr-D-#2 chr14 p.L115L/c.73568707C > T exonic Sucla2 14 11 0 36 56.00% Y Y chr2 p.E2105E/c.26460812C > T exonic Notch1 11 41 0 40 21.15% Y Y chr2 p.E872K/c.28685723G > A exonic Tsc1 9 29 0 33 23.68% Y Y chr8 p.E196K/c.11785830G > A exonic Arhgef7 14 22 0 37 38.89% Y Y
[0158] Among the 26 SNVs, 14 caused non-synonymous changes in the encoded protein, and 2 caused premature termination in Trim23 and Aim2 genes. Trim23 encodes an E3 ubiquitin ligase whose dysfunction can lead to muscular dystrophy. Previous studies reported that the Aim2 gene plays an important role in innate immunity and is the basis against viral infections. The inventors also found one SNV on the proto-oncogene and 13 SNVs on the tumor suppressor gene, which has caused serious concern about the carcinogenic risk of BE3 editing (
TABLE-US-00011 TABLE 8 ID Mutation WT Total Frequency Dose sgRNA A1 8 7 15 53.33 50 Tyr-C A4 8 4 12 66.67 50 Tyr-C A6 11 4 15 73.33 50 Tyr-C A8 7 8 15 46.67 50 Tyr-C A9 10 1 11 90.91 50 Tyr-C A12 11 0 11 100 50 Tyr-C A13 12 3 15 80 50 Tyr-C A14 11 3 14 78.57 50 Tyr-C A16 6 8 14 42.86 50 Tyr-C A18 10 3 13 76.92 50 Tyr-C A19 9 5 14 64.29 50 Tyr-C G1 6 9 15 40 20 Tyr-C G2 1 13 14 7.14 20 Tyr-C G3 0 13 13 0 20 Tyr-C G4 2 12 14 14.29 20 Tyr-C G5 0 15 15 0 20 Tyr-C G6 5 10 15 33.33 20 Tyr-C G7 3 11 14 21.43 20 Tyr-C G8 4 9 13 30.77 20 Tyr-C G9 5 8 13 38.46 20 Tyr-C G10 4 9 13 30.77 20 Tyr-C G11 2 12 14 14.29 20 Tyr-C G12 3 9 12 25 20 Tyr-C B2 0 12 12 0 10 Tyr-C B3 4 9 13 30.77 10 Tyr-C B4 5 7 12 41.67 10 Tyr-C B7 0 13 13 0 10 Tyr-C B9 1 14 15 6.67 10 Tyr-C B10 0 12 12 0 10 Tyr-C B11 4 9 13 30.77 10 Tyr-C B12 1 12 13 7.69 10 Tyr-C B13 3 8 11 27.27 10 Tyr-C B14 0 12 12 0 10 Tyr-C C2 0 12 12 0 2 Tyr-C C3 1 8 9 11.11 2 Tyr-C C5 0 12 12 0 2 Tyr-C C7 1 13 14 7.14 2 Tyr-C C8 2 12 14 14.29 2 Tyr-C C10 0 13 13 0 2 Tyr-C C13 0 14 14 0 2 Tyr-C C14 0 15 15 0 2 Tyr-C C15 0 8 8 0 2 Tyr-C C17 0 9 9 0 2 Tyr-C C18 0 11 11 0 2 Tyr-C D2-1 11 2 13 84.62 50 Tyr-D D2-3 12 0 12 100 50 Tyr-D D2-6 10 4 14 71.43 50 Tyr-D D2-8 10 2 12 83.33 50 Tyr-D D2-9 15 0 15 100 50 Tyr-D D2-10 9 2 11 81.82 50 Tyr-D D2-11 7 5 12 58.33 50 Tyr-D D2-13 7 2 9 77.78 50 Tyr-D D10 8 2 10 80 50 Tyr-D H1 7 6 13 53.35 20 Tyr-D H2 9 5 14 64.29 20 Tyr-D H3 1 14 15 6.67 20 Tyr-D H4 3 12 15 20 20 Tyr-D H5 5 9 14 35.71 20 Tyr-D H6 4 10 14 28.57 20 Tyr-D H7 5 10 15 33.33 20 Tyr-D H8 4 10 14 28.57 20 Tyr-D H9 6 5 11 54.55 20 Tyr-D H10 11 4 15 73.33 20 Tyr-D E2-3 0 12 12 0 10 Tyr-D E2-5 2 10 12 16.67 10 Tyr-D E2-6 1 9 10 10 10 Tyr-D E2-7 8 2 10 80 10 Tyr-D E2-8 9 3 12 75 10 Tyr-D E2-9 6 6 12 50 10 Tyr-D E2-10 4 6 10 40 10 Tyr-D E2-11 1 10 11 9.09 10 Tyr-D E2-12 11 2 13 84.62 10 Tyr-D E2-13 1 11 12 8.33 10 Tyr-D E2-14 6 6 12 50 10 Tyr-D F2-9 2 9 11 18.18 2 Tyr-D F2-11 7 7 14 50 2 Tyr-D F3 2 11 13 15.38 2 Tyr-D F2-4 0 14 14 0 2 Tyr-D F2-5 4 8 12 33.33 2 Tyr-D F6 0 13 13 0 2 Tyr-D F8 0 13 13 0 2 Tyr-D F14 3 8 11 27.27 2 Tyr-D F15 1 10 11 9.09 2 Tyr-D F19 0 12 12 0 2 Tyr-D F22 3 12 15 20 2 Tyr-D F28 0 12 12 0 2 Tyr-D
[0159] A major advantage of the method of the present disclosure is that edited and unedited cells can be compared in one animal, eliminating the difference in genetic background. The results about the comparison of edited and unedited animals in previous studies were unreliable due to differences in genetic background. In fact, the inventors also applied this method to a published data set and found that there are an average of about 1000 SNVs and about 100 indels in CRISPR/Cas9 edited and unedited mice. Based on such discovery, the inventors believe that the differences between siblings are due to genetic variation rather than the result of CRISPR/Cas9 editing. In addition, when comparing the sequences between any two different embryos, more SNVs (3706±5232) and indels (583±762) (n=18 pairs) were found because the embryos used were not from the same parents. These results indicate that, even if the mice have the same parents, it is difficult to find a complete blank control for the off-target analysis to compare the edited mice with the unedited mice, due to the large amount of genetic variation among the mice.
[0160] In sum, the present disclosure proves the advantage of GOTI in studying off-target effects caused by gene editing, that is, using the daughter cells of the same embryo to perform whole-genome sequencing. The inventors also found that undesirable off-target mutations caused by CRISPR/cas9-mediated gene editing are rare in mouse embryos. This is supported by the results of previous studies that in vivo editing based on CRISPR/Cas9 will not cause significant SNVs and indels. However, most deletions or most chromosomal translocations reported in other studies cannot be ruled out. In contrast, the present disclosure discovers many new SNVs caused by BE3 editing, which improves the safety of base editing in therapeutic applications.
[0161] The inventors found that BE3 induced many new SNVs, which was not reported in previous studies. A possible explanation is that in the present disclosure, GOTI can detect cell populations from a single gene-edited blastomere, while previous studies used a large number of cell pools, in which editing is different, and random off-target signal is lost due to population average. Unlike BE3. ABE7.10 induced no increase in SNV, which may be due to the lack of DNA binding ability of TadA (Ref. 17). The off-target effect of BE3 may be solved by reducing the DNA binding capacity of APOBEC1 or using different forms of cytosine deaminase. In short, GOTI avoids interference of SNP among different individuals and is used to examine the off-target effects of various gene editing tools.
Example 2. The Effect of APOBEC1 Enzyme on Off-Target Effects
[0162] As disclosed above, the single-base editing tool BE3 will cause a large number of single-nucleotide off-target variations (SNV). The inventors expect that these off-target variations are caused by the overexpression of APOBEC1 and its binding to single-stranded DNA (ssDNA). However, single-base gene editing tools (BEs) have been widely used in single-base mutation research and have the potential to correct pathogenic mutations. In this example, the inventors tested the possibility of solving the off-target problem of BE3, to specifically correct the disease-related target Cs. The wild-type APOBEC1 protein sequence is shown in SEQ ID NO: 1.
[0163] The BE2 system constructed for off-target evaluation of BE3 is shown in
[0164] The inventors first reduced the amount of BE3mRNA injected into the embryo, and applied GOTI to detect off-target variants. As the injection amount of BE3 decreased, the efficiency of gene editing at the targeted site was correspondingly reduced (
[0165] As an alternative method, the ssDNA binding domain on Apobec1 protein was mutated to detect whether it can reduce the off-target activity of APOBEC1. The inventors mutated the corresponding amino acid positions of the corresponding BE3 based on the previous research, and used the GOTI method to evaluate their effects on the targeting efficiency and off-target effects (
[0166] The inventors evaluated the efficiency of gene editing Tyr-C and Tyr-D target sites for different mutations. First, editing activity of the mutant BE3 was evaluate by use of sgRNA-C and D: BE3-W90A (at position 90 in the amino acid sequence of Apobec1 protein), BE3-W90F, BE3-R132E (at position 132 in the amino acid sequence of Apobec1 protein), BE3-R126E (at position 126 in the amino acid sequence of Apobec1 protein) and BE3-E63A (at position 63 in the amino acid sequence of Apobec1 protein). The results showed that the editing efficiency of the BE3-R126E mutation at the two target sites was not much different than that of BE3. The activity of the mutant BE3-R126E was also confirmed by the high targeting efficiency shown by WGS (
[0167] Therefore, the present inventors revealed for the first time a solution to solve the off-target effect induced by BE3 by mutating APOBEC1, such as R126E.
[0168] The modularity established in the present disclosure indicates that GOTI is a further solution for other mutant versions of APOBEC1 or a newly engineered cytidine deaminase.
Example 3. Research on Mutation Optimization
[0169] First, the present inventors injected different amounts of BE3 mRNA (50 ng/μl and 10 ng/μl) together with sgRNA-Tyr-C or sgRNA-Tyr-D into embryos, and evaluated the targeting efficiency by single-cell Sanger sequencing.
[0170] It is found that using a smaller amount of BE3 can significantly reduce the targeting efficiency (72.6±5.3%, 50 ng/μl; 12.6±2.9%, 10 ng/μl).
[0171] Then whole-genome off-target assessment was performed by GOTI method. Genome-wide off-target analysis by two-cell embryo injection (GOTI) detected off-target variants on BE3-Tyr-D-treated embryos, and it is found that the number of off-target SNVs of BE3mRNA in two different level (injected with 50 ng/nl and 10 ng/nl) did not change.
[0172] Then the inventors detected whether a point mutation at the DNA binding domain of APOBEC1 would reduce the off-target rate of BE3. Based on the DNA binding domain identified in previous studies, the inventors introduced various point mutations into the putative DNA binding domain of APOBEC1 in the BE3 system, and evaluated their effects on on-target efficiency and off-target rate (
[0173] Then GOTI was used to evaluate on-target efficiency and off-target frequency of BE3-R126E in the three groups with or without sgRNA (BE3-R126E, BE3-R126E-Tyr-C and BE3-R126E-Tyr-D), BE3-W90Y+R126E(YE1)-Tyr-C and BE3-W90F+R126E(FE1)-Tyr-C. The on-target efficiency was confirmed by whole genome sequencing (
[0174] The inventors further detected the off-target effects in BE3-W90Y+R126E (YE1) and BE3-R126E on 293T cells. It was found that BE3-R126E can significantly reduce RNA off-target. BE3-W90Y+R126E(YE1) can completely eliminate RNA off-target (Figure Se).
[0175] In
[0176] In conclusion, by applying the GOTI method to assess the amount of off-target SNVs, it can be proved that by mutating the putative ssDNA binding domain of the deaminase of the base editor can eliminate the off-target effect of the cytosine base editor at the DNA and RNA levels.
[0177] The results indicate that a base editor can be designed as an effective and safe tool for gene editing and therapeutic applications.
[0178] Each reference provided herein is incorporated by reference to the same extent as if each reference was individually incorporated by reference. In addition, it should be understood that based on the above teaching content of the disclosure, those skilled in the art can practice various changes or modifications to the disclosure, and these equivalent forms also fall within the scope of the appended claims.
REFERENCES
[0179] 1. G. J. Knott, J. A. Doudrna, CRISPR-Cas guides the future of genetic engineering. Science 361, 866-869 (2018). [0180] 2. S. Q. Tsai. J. K. Joung, Defining and improving the genome-wide specificities of CRISPR-Cas9 nucleases. Nat Rev Genet 17, 300-312 (2016). [0181] 3. C. P. Lazzarotto et al., Defining CRISPR-Cas9 genome-wide nuclease activities with CIRCLE-seq. Nat Protoc 13, 2615-2642 (2018). [0182] 4. K R. Anderson et al., CRISPR off-target analysis in genetically engineered rats and mice. Nat Methods 15, 512-514 (2018). [0183] 5. D. Kim et al., Genome-wide target specificities of CRISPR PNA-guided programmable deaminases. Nat Biotechnol 35, 475-40(2017). [0184] 6. T. I. Cornu, C. Mussolino, T. Cathomen, Refining strategies to translate genome editing to the clinic. Nature Medicine 23, 415-423 (2017). [0185] 7. H. A. Rees, D. R. Liu, Base editing precision chemistry on the genome and transcriptome of living cells. Nat Rev Genet, (2018). [0186] 8. N. M. Gaudelli et al., Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. Nature 551, 464-471 (2017). [0187] 9. L. Madisen et al., A robust and high-throughput Cre reporting and characterization system for the whole mouse brain. Nat Neurosci 13, 133-140 (2010). [0188] 10. L. Wang et al., CRISPR-Cas9-mediated genome editing in one blastomere of two-cell embryos reveals a novel Tet3 function in regulating neocortical development. Cell Res 27, 815-829 (2017). [0189] 11. K Kim et al., Highly efficient RNA-guided base editing in mouse embryos. Nat Biotechnol 35, 435-437 (2017). [0190] 12. J. W. Drake, B. Charlesworth, D. Charlesworth, J. F. Crow, Rates of spontaneous mutation. Genetics 148, 1667-1686 (1998). [0191] 13. A. C Kornor, Y. B. Kim, M. S. Packer, J. A Zuris, D. R. Liu, Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424 (2016). [0192] 14. R. S. Harris, S. K. Petersen-Mahrt, M. S. Neuberger, RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol Cell 10, 1247-1253 (2002). [0193] 15. S. Rebhandi, M. Huemer, R. Grell, R Geisberger, AID/APOBEC deaminases and cancer. Oncosceince 2, 320-333 (2015). [0194] 16. L. B. Alexarndrov et al., Signatures of mutational processes in human cancer. Nature 500, 415-421 (2013). [0195] 17. H. C. Losey, A. J. Ruthenburg, G. L. Verdine, Crystal structure of Staphylococcus aureus tRNA adenosine deaminase TadA in complex with RNA. Nat Struct Mol Biol 13, 153-159 (2006). [0196] 18. S Jin et al., Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science, in press (2019). [0197] 19. Y. B. Kim et al., Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat Biotechnol 35, 371-376 (2017). [0198] 20. K. Wang et al., Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. Nat Biotechnol 36, 946-949 (2018). [0199] 21. J. M. Gehrke et al., An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat Biotechnol 36, 977-982 (2018).