NOVEL Cas ENZYME AND SYSTEM, AND USE THEREOF

20220186206 · 2022-06-16

Assignee

Inventors

Cpc classification

International classification

Abstract

A CRISPR-associated (Cas) protein, a fusion protein including the Cas protein, and a nucleic acid encoding either of the proteins are provided. The Cas protein is any one from the group consisting of a Cas protein having an amino acid sequence with at least 95% sequence identity with SEQ ID NO: 1 and basically retaining a biological function of SEQ ID NO: 1; a Cas protein having an amino acid sequence obtained through a substitution, a deletion, or an addition of one or more amino acids based on SEQ ID NO: 1 and basically retaining the biological function of SEQ ID NO: 1; and a Cas protein comprising an amino acid sequence shown in SEQ ID NO: 1.

Claims

1. A clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) protein, wherein the Cas protein is any one from the group consisting of: a first Cas protein having an amino acid sequence with at least 95% sequence identity with SEQ ID NO: 1 and basically retaining a biological function of SEQ ID NO: 1; a second Cas protein having an amino acid sequence obtained through a substitution, a deletion, or an addition of one or more amino acids based on SEQ ID NO: 1 and basically retaining the biological function of SEQ ID NO: 1, and the one or more amino acids comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids; and a third Cas protein comprising an amino acid sequence shown in SEQ ID NO: 1.

2. A fusion protein, comprising the Cas protein according to claim 1 and a modification part.

3. An isolated polynucleotide, wherein the isolated polynucleotide is a polynucleotide sequence encoding the Cas protein according to claim 1, or a polynucleotide sequence encoding a fusion protein comprising the Cas protein and a modification part.

4. A guide RNA (gRNA), comprising a framework region binding to the Cas protein according to claim 1 and a guide sequence targeting a target sequence.

5. A vector, comprising the isolated polynucleotide according to claim 3 and a regulatory element operably linked to the isolated polynucleotide.

6. A CRISPR-Cas system, comprising the Cas protein according to claim 1 and at least one gRNA, wherein the at least one gRNA comprises a framework region binding to the Cas protein and a guide sequence targeting a target sequence.

7. A vector system, wherein the vector system comprises one or more vectors, and the one or more vectors comprise: a) a first regulatory element operably linked to a gRNA, wherein the gRNA comprises a framework region binding to the Cas protein according to claim 1 and a guide sequence targeting a target sequence, and b) a second regulatory element operably linked to the Cas protein; wherein the first regulatory element and the second regulatory element are located on a same vector or different vectors of the vector system.

8. A composition, comprising: a protein component selected from the group consisting of the Cas protein according to claim 1 and a fusion protein comprising the Cas protein and a modification part; and a nucleic acid component selected from the group consisting of a gRNA comprising a framework region binding to the Cas protein and a guide sequence targeting a target sequence, a nucleic acid encoding the gRNA, a precursor RNA of the gRNA, and a nucleic acid encoding the precursor RNA of the gRNA; wherein the protein component and the nucleic acid component combine with each other to form the composition.

9. An activated CRISPR complex, comprising: a protein component selected from the group consisting of the Cas protein according to claim 1 and a fusion protein comprising the Cas protein and a modification part; a nucleic acid component selected from the group consisting of a gRNA comprising a framework region binding to the Cas protein and a guide sequence targeting a target sequence, a nucleic acid encoding the gRNA, a precursor RNA of the gRNA, and a nucleic acid encoding the precursor RNA of the gRNA; and the target sequence binding to the gRNA.

10. An engineered host cell, comprising: the Cas protein according to claim 1, or a fusion protein comprising the Cas protein and a modification part, or a polynucleotide, wherein the polynucleotide is a polynucleotide sequence encoding the Cas protein or a polynucleotide sequence encoding the fusion protein, or a vector, wherein the vector comprises the polynucleotide and a first regulatory element operably linked to the polynucleotide, or a CRISPR-Cas system comprising the Cas protein and at least one gRNA, wherein the at least one gRNA comprises a framework region binding to the Cas protein and a guide sequence targeting a target sequence, or a vector system, wherein the vector system comprises one or more vectors, and the one or more vectors comprise a second regulatory element operably linked to the gRNA, and a third regulatory element operably linked to the Cas protein, wherein the second regulatory element and the third regulatory element are located on a same vector or different vectors of the vector system, or a composition, wherein the composition comprises a protein component selected from the group consisting of the Cas protein and the fusion protein; and a nucleic acid component selected from the group consisting of the gRNA, a nucleic acid encoding the gRNA, a precursor RNA of the gRNA, and a nucleic acid encoding the precursor RNA of the gRNA, wherein the protein component and the nucleic acid component combine with each other to form the composition, or an activated CRISPR complex, wherein the activated CRISPR complex comprises the protein component; the nucleic acid component; and the target sequence binding to the gRNA.

11. The Cas protein according to claim 1, wherein the Cas protein is used in a gene editing, a gene targeting, or a gene cleaving.

12. The Cas protein according to claim 1, wherein the Cas protein is used in one or more selected from the group consisting of: targeting and/or editing a target nucleic acid; cleaving a double-stranded DNA, a single-stranded DNA, or a single-stranded RNA; non-specifically cleaving and/or degrading a collateral nucleic acid; non-specifically cleaving a single-stranded nucleic acid; a nucleic acid detection; specifically editing a double-stranded nucleic acid; base-editing the double-stranded nucleic acid; and base-editing the single-stranded nucleic acid.

13. A method for editing a target nucleic acid, targeting the target nucleic acid, or cleaving the target nucleic acid, comprising: contacting the target nucleic acid with the Cas protein according to claim 1, or a fusion protein comprising the Cas protein and a modification part, or a polynucleotide, wherein the polynucleotide is a polynucleotide sequence encoding the Cas protein or a polynucleotide sequence encoding the fusion protein, or a vector, wherein the vector comprises the polynucleotide and a first regulatory element operably linked to the polynucleotide, or a CRISPR-Cas system comprising the Cas protein and at least one gRNA, wherein the at least one gRNA comprises a framework region binding to the Cas protein and a guide sequence targeting a target sequence, or a vector system, wherein the vector system comprises one or more vectors, and the one or more vectors comprise a second regulatory element operably linked to the gRNA, and a third regulatory element operably linked to the Cas protein, wherein the second regulatory element and the third regulatory element are located on a same vector or different vectors of the vector system, or a composition wherein the composition comprises a protein component selected from the group consisting of the Cas protein and the fusion protein; and a nucleic acid component selected from the group consisting of the gRNA, a nucleic acid encoding the gRNA, a precursor RNA of the gRNA, and a nucleic acid encoding the precursor RNA of the gRNA, wherein the protein component and the nucleic acid component combine with each other to form the composition, or an activated CRISPR complex, wherein the activated CRISPR complex comprises the protein component; the nucleic acid component; and the target sequence binding to the gRNA, or a host cell, wherein the host cell comprises the Cas protein, the fusion protein, the polynucleotide, the vector, the CRISPR-Cas system, the vector system, the composition, or the activated CRISPR complex.

14. A method for cleaving a single-stranded nucleic acid, comprising: contacting a nucleic acid group with the Cas protein according to claim 1 and a gRNA comprising a framework region binding to the Cas protein and a guide sequence targeting a target sequence, wherein the nucleic acid group comprises a target nucleic acid and at least one non-target single-stranded nucleic acid; the gRNA targets the target nucleic acid; and the Cas protein cleaves the non-target single-stranded nucleic acid.

15. A kit for gene editing, gene targeting, or gene cleaving, comprising: the Cas protein according to claim 1, or a fusion protein comprising the Cas protein and a modification part, or a polynucleotide, wherein the polynucleotide is a polynucleotide sequence encoding the Cas protein or a polynucleotide sequence encoding the fusion protein, or a vector, wherein the vector comprises the polynucleotide and a first regulatory element operably linked to the polynucleotide, or a CRISPR-Cas system comprising the Cas protein and at least one gRNA, wherein the at least one gRNA comprises a framework region binding to the Cas protein and a guide sequence targeting a target sequence, or a vector system, wherein the vector system comprises one or more vectors, and the one or more vectors comprise a second regulatory element operably linked to the gRNA, and a third regulatory element operably linked to the Cas protein, wherein the second regulatory element and the third regulatory element are located on a same vector or different vectors of the vector system, or a composition, wherein the composition comprises a protein component selected from the group consisting of the Cas protein and the fusion protein; and a nucleic acid component selected from the group consisting of the gRNA, a nucleic acid encoding the gRNA, a precursor RNA of the gRNA, and a nucleic acid encoding the precursor RNA of the gRNA, wherein the protein component and the nucleic acid component combine with each other to form the composition, or an activated CRISPR complex, wherein the activated CRISPR complex comprises the protein component; the nucleic acid component; and the target sequence binding to the gRNA, or a host cell, wherein the host cell comprises the Cas protein, the fusion protein, the polynucleotide, the vector, the CRISPR-Cas system, the vector system, the composition, or the activated CRISPR complex.

16. A kit for detecting a target nucleic acid in a sample, comprising: (a) the Cas protein according to claim 1 or a nucleic acid encoding the Cas protein; (b) a gRNA comprising a framework region binding to the Cas protein and a guide sequence targeting a target sequence, or a nucleic acid encoding the gRNA, or a precursor RNA comprising the gRNA, or a nucleic acid encoding the precursor RNA; and (c) a single-stranded nucleic acid detector not hybridizing with the gRNA.

17. The Cas protein according to claim 1, wherein the Cas protein is used in a preparation of a formulation or a kit, wherein the formulation or the kit is used for: (i) gene or genome editing; (ii) target nucleic acid detection and/or diagnosis; (iii) editing a target sequence in a target gene locus to modify an organism or a non-human organism; (iv) disease treatment; and (v) targeting the target gene.

18. A method for detecting a target nucleic acid in a sample, comprising: contacting the sample with the Cas protein according to claim 1, a gRNA, and a single-stranded nucleic acid detector; and detecting a detectable signal generated due to a cleavage of the Cas protein on the single-stranded nucleic acid detector to detect the target nucleic acid; wherein the gRNA comprises a region to bind to the Cas protein and a guide sequence to hybridize with the target nucleic acid, and the single-stranded nucleic acid detector does not hybridize with the gRNA.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0245] FIG. 1 shows a PAM preference result of UkCpf1.

[0246] FIG. 2 shows a sterilization consumption experiment to verify the PAM preference result of UkCpf1.

[0247] FIG. 3 shows a functional domain prediction result of UkCpf1.

[0248] FIG. 4 shows in vitro RNA and DNA cleavage activity results of UkCpf1 and a mutant thereof.

[0249] FIG. 5 is a schematic diagram illustrating the construction of an UkCpf1 expression construct for Arabidopsis thaliana (A. thaliana).

[0250] FIG. 6 is a schematic diagram illustrating the principle of use of an YFFP gene (SEQ ID NO: 17) to detect UkCpf1 cleavage activity.

[0251] FIG. 7 shows the gene editing activity of UkCpf1 in A. thaliana cells.

[0252] FIG. 8 is a schematic diagram illustrating the construction of an UkCpf1 expression construct for rice.

[0253] FIG. 9 is a schematic diagram of the pDR-UkCpf1-At vector.

[0254] FIG. 10 shows a fluorescence result of nucleic acid detection of UkCpf1.

SEQUENCE INFORMATION

[0255]

TABLE-US-00002 SEQ ID NO: Description 1 Amino acid sequence of UkCpf1 2 Nucleic acid sequence of UkCpf1 3 DR region of gRNA of UkCpf1 4 gTGW6-1 5 gTGW6-2 6 gTGW6-3 7 gTGW6-4 8 gTGW6-5 9 N-B-i3g1-ssDNA0 10 gRNA-trans

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0256] The following examples are only used to describe rather than limit the present disclosure. Unless otherwise specified, the experiments and methods described in the examples are basically conducted in accordance with conventional methods well known in the art and described in various references. For example, conventional techniques such as immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant DNA used in the present disclosure can be found in “MOLECULAR CLONING: A LABORATORY MANUAL”, Sambrook, Fritsch, and Maniatis, edition 2 (1989); “CURRENT PROTOCOLS IN MOLECULAR BIOLOGY” (edited by F. M. Ausubel et al., (1987)); and “METHODS IN ENZYMOLOGY” series (Academic Press Corporation): “PCR 2: A PRACTICAL APPROACH (edited by M. J. MacPherson, B. D. Hames, and G. R Taylor (1995)), ANTIBODIES, A LABORATORY MANUAL edited by Harlow and Lane (1988), and “ANIMAL CELL CULTURE” (edited by R. I. Freshney (1987)).

[0257] In addition, if no specific conditions are specified in the examples, the examples will be conducted according to conventional conditions or the conditions recommended by the manufacturer. All of the used reagents or instruments which are not specified with manufacturers are conventional commercially-available products. Those skilled in the art know that the present disclosure is described by way of examples in the embodiments, and the examples are not intended to limit the protection scope of the present disclosure. All publications and other references mentioned herein are incorporated into this article by reference in their entirety.

Example 1. Acquisition of Cas Protein

[0258] The inventors analyzed the metagenome of an uncultivated microorganism and identified a novel Cas enzyme through de-redundancy and protein cluster analysis, and the novel Cas enzyme had an amino acid sequence shown in SEQ ID NO: 1 and a nucleic acid sequence shown in SEQ ID NO: 2. Blast results showed that the Cas protein had low sequence identity with reported Cas proteins; and the Cas protein was named UkCpf1 in the present disclosure.

[0259] Analysis results showed that a direct repeat of gRNA corresponding to the UkCpf1 protein was AUUUCUACUAUUGUAGAU (SEQ ID NO: 3), and corresponding PAM had a sequence shown in 5′-YYV-3′, where Y=C/T and V=C/G/A.

[0260] 1.1 The PAM Preference of UkCpf1 was Tested Through a Bacterium Elimination Experiment.

[0261] In order to test the PAM site preference of UkCpf1, a UkCpf1 coding gene driven by a T7 promoter and a crRNA precursor driven by a J23119 promoter (namely, repeat-spacer-repeat DR-Sp-DR: TTGACAGCTAGCTCAGTCCTAGGTATAATACTAGTGTCTAAAGGTATTATAAAATTTCT ACTATTGTAGATAGAGCGCAATTAATTATTGCGGATATTCGTCTAAAGGTATTATAAAAT TTCTACTATTGTAGATTTTTTT, SEQ ID NO: 18) were ligated into a prokaryotic expression plasmid pET28a with kanamycin resistance, and then the prokaryotic expression plasmid was transformed into E. coli BL21 to obtain competent E. coli. Processed mature crRNA, namely gRNA, could identify a targeting site on a plasmid pACYCDuet with chloramphenicol resistance, and the targeting site included PAM composed of 8 random bases at the 5′-terminus and a recognition sequence with a length of 28 nt at the 3′-terminus. The PAM plasmid library was transformed into the above-mentioned competent E. coli, and then the E. coli was cultivated overnight at 37° C. Viable bacteria were collected the next day to extract the plasmid. The PAM site sequence of the obtained plasmid library was subjected to PCR amplification and sequenced, and an untransformed PAM library was used as a control group.

[0262] The abundance was counted for 65,536 PAM sequences in the experimental group and the control group, and data were standardized according to a sequencing depth. For any PAM sequence, when its log 2 (control group/sample group) was greater than 4.0, it was determined that the PAM was significantly consumed. A total of 825 significantly-consumed PAM sequences were obtained, accounting for 5.1% of all sequencing types. The Weblogo prediction of the 825 PAM sequences showed that the UkCpf1 preferred to cleave a target site with a 5′-terminus of a YYV (Y=C/T and V=C/G/A) sequence, and results were shown in FIG. 1. The preference was more relaxed and flexible than that of other known Cas12a (Cpf1) family members.

[0263] 1.2 The PAM Preference of UkCpf1 was Verified Through a Sterilization Consumption Experiment.

[0264] In order to verify the PAM preference of UkCpf1 through a sterilization consumption experiment, a total of 32 PAM sequences with YYN were selected for bacteria test in vivo. Targeting sites that included the 32 PAMs and recognition sequences with a length of 28 nt were each linked to a pACYCDuet plasmid with chloramphenicol resistance, and then the plasmid was transformed into a competent E. coli strain expressing UkCpf1/gRNA. After a brief resuscitation at 37° C., concentrations of different transformed samples were leveled according to OD600 values of bacterial solutions, then the bacterial solutions were diluted to obtain three gradients: 10°, 10.sup.−1, and 10.sup.−2, and 5 μl of each bacterial solution was spotted on isopropyl-β-D-thiogalactoside (IPTG)-containing and IPTG-free chloramphenicol and kanamycin-resistant plates and cultivated overnight. The next day, colonies appearing on the plate were photographed and recorded.

[0265] Results showed that the UkCpf1 only exhibited significant plasmid DNA cleavage activity for the “TTTV” type PAM on the IPTG-free plate. On the IPTG-containing plate, either “AYTV” or “TYYV” type PAM exhibited prominent cleavage activity. It indicated that the UkCpf1 preferentially recognized the “TYYV” type PAM site, and results were shown in FIG. 2.

[0266] 1.3 Functional Domain and Catalytically-Active Site of UkCpf1

[0267] Amino acid sequences of UkCpf1 and four known Cpf1 were subjected to multiple sequence alignment with Muscle Alighment, and in combination with HHpred and HMM3_domain finder, a conservative domain of UkCpf1 was predicted. According to prediction results (shown in FIG. 3), three conservative catalytically-active sites of the RuvC domain were identified, including D873, E964, and D1232.

[0268] Coding sequences of FnCpf1 and LbCpf1 were synthesized and inserted into the pET28a plasmid for prokaryotic expression. D873, E964, and D1232 of UkCpf1 were mutated into D873A, E964A, and D1232A by overlap PCR respectively, then inserted into pET28a, and transformed into the E. coli strain BL21 together with a control plasmid of the wild-type UkCpf1, and positive clones were identified. Obtained positive clones were transferred to a test tube with 3 ml of a 100 mg/L kanamycin-containing LB medium, and cultivated overnight at 37° C. The next day, a resulting bacterial solution was inoculated at an inoculation ratio of 1:100 into a new Erlenmeyer flask with 20 ml of a 100 mg/L kanamycin-containing LB medium, and cultivated at 37° C. for about 8 h. In the afternoon of the next day, a resulting bacterial solution was inoculated at the inoculation ratio of 1:100 into a new Erlenmeyer flask with 1 L of a 100 mg/L kanamycin-containing LB medium, and cultivated at 37° C. until OD600 was 0.6 to 0.8. Then IPTG was added to a final concentration of 0.4 mM, and the bacteria were further cultivated for 18 h at 16° C. and 220 rpm. The bacteria were collected by centrifugation, and then passed through a nickel column, a heparin column, and a molecular sieve for purification to obtain the target protein.

[0269] In order to determine whether UkCpf1 has the ability to process and cleave a precursor RNA, a precursor crRNA that had a length of 157 nt and included a sequence of DR-Sp-DR was transcribed in vitro. A reaction system was prepared by mixing 3 μl of 10×2.1 NEBbuffer, 2 μl of 10 μM Ukcpf1, 4 μl of 5 μM pre-crRNA, and 18 μl of DEPC H.sub.2O, and then underwent a reaction at 25° C. for 30 min. Before RNA electrophoresis, a sample was digested with proteinase K at 25° C. for 15 min to remove Ukcpf1. A resulting reaction solution was loaded onto a 15% urea-PAGE gel to undergo electrophoresis for 2 h under tris-borate-EDTA (TBE) buffer, and then ethidium bromide (EB) staining and photographing were conducted. Results showed that UkCpf1 was similar to LbCpf1 and FnCpf1 and had the precursor RNA cleavage activity, and the mutations of D873A, E964A, and D1232A did not affect its RNA cleavage activity (see the left panel of FIG. 4).

[0270] In order to determine whether UkCpf1 has the cleavage activity against a target DNA, a pACYCDuet plasmid with the “TTTA” type PAM targeting site was constructed as a substrate to conduct a DNA cleavage experiment in vitro for identification. A reaction system was prepared by the same method as above, and then underwent a reaction at 25° C. for 30 min. 3 μl of a 100 ng/μl target plasmid was added to the reaction system, and then a reaction was conducted at 37° C. for 30 min. Digestion was conducted with proteinase K at 25° C. for 15 min, then a resulting reaction solution was loaded on a 0.8% agarose gel for TAE electrophoresis, and EB staining and photographing were conducted. Results showed that Ukcpf1 was similar to LbCpf1 and FnCpf1, which all could cleave a superspiral substrate DNA into a linear structure; and the predicted catalytically-active site mutation D873A, E964A, or D1232A of the RuvC domain caused Ukcpf1 to lose its DNA cleavage activity, indicating that these three sites are the catalytically-active sites of the RuvC domain (see the right panel of FIG. 4).

Example 2. Editing Efficiency of UkCpf1 Protein in an A. thaliana Protoplast

[0271] The engineered YFFP gene was used as a reporter to visualize the site-specific nuclease activity of UkCpf1 in an A. thaliana protoplast. Two UkCpf1 expression constructs were constructed to target EBE1 and EBE2 sites in a YFFP gene respectively. A schematic diagram of the constructs was shown in FIG. 5. Once cleaved by UkCpf1, a partially replicated “F” fragment will promote DSB repair through a homology-dependent DNA repair (HdR) pathway to restore the functional YFP gene (a schematic diagram was shown in FIG. 6). Therefore, the cleavage activity of UkCpf1 can be evaluated by observing the number of YFP-positive cells.

[0272] The isolation and preparation of A. thaliana protoplast cells were conducted according to the tape sandwich method reported in a literature. A reporter gene plasmid and a nuclease plasmid were mixed in a ratio of 1:1, and then transformed into protoplast cells by the PEG method. Transformed protoplast cells were cultivated in the dark at room temperature for 12 h to 24 h, then fluorescence signal channels of YFP and RFP were observed and photographed with a fluorescence stereo microscope (Olympus, IX71), and the number of YFP-positive cells was counted with ImageJ.

[0273] Results were shown in FIG. 7. Compared with the control, either for EBE1 or EBE2 site, the experimental group could show obvious fluorescent cells. That is, the UkCpf1 protein could show obvious cleavage activity in the A. thaliana protoplast and could be used for gene editing in cells.

Example 3. Editing Efficiency of Cas Protein in a Rice Protoplast

[0274] With UkCpf1 in Example 1, the following 5 gRNAs were designed for a TGW6 gene of rice: gTGW6-1, gTGW6-2, gTGW6-3, gTGW6-4, and gTGW6-5. Targeting sequences of the above five gRNAs were: ACTACAAAACCGGCAACCTGTAC (SEQ ID NO: 4), TTTCACCGACAGCAGCATGAACT (SEQ ID NO: 5), TTGACCTGCCAGGCTATCCTGAT (SEQ ID NO: 6), GGTCCGGATAGTCACTTGGTTGC (SEQ ID NO: 7), and CGTGTAGCTGGGGCTGTACGTGT (SEQ ID NO: 8), respectively.

[0275] These 5 gRNAs were used to construct knockout vectors (as shown in FIG. 8), plasmids were extracted using the knockout vectors and transformed into corn protoplast cells, and the protoplast cells were cultivated in the dark at 37° C. for 24 h. After the cultivation was completed, a protoplast was collected by centrifugation, then protoplast DNA was extracted, and a DNA fragment of about 800 bp upstream and downstream of a target site was amplified. A DNA fragment with the target site was subjected to next-generation sequencing (NGS), and corresponding editing efficiency was counted; and the DNA fragment was compared with other Cas proteins, and results were shown in Table 2. The UkCpf1 protein of the present disclosure showed more efficient cleavage activity than other proteins in the rice protoplast.

TABLE-US-00003 TABLE 2 Editing efficiency of different Cas proteins in the rice protoplast Mapped InDel InDel SampleID AmpliconID Reads Reads Reads Ratio Cas160 TGW6-1 878894 0 0.00% TGW6-2 2279912 2747 0.12% TGW6-3 1361224 0 0.00% TGW6-4 97 0 0.00% TGW6-5 1 0 0.00% Cas230 TGW6-1 1708137 0 0.00% TGW6-2 957129 867 0.09% TGW6-3 571055 0 0.00% TGW6-4 640298 98 0.02% ukCpf1 TGW6-1 1179912 177 0.02% TGW6-2 1975217 7672 0.39% TGW6-3 131813 748 0.57% TGW6-4 168485 528 0.31% TGW6-5 13431 98 0.73%

Example 4. Editing Efficiency of Cas Protein in A. thaliana

[0276] An A. thaliana material was selected from the Columbia wild-type background. Plant genetic transformation was conducted by the Agrobacterium GV3101-mediated floral dip method. Harvested T1-generation seeds were disinfected with 5% sodium hypochlorite for 10 min, rinsed 4 times with sterile water, and sown on a 30 μM hygromycin-resistant plate for screening. The plate was placed at 4° C. for 2 d and then incubated in a 12 h-light incubator for 10 d, and then resistant plants were transplanted into flower pots and further cultivated in a 16 h-light greenhouse.

[0277] The synthesized UkCpf1 sequence of Example 1 was amplified with primers pAtUBQ-F-UnCpf1/UnCpf1-R-tUBQ, and recombined to the NcoI and BamHI sites of the psgR-Cas9-At vector to obtain an intermediate vector psgR-UkCpf1-At. Then, the synthesized DR-tRNA site was ligated to the HindIII and XmaI sites of the psgR-UkCpf1-At vector through enzyme digestion to obtain a pDR-UkCpf1-At vector. A schematic diagram of the pDR-UkCpf1-At vector was shown in FIG. 9. The vector could be inserted into a target-specific sequence after undergoing BsaI digestion.

[0278] According to Table 3, sense and antisense primers targeting TT4-269 were synthesized. 10 plVI primers were denatured, annealed, and diluted (1/20), and then ligated to the 2×BsaI site of pDR-UkCpf1-At. A resulting vector could be transformed into Agrobacterium for genetic transformation of A. thaliana.

TABLE-US-00004 TABLE 3 Primers for pDR-UkCpf1-At vector construction SEQ ID Primer Sequence (5′-3′) NO: pAtUBQ-F- GAGAGAGACGAAACACAAACCATGGAC 19 UnCpf1 TACAAGGACCACGACGG UnCpf1-R-tUBQ TTCTTGATAAGAGTCTCTTAGGATCCT 20 CACTCCACCTTGCGCTTCTTCTTG AsDR-EBE-S1 AGATTCTCTTAGGGATAACAGGGTAAT 21 AsDR-EBE-A1T AAAAATTACCCTGTTATCCCTAAGAGA 22 AsDR-EBE-S2 AGATTCTCTATTACCCTGTTATCCCTA 23 ASDR-EBE-A2T AAAATAGGGATAACAGGGTAATAGAGA 24 AsDR-TT4-S269 AGATCTATTCACAGGCGACAAGTCGAC 25 AsDR-TT4- AAAAGTCGACTTGTCGCCTGTGAATAG 26 A269T

[0279] For the A. thaliana transgenic T1-generation population of TT4-269, 52 lines were randomly selected, and one leaf was selected after each line grew for 2 weeks to extract the DNA genome by the cetyltrimethylammonium bromide (CTAB) method. A target gene fragment was amplified by PCR, and amplification products were used to build a library by the Hi-Tom method and sent to the Hiseq2500 platform for sequencing. For the data obtained, a linker sequence was cut off, and the remaining sequence was aligned with a reference gene sequence by bowtie. Alignment results were sorted by Samtool, and R was used for statistical mapping.

[0280] Final results showed that UkCpf1 exhibited significant editing effects in A. thaliana; for the TT4-269 target, in the 52 strains, the editing efficiency was as high as 65.4%; and the editing type mainly included single-base insertion and deletion. Another Cas protein SmCsm1 was used for editing at the above-mentioned site in A. thaliana, and results showed that its editing efficiency was only about 10%.

Example 5. Use of Cas Protein in Nucleic Acid Detection

[0281] In this example, the trans cleavage activity of UkCpf1 was verified through an in vitro test. In this example, a gRNA that could be paired with a target nucleic acid was used to guide the UkCpf1 protein to recognize and bind to the target nucleic acid; then the trans cleavage activity of the UkCpf1 protein to the single-stranded nucleic acid was stimulated to cleave the single-stranded nucleic acid detector in the system; two termini of the single-stranded nucleic acid detector were provided with a fluorophore and a quencher respectively, and if the single-stranded nucleic acid detector was cleaved, fluorescence will be excited; and in other embodiments, the two termini of the single-stranded nucleic acid detector could also be provided with a labeling molecule that could be detected by colloidal gold.

[0282] In this example, a selected target nucleic acid was a single-stranded DNA, N-B-i3g1-ssDNA0, with a sequence:

TABLE-US-00005 (SEQ ID NO: 9) CGACATTCCGAAGAACGCTGAAGCGCTGGGGGCAAATTGTGCAATTTGCG GC;

[0283] a gRNA sequence was

TABLE-US-00006 (SEQ ID NO: 10) AGAGAAUGUGUGCAUAGUCACACCCCCCAGCGCUUCAGCGUUC;
and

[0284] a sequence of a single-stranded nucleic acid detector was FAM-TTGTT-BHQ1.

[0285] The following reaction system was adopted: UkCpf1 with a final concentration of 50 nM, gRNA with a final concentration of 50 nM, target nucleic acid with a final concentration of 500 nM, and single-stranded nucleic acid detector with a final concentration of 200 nM. The reaction system was incubated at 37° C. and then the FAM fluorescence was read every 1 min. No target nucleic acid was added in the control group.

[0286] As shown in FIG. 10, compared with the target nucleic acid-free control, in the presence of target nucleic acid, single-stranded nucleic acid detection in the UkCpf1 cleavage system quickly reported fluorescence. The above experiment showed that, in combination with the single-stranded nucleic acid detector, UkCpf1 can be used for target nucleic acid detection. In FIG. 10, {circle around (1)} shows the experimental result of the group with the target nucleic acid, and {circle around (2)} shows the experimental result of the control group without the target nucleic acid.

Example 6. UkCpf1-Mediated PDS Gene Mutations in A. thaliana and Rice

[0287] In order to determine whether UkCpf1 can edit a genome of a plant cell, a plant stable expression vector suitable for rice and A. thaliana was constructed. The UBI promoter (pZmUBI) and the RPS5a (pRPS5a) were used to drive the stable expression of the UKCpf1 gene in rice and A. thaliana respectively, and the rice U6 promoter (pU6) and the A. thaliana U6 promoter (pU6) were used to drive the expression of the crRNA element (DR-guide) of UKCpf1 in rice and A. thaliana respectively. In order to improve the accuracy and stability of expression of the 3′ terminus of the crRNA element in A. thaliana, the HDV ribozyme sequence was fusion-expressed at the 3′ terminus of crRNA. The PDS genes of rice and A. thaliana were each used as an identification target of crRNA to facilitate the calculation of gene editing efficiency through the phenotype of leaf bleaching.

[0288] The above-mentioned two vectors were introduced into the genomes of rice and A. thaliana respectively through Agrobacterium-mediated plant genetic transformation, and screening was conducted with hygromycin to obtain stably-transformed transgenic materials. Primers (AtPDS-F: 5′-GGTCCTTTGCAGGTATCT-3′, as shown in SEQ ID NO: 27, and AtPDS-R: 5′-TTCAAAGGCTTAGCAGGACGA-3′, as shown in SEQ ID NO: 28) were used to sequence and identify targets, and the leaf bleaching phenotypes of genetically-modified materials were counted. Results showed that the UkCpf1 had editing efficiency of 7% and 44% on PDS genes in rice and A. thaliana, respectively.

Example 7. UkCpf1-Mediated DNMT1 Gene Editing in Human Cell Line 293T

[0289] In order to determine whether UkCpf1 can be used for gene editing in human cells, an UkCpf1 expression vector suitable for human cells was constructed. The CAG promoter (pCAG) was used to drive the expression of UkCpf1, and the human U6 promoter (pHuU6) was used to drive the chimeric sequences of crRNA and HDV ribozyme. In the human DNMT1 gene coding sequence, TTV, TCV, CTV, and CCV were selected as targeting sites for PAM. The resulting plasmid vector was introduced into human 293T cells by lipofectin transfection. After the cells were cultivated for 2 d, the gDNA was extracted from the cells, and a DNA sequence of a target site was subjected to PCR amplification and sequencing with primers (DNMT1-F: 5′-CGGGAACCAAGCAAGAAGTG-3′, as shown in SEQ ID NO: 29, and DNMT1-R: 5′-GGGCAACACAGTGAGACTCC-3′, as shown in SEQ ID NO: 30). According to statistical results of Sanger and high-throughput sequencing, UkCpf1 showed editing activity on these four targets, with the highest editing efficiency of 14.5%.

[0290] Although the specific implementations of the present disclosure have been described in detail, those skilled in the art will understand that various modifications and changes can be made to the details according to all teachings published, and such modifications and changes are all within the protection scope of the present disclosure. The full content of the present disclosure is defined by the appended claims and any equivalents thereof.