METHOD FOR SCREENING SPLIT SITES AND APPLICATION THEREOF
20230317209 · 2023-10-05
Inventors
Cpc classification
G16B40/00
PHYSICS
G16B15/00
PHYSICS
G16B20/00
PHYSICS
International classification
G16B15/00
PHYSICS
G16B20/00
PHYSICS
G16B40/00
PHYSICS
Abstract
A method for screening a split site and an application thereof are provided. The method includes: S1, writing a program using a computer language, and predicting an amino acid sequence formed by connecting adjacent peptide fragments after an intein is embedded into each two adjacent amino acid residues in an initial amino acid sequence and then excised through a self-splicing reaction to construct a protein database; and S2, performing molecular clone after inserting an intein sequence into a gene segment and then translating to obtain a peptide fragment, detecting whether that peptide fragment contain a labeled amino acid sequence by mass spectrometry, and comparing the peptide fragment with the protein database to confirm the split site. A final detection is realized by the mass spectrometry instead of high-throughput screening, and extended to searches for the split site of any active protein.
Claims
1. A method for screening a split site, comprising: step S1, establishing a protein database, which comprises: writing a program by using a computer language, and predicting an amino acid sequence formed by connecting adjacent peptide fragments after an intein is embedded into each two adjacent amino acid residues in an initial amino acid sequence and then excised through a self-splicing reaction to construct the protein database; and step S2, performing an experiment, which comprises: inserting an intein sequence into a gene segment through a molecular clone experimental method and then translating to obtain a peptide fragment, detecting whether that the peptide fragment contains a labeled amino acid sequence by mass spectrometry, and comparing the peptide fragment with the protein database when the peptide fragment is detected as containing the labeled amino acid sequence to confirm the split site.
2. The method according to claim 1, wherein in the step S1, the establishing a protein database specifically comprises: step S11, fusing a first gene segment, an inserted intein sequence segment, and a second gene segment in a sequential order to obtain a new deoxyribonucleic acid (DNA) sequence; step S12, translating the new DNA sequence into a new amino acid sequence; step S13, searching a target intein amino acid sequence in the new amino acid sequence, and deleting the target intein amino acid sequence in the new amino acid sequence to thereby obtain an output amino acid sequence; and step S14, predicting each possible site of the first gene segment and the second gene segment into which the inserted intein sequence segment is inserted, and repeating the steps S11 to S13 to obtain all the output amino acid sequences to construct the protein data database.
3. The method according to claim 2, wherein in the step S11, at least one base is inserted into the inserted intein sequence segment.
4. The method according to claim 3, wherein the at least one base is one base.
5. A use of the method according to claim 1 in screening split sites of at least one of Escherichia coli (E. coli) antigen protein Im7-6 and Cas9 protein, wherein Im7-6 refers to immunity protein 7-6, and Cas9 refers to clustered regularly interspaced short palindromic repeats associated protein 9.
6. The use of claim 5, wherein in the step S1, the establishing a protein database specifically comprises: step S11, fusing a first gene segment, an inserted intein sequence segment, and a second gene segment in a sequential order to obtain a new deoxyribonucleic acid (DNA) sequence; step S12, translating the new DNA sequence into a new amino acid sequence; step S13, searching a target intein amino acid sequence in the new amino acid sequence, and deleting the target intein amino acid sequence in the new amino acid sequence to thereby obtain an output amino acid sequence; and step S14, predicting each possible site of the first gene segment and the second gene segment into which the inserted intein sequence segment is inserted, and repeating the steps S11 to S13 to obtain all the output amino acid sequences to construct the protein data database.
7. The use of claim 6, wherein in the step S11, at least one base is inserted into the inserted intein sequence segment.
8. The use of claim 7, wherein the at least one base is one base.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
DETAILED DESCRIPTION OF EMBODIMENTS
[0037] In order that the disclosure may be easily understood, the disclosure will be described in detail below in combination with the accompanying drawings. However, before describing the disclosure in detail, it should be understood that the disclosure is not limited to specific embodiments described. It should also be understood that terms used herein are for the purpose of describing specific embodiments only and are not intended to be limiting.
[0038] When a value range is provided, it should be understood that upper and lower limits of the range and each intermediate value between any other specified or intermediate values in the range are included in the disclosure. The upper and lower limits of these smaller ranges may be independently included in the respective smaller ranges, and are also included in the disclosure, subject to any explicitly excluded limits in the specified ranges. Where the specified range includes one or two limits, a range excluding any or both of those included limits is also included in the disclosure.
[0039] Unless otherwise defined, all terms used herein have the same meaning as commonly understood by those skilled in the art to which the disclosure belongs. While any methods and materials similar or equivalent to those described herein can also be used in implementation or testing of the disclosure, preferred methods and materials are described herein.
[0040] The relevant experimental principle of the method for randomly inserting gene segments by mini-Mu transposon mediated by phage infection mechanism adopted by the disclosure can be referred to a scientific and technical literature (Ho, T. Y. H., Shao, A., Lu, Z. et al., “A systematic approach to inserting split inteins for Boolean logic gate engineering and basic activity reduction”, NATURE COMMUNICATIONS, 2021, Nat Commun 12, 2200), and the full text of the literature may be cited for reference.
First Embodiment
[0041] Referring to
[0042] As shown in
[0043] 1. Transposase MuA is used to perform transposition experiment on a target gene segment, transposon is randomly inserted into the target gene segment, and its principle ensured that only one transposon segment is inserted into each the target gene segment. The transposon segment is a complete expression line with a promoter, a terminator and other elements and expressed chloramphenicol resistance protein.
[0044] 2. The transposed gene segment is connected to pET28a expression vector by seamless cloning method and transformed into E. coli Top10 amplification vector. The colonies are screened by a chloramphenicol resistant plate.
[0045] 3. All the colonies in the above plate are collected, mixed and cultured, and their plasmids are extracted. Since the transposon carriers a NotI restriction enzyme cutting site at each upstream and downstream terminals, the transposon segment is replaced by the expressed intein Ssp DnaBM86 (corresponding to a cis-intein version of 3. substitution in
[0046]
[0047] As shown in
[0048] 4. A script is written first by using Python to predict that the intein has undergone a splicing reaction to excise itself after being embedded in each of the two adjacent amino acid residues in the amino acid sequence, and the amino acid sequence formed after the adjacent peptide fragments are connected would form a protein database.
[0049] 5. The samples obtained in the step 3 is detected by mass spectrometry.
[0050] A result of mass spectrometry is shown in
Second Embodiment
[0051] In this embodiment, the above screening method is used to verify two polypeptide split sites of clustered regularly interspaced short palindromic repeats (CRISPR) associated protein 9 (Cas9).
[0052] Steps 1-5 are substantially the same as those in the first embodiment, except that, in the step 3, “the transposon segment is replaced by the expressed intein Ssp DnaBM86 through enzyme digestion (also referred to as enzyme-cut)” is replaced by “the transposon segment is replaced by the gene segment that an N-terminal of the split intein Ssp DnaBM86, transcriptional ulator elements (TREs) including terminator and promoter, and a C-terminal of the split intein Ssp DnaBM86 (corresponding to a split intein version of 3. substitution in
[0053]
[0054] Results of mass spectrometry are shown in
[0055] It can be seen from the above embodiments that the screening method provided in the disclosure is reasonably matched with computer programming to construct the protein database; and the final detection is realized by mass spectrometry, which innovatively expands the existing screening scheme for the split sites, and can be extended to search for the split sites of any active protein. After confirming the split sites, experiments can be designed for protein assembly, which provides a new idea for the subsequent glycosylation experiments.
[0056] It should be noted that the above-described embodiments are only for the purpose of illustrating the disclosure and do not constitute any limitation of the disclosure. The disclosure has been described with reference to exemplary embodiments, but it should be understood that the words used therein are words of description and explanation, not of limitation. The disclosure may be modified within the scope of the claims of the disclosure, and the disclosure may be modified without departing from the scope and spirit of the disclosure. Although the disclosure described herein relates to specific methods, materials, and embodiments, it does not mean that the disclosure is limited to the specific embodiments disclosed therein. On the contrary, the disclosure may be extended to all other methods and applications with the same function.