METHOD FOR BIOSYNTHESIS OF PROTEIN HETEROCATENANE
20230348546 · 2023-11-02
Inventors
Cpc classification
C07K2319/92
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
C12P21/02
CHEMISTRY; METALLURGY
International classification
C12P21/02
CHEMISTRY; METALLURGY
Abstract
Provided is a method for biosynthesis of a protein heterocatenane. The basic structure of a protein precursor sequence of the protein heterocatenane comprises form an N-terminal to a C-terminal: L.sub.1-1-X-L.sub.1-2-(in situ enzyme cutting site)-L.sub.2-1-X-L.sub.2-2, wherein the Xs represent entangled motifs for forming dimers, the two Xs can be the same or different, L.sub.1-1/L.sub.1-2 and L.sub.2-1/L.sub.2-2 represent two pairs of cyclization motifs that undergo an orthogonal coupling reaction in cellulo, and the two pairs of cyclization motifs can be two orthogonal peptide-protein reactive pairs, or combinations of peptide-protein reactive pairs and split inteins, or two orthogonal split inteins. When the peptide-protein reactive pair and the split intein are combined for use, biosynthesis of branched protein heterocatenanes can be achieved; and when the two orthogonal split inteins are combined for use, the protein heterocatenane having a completely cyclized main chain can be obtained.
Claims
1. A method for biosynthesis of a protein heterocatenane, comprising the following steps: 1) designing a protein precursor sequence of the protein heterocatenane with a basic structure including, from N terminus to C terminus: L.sub.1-1-X-L.sub.1-2-(in situ protease digestion site)-L.sub.2-1-X-L.sub.2-2, wherein X represents a dimer-forming entwining motif, which may be homodimeric or heterodimeric, that is, two Xs may or may not be the same; L.sub.1-1/L.sub.1-2 and L.sub.2-1/L.sub.2-2 represent two pairs of cyclization motifs that undergo an orthogonal coupling reaction in cellulo, and the two pairs of cyclization motifs may be two orthogonal peptide-protein reactive pairs, or combinations of a peptide-protein reactive pair and a split intein, or two orthogonal split inteins; when L.sub.1-1/L.sub.1-2 is the peptide-protein reactive pair, the in situ protease digestion site inserted between L.sub.1-2 and L.sub.2-1 is an essential element, which can be digested in situ by co-expressing a protease intracellularly; otherwise the in situ protease digestion site is a non-essential element; the sequence of a protein of interest is inserted in the basic structure with the insertion sites selected from: before and/or after an X domain, at the N terminus and/or at the C terminus of the peptide-protein reactive pair; 2) constructing a gene sequence encoding the corresponding protein precursor sequence according to step 1) and introducing the gene sequence into an expression vector; 3) transforming the expression vector constructed in step 2) into a cell for expression, and co-expressing, if necessary, the protease that in situ cleaves the digestion site in cellulo; and 4) purifying a fusion protein obtained in step 3) to obtain a corresponding protein heterocatenane.
2. The method according to claim 1, wherein the entwining motif in step 1) is a p53dim domain or a p53dim mutant capable of forming a dimeric structure, where the amino acid sequence of the p53dim domain is as shown in SEQ ID NO:3 in the sequence listing.
3. The method according to claim 1, wherein the peptide-protein reactive pair in step 1) is selected from a SpyTag-SpyCatcher reactive pair and a SnoopTag-SnoopCatcher reactive pair.
4. The method according to claim 3, wherein the amino acid sequences of SpyTag and SpyCatcher in the SpyTag-SpyCatcher reactive pair are as shown in SEQ ID NO:1 and SEQ ID NO:2 in the sequence listing, respectively.
5. The method according to claim 1, wherein the split intein in step 1) is an NpuDnaE split intein, which consists of IntC1 and IntN1 or IntC2 and IntN2 as a cyclization motif, and the amino acid sequences of IntC1, IntN1, IntC2, and IntN2 are as shown in SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, and SEQ ID NO:7 in the sequence listing, respectively.
6. The method according to claim 1, wherein the in situ protease digestion site designed in step 1) is a recognition sequence ETVRFQG of a TVMV protease or a recognition sequence ENLYFQG of a TEV protease; and accordingly the TVMV protease or the TEV protease is co-expressed in step 3).
7. The method according to claim 1, wherein a histidine tag sequence is introduced before a second entwining motif X in step 1), and protein purification is performed by affinity chromatography on a nickel column in step 4).
8. The method according to claim 1, wherein the basic structure of the protein precursor sequence designed in step 1) is SpyCatcher-p53dim-SpyTag-IntC1-p53dim-IntN1, which is, from N terminus to C terminus in order, a cyclization reaction motif SpyCatcher, an entwining motif p53dim domain, a cyclization reaction motif SpyTag, a C-terminal part IntC1 of the split intein, an entwining motif p53dim domain, and an N-terminal part IntN1 of the split intein; a recognition sequence of a TVMV protease is inserted between SpyTag and IntC1, and a histidine tag sequence is introduced before a second p53dim domain; a fusion site for one or more identical or different proteins of interest is selected from: before and/or after the p53dim domain, at the N terminus of SpyCatcher, and at the C terminus of SpyTag.
9. The method according to claim 1, wherein the basic structure of the protein precursor sequence designed in step 1) is IntC1-p53dim-IntN1-IntC2-p53dim-IntN2, which is, from the N terminus to the C terminus in order, a C-terminal part IntC1 of the split intein, an entwining motif p53dim domain, an N-terminal part IntN1 of the split intein, a C-terminal part IntC2 of the split intein, an entwining motif p53dim domain, and an N-terminal part IntN2 of the split intein; a histidine tag sequence is introduced before a second p53dim domain; and one or more identical or different proteins of interest are inserted before and/or after two p53dim domains.
10. The method according to claim 1, wherein for the protein heterocatenane in which a histidine tag sequence is introduced in step 4), an expressed protein is purified by affinity chromatography on a nickel column, and the purity of the protein heterocatenane is further improved by gradient elution or size exclusion chromatography.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
DETAILED DESCRIPTION
[0061] The present disclosure is further described in detail below by way of examples, which are not intended to limit the scope of the present disclosure in any way.
[0062] Protein precursors involved in biosynthesis of protein heterocatenanes and their corresponding expression systems are constructed by the following specific steps: [0063] 1) For the system in which the synthesis of protein heterocatenanes is mediated jointly by the SpyTag-SpyCatcher reactive pair and the split intein IntC1/IntN1, a gene sequence containing a 6×His tag (for protein purification), SpyTag and SpyCatcher reactive pair, p53dim domains, a split intein IntC1/IntN1, i.e., SpyCatcher(B)-p53dim(X)-SpyTag(A)-IntC1-p53dim(X)-IntN1 (BXA-IntC1-X-IntN1) is constructed by the recombinant genetic engineering technique. On the basis of this gene sequence, a folded protein AffiHER2 is further introduced at the N terminus of the SpyCatcher and the C terminus of the SpyTag respectively to construct a gene sequence, i.e., AffiHER2-SpyCatcher(B)-p53dim(X)-SpyTag(A)-AffiHER2-IntC1-p53dim(X)-IntN1 (AffiHER2-BXA-AffiHER2-IntC1-X-IntN1). The two gene sequences are inserted into an expression vector pMSCG19 respectively, transformed into a pRK1037 plasmid-containing BL21(DE3) competent cell for expression. The pRK1037 plasmid can encode the TVMV protease. During the expression, the biosynthesis of protein heterocatenanes cat-BXA-X and cat-(AffiHER2-BXA-AffiHER2)-X is achieved by in situ assembly, protease digestion and site-specific cyclization. [0064] 2) For the system in which the synthesis of protein heterocatenanes is mediated by orthogonal split inteins, a gene sequence containing a 6-His tag (for protein purification), p53dim domains, split inteins IntC1/IntN1 and IntC2/IntN2, and a protein of interest SUMO/GFP, i.e., IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2 (IntC1-X-SUMO-IntN1-IntC2-X-IntN2) or IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2 (IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN2) is constructed by the recombinant genetic engineering technique. The two gene sequences are inserted into the expression vector pMSCG19 respectively, transformed into a BL21(DE3) competent cell for expression. During the expression, the biosynthesis of protein heterocatenanes cat-XSUMO-X and cat-XSUMO-XGFP is achieved by in situ assembly and the cyclization reactions mediated by orthogonal split inteins.
[0065] The prepared protein heterocatenanes are subjected to basic characterization and their topologies are proven through sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), ultra-performance liquid chromatography-mass spectrometry (LC-MS), and TEV protease digestion reaction.
Example 1: Biosynthesis of Protein Heterocatenanes Cat-BXA-X and cat-(AffiHER2-BXA-AffiHER2)-X using the co-expression system of pMCSG19/pRK1037
[0066] Gene fragments of BXA-IntC1-X-IntN1 and AffiHER2-BXA-AffiHER2-IntC1-X-IntN1 were inserted into the expression vectors pMCSG19 respectively, with the sequences shown in SEQ ID No:8 and SEQ ID No:9 in the sequence listing. The resulting constructs were confirmed by sequencing, then transformed into the pRK1037 plasmid-containing BL21(DE3) competent cells, and incubated overnight at 37° C. on Amp-Kan plates containing 100 μg/mL of sodium ampicillin (Amp) and 50 μg/mL of kanamycin (Kan). Thereafter, monoclonal colonies were picked out, inoculated into a 5-mL 2×YT medium with the same antibiotics, and subjected to shake incubation at 37° C. for 10 to 12 hours to prepare a seed broth. The seed broth was inoculated at a ratio of 1:100 into a 250 mL 2×YT medium with the same antibiotics, and the obtained cultures were subjected to shake incubation at 37° C. until OD.sub.600 was between 0.5 and 0.7. Isopropyl-p-D-thiogalactopyranoside (IPTG) was added to a final concentration of 0.25 mM, and then the cultures were shaken at 16° C. for 20 hours for expression.
Example 2: Biosynthesis of Protein Heterocatenanes Cat-XSUMO-X and Cat-XSUMO-XGFP
[0067] Gene fragments of IntC1-X-SUMO-IntN1-IntC2-X-IntN2 and IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN2 were inserted into the expression vectors pMCSG19 respectively, with the sequences shown in SEQ ID No:10 and SEQ ID No:11 in the sequence listing. The resulting constructs were confirmed by sequencing, then transformed into BL21(DE3) competent cells, and incubated overnight at 37° C. on plates containing 100 μg/mL of sodium ampicillin. Thereafter, monoclonal colonies were picked out, inoculated into a 5-mL 2-YT medium with the same antibiotics, and subjected to shake incubation at 37° C. for 10 to 12 hours to prepare a seed broth. The seed broth was inoculated at a ratio of 1:100 into a 250 mL 2×YT medium with the same antibiotics, and the obtained cultures were subjected to shake incubation at 37° C. until OD.sub.600 was between 0.5 and 0.7. Isopropyl-β-D-thiogalactopyranoside (IPTG) was added to a final concentration of 0.25 mM, and then the cultures were shaken at 16° C. for 20 hours for expression.
Example 3: Purification of Protein Heterocatenanes
[0068] Upon completion of the protein expression, the bacterial cells were collected by centrifugation (5500 g×15 min) with a high-speed refrigerated centrifuge and the supernatant was discarded. Bacterial cells were re-suspended with lysis buffer A (50 mM sodium dihydrogen phosphate, 300 mM sodium chloride, 10 mM imidazole, pH 8.0). The re-suspension was sonicated with an ultrasonic homogenizer in an ice-water bath (5-second interval for every 5-second operation, 30% intensity) and then centrifuged (12000 g×30 min) to collect the supernatant. The supernatant was mixed well with Ni-NTA resin and incubated at 4° C. for 1 hour. The mixture was poured into an empty column PD-10 for purification, and after the lysate was exhausted, the resin was washed with wash buffer B (50 mM sodium dihydrogen phosphate, 300 mM sodium chloride, 20 mM imidazole, pH 8.0) for 5 to 10 times the resin volume to reduce non-specific adsorption. The protein heterocatenanes cat-BXA-X, cat-(AffiHER2-BXA-AffiHER2)-X and cat-XSUMO-X could be eluted directly with elution buffer C (50 mM sodium dihydrogen phosphate, 300 mM sodium chloride, 250 mM imidazole, pH 8.0). In order to increase the purity, the protein heterocatenane cat-XSUMO-XGFP was subjected to gradient elution of first eluting with elution buffer D (50 mM sodium dihydrogen phosphate, 300 mM sodium chloride, 50 mM imidazole, pH 8.0) for about 10 times the resin volume, collecting the protein eluent which mainly contained heterocatenane, and then eluting the cyclic or catenated by-product of GFP with the elution buffer C.
[0069] The protein eluent was further purified using a fast protein liquid chromatography system (ÄKTA pure, GE Healthcare) with a size exclusion chromatography column (Superdex 200 increase 10/300 GL, GE Healthcare). The mobile phase was phosphate buffered saline PBS (pH 7.4) filtered through a 0.22 μm filter at a flow rate of 0.5 mL % min. The efflux peak of the protein was monitored by UV absorption at 280 nm, and the sample was collected for characterization.
Example 4: Characterization of Protein Heterocatenanes
[0070] The protein heterocatenanes purified in Example 3 were first added with 5×SDS loading buffer, heated at 98° C. for 10 min, and then characterized by SDS-PAGE. After exchanging buffers of the protein samples purified by SEC into ddH.sub.2O with an ultrafiltration tube, LC-MS was adopted to characterize their molecular weights. Protein concentrations were determined by an ultra-micro spectrophotometer (NanoPhotometer P330, Implen, Inc.). To prove the heterocatenane topology, the protein solution (10 μM) and TEV protease solution (10 μM) were mixed at a molar ratio of 20:1 and proteolysis was carried out at 37° C. (for 1, 3, 6 hours, where the protease digestion could be substantially complete within 3 hours). After the protease digestion, 10 μL of the proteolytic products were added with 5×SDS loading buffer and heated at 98° C. for 10 min to quench the reaction. The product composition after digestion was characterized by SDS-PAGE. After exchanging buffers of the remaining digested system into ddH.sub.2O with an ultrafiltration tube, LC-MS was employed to confirm the molecular weight of the proteolytic products. The results of the SEC characterization after affinity purification by a nickel column, SDS-PAGE characterization before and after the enzyme digestion, and the LC-MS characterization of the cat-BXA-X, cat-(AffiHER2-BXA-AffiHER2)-X, cat-XSUMO-X, and cat-XSUMO-XGFP were as shown in