Cleavable fusion tag for protein overexpression and purification

Abstract

Provided are compositions and methods for enhancing recombinant protein production. The compositions and methods involve use of Ribose Binding Protein (RBP) as a segment of a fusion polypeptide, whereby the RBP segment enhances production of the fusion protein. The fusion proteins contain the RBP sequentially in a single fusion protein with a polypeptide for which enhanced expression is desired. Recombinant expression vectors encoding the fusion proteins that contain and RBP segment are included, as are cells that contain the expression vectors. Methods for separating fusion proteins and for liberating a polypeptide segment that is part of the fusion protein are also provided.

Claims

1. A fusion protein comprising: a target protein segment; and a Ribose Binding Protein (RBP) segment comprising a contiguous portion of a RBP amino acid sequence, wherein the contiguous portion is at least 80% similar to amino acids 34-211 of SEQ ID NO:2, the fusion protein does not include a signal peptide, at least 80% identical to SEQ ID NO: 17, that targets the fusion protein to the periplasm, and wherein the target protein is operatively linked to an N-terminal or C-terminal end of RBP for protein overexpression.

2. The fusion protein of claim 1, wherein the target protein is operatively linked to the N-terminal end of the RBP segment.

3. The fusion protein of claim 1, further comprising a linker peptide segment positioned between the target protein segment and the RBP segment and configured to liberate the target protein segment from the RBP segment.

4. The fusion protein of claim 3, wherein the linker peptide segment comprises a cleavage site for separating the target protein segment from the RBP segment.

5. The fusion protein of claim 1, wherein the wherein the target protein is operatively linked to the C-terminal end of the RBP segment.

6. The fusion protein of claim 1, wherein the target protein has a native structure and function.

7. The fusion protein of claim 1, wherein the fusion protein does not oligomerize in solution with proteins that have the same amino acid sequence of the fusion protein.

8. A method of using a recombinant expression vector to increase production of a target protein, the method comprising: obtaining a recombinant expression vector encoding a fusion protein, the recombinant expression vector comprising an uninterrupted nucleotide sequence encoding the target protein, and a nucleotide sequence coding for a RBP segment that is at least 80% similar to amino acids 34 to 21 of SEQ ID NO:2, wherein the uninterrupted nucleotide sequence and the nucleotide sequence are operatively linked, and the fusion protein does not include a signal peptide, at least 80% identical to SEQ ID NO: 17, that targets the fusion protein to the periplasm; introducing the recombinant expression vector into a host cell; culturing the host cell transformed with the recombinant expression vector; lysing the host cell to recover the fusion protein; and isolating the fusion protein.

9. The method of claim 8, wherein the recombinant expression vector further comprises a nucleic acid sequence coding for a peptidic linker comprising a proteolytic cleavage site, wherein after isolating the fusion protein, the proteolytic cleavage site is cleaved to free the target protein from the RBP segment.

10. The method of claim 9, wherein the target protein has a native structure.

11. The method of claim 8, wherein the fusion protein comprises an affinity peptide to facilitate isolating the fusion protein.

12. The method of claim 11, wherein the affinity peptide is a Histidine tail.

13. The method of claim 8, wherein the host cell is a prokaryote.

14. The method of claim 8, wherein the host cell is a eukaryote.

15. A recombinant expression vector encoding a fusion protein, the recombinant expression vector comprising an uninterrupted first nucleotide sequence coding for a target protein; and a second nucleotide sequence that a RBP amino acid sequence that is at least 80% similar to amino acids 34-211 of SEQ ID NO:2, wherein the uninterrupted first nucleotide sequence and the second nucleotide sequence are operatively linked, and the expression vector does not code for a signal peptide, at least 80% identical to SEQ ID NO: 17, that targets the fusion protein to the periplasm.

16. The recombinant expression vector of claim 15, wherein the target protein is operatively linked to the C-terminal end of the RBP segment.

17. The recombinant expression vector of claim 15, wherein the target protein is operably linked to the C-terminal end of the RBP segment.

Description

BRIEF DESCRIPTION OF FIGURES

(1) FIG. 1. Schematic representation of resultant fusion protein. The tteRBP tag is shown in red, protein of interest in blue, optional purification tags in green, and optional linker sequences in white. The tteRBP is presented at the N-terminal end of the protein of interest, but it may also be placed at the C-terminal end.

(2) FIG. 2. Schematic of E. coli expression plasmid pRExpress. This is a schematic of representative bacterial expression plasmid that comprises the DNA sequence for the tteRBP tag, functional homologue, fragment, or derivative. It includes a representative promoter (e.g. the T7 promoter shown in cyan), and additional expression-control elements required for the particular expression system being used (e.g. the lacI gene (orange) and lac operator (blue)), the tteRBP, functional homologue, fragment, or derivative expression tag (red), a multiple cloning site containing the sequences for restriction endonucleases (green), a selection element such as antibiotic resistance (e.g. the ampR gene shown in magenta), and an origin of replication to allow the cells to synthesize more plasmids as they grow (black). Each of these elements may be tailored to the different expression systems they are being used in (e.g. using the Aox1 or Aox2 promoters instead of T7 promoter for methanol induction in the model organism Pichia pastoris).

(3) FIG. 3. Full length human p53 gels. Shown is an SDS-PAGE gel of full-length p53 expressed as fusion proteins to either an N-terminal 6His tag (left) or 6His-tteRBP tag (right) through a cleavable linker containing the HRV 3C recognition site. Additionally, the gel on the right shows samples taken before and after overnight cleavage with GST-tagged HRC 3C protease. These gels demonstrate a marked increase in soluble expression of the fusion protein.

(4) FIG. 4. Shown is an SDS-PAGE gel of whole cell lysates of uninduced BL21 (DE3) with WDR5 expressed alone, or as a tteRBP fusion protein (both proteins contained N-terminal 6His tags). Molecular weight standard is the CLEARLY protein ladder (Unstained) (Clontech Laboratories Inc., Mountain View, Calif.). This figure clearly demonstrates that the fusion protein expresses at substantially higher levels than the unfused WDR5.

(5) FIG. 5. Shown is an SDS-PAGE gel of HRV3C protease purified using our system purified using the GST-tag obtained from a commercial source. Molecular weight standard is the CLEARLY protein ladder (unstained) (Clontech Laboratories Inc., Mountain View, Calif.). This figure demonstrates that the fusion protein of tteRBP and HRV3C protease can be expressed and purified from E. coli.

DETAILED DESCRIPTION

(6) Unless defined otherwise herein, all technical and scientific terms used in this disclosure have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains.

(7) Every numerical range given throughout this specification includes its upper and lower values, as well as every narrower numerical range that falls within it, as if such narrower numerical ranges were all expressly written herein.

(8) Every DNA sequence disclosed herein includes its complementary DNA sequence, and also includes the RNA equivalents thereof. Every DNA and RNA sequence encoding the polypeptides disclosed herein is encompassed by this disclosure, including but not limited to all fusion proteins, and all of the Ribose Binding Protein (RBP) segment of fusion proteins, including but not limited to those comprising N-terminal and/or C-terminal truncations of the RBP segment.

(9) The present disclosure encompasses compositions and methods for improving production of recombinantly produced protein. In embodiments the disclosure comprises recombinant expression vectors and methods of using them to produce proteins. In general the expression vectors encode at least one fusion protein comprising a segment that includes a polypeptide of interest (also referred to herein as a target protein) and a segment that comprises a Ribose Binding Protein (RBP) or at least a contiguous portion of an RBP.

(10) In embodiments, the RBP encoded by the expression vector comprises an RBP from a prokaryote, such as an archaea, which may be a thermophilic and/or anaerobic microorganism. In an embodiment, the RBP is from Thermoanaerobacter tengcongensis (T. tengcongensis), which is referred to herein as tteRBP. In embodiments, the RBP comprises a functional homologue, fragment, or derivative of tteRBP or a segment thereof which retains the capability to enhance production of a fusion protein into which it is inserted. Enhanced protein production means in one embodiment that more of the fusion protein is produced than a value for a suitable reference. In embodiments, the reference can be a value obtained by production of the protein into which the RBP or segment thereof has not been inserted. In embodiments, the disclosure includes increasing production of a recombinant protein by at least 10% relative to a reference, and can comprise increasing production of a recombinant protein by from 10%-80%, inclusive, relative to a reference, or more than 80% relative to a reference.

(11) In embodiments, the RBP comprises an amino acid sequence that is at least 80% similar to SEQ ID NO:2, or to a contiguous segment of SEQ ID NO:2. In embodiments, the RBP comprises an amino acid sequence that is 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical to SEQ ID NO:2 or a segment of it, for example, a segment that comprises at least 178 amino acids. Thus, in certain embodiments, an RBP segment of this disclosure comprises variations in sequence relative to SEQ ID NO:2. Such variations can comprise conservative or non-conservative amino acid substitutions, insertions, and deletions. In embodiments, the RBP component of the fusion protein comprises a mutation relative to its naturally occurring sequence. In one embodiment the mutation is a Cys102Ser alteration. In certain implementations, the RBP component of a fusion protein lacks a signal peptide, and thus the disclosure also includes entire fusion proteins which lack a signal peptide. The term lacks a signal peptide means either the construct may in fact lack the signal peptide sequence, or the signal peptide may simply be modified to lack signal peptide function. In an embodiment, the fusion protein lacks a signal peptide that functions to transport the protein to the periplasm (N-terminal amino acid sequence RKSRILLLLTIFVTSAALILSGCKTNTPNTASTST (SEQ ID NO: 17).

(12) In embodiments the RBP component of the fusion protein is a segment of a full-length RBP (but lacking a signal sequence). We have also determined if 34 or more amino acids are removed from the N-terminus, or 68 or more amino acids are removed from the C-terminus, the protein loses much of its stability and native structure/function as measured by melting temperature, far UV circular dichroism spectrum, 2D NMR spectrum, and ribose binding ability. Thus, it is considered that a truncation of the first 34 or more N-terminal amino acids, or the last 68 or more C-terminal amino acids of SEQ ID NO:2, exceeds the limits of how much the ends of the tteRBP component can be shortened, yet still function to increase expression and solubility. However, an RBP component of the fusion protein that has shorter truncations of amino acids at its N-terminus, its C-terminus, or at both the N- and C-termini, may still have utility as a solubility and expression tag. Therefore, the disclosure includes fusion proteins which comprise truncations at the N-terminus of the RBP component of from 1-33 amino acids, inclusive and including all integers and all ranges of integers there between, and at the C-terminus of the RBP component of from 1-67 amino acids, inclusive and including all integers and all ranges of integers there between.

(13) In embodiments, the RBP component of the fusion protein comprises a contiguous segment of SEQ ID NO:2 that includes amino acid number 34 of SEQ ID NO:2 at its N terminus. In embodiments, the RBP component of the fusion protein comprises a contiguous segment of SEQ ID NO:2 that includes the amino acid at position 211 of SEQ ID NO:2 at its C-terminus. In embodiments, the RBP component of the fusion protein comprises a contiguous segment of SEQ ID NO:2 that comprises or consists of a segment of SEQ ID NO:2 having the amino acid at position 34 and the amino acid at position 211 of SEQ ID NO:2 at its N- and C-terminus, respectively. In embodiments, the RBP component of the fusion protein is from 278 to 211 amino acids in length. In one embodiment, the RBP component is at least 244 amino acids in length. In embodiments, the fusion protein comprises a tteRBP component lacking the signaling peptide and comprising amino acids 1-211, 1-259, or 34-278, of SEQ ID NO:2. SEQ ID NO:2 is: MKEGXTIGLVISTLNPFFVTKGAWEKLGYKIIVEDSQNDSSKELSNVEDLIQQKVDVLLINPVDSDAVV TAIKEANSKNIPVITIDRSANGGDVVSHIASDNVKGGEMIAAEFIAKALKGKGNVVELEGIPGASAARDRGKGFDE AIAKYPDIKIVAKQAADFDRSKGLSVMENILQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIZVIGDGTEDAL KAIKEGMAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQ. The bold italicized amino acids indicate those that have been determined in accordance with this invention to be dispensable for use in the enhanced protein production approaches of this disclosure.

(14) In embodiments, the fusion proteins do not comprise ubiquitin. In embodiments the fusion proteins do not comprise any segment of ubiquitin that can enhance production of the fusion protein in which the ubiquitin segment is contained, relative to production of an otherwise same fusion protein but in which the ubiquitin segment is not present. In embodiments the fusion proteins do not comprise a ubiquitin-like protein, Apoptosis Stimulating Protein of p53 2 (ASPP2), an isoform of ASPP2, or General Control Protein 4 (GCN4). In embodiments the fusion proteins of the present disclosure do not bind to one another in solution, and/or do not oligomerize, and/or do not undergo domain swapping with one another and thus do not bind to other of the same or similar fusion proteins in trans, and/or do not bind to one another in cis. In an embodiment, the fusion proteins do not form a network, such as a branched network, or a gel comprising the fusion proteins. In embodiments, fusion proteins of this disclosure retain their native-like structure, which can be determined, for example, using near-UV circular dichroism spectroscopy (CD), electrophoretic mobility shift assay (EMSA), gel-filtration chromatography, or any other suitable approach for determining protein structure. In embodiments, isolated fusion proteins of this disclosure retain their native-like structure. In embodiments, a fusion protein of this disclosure may comprise only a single RBP, even if the RBP is interrupted by a distinct polypeptide sequence. In embodiments a fusion protein of this disclosure can include only one protein of interest, which may be N-terminal to the RBP segment, C-terminal to the RBP segment, or flanked by RBP segments.

(15) A representative polynucleotide sequence encoding tteRBP is provided in SEQ ID NO: 1. Those skilled in the art will recognize that, due to the redundancy of the genetic code, there are a multitude of polynucleotide sequences that can encode tteRBP, and each of these sequences is included within the scope of this disclosure. This also pertains to the other DNA sequences that encode representative and non-limiting examples of fusion proteins provided by this disclosure as further described in the Examples.

(16) The polypeptide encoded by the expression vector along with the RBP segment may be any polypeptide of interest. A target polypeptide according to the present disclosure may be any polypeptide required or desired in larger amounts and therefore may be difficult to isolate or purify from other sources. Non-limiting examples of target proteins that can produced by the present methods include mammalian gene products, such as enzymes, cytokines, growth factors, hormones, vaccines, antibodies and the like. In embodiments, overexpressed gene products of the present disclosure include gene products such as erythropoietin, insulin, somatotropin, growth hormone releasing factor, platelet derived growth factor, epidermal growth factor, transforming growth factor a, transforming growth factor 13, epidermal growth factor, fibroblast growth factor, nerve growth factor, insulin-like growth factor I, insulin-like growth factor II, clotting Factor VIII, superoxide dismutase, -interferon, -interferon, interleukin-1, interleukin-2, interleukin-3, interleukin-4, interleukin-5, interleukin-6, granulocyte colony stimulating factor, multi-lineage colony stimulating activity, granulocyte-macrophage stimulating factor, macrophage colony stimulating factor, T cell growth factor, lymphotoxin and the like. In embodiments overexpressed gene products are human gene products. The present methods can readily be adapted to enhance secretion of any overexpressed gene product which can be used as a vaccine. Overexpressed gene products which can be used as vaccines include any structural, membrane-associated, membrane-bound or secreted gene product of a mammalian pathogen. Mammalian pathogens include viruses, bacteria, single-celled or multi-celled parasites which can infect or attack a mammal. For example, viral vaccines can include vaccines against viruses such as human immunodeficiency virus (HIV), vaccinia, poliovirus, adenovirus, influenza, hepatitis A, hepatitis B, dengue virus, Japanese B encephalitis, Varicella zoster, cytomegalovirus, hepatitis A, rotavirus, as well as vaccines against viral diseases like measles, yellow fever, mumps, rabies, herpes, influenza, parainfluenza and the like. Bacterial vaccines can include vaccines against bacteria such as Vibrio cholerae, Salmonella typhi, Bordetella pertussis, Streptococcus pneumoniae, Hemophilus influenza, Clostridium tetani, Corynebacterium diphtheriae, Mycobacterium leprae, R. rickettsii, Shigella, Neisseria gonorrhoeae, Neisseria meningitidis, Coccidioides immitis, Borellia burgdorferi, and the like. A target polypeptide may also comprise sequences; e.g., diagnostically relevant epitopes, from several different proteins constructed to be expressed as a single recombinant polypeptide.

(17) Variants of the RBP or target protein bearing one or several amino acid substitutions or deletion are also included in this disclosure. The skilled artisan can easily assess whether such variants, e.g., fragments or mutants are appropriate for a method of this disclosure by, for example, using the procedures as described in the Examples.

(18) As described above, in embodiments the present disclosure provides polypeptides comprising at least one polypeptide domain corresponding to the tteRBP used as an expression tool and at least one polypeptide domain corresponding to the target protein. In embodiments, the tteRBP component is referred to as a solubility and expression tag.

(19) A representative and non-limiting configuration of a fusion protein of this disclosure is provided in FIG. 1 wherein the location of an optional linker polypeptide of 10-100 amino acid residues is depicted. As the skilled artisan will appreciate, such a linker polypeptide is designed as most appropriate for the intended application, especially in terms of length, flexibility, charge, and hydrophilicity. E.g., in case of a hydrophobic target protein the linker polypeptide may contain an appropriate number of hydrophilic amino acids. In embodiments the present disclosure also relates to fusion proteins which comprise the target polypeptide and one, or two tteRBP-solubility and expression tag or domains thereof and an appropriate peptidic linker sequences between domains. For such applications where the target protein is desired in free form a linker peptide or linker peptides can be used. Such linkers contain an appropriate proteolytic cleavage site. Peptide sequences appropriate for proteolytic cleavage are well-known to the skilled artisan and comprise amongst others, e.g., Ile-Glu-Gly-Arg, cleaved at the carboxy side of the arginine residue by coagulation factor Xa, or Gly-Leu-Pro-Arg-Gly-Ser, a thrombin cleavage site, etc.

(20) In embodiments the DNA construct of the present disclosure encodes a fusion protein comprising a polypeptide linker in between the polypeptide sequence corresponding to the tteRBP-solubility and expression tag and the polypeptide sequence corresponding to the target protein. Such a DNA sequence coding for a linker, in addition to e.g., providing for a proteolytic cleavage site, may also serve as a polylinker. i.e., it may provide multiple DNA restriction sites to facilitate fusion of the DNA fragments coding for a target protein and a solubility and expression tag domain.

(21) In a further embodiment, the disclosure includes a recombinant DNA molecule, such as an expression vector, encoding a fusion protein, comprising operatively-linked at least one nucleotide sequence coding for a target polypeptide and upstream thereto at least one nucleotide sequence coding for a tteRBP.

(22) Polynucleotide sequences are operatively-linked when they are placed into a functional relationship with another polynucleotide sequence. For instance, a promoter is operatively-linked to a coding sequence if the promoter affects transcription or expression of the coding sequence. Generally, operatively-linked means that the linked sequences are contiguous and, where necessary to join two protein coding regions, both contiguous and in reading frame. However, it is well known that certain genetic elements, such as enhancers, may be operatively-linked even at a distance, i.e., even if not contiguous. Promoters of the present disclosure may be endogenous or heterologous to the host, and may be constitutive or inducible.

(23) DNA constructs prepared for introduction into a host typically comprise a replication system recognized by the host, including the intended DNA fragment encoding the desired target fusion peptide, and will can also include transcription and translational initiation regulatory sequences operatively-linked to the polypeptide encoding segment. Expression systems (expression vectors) may include, for example, an origin of replication or autonomously replicating sequence (ARS) and expression control sequences, a promoter, an enhancer and necessary processing information sites, such as ribosome-binding sites, RNA splice sites, polyadenylation sites, transcriptional terminator sequences, and mRNA stabilizing sequences.

(24) The appropriate promoter and other necessary vector sequences are selected so as to be functional in the host. Examples of workable combinations of cell lines and expression vectors include but are not limited to those described Sambrook, J., et al., in Molecular Cloning: A Laboratory Manual (1989, 4th edition: 2012)-, Eds. J. Sambrook, E. F. Fritsch and T. Maniatis, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, or Ausubel, F., et al., in Current Protocols in Molecular Biology (1987 and periodic updates), Eds. F. Ausubel, R Brent and K. R. E., Wiley & Sons Verlag, New York; and Metzger, D., et al., Nature 334 (1988) 31-6. Many useful vectors for expression in bacteria, yeast, mammalian, insect, plant or other cells are known in the art and may be obtained from vendors including, but not limited to. Stratagene, New England Biolabs, Promega Biotech, and others. In addition, the construct may be joined to an amplifiable gene (e.g., DHFE) so that multiple copies of the gene may be obtained.

(25) Expression and cloning vectors can contain a selectable marker, a gene encoding a protein necessary for the survival or growth of a host cell transformed with the vector, although such a marker gene may be carried on another polynucleotide sequence co-introduced into the host cell. Only those host cells expressing the marker gene will survive and/or grow under selective conditions. Typical selection genes include but are not limited to those encoding proteins that (a) confer resistance to antibiotics or other toxic substances, e.g., ampicillin, tetracycline, etc.: (b) complement auxotrophic deficiencies; or (c) supply critical nutrients not available from complex media. The choice of the proper selectable marker will depend on the host cell, and appropriate markers for different hosts are known in the art.

(26) The expression vectors containing the polynucleotides of interest can be introduced into the host cell by any method known in the art. These methods vary depending upon the type of cellular host, including but not limited to transfection employing calcium chloride, rubidium chloride, calcium phosphate, DEAE-dextran, other substances, and infection by viruses. Large quantities of the polynucleotides and polypeptides may be prepared by expressing the polynucleotides in compatible host cells. The most commonly used prokaryotic hosts are strains of Escherichia coli, although other prokaryotes, such as Bacillus subtilis may also be used.

(27) Construction of a vector according to the present disclosure employs conventional ligation techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in the form desired to generate the plasmids required. If desired, analysis to confirm correct sequences in the constructed plasmids is performed in a known fashion. Suitable methods for constructions expression vectors, preparing in vitro transcripts, introducing DNA into host cells, and performing analyses for assessing expression and function are known to those skilled in the art.

(28) The DNA construct comprising two solubility and expression tag domains as well as a target polypeptide domain may also contains two linker peptides in between these domains. In order to allow for systematic cloning, the nucleotide sequences coding for these two linker peptide sequences may be different from one another. This difference in nucleotide sequence can result in a difference in the amino-acid sequence of the linker peptides, but the amino acid sequences of the two linker peptides may also be identical. Such identical linker peptide sequences for example are advantageous if the fusion protein comprising two tteRBP-solubility and expression tag domains as well as their target protein domain is to be used in an immunoassay.

(29) In cases where it is desired to release one or all of the solubility and expression tags out of a fusion protein, the linker peptide can be constructed to comprise a proteolytic cleavage site. Thus, a recombinant DNA molecule, such as an expression vector, encoding a fusion protein comprising at least one polynucleotide sequence coding for a target polypeptide, upstream thereto at least one polynucleotide sequence coding for a tteRBP-solubility and expression tag with the signaling peptide removed, and additionally comprising a nucleic acid sequence coding for a peptidic linker comprising a proteolytic cleavage site, represents a non-limiting embodiment of this invention. In certain embodiments, the expression vector comprises codons optimized for expression in the host cell.

(30) The recombinant proteins of the inventions can be recovered by conventional methods. Thus, where the host cell is bacterial, such as E. coli it may be lysed physically, chemically or enzymatically and the protein product isolated from the resulting lysate. It is then purified using conventional techniques, including but not necessarily limited to conventional protein isolation techniques such as selective precipitation, adsorption chromatography, and affinity chromatography, including but not limited to a monoclonal antibody affinity column.

(31) In embodiments the fusion proteins comprise a tag for facilitating separation, isolation and/or purification. For example, when the proteins of the present invention are expressed with a histidine tail (HIS tag), they can easily be purified by affinity chromatography using an ion metal affinity chromatography column (IMAC) column.

(32) In one embodiment, the proteins comprise an affinity peptide, such as a Histidine tail, fused at the carboxy-terminus of the proteins of the invention. In embodiments the His tag comprises between 5 to 8 histidine residues, or at least 4 His residues, or 6 His residues. In embodiments the affinity peptide has adjacent histidine residues, such as at least two, three or four. In an embodiment the protein comprises 6 directly neighboring histidine residues. In another embodiment, the proteins comprise a C-LYTA tag at their carboxy-terminus. Lyta is derived from Streptococcus pneumoniae which synthesize an N-acetyl-L-alanine amidase, amidase LYTA, (coded by the lytA gene {Gene, 43 (1986) page 265-272} an autolysin that specifically degrades certain bonds in the peptidoglycan backbone. The C-terminal domain of the LYTA protein is responsible for the affinity to the choline or to some choline analogues such as DEAE. This property has been exploited for the development of E. coli C-LYTA expressing plasmids useful for expression of fusion proteins. Purification of hybrid proteins containing the C-LYTA fragment at its amino terminus has been described {Biotechnology: 10, (1992) page 795-798}.

(33) When used as part of an expression construct designed for the expression of the coded protein in an appropriate host (e.g. a bacterial expression plasmid in E. Coli, pCDFDuet-1 and pET-23 used with BL21(DE3) in Example 1), the disclosure produces a novel fusion protein, from which the protein of interest can be readily purified, in certain embodiments at substantially higher levels than can be achieved using only the sequence for the protein of interest alone.

(34) Fusion polypeptides can be purified to high levels (greater than 80%, or greater than 90% pure, as visualized by SDS-PAGE) by undergoing further purification steps. An additional purification step is a Q-Sepharose step that may be operated either before or after the IMAC column to yield highly purified protein. They present a major single band when analyzed by SDS PAGE under reducing conditions, and western blot analysis show less than 5% host cell protein contamination.

(35) The fusion proteins of the invention may be expressed in unicellular hosts such as prokaryotic and lower eukaryotic organisms, such as yeast and bacteria. In an embodiment the fusion are expressed in E. coli.

(36) In one aspect, the present disclosure relates to a method of producing a fusion protein. The method comprises the steps of culturing a host cell transformed with an expression vector as described above, expression of that fusion protein in the respective host cell and separating the protein from the cell culture. The expression system is demonstrated to function with biochemically distinct target proteins, e.g., p53, cellulase 6B and 5A from Thermobifida fusca and cellulase from Pyrococcus horikoshii, WD-repeat containing protein 5 (WDR5) from Drosophila melanogaster, and actin. As can be readily seen from the Examples of this disclosure, specifically relating to these proteins, the efficient expression systems function and result in high levels of fusion protein produced. Similar findings have been made with a variety of other target proteins expressed as fusion proteins.

(37) Further, we demonstrate that the target protein comprised in a fusion protein produced according to the present disclosure can be obtained in a native-like structure. Such native-like structure and function, e.g., for p53 and cellulases, has been confirmed by near-UV circular dichroism spectroscopy (CD), electrophoretic mobility shift assay (EMSA), and gel-filtration chromatography. For p53, near-UV CD spectroscopy reveals a folded protein with mixed alpha helix and beta strand character, EMSA reveals high-affinity site-specific binding to DNA including the p53 consensus recognition sequence, and gel-filtration reveals the correct tetrameric oligomeric state, which is well known in the art. Cellulases were confirmed native and functional by cellulose filter paper digestion. Avicel digestion, and soluble carboxymethyl cellulose digestion assays, which are well known cellulase activity assays in the art.

(38) Compositions comprising fusion proteins, or proteins liberated from the tteRBP, are also provided. Such compositions include but are not necessarily limited to compositions that comprise a pharmaceutically acceptable excipient and thus are suitable for human and veterinary prophylactic and/or therapeutic approaches.

(39) In another embodiment, kits for producing fusion proteins according to this disclosure are provided. The kits can provide one or more expression vectors described herein, as well as printed instructions for using the vectors, and/or for recovering the overexpressed protein.

(40) The following specific examples are provided to illustrate the invention, but are not intended to be limiting in any way.

Example 1

(41) This Example demonstrates a fusion protein of the present invention that comprises full length p53 expressed in E. coli BL21(DE3).

(42) Expression Plasmids.

(43) The full-length human p53 gene (coding sequence for amino acids 1-393) was fused to the 3 end of either an oligonucleotide coding for an N-terminal 6His tag followed by the human rhinovirus 3C (HRV 3C) protease recognition site (LEVLFN/GP) and placed under the control of a T7 promoter in the pET23 expression vector (EMD Millipore, Billerica, Mass.), or to the 3 end of an oligonucleotide coding for an N-terminal 6His tag followed by tteRBP, an linker and a HRV3C protease recognition site and placed under the control of a T7 promoter in the pCDF-Duet1 expression vector (EMD Millipore, Billerica. Mass.). The nucleotide and resultant fusion protein sequences can be seen in Table 1.

(44) Protein Expression and Partial Purification.

(45) BL21(DE3) cells made competent by CaCl.sub.2) permeabilization were transformed with the expression plasmids, plated on LB Agar plates containing 50 g/mL ampicillin (for pET23) or streptomycin (for pCDFDuet-1), and grown at 37 C. for 18 hrs. Isolated colonies were then picked and grown in batch culture (1 L baffled flasks) at 37 C. in LB containing 50 g/mL appropriate antibiotic with 200 RPM continuous shaking until OD.sub.600=0.6. The temperature was then dropped to 20 C. and the cultures induced with 20 mg/L IPTG and grown for 18 hrs. Cells were harvested by centrifugation, resuspended in resuspension/wash buffer (20 mM Tris pH 7.2, 300 mM NaCl, 10 mM Imidazole, and 10 mM -mercaptoethanol), and lysed by incubation with egg white lysozyme and DNAase I+5 mM MgCl.sub.2 on ice for 60 min. Insoluble material was pelleted by centrifugation, and the clarified supernatant loaded on to a Ni.sup.2+-NTA column pre-equilibrated with resuspension/wash buffer. After washing, the sample was eluted with 20 mM Tris pH 7.2, 300 mM NaCl, 250 mM Imidazole, and 10 mM 3-mercaptoethanol. Protein-containing fractions were then pooled, dialyzed against 20 mM Tris, 150 mM NaCl, 10 mM -mercaptoethanol, and the tags removed by incubation with GST-tagged HRV 3C protease (0.05-0.1 mg protease/mg p53) for 18 hrs at 4 C. Samples were subjected to denaturing, reducing SDS-PAGE (samples prepared by boiling in 1 Lamelli Buffer+10% -mercaptoethanol for 5 min) and visualized by staining with Coumassie Brilliant Blue.

(46) Results

(47) After Ni.sup.2+-NTA chromatography, a band corresponding to the correct molecular weight of 6His-HRV3Csite-p53 or the 6His-tteRBP-HRV3Csite-p53 fusion protein can be seen. However, the band in the 6His-HRV3Csite-p53 lane is faint, and is not significantly more intense than many of the impurities (FIG. 3). In the case of the 6His-tteRBP-HRC3Csite-p53, by far the most intense band is the fusion protein (FIG. 3). After cleavage by HRV 3C protease, a band corresponding the correct molecular weight of liberated p53 and the tteRBP tag appear, and the band corresponding to the fusion protein disappears (FIG. 3). A gel for the cleaved product for 6His-HRV3Csite-p53 is not shown because the tag is too small (1.4 kDa) to resolve cleaved from uncleaved protein by SDS-PAGE. After further purification the 6His-tteRBP-HRV3Csite-p53 system gave a final yield of 3 mg/L culture of >90% pure p53 by SDS-PAGE and gel filtration (not shown). 6His-HRV3Csite-p53 gave an estimated yield of <0.1 mg/L culture 50% pure by SDS-PAGE and gel filtration (not shown). Together, these data demonstrate a >30-fold increase in yield and an 80% increase in purity for recombinant human p53 by employing the modified tteRBP tag in E. coli.

Example 2

(48) This example demonstrates tteRBP as an expression tag for WD-Repeat Protein 5 (WDR5) from Drosophila melanogaster in E. coli.

(49) Expression Plasmids

(50) The coding sequence for WDR5 from Drosophila melanogaster was fused to the 3 end of either an oligonucleotide coding for an N-terminal 6His tag and placed under the control of a T7 promoter in the pHis-parallel1 expression vector (NCBI GenBank AF097413.1), or to the 3 end of an oligonucleotide coding for an N-terminal 6His tag followed by tteRBP, an linker and a HRV3C protease recognition site and placed under the control of a T7 promoter in the pCDF-Duet1 expression vector (EMD Millipore, Billerica, Mass.). The nucleotide and resultant fusion protein sequences can be seen in Table 1.

(51) Protein Expression and Purification

(52) BL21(DE3) cells made competent by CaCl.sub.2) permeabilization were transformed with the expression plasmids, plated on LB Agar plates containing 50 g/mL streptomycin for pCDF-Duet1 or ampicillin for pHis-parallel1, and grown at 37 C. for 18 hrs. Isolated colonies were then picked and grown in batch culture (5 mL tubes) at 37 C. in LB containing 50 g/mL streptomycin 50 g/mL streptomycin for pCDF-Duet1 or ampicillin for pHis-parallel1 with 200 RPM continuous shaking until OD.sub.600=0.6. Cultures were induced with 20 mg/L IPTG and grown at for 18 hrs. Samples taken before induction and after 18 hrs induction were lysed by boiling in cracking buffer (1 lamelli buffer+4 M Urea+10% -mercaptoethanol) for 5 min, and subjected to SDS-PAGE. Whole cell lysates were then visualized by staining with Coumassie Brilliant Blue.

(53) Results

(54) Bands corresponding to the predicted molecular weight of both the untagged and tteRBP tagged proteins can be seen in their respective lanes that are not present in the uninduced sample (FIG. 4). However, the band corresponding to the fusion protein is much more intense, indicating that it expressed at a much higher level than the untagged protein. In combination with other data in this work, this indicates that tteRBP can enhance the expression of many diverse proteins.

Example 3

(55) This Example demonstrates use of tteRBP as an expression tag in E. coli BL21(DE3) for the expression of Actin.

(56) Expression Plasmids

(57) The full-length human actin gene was fused to the 3 end of an oligonucleotide coding for an N-terminal 6His tag followed by tteRBP, an linker and a HRV3C protease recognition site and placed under the control of a T7 promoter in the pCDF-Duet 1 expression vector (EMD Millipore, Billerica, Mass.). The nucleotide and resultant fusion protein sequences can be seen in Table 1.

(58) Protein Expression

(59) BL21(DE3) cells made competent by CaCl.sub.2 permeabilization were transformed with the expression plasmids, plated on LB Agar plates containing 50 g/mL streptomycin, and grown at 37 C. for 18 hrs. Isolated colonies were then picked and grown in batch culture (50 mL unbaffled flasks) at 37 C. in LB containing 50 g/mL streptomycin with 225 RPM continuous shaking until OD.sub.600=0.6. Cultures were then cooled to 20 C. and induced with 20 mg/L IPTG and grown at for 18 hrs. Cells were harvested by centrifugation, resuspended in resuspension/wash buffer (20 mM Tris pH 7.2, 300 mM NaCl, 10 mM Imidazole, and 10 mM -mercaptoethanol), and lysed by incubation with egg white lysozyme and DNAase I+5 mM MgCl.sub.2 on ice for 60 min. The fusion protein expressed as inclusion bodies, which were pelleted by centrifugation and washed 3 times in buffer and 1 M NaCl. The pellet was then dissolved in 20 mM Tris pH 7.2, 300 mM NaCl, 10 mM Imidazole, 10 mM -mercaptoethanol+6 M guanidine-hydrochloride and loaded onto an Ni.sup.2+-NTA column pre-equilibrated with the same buffer. After washing, the sample was eluted with 20 mM Tris pH 7.2, 300 mM NaCl, 250 mM Imidazole, and 10 mM 3-mercaptoethanol. Protein-containing fractions were then pooled, and refolded by 20-fold rapid dilution into 20 mM Tris, 150 mM NaCl, 10 mM -mercaptoethanol. Samples were subjected to denaturing, reducing SDS-PAGE (samples prepared by boiling in 1 Lamelli Buffer+10% -mercaptoethanol for 5 min) and visualized by staining with Coumassie Brilliant Blue.

(60) Results

(61) The resultant protein was soluble and resulted in a single homogenous band by SDS-PAGE. This is a substantial improvement over previous attempts at IPTG-inducible recombinant expression of human actin in E. coli, which has previously been demonstrated to yield little to no soluble protein at these temperatures [Production of human beta actin and a mutant using bacterial expression system with a cold shock vector, Tamura M et al, Protein Expression and Purification (2010)].

Example 4

(62) This example demonstrates a fusion protein of the present invention that comprises and RBP fusion with HRV3C protease.

(63) Expression Plasmid

(64) The sequence for HRV3C protease was fused to the 3 end of an oligonucleotide coding for an N-terminal 6His tag followed by tteRBP and placed under the control of a T7 promoter in the pCDF-Duet1 expression vector (EMD Millipore, Billerica, Mass.).

(65) Protein Expression and Purification

(66) BL21(DE3) cells made competent by CaCl.sub.2) permeabilization were transformed with the expression plasmids, plated on LB Agar plates containing 50 g/mL streptomycin and grown at 37 C. for 18 hrs. Isolated colonies were then picked and grown in batch culture (1 L baffled flasks) at 37 C. in LB containing 50 g/mL streptomycin with 200 RPM continuous shaking until OD.sub.600=0.6. The temperature was then dropped to 18 C. and the cultures induced with 20 mg/L IPTG and grown for 18 hrs. Cells were harvested by centrifugation, resuspended in resuspension/wash buffer (20 mM Tris pH 8.0, 300 mM NaCl. 10 mM Imidazole, and 10 mM-mercaptoethanol), and lysed by incubation with egg white lysozyme and DNAase I+5 mM MgCl.sub.2 on ice for 60 min. Insoluble material was pelleted by centrifugation, and the clarified supernatant loaded on to a Ni.sup.2+-NTA column pre-equilibrated with resuspension/wash buffer. After washing, the sample was eluted with 20 mM Tris pH 8.0, 300 mM NaCl, 250 mM Imidazole, and 10 mM -mercaptoethanol. Protein-containing fractions were then pooled, dialyzed against 20 mM Tris, 10 mM -mercaptoethanol. Samples were then further purified by Q-sepharose chromatography in the same buffer with a 0-1M NaCl gradient. Samples were subjected to denaturing, reducing SDS-PAGE (samples prepared by boiling in 1 Lamelli Buffer+10% -mercaptoethanol for 5 min) and visualized by staining with Coumassie Brilliant Blue. Precision protease (GST-fused HRV3C protease) was obtained from GE Healthcare Life Sciences for comparison.

(67) Results

(68) We were able to purify a protein was purified with a molecular weight consistent with the fusion protein, with the only major impurity being a band with a molecular weight consistent with the free-RBP tag (FIG. 5). This protein was purified in yields of >10 mg/L. This demonstrates the use of an embodiment of this disclosure to express and purify proteases in high yield. The results are shown in FIG. 5, which depicts an SDS-PAGE gel of HRV3C protease purified using our pRexpress system. The fusion protein included a GST-tag obtained from a commercial source. The molecular weight standard is the CLEARLY protein ladder (unstained) (Clontech Laboratories Inc., Mountain View, Calif.). Thus, the example demonstrates a fusion protein of tteRBP and HRV3C protease can be expressed and purified from E. coli using a non-limiting embodiment of this disclosure.

Example 5

(69) This example demonstrates an RBP fusion protein that comprises full length MDM2, a ubiquitin E3 ligase.

(70) Expression Plasmid.

(71) The full-length human MDM2 gene was fused to the 3 end of an oligonucleotide coding for an N-terminal 6His tag followed by tteRBP, an linker and a HRV3C protease recognition site and placed under the control of a T7 promoter in the pCDF-Duet1 expression vector (EMD Millipore, Billerica, Mass.).

(72) Protein Expression and Purification

(73) BL21(DE3) cells made competent by CaCl.sub.2) permeabilization were transformed with the expression plasmid, plated on LB Agar plates containing 50 g/mL streptomycin, and grown at 37 C. for 18 hrs. Isolated colonies were then picked and grown in batch culture (1 L baffled flasks) at 37 C. in LB containing 50 g/mL streptomycin with 200 RPM continuous shaking until OD.sub.600=0.6. The temperature was then dropped to 18 C. and the cultures induced with 20 mg/L IPTG and grown for 18 hrs. Cells were harvested by centrifugation, resuspended in resuspension/wash buffer (20 mM Tris pH 8.0, 300 mM NaCl. 10 mM Imidazole, and 10 mM-mercaptoethanol), and lysed by incubation with egg white lysozyme and DNAase I+5 mM MgCl.sub.2 on ice for 60 min. Insoluble material was pelleted by centrifugation, and the clarified supernatant loaded on to a Ni.sup.2+-NTA column pre-equilibrated with resuspension/wash buffer. After washing, the sample was subjected to on-column tag cleavage with GST-tagged HRV 3C protease (0.05-0.1 mg protease/mg p53) for 18 hrs at 4 C. The protein was then collected, and samples were subjected to denaturing, reducing SDS-PAGE (samples prepared by boiling in 1 Lamelli Buffer+10% 3-mercaptoethanol for 5 min) and visualized by staining with Coumassie Brilliant Blue.

(74) Results

(75) We were able to purify a protein that migrated at a molecular weight consistent with MDM2. We were able to confirmed its identity by western blot, and also found that this protein bound to full-length p53. Thus, this example demonstrates yet another embodiment of this disclosure in the form of an RBP/MDM2 fusion protein.

(76) Table 1. Representative nucleotide and protein sequences used in this disclosure. In amino acid sequences below, the tteRBP amino acid sequences are italicized, amino acid sequences of proteins of interest are underlined, amino acid sequences of additional purification tags (e.g. 6His) show in bold, and amino acid sequences of linkers are shown in plain text (without italics, without underlining, and not it bold). Additionally, protease recognition sequences within linkers are shown with a double underline.

(77) TABLE-US-00001 DNAcodingsequenceofmodifiedtteRBPforuseasanexpressionand solubilitytag (SEQIDNO:1) 1 ATGAAAGAGGGCAAAACGATTGGCCTGGTGATCTCTACCCTGAACAATCCGTTCTTTGTG 61 ACCCTGAAAAATGGTGCGGAAGAAAAAGCGAAAGAACTGGGTTACAAAATTATCGTTGAA 121 GATTCGCAAAATGATTCCTCTAAAGAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAA 181 GTTGATGTTCTGCTGATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAA 241 GCGAATAGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGTGATGTT 301 GTTTCCCATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGCCGCGGAATTTATCGCG 361 AAAGCCCTGAAAGGCAAGGGGAATGTTGTGGAACTGGAAGGGATCCCGGGGGCGTCTGCG 421 GCACGTGATCGCGGCAAAGGGTTTGATGAAGCCATTGCTAAGTATCCGGATATTAAAATC 481 GTTGCAAAGCAGGCGGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATC 541 TTGCAAGCCCAGCCGAAAATTGATGCAGTGTTTGCGCAAAATGATGAAATGGCTCTGGGC 601 GCTATCAAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGTTGTGGGCTTTGATGGG 661 ACCGAAGATGCTCTGAAAGCGATTAAAGAAGGGAAAATGGCTGCGACCATTGCGCAGCAG 721 CCGGCCCTGATGGGCTCACTGGGTGTGGAGATGGCTGATAAATACCTGAAAGGTGAAAAA 781 ATTCCGAACTTTATTCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAG AminoacidsequenceofmodifiedtteRBPforuseasanexpressionand solubilitytag (SEQIDNO:2) MKEGKTIGLVISTLNNPFFVTLKNGAEEKAKELGYKIIVEDSQNDSSKELSNVEDLIQQK VDVLLINPVDSDAVVTAIKEANSKNIPVITIDRSANGGDVV custom character HIASDNVKGGEMAAEFIA KALKGKGNVVELEGIPGASAARDRGKGFDEAIAKYPDIKIVAKQAADFDRSKGLSVMENI LQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTEDALKAIKEGKMAATIAQQ PALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQ DNAcodingsequenceoftteRBP-p53fusionproteinusedin Example1(SEQIDNO:3): 1 ATGGGCAGCAGCCATCACCATCATCACCACAGCCAGGATCCGAATTCGAGCTCGATGAAA 61 GAGGGCAAAACGATTGGCCTGGTGATCTGTACCCTGAACAATGCGTTCTTTGTGACCCTG 121 AAAAATGGTGCGGAAGAAAAAGCGAAAGAACTGGGTTACAAAATTATCGTTGAAGATTCG 181 CAAAATGATTCCTCTAAAGAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAAGTTGAT 241 GTTCTGCTGATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAAGCGAAT 301 AGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGTGATGTTGTTTCC 361 CATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGCCGCGGAATTTATCGCGAAAGCC 421 CTGAAAGGCAAGGGGAATGTTGTGGAACTGGAAGGTATCCCGGGGGCGTCTGCGGCACGT 481 GATCGCGGCAAAGGGTTTGATGAAGCCATTGCTAAGTATCCGGATATTAAAATCGTTGCA 541 AAGCAGGCGGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATCTTGCAA 601 GCCCAGCCGAAAATTGATGCAGTGTTTGCGCAAAATGATGAAATGGCTCTGGGCGCTATC 661 AAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGTTGTGGGCTTTGATGGGACCGAA 721 GATGCTCTGAAAGCGATTAAAGAAGGGAAAATGGCTGCGACCATTGCGCAGCAGCCGGCC 781 CTGATGGGCTCACTGGGTGTGGAGATGGCTGATAAATACCTGAAAGGTGAAAAAATTCCG 841 AACTTTATTCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAGGGTGGAGCGGCA 901 AGCGGGGGTGCCGCGGGTGGCAGCTCTGCGGCGCGCCTGCAGGTCGACAAGCTTGCGGCC 961 GCATTAGAAGTGCTGTTTCAAGGTCCAGGCATGGAGGAGCCGCAGTCAGATCCTAGCGTC 1021 GAGCCCCCTCTGAGTCAGGAAACATTTTGAGACCTATGGAAACTACTTCCTGAAAACAAC 1081 GTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTGTCCCCGGACGATATT 1141 GAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGAATGCCAGAGGCTGCT 1201 CCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCCCTCC 1261 TGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGCAGCTACGGTTTCCGT 1321 CTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGCCCTC 1381 AACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGTTGATTCCACA 1441 CCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAGTCACAGGACATGACG 1501 GAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGGGATGGTCTGGCCCCT 1561 CCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTATTTGGATGACAGAAAC 1621 ACTTTTGGACATAGTGTGGTGGTGCCCTATGAGCCGCCTGAGGTTGGCTCTGACTGTACC 1581 ACCATCCACTACAACTACATGTGTAACAGTTCCTGCATGGGCGGCATGAACCGGAGGCCC 1741 ATCCTCACCATCATCACACTGGAAGACTCCAGTGGTAATCTACTGGGACGGAACAGCTTT 1801 GAGGTGCGTGTTTGTGCCTGTCCTGGGAGAGACCGGCGCACAGAGGAAGAGAATCTCCGC 1861 AAGAAAGGGGAGCCTCACCACGAGCTGCGCCCAGGGAGCACTAAGCGAGCACTGCCCAAC 1921 AACACCAGCTCCTGTCCCCAGCCAAAGAAGAAACCACTGGATGGAGAATATTTCACCCTT 1981 CAGATCCGTGGGCGTGAGCGCTTCGAGATGTTCCGAGAGCTGAATGAGGCCTTGGAACTC 2041 AAGGATGCCCAGGCTGGGAAGGAGCCAGGGGGGAGCAGGGCTCACTCCAGCCACCTGAAG 2101 TCCAAAAAGGGTCAGTCTACCTCCCGCCATAAAAAACTCATGTTCAAGACAGAAGGGCCT 2161 GACTCAGACTGAC AminoacidsequenceofthetteRBP-p53fusionproteinusedin Example1(SEQIDNO:4): MGSSHHHHHHSQDPNSSSMKEGKTIGLVISTLNNPFFVTLKNGAEEKAKELGYKIIVEDS QNDSSKELSNVEDLIQQKVDVLLINPVDSDAVVTAIKEANSKNIPVITIDRSANGGDVVS HIASDNVKGGEMAAEFIAKALKGKGNVVELEGIPGASAARDRGKGFDEAIAKYPDIKIVA KQAADFDRSKGLSVMENILQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTE DALKAIKEGKMAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQGGAA SGGAAGGSSAARLQVDKLAAALEVLFQGPGMEEPQSDPSVEPPLSQETFSDLWKLLPENN VLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPS WPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDST PPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRN TFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSF EVRVCACPGRDRRTEEENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTL QIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGP DSD DNAcodingsequenceof6xHis-p53proteinwithouttteRBPtag usedinExample1(SEQIDNO:5): 1 ATGCACCATCACCACCATCACCTGGAAGTTCTGTTCCAGGGGCCCATGGAGGAGCCGCAG 61 TCAGATCCTAGCGTCGAGCCCCCTCTGAGTCAGGAAACATTTTCAGACCTATGGAAACTA 121 CTTCCTGAAAACAACGTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTG 181 TCCCCGGACGATATTGAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGA 241 ATGCCAGAGGCTGCTCCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCT 301 GCACCAGCCCCCTCCTGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGC 361 AGCTACGGTTTCCGTCTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACG 421 TACTCCCCTGCCCTCAACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTG 481 TGGGTTGATTCCACACCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAG 541 TCACAGCACATGACGGAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGC 601 GATGGTCTGGCCCCTCCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTAT 661 TTGGATGACAGAAACACTTTTCGACATAGTGTGGTGGTGCCCTATGAGCCGCCTGAGGTT 721 GGCTCTGACTGTACCACCATCCACTACAACTACATGTGTAACAGTTCCTGCATGGGCGGC 781 ATGAACCGGAGGCCCATCCTCACCATCATCACACTGGAAGACTCCAGTGGTAATCTACTG 841 GGACGGAACAGCTTTGAGGTGCGTGTTTGTGCCTGTCCTGGGAGAGACCGGCGCACAGAG 901 GAAGAGAATCTCCGCAAGAAAGGGGAGCCTCACCACGAGCTGCCCCCAGGGAGCACTAAG 961 CGAGCACTGCCCAACAACACCAGCTCCTCTCCCAAGCCAAAGAAGAAACCACTGGATGGA 1021 GAATATTTCACCCTTCAGATCCGTGGGCGTGAGCGCTTCGAGATGTTCCGAGAGCTGAAT 1081 GAGGCCTTGGAACTCAAGGATGCCCAGGCTGGGAAGGAGCCAGGGGGGAGCAGGGCTCAC 1141 TCCAGCCACCTGAAGTCCAAAAAGGGTCAGTCTACCTCCCGCCATATAAAACTCATGTTC 1201 AAGACAGAAGGGCCTGACTCAGACTGA Aminoacidsequenceofthe6xHis-p53fusionproteinwithouttteRBP tagusedinExample1: (SEQIDNO:6) MHHHHHHLEVLFQGPMEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLML SPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQG SYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQ SQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEV GSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTE EENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELN EALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD DNAcodingsequenceof6xHis-WDR5fusionproteinusedinExample2 (SEQIDNO:7) 1 ATGTCGTACTACCATCACCATCACCATCACGATTACGATATCCCAACGACCGAAAACCTG 61 TATTTTCAGGGCGCCATGGATATGGTGCCCATCGGAGCCGTGCACGGCGGCCATCCCGGC 121 GTAGTGCATCCGCCACAGCAACCACTGCCCACGGCGCCCAGCGGCCCAAACTCGCTGCAG 181 CCGAACTCGGTGGGCCAGCCGGGGGCCACCACCTCCTCGAACAGCAGCGCCTCCAACAAG 241 AGCTCGCTATCCGTCAAGCCCAACTACACGCTCAAGTTCACGCTGGCCGGGCACACCAAG 301 GCGGTGTCGGCGGTCAAGTTCAGTCCGAATGGCGAGTGGCTGGCCAGCTCCTCCGCTGAT 361 AAACTAATCAAAATCTGGGGAGCATACGATGGCAAGTTCGAGAAGACCATTTCGGGCCAC 421 AAGCTGGGCATCAGCGATGTGGCCTGGAGCTCAGACTCGCGACTCCTCGTGAGCGGCAGT 481 GATGACAAGACGCTCAAGGTCTGGGAGCTGAGCACCGGGAAGAGCTTGAAAACTCTGAAG 541 GGCCACAGCAACTATGTGTTCTGCTGCAACTTTAATCCGCAGTCCAATCTGATCGTCTCC 601 GGCAGCTTCGACGAGAGCGTTCGCATATGGGATGTGCGCACCGGCAAGTGTCTGAAGACT 661 CTACCCGCCCATTCCGATCCCGTTTCGGCGGTACATTTCAATCGCGACGGATCGCTGATC 721 GTGAGCAGCAGCTACGACGGCCTCTGTCGCATATGGGACACGGCCAGTGGACAGTGCTTG 781 AAAACCCTGATCGACGACGACAATCCGCCCGTCAGCTTTGTAAAGTTCTCGCCCAATGGC 841 AAGTACATTTTGGCCGCCACGCTGGATAATACGCTCAAGTTGTGGGACTACTCGAAGGGC 901 AAGTGCCTGAAGACGTATACGGGTCACAAGAATGAGAAGTACTGCATATTCGCCAACTTC 961 TCGGTGACGGGAGGAAAGTGGATCGTGAGTGGCAGCGAGGACAACATGGTCTACATTTGG 1021 AATCTGCAGAGCAAGGAGGTGGTGCAAAAGCTGCAGGGACACACCGATACCGTTCTGTGC 1081 ACCGCCTGCCATCCCACGGAGAACATCATTGCTTCCGCGGCGCTCGAGAACGACAAGACC 1141 ATCAAGCTGTGGAAGTCGGATACATAG Aminoacidsequenceof6xHis-WDR5fusionproteinusedinExample2 (SEQIDNO:8) MSYYHHHHHHDYDIPTTENLYFQGAMDMVPIGAVHGGHPGVVHPPQQPLPTAPSGPNSLQ PNSVGQPGATTSSNSSASNKSSLSVKPNYTLKFTLAGHTKAVSAVKFSPNGEWLASSSAD KLIKIWGAYDGKFEKTISGHKLGISDVAWSSDSRLLVSGSDDKTLKVWELSTGKSLKTLK GHSNYVFCCNFNPQSNLIVSGSFDESVRIWDVRTGKCLKTLPAHSDPVSAVHFNRDGSLI VSSSYDGLCRIWDTASGQCLKTLIDDDNPPVSFVKFSPNGKYILAATLDNTLKLWDYSKG KCLKTYTGHKNEKYCIFANFSVTGGKWIVSGSEDNMVYIWNLQSKEVVQKLQGHTDTVLC TACHPTENIIASAALENDKTIKLWKSDT DNAcodingsequenceof6xHis-tteRBP-WDR5fusionproteinusedinExample3 (SEQIDNO:9) 1 ATGGGCAGCAGCCATCACCATCATCACCACAGCCAGGATCCGAATTCGAGCTCGATGAAA 61 GAGGGCAAAACGATTGGCCTGGTGATCTCTACCCTGAACAATCCGTTCTTTGTGACCCTG 121 AAAAATGGTGCGGAAGAAAAAGCGAAAGAACTGGGTTACAAAATTATCGTTGAAGATTCG 181 CAAAATGATTCCTCTAAAGAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAAGTTGAT 241 GTTCTGCTGATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAAGCGAAT 301 AGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGTGATGTTGTTTCC 361 CATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGCCGCGGAATTTATCGCGAAAGCC 421 CTGAAAGGCAAGGGGAATGTTGTGGAACTGGAAGGTATCCCGGGGGCGTCTGCGGCACGT 481 GATCGCGGCAAAGGGTTTGATGAAGCCATTGCTAAGTATCCGGATATTAAAATCGTTGCA 541 AAGCAGGCGGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATCTTGCAA 601 GCCCAGCCGAAAATTGATGCAGTGTTTGCGCAAAATGATGAAATGGCTCTGGGCGCTATC 661 AAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGTTGTGGGCTTTGATGGGACCGAA 721 GATGCTCTGAAAGCGATTAAAGAAGGGAAAATGGCTGCGACCATTGCGCAGCAGCCGGCC 781 CTGATGGGCTCACTGGGTGTGGAGATGGCTGATAAATACCTGAAAGGTGAAAAAATTCCG 841 AACTTTATTCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAGGGTGGAGCGGCA 901 AGCGGGGGTGCCGCGGGTGGCAGCTCTGCGGCCGCATTAGAAGTGCTGTTTCAAGGTCCA 961 GGCATGGTGCCCATCGGAGCCGTGCACGGCGGCCATCCCGGCGTAGTGCATCCGCCACAG 1021 CAACCACTGCCCACGGCGCCCAGCGGCCCAAACTCGCTGCAGCCGAACTCGGTGGGCCAG 1081 CCGGGGGCCACCACCTCCTCGAACAGCAGCGCCTCCAACAAGAGCTCGCTATCCGTCAAG 1141 CCCAACTACACGCTCAAGTTCACGCTGGCCGGGCACACCAAGGCGGTGTCGGCGGTCAAG 1201 TTCAGTCCGAATGGCGAGTGGCTGGCCAGCTCCTCCGCTGATAAACTAATCAAAATCTGG 1261 GGAGCATACGATGGCAAGTTCGAGAAGACCATTTCGGGCCACAAGCTGGGCATGAGCGAT 1321 GTGGCCTGGAGCTCAGACTCGCGACTCCTCGTGAGCGGCAGTGATGAGAAGACGCTCAAG 1381 GTCTGGGAGCTGAGCACCGGGAAGAGCTTGAAAACTCTGAAGGGCCACAGCAACTATGTG 1441 TTCTGCTGCAACTTTAATCCGCAGTCCAATCTGATCGTCTCCGGCAGCTTCGACGAGAGC 1501 GTTCGCATATGGGATGTGCGCACCGGCAAGTGTCTGAAGACTCTACCCGCCCATTCCGAT 1561 CCCGTTTCGGCGGTACATTTCAATCGCGACGGATCGCTGATCGTGAGCAGCAGCTACGAC 1621 GGCCTCTGTCGCATATGGGACACGGCCAGTGGACAGTGCTTGAAAACCCTGATCGACGAC 1681 GACAATCCGCCCGTCAGCTTTGTAAAGTTCTCGCCCAATGGCAAGTACATTTTGGCCGCC 1741 ACGCTGGATAATACGCTCAAGTTGTGGGACTACTCGAAGGGCAAGTGCCTGAAGACGTAT 1801 ACGGGTCACAAGAATGAGAAGTACTGCATATTCGCCAACTTCTCGGTGACGGGAGGAAAG 1861 TGGATCGTGAGTGGCAGCGAGGACAACATGGTCTACATTTGGAATCTGCAGAGCAAGGAG 1921 GTGGTGCAAAAGCTGCAGGGACACACCGATACCGTTCTGTGCACCGCCTGCCATCCCACG 1981 GAGAACATCATTGCTTCCGCGGCGCTCGAGAACCACAAGACCATCAAGCTGTGGAAGTCG 2041 GATACATAG Aminoacidsequencesequenceof6xHis-tteRBP-WDR5fusionproteinusedin Example2 (SEQIDNO:10) MGSSHHHHHHSQDPNSSSMKEGKTIGLVISTLNNPFFVTLKNGAEEKAKELGYKIIVEDS QNDSSKELSNVEDLIQQKVDVLLINPVDSDAVVTAIKEANSKNIPVITIDRSANGGDVVS HIASDNVKGGEMAAEFIAKALKGKGNVVELEGIPGASAARDRGKGFDEAIAKYPDIKIVA KQAADFDRSKGLSVMENILQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTE DALKAIKEGKMAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQGGAA SGGAAGGSSAAALEVLFQGPGMVPIGAVHGGHPGVVHPPQQPLPTAPSGPNSLQPNSVGQ PGATTSSNSSASNKSSLSVKPNYTLKFTLAGHTKAVSAVKFSPNGEWLASSSADKLIKIW GAYDGKFEKTISGHKLGISDVAWSSDSRLLVSGSDDKTLKVWELSTGKSLKTLKGHSNYV FCCNFNPQSNLIVSGSFDESVRIWDVRTGKCLKTLPAHSDPVSAVHFNRDGSLIVSSSYD GLCRIWDTASGQCLKTLIDDDNPPVSFVKFSPNGKYILAATLDNTLKLWDYSKGKCLKTY AALENDKTIKLWKSDT CodingnucleotidesequencefortteRBP-actinfusionusedin Example3(SEQIDNO:11): 1 ATGGGCAGCAGCCATCACCATCATCACCACAGCCAGGATCCGAATTCGAGCTCGATGAAA 61 GAGGGCAAAACGATTGGCCTGGTGATCTCTACCCTGAACAATCCGTTCTTTGTGACCCTG 121 AAAAATGGTGCGGAAGAAAAAGCGAAAGAACTGGGTTACAAAATTATCGTTGAAGATTCG 181 CAAAATGATTCCTCTAAAGAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAAGTTGAT 241 GTTCTGCTGATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAAGCGAAT 301 AGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGTGATGTTGTTTCC 361 CATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGCCGCGGAATTTATCGCGAAAGCC 421 CTGAAAGGCAAGGGGAATGTTGTGGAACTGGAAGGTATCCCGGGGGCGTCTGCGGCACGT 481 GATCGCGGCAAAGGGTTTGATGAAGCCATTGCTAAGTATCCGGATATTAAAATCGTTGCA 541 AAGCAGGCGGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATCTTGCAA 601 GCCCAGCCGAAAATTGATGCAGTGTTTGCGCAAAATGATGAAATGGCTCTGGGCGCTATC 661 AAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGTTGTGGGCTTTGATGGGACCGAA 721 GATGCTCTGAAAGCGATTAAAGAAGGGAAAATGGCTGCGACCATTGCGCAGCAGCCGGCC 781 CTGATGGGCTCACTGGGTGTGGAGATGGCTGATATATACCTGAAAGGTGAAAAAATTCCG 841 AACTTTATTCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAGGGTGGAGCGGCA 901 AGCGGGGGTGCCGCGGGTGGCAGCTCTGCGGCCGCATTAGAAGTGCTGTTTCAAGGTCCA 961 GGCATGGATTCTGAGGTTGCTGCTTTGGTTATTGATAACGGTTCTGGTATGTGTAAAGCC 1021 GGTTTTGCCGGTGACGACGCTCCTCGTGCTGTCTTCCCATCTATCGTCGGTAGACCAAGA 1081 CACCAAGGTATCATGGTCGGTATGGGTCAAAAAGACTCCTACGTTGGTGATGAAGCTCAA 1141 TCCAAGAGAGGTATCTTGACTTTACGTTACCCAATTGAACACGGTATTGTCACCAACTGG 1201 GACGATATGGAAAAGATCTGGCATCATACCTTCTACAACGAATTGAGAGTTGCCCCAGAA 1261 GAACACCCTGTTCTTTTGACTGAAGCTCCAATGAACCCTAAATCAAACAGAGAAAAGATG 1321 ACTCAAATTATGTTTGAAACTTTCAACGTTCCAGCCTTCTACGTTTCCATCCAAGCCGTT 1381 TTGTCCTTGTACTCTTCCGGTAGAACTACTGGTATTGTTTTGGATTCCGGTGATGGTGTT 1441 ACTCACGTCGTTCCAATTTACGCTGGTTTCTCTCTACCTCACGCCATTTTGAGAATCGAT 1501 TTGGCCGGTAGAGATTTGACTGACTACTTGATGAAGATCTTGAGTGAACGTGGTTACTCT 1561 TTCTCCACCACTGCTGAAAGAGAAATTGTCCGTGACATCAAGGAAAAACTATGTTACGTC 1621 GCCTTGGACTTCGAACAAGAAATGCAAACCGCTGCTCAATCTTCTTCAATTGAAAAATCC 1681 TACGAACTTCCAGATGGTCAAGTCATCACTATTGGTAACGAAAGATTCAGAGCCCCAGAA 1741 GCTTTGTTCCATCCTTCTGTTTTGGGTTTGGAATCTGCCGGTATTGACCAAACTACTTAC 1801 AACTCCATCATGAAGTGTGATGTCGATGTCCGTAAGGAATTATACGGTAACATCGTTATG 1861 TCCGGTGGTACCACCATGTTCCCAGGTATTGCCGAAAGAATGCAAAAGGAAATCACCGCT 1921 TTGGCTCCATCTTCCATGAAGGTCAAGATCATTGCTCCTCCAGAAAGAAAGTACTCCGTC 1981 TGGATTGGTGGTTCTATCTTGGCTTCTTTGACTACCTTCCAACAAATGTGGATCTCAAAA 2041 CAAGAATACGACGAAAGTGGTCCATCTATCGTTCACCACAAGTGTTTCTAA AminoacidsequencefortteRBP-actinfusionusedinExample3 (SEQIDNO:12) MGSSHHHHHHSQDPNSSSMKEGKTIGLVISTLNNPFFVTLKNGAEEKAKELGYKIIVEDS QNDSSKELSNVEDLIQQKVDVLLINPVDSDAVVTAIKEANSKNIPVITIDRSANGGDVVS HIASDNVKGGEMAAEFIAKALKGKGNVVELEGIPGASAARDRGKGFDEAIAKYPDIKIVA KQAADFDRSKGLSVMENILQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTE DALKAIKEGKMAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQGGAA SGGAAGGSSAAALEVLFQGPGMDSEVAALVIDNGSGMCKAGFAGDDAPRAVFPSIVGRPR HQGIMVGMGQKDSYVGDEAQSKRGILTLRYPIEHGIVTNWDDMEKIWHHTFYNELRVAPE EHPVLLTEAPMNPKSNREKMTQIMFETFNVPAFYVSIQAVLSLYSSGRTTGIVLDSGDGV THVVPIYAGFSLPHAILRIDLAGRDLTDYLMKILSERGYSFSTTAEREIVRDIKEKLCYV ALDFEQEMQTAAQSSSIEKSYELPDGQVITIGNERFRAPEALFHPSVLGLESAGIDQTTY NSIMKCDVDVRKELYGNIVMSGGTTMFPGIAERMQKEITALAPSSMKVKIIAPPERKYSV WIGGSILASLTTFQQMWISKQEYDESGPSIVHHKCF RBP/HRV3CFusionProteinAminoAcidSequence(SEQIDNO:13): MGSSHHHHHHSQDPNSSSMKEGKTIGLVISTLNNPFFVTLKNGAEEKAKELGYKIIVEDS QNDSSKELSNVEDLIQQKVDVLLINPVDSDAVVTAIKEANSKNIPVITIDRSANGGDVVS HIASDNVKGGEMAAEFIAKALKGKGNVVELEGIPGASAARDRGKGFDEAIAKYPDIKIVA KQAADFDRSKGLSVMENILQAQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTE DALKAIKEGKMAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQGGAA SGGAAGGSSAAAGGPNTEFALSLLRKNIMTITTSKGEFTGLGIHDRVCVIPTHAQPGDDV LVNGQKIRVKDKYKLVDPENINLELTVLTLDRNEKFRDIRGFISEDLEGVDATLVVHSNN FTNTILEVGPVTMAGLINLSSTPTNRMIRYDYATKTGQCGGVLCATGKIFGIHVGGNGRQ GFSAQLKKQYFVEKQ RBP/HRV3CFusionProteinDNAcodingsequence (SEQIDNO:14) 1 ATGGGCAGCAGCCATCACCATCATCACCACAGCCAGGATCCGAATTCG 51 AGCTCGATGAAAGAGGGCAAAACGATTGGCCTGGTGATCTCTACCCTGAA 101 CAATCCGTTCTTTGTGACCCTGAAAAATGGTGOGGAAGAAAAAGCGAAAG 151 AACTGGGTTACAAAATTATCGTTGAAGATTCGCATAATGATTCCTCTAAA 201 GAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAAGTTGATGTTCTGCT 251 GATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAAGCGA 301 ATAGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGT 351 GATGTTGTTTCCCATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGC 401 CGCGGAATTTATCGCGAAAGCCCTGAAAGGCAAGGGGAATGTTGTGGAAC 451 TGGAAGGTATCCCGGGGGCGTCTGCGGCACGTGATCGCGGCAAAGGGTTT 501 GATGAAGCCATTGCTAAGTATCCGGATATTAAAATCGTTGCAAAGCAGGC 551 GGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATCTTGC 601 AAGCCCAGCCGAAAATTGATGCAGTGTTTGCGCATAATGATGAAATGGCT 651 CTGGGCGCTATCAAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGT 701 TGTGGGCTTTGATGGGACCGAAGATGCTCTGAAAGCGATTAAAGAAGGGA 751 AAATGGCTGCGACCATTGCGCAGCAGCCGGCCCTGATGGGCTCACTGGGT 801 GTGGAGATGGCTGATAAATACCTGAAAGGTGAAAAAATTCCGAACTTTAT 851 TCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAGGGTGGAGCGG 901 CAAGCGGGGGTGCCGCGGGTGGCAGCTCTGCGGCCGCAGGCGGACCAAAC 951 ACAGAATTTGCACTATCCCTGTTAAGGAAAAACATAATGACTATAACAAC 1001 CTCAAAGGGAGAGTTCACAGGGTTAGGCATACATGATCGTGTCTGTGTGA 1051 TACCCACACACGCACAGCCTGGTGATGATGTACTAGTGAATGGTCAGAAA 1101 ATTAGAGTTAAGGATAAGTACAAATTAGTAGATCCAGAGAACATTAATCT 1151 AGAGCTTACAGTGTTGACTTTAGATAGAAATGAAAAATTCAGAGATATCA 1201 GGGGATTTATATCAGAAGATCTAGAAGGTGTGGATGCCACTTTGGTAGTA 1251 CATTCAAATAACTTTACCAACACTATCTTAGAAGTTGGCCCTGTAACAAT 1301 GGCAGGACTTATTAATTTGAGTAGCACCCCCACTAACAGAATGATTCGTT 1351 ATGATTATGCAACAAAAACTGGGCAGTGTGGAGGTGTGCTGTGTGCTACT 1401 GGTAAGATCTTTGGTATTCATGTTGGCGGTAATGGAAGACAAGGATTTTC 1451 AGCTCAACTTAAAAAACAATATTTTGTAGAGAAACAATAA RBP/MDM2fusionproteinaminoacidsequence(SEQIDNO:15): MGSSHHHHHHSQDPNSSSMKEGKTIGLVISTLNNPFFVTLKNGAEEKAKE LGYKIIVEDSQNDSSKELSNVEDLIQQKVDVLLINPVDSDAVVTAIKEAN SKNIPVITIDRSANGGDVVSHIASDNVKGGEMAAEFIAKALKGKGNVVEL EGIPGASAARDRGKGFDEAIAKYPDIKIVAKQAADFDRSKGLSVMENILQ AQPKIDAVFAQNDEMALGAIKAIEAANRQGIIVVGFDGTEDALKAIKEGK MAATIAQQPALMGSLGVEMADKYLKGEKIPNFIPAELKLITKENVQGGAA SGGAAGGSSAARLQVDKLAAALEVLFQGPGMCNTNMSVPTDGAVTTSQIP ASEQETLVRPKPLLLKLLKSVGAQKDTYTMKEVLFYLGQYIMTKRLYDEK QQHIVYCSNDLLGDLFGVPSFSVKEHRKIYTMIYRNLVVVNQQESSDSGT SVSENRCHLEGGSDQKDLVQELQEEKPSSSKLVSRPSTSSRRRAISETEE NSDELSGERQRKRHKSDSISLSFDESLALCVIREICCERSSSSESTGTPS NPDLDAGVSEHSGDWLDQDSVSDQFSVEFEVESLDSEDYSLSEEGQELSD EDDEVYQVTVYQAGESDTDSFEEDPEISLADYWKCTSCNEMNPPLPSHCN RCWALRENWLPEDKGKDKGEISEKAKLENSTQAEEGFDVPDCKKTIVNDS RESCVEENDDKITQASQSQESEDYSQPSTSSSIIYSSQEDVKEFEREETQ DKEESVESSLPLNAIEPCVICQGRPKNGCIVHGKTGHLMACFTCAKKLKK RNKPCPVCRQPIQMIVLTYFP RBP/MDM2fusionproteinFusionproteincodingnucleotide sequence(SEQIDNO:16): 1 CCATGGGCAGCAGCCATCACCATCATCACCACAGCCAGGATCCGAATTCG 51 AGCTCGATGAAAGAGGGCAAAACGATTGGCCTGGTGATCTCTACCCTGAA 101 CAATCCGTTCTTTGTGACCCTGAAAAATGGTGCGGAAGAAAAAGCGAAAG 151 AACTGGGTTACAAAATTATGGTTGAAGATTCGCAAAATGATTCCTCTAAA 201 GAGCTGTCTAATGTCGAAGATTTGATTCAACAGAAAGTTGATGTTCTGCT 251 GATCAATCCGGTGGATAGCGATGCGGTTGTTACGGCGATTAAAGAAGCGA 301 ATAGCAAAAATATCCCGGTTATTACCATCGATCGCAGCGCGAATGGTGGT 351 GATGTTGTTTCCCATATCGCCAGCGATAATGTTAAGGGTGGCGAAATGGC 401 CGCGGAATTTATCGCGAAAGCCCTGAAAGGCAAGGGGAATGTTGTGGAAC 451 TGGAAGGTATCCCGGGGGCGTCTGCGGCACGTGATCGCGGCAAAGGGTTT 501 GATGAAGCCATTGCAAAGTATCCGGATATTAAAATCGTTGCAAAGCAGGC 551 GGCGGATTTTGATCGTTCCAAAGGTCTGTCAGTGATGGAAAACATCTTGC 601 AAGCCCAGCCGAAAATTGATGCAGTGTTTGCGCAAAATGATGATATGGCT 651 CTGGGCGCTATCAAAGCCATTGAGGCCGCGAATCGTCAAGGTATTATTGT 701 TGTGGGCTTTGATGGGACCGAAGATGGTCTGAAAGCGATTAAAGAAGGGA 751 AAATGGCTGCGACCATTGCGCAGCAGCCGGCCCTGATGGGCTCACTGGGT 801 GTGGAGATGGCTGATAAATACCTGAAAGGTGAAAAAATTCCGAACTTTAT 851 TCCGGCAGAACTGAAACTCATCACGAAAGAAAATGTGCAGGGTGGAGCGG 901 CAAGCGGGGGTGCCGCGGGTGGCAGCTCTGCGGCGCGCCTGCAGGTCGAC 951 AAGCTTGCGGCCGCATTAGAAGTGCTGTTTCAAGGTCCAGGCATGTGCAA 1001 TACCAACATGTCTGTACCTACTGATGGTGCTGTAACCACCTCACAGATTC 1051 CAGCTTCGGAACAAGAGACCCTGGTTAGACCAAAGCCATTGCTTTTGAAG 1101 TTATTAAAGTCTGTTGGTGCACAAAAAGACACTTATACTATGAAAGAGGT 1151 TCTTTTTTATCTTGGCCAGTATATTATGACTAAACGATTATATGATGAGA 1201 AGCAACAACATATTGTATATTGTTCAAATGATCTTCTAGGAGATTTGTTT 1251 GGCGTGCCAAGCTTCTCTGTGAAAGAGCACAGGAAAATATATACCATGAT 1301 CTACAGGAACTTGGTAGTAGTCAATCAGCAGGAATCATCGGACTCAGGTA 1351 CATCTGTGAGTGAGAACAGGTGTCACCTTGAAGGTGGGAGTGATCAAAAG 1401 GACCTTGTACAAGAGCTTCAGGAAGAGAAACCTTCATCTTCACATTTGGT 1451 TTCTAGACCATCTACCTCATCTAGAAGGAGAGCAATTAGTGAGACAGAAG 1501 AAAATTCAGATGAATTATCTGGTGAACGACAAAGAAAACGCCACAAATCT 1551 GATAGTATTTCCCTTTCCTTTGATGAAAGCCTGGCTCTGTGTGTAATAAG 1601 GGAGATATGTTGTGAAAGAAGCAGTAGCAGTGAATCTACAGGGACGCCAT 1651 CGAATCCGGATCTTGATGCTGGTGTAAGTGAACATTCAGGTGATTGGTTG 1701 GATCAGGATTCAGTTTCAGATCAGTTTAGTGTAGAATTTGAAGTTGAATC 1751 TCTCGACTCAGAAGATTATAGCCTTAGTGAAGAAGGACAAGAACTCTCAG 1801 ATGAAGATGATGAGGTATATCAAGTTACTGTGTATCAGGCAGGGGAGAGT 1851 GATACAGATTCATTTGAAGAAGATCCTGAAATTTCCTTAGCTGACTATTG 1901 GAAATGCACTTCATGCAATGAAATGAATCCCCCCCTTCCATCACATTGCA 1951 ACAGATGTTGGGCCCTTCGTGAGAATTGGCTTCCTGAAGATAAAGGGAAA 2001 GATAAAGGGGAAATCTCTGAGAAAGCCAAACTGGAAAACTCAACACAAGC 2051 TGAAGAGGGCTTTGATGTTCCTGATTGTAAAAAAACTATAGTGAATGATT 2101 CCAGAGAGTCATGTGTTGAGCAAAATGATGATAAAATTACACAAGCTTCA 2151 CAATCACAAGAAAGTGAAGACTATTCTCAGCCATCAACTTCTAGTAGCAT 2201 TATTTATAGCAGCCAAGAAGATGTGAAAGAGTTTGAAAGGGAAGAAACCC 2251 AAGAGAAAGAAGAGAGTGTGGAATCTAGTTTGCCCCTTAATGCCATTGAA 2301 CCTTGTGTGATTTGTCAAGGTCGACCTAAAAATGGTTGCATTGTCCATGG 2351 CAAAACAGGACATCTTATGGCCTGCTTTACATGTGCAAAGAAGCTAAAGA 2401 AAAGGAATAAGCCCTGCCCAGTATGTAGACAACCAATTCAAATGATTGTG 2451 CTAACTTATTTCCCCTAGCTCGAGTCTGGTAAAGAAACCGCTGCTGCGAA 2501 ATTTGAACGCCAGCACATGGACTCGTCTACTAGCGCAGC

(78) While the invention has been described through specific embodiments, routine modifications will be apparent to those skilled in the art and such modifications are intended to be within the scope of the present invention.

Cleavable fusion tag for protein overexpression and purification

Assignee

Inventors

Cpc classification

Classification Explorer

C12N2510/02

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/395

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/62

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/74

CHEMISTRY; METALLURGY

Classification Explorer

C07K2319/00

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/66

CHEMISTRY; METALLURGY

Classification Explorer

C07K19/00

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/79

CHEMISTRY; METALLURGY

Classification Explorer

C12N2310/3519

CHEMISTRY; METALLURGY

Classification Explorer

C07K2319/10

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/70

CHEMISTRY; METALLURGY

Classification Explorer

C12N2310/3517

CHEMISTRY; METALLURGY

Classification Explorer

C12N2810/50

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/245

CHEMISTRY; METALLURGY

Classification Explorer

C07K2319/50

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/65

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C12N15/62

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/65

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/66

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/74

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/70

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/395

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/245

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/79

CHEMISTRY; METALLURGY

Classification Explorer

C07K19/00

CHEMISTRY; METALLURGY